NVIDIA GB300 Dominates Agentic AI Workloads With 20x Performance Leap Over Hopper As Rubin Nears Launch

Jun 14, 2026 at 08:25am EDT
A close-up view of an NVIDIA circuit board featuring multiple processing units, mounted on a dark background.

NVIDIA's Blackwell GB300 has posted record performance in AA-AgentPerf, a new benchmark that measures Agentic AI workflows.

NVIDIA Blackwell Ultra GB300 is 20 Times Faster Than Hopper In Agentic AI, Records Highest Performance In Latest Benchmarks

Artificial Analysis has a new benchmark out called AA-AgentPerf, which measures how many active agents an inference deployment can support under realistic workloads, which include:

Related Story NVIDIA’s First Co-Packaged Optics Switch Lands at Lambda, Cutting 3kW Per Rack and Freeing Power for 3,137 Extra GPUs

The AA-AgentPerf benchmark is used to measure three key metrics, which form the basis of modern-day AI deployments, such as:

NVIDIA is now publishing its first benchmarks in AgentPerf measures using DeepSeek V4 Pro on its GB300 NVL72 platform. This model represents the type of Frontier models that power agents today & are widely used for AI.

In the first round of benchmarks, NVIDIA has recorded the fastest performance with its GB300 hardware, posting a 20x lead (per MegaWatt) over its older HGX H200 platform. GB300 can sustain up to 60,000 concurrent agents per MW, a massive leap over Hopper.

BenchmarkValue of metricNVIDIA GB300 NVL72NVIDIA H200
Concurrent agents per MWEnergy efficiency: How many active agents a system can support for a given power budget61.4K2.6K
Concurrent agents per GPUHardware efficiency: How much serving capacity is achieved per GPU57.51.4

NVIDIA states that the performance highlights NVIIA's GB300 NVL72 and Blackwell's ability to run large-scale agentic coding workloads while keeping the GPUs fully utilized across several concurrent agent sessions.

Looking forward, NVIDIA's Rubin is just on the horizon and is expected to extend these leads through a supercharged AI architecture, which will offer 50 PFLOPs of compute from NVFP4, and with the Vera CPU, the LLM tool calls and end-to-end performance will see major performance and efficiency gains.

About the author: A Software Engineer by training and a PC enthusiast by passion, Hassan Mujtaba serves as Wccftech's Senior Editor for hardware section. With years of experience in the industry, he specializes in deep-dive technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work involves not only breaking news on upcoming technologies but also extensive hands-on reviews and benchmarking.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Deal of the Day