NVIDIA Wants Everyone To Rethink AI TCO, & Explains Why “Cost Per Token” Is The Only Metric That Matters

Apr 16, 2026 at 12:00pm EDT
NVIDIA Wants Everyone To Rethink AI TCO, & Explains Why "Cost Per Token" Is The Only Metric That Matters 1

As the AI industry enters the maturity phase, traditional terms have become outdated, which is why NVIDIA suggests that the new ways to think about AI TCO should be evaluated based on "Cost Per Token".

NVIDIA Wants Everyone To Rethink AI TCO With "Cost Per Tokens" Metric

Tokens are the single most important metric for AI. While yesterday's data centers were evaluated on their raw computing power, today's AI factories are evaluated on their token output. But it's not important for who does the most tokens, efficiency and cost are still the values that matter the most. That is why how AI factories think about TCO needs to change.

Related Story Hands On With NVIDIA’s First RTX Spark Laptops & PCs Ft. ASUS, Dell, HP, Microsoft, MSI, Lenovo

NVIDIA emphasizes that enterprises still use relative numbers, chip specifications, compute cost, FLOPS/$, and that needs to change.

NVIDIA explains some of the factors that can lower token cost. They use an equation for calculating the cost per million tokens. The company cites that most AI enterprises only focus on the numerator, which is Cost Per GPU per Hour, but that's only the tip of the iceberg. The denominator of the equation is what actually helps minimize token costs and maximize revenue.

And why does all of this matter? The answer is very simple, because for AI enterprises, it should be the cost per token that matters, not the FLOPS per dollar.

For this, NVIDIA showcases an example between its Hopper and Blackwell GPUs. The cost of operating Hopper GPUs is way lower than Hopper, around 2x lower, and the total FLOPS per dollar also shows just a 2x difference. So, just going by these two metrics, Blackwell doesn't look like much of a difference since it costs 2x more, and that offsets its 2x performance difference versus the previous generation.

The actual difference lies in the tokens throughput and the cost per million tokens. In these variables, Blackwell is up to 65x better than Hopper, and the cost per million tokens is 35 times lower on Blackwell versus Hopper. For reference, the data was evaluated on SemiAnalysis's InferenceX v2 benchmark.

MetricNVIDIA Hopper (HGX H200)NVIDIA Blackwell (GB300 NVL72)NVIDIA Blackwell Relative to Hopper
Cost per GPU per Hour ($)$1.41 $2.652x
FLOP per Dollar (PFLOPS) 2.85.62x
Tokens per Second per GPU906,00065x
Tokens per Second per MW54K2.8M50x
Cost per Million Tokens ($)$4.20$0.1235x lower

Now you can treat all of this as NVIDIA's iconic "CEO Math," but there is some actual reasoning behind why these numbers matter. You see, NVIDIA has a very powerful suite of software stacks for AI, and has been leading the charts across every benchmark where others aren't even close.

NVIDIA's CEO has even challenged other firms to benchmark their own chips since many often claim that they are ahead of NVIDIA, but there's just no proof out there.

"Nobody can demonstrate to me that any single platform in the world today has better performance TCO ratio. Not one company... I encourage them to use inference max and demonstrate their incredible inference cost. It's really really hard.. no nobody wants to show up."

Jensen Huang - NVIDIA CEO

With this rethinking of AI TCO and AI in general, NVIDIA isn't just claiming a victory in benchmarks; they are also claiming that they have the throne in metrics that matter to AI enterprises.

About the author: A Software Engineer by training and a PC enthusiast by passion, Hassan Mujtaba serves as Wccftech's Senior Editor for hardware section. With years of experience in the industry, he specializes in deep-dive technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work involves not only breaking news on upcoming technologies but also extensive hands-on reviews and benchmarking.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Deal of the Day