Rethinking AI TCO: Why Cost per Token Is the Only Metric That Matters

15/04/2026

Traditional data centers only stored, retrieved and processed data. In the generative and agentic AI era, these facilities have evolved into AI token factories. With AI inference becoming their primary workload, their primary output is intelligence manufactured in the form of tokens.

This transformation demands a corresponding shift in how the economics of AI infrastructure, including total cost of ownership (TCO), is assessed. Enterprises evaluating AI infrastructure still too often focus on peak chip specifications, compute cost or floating point operations per second for every dollar spent, aka FLOPS per dollar.

The distinction that matters is this:

Compute cost is what enterprises pay for AI infrastructure, whether rented from cloud providers or owned on premises.

FLOPS per dollar is how much raw computing power an enterprise gets for every dollar spent, but raw compute and real-world token output are not the same thing.

Cost per token is an enterprise's all-in cost to produce each delivered token, usually represented as cost per million tokens.

The first two are merely input metrics. Optimizing for inputs while the business runs on output is a fundamental mismatch.

Cost per token determines whether enterprises can profitably scale AI. It's the one TCO metric that directly accounts for hardware performance, software optimization, ecosystem support and real-world utilization - and NVIDIA delivers the lowest cost per token in the industry.

What Are the Factors That Lower Token Cost? Understanding how to optimize token cost requires looking at the equation for calculating cost per million tokens.

In this equation, many enterprises evaluating AI infrastructure focus on the numerator: the cost per GPU per hour. For cloud deployments, this is the hourly rate paid to a cloud provider; for on-premises deployments, it's the effective hourly cost derived from amortizing owned infrastructure. The real key to reducing token cost, however, lies in the denominator: maximizing the delivered token output.

That denominator carries two business implications.

Minimize token cost: When this increase in token output is reflected through the cost equation, it drives down cost per token, which is what grows the profit margin on every interaction served.

Maximize revenue: More tokens delivered per second also translates to more tokens per megawatt, which means more intelligence to use in AI-powered products and services, generating more revenue from the same infrastructure investment.

So focusing only on the numerator means missing what drives the denominator. Think of it as an inference iceberg : The numerator sits above the surface, visible and easy to compare. The denominator is everything beneath the surface, which represents key factors that determine real-world token output. Accurately evaluating AI infrastructure starts with asking what lies beneath.

Surface-level inquiry:

What is the cost per GPU hour?

What are the peak petaflops and high-bandwidth memory capacity?

What are the FLOPS per dollar?

In-depth cost analysis:

What is the cost per million tokens? Specifically, what is the cost per million tokens for large-scale mixture-of-experts (MoE) reasoning models, which represent the most widely deployed type of AI models?

What is the delivered token output per megawatt? For on-premises deployments especially, where capital commitment to land, power and infrastructure is substantial, maximizing intelligence produced per megawatt is critical.

Can the scale-up interconnect handle the all-to-all traffic of MoE models?

Is FP4 precision supported? Can the inference stack make use of FP4 while maintaining high accuracy?

Does the inference runtime support speculative decoding or multi-token prediction to increase user interactivity?

Does the serving layer support disaggregated serving, KV-aware routing, KV-cache offloading and other optimizations?

Does the platform support the unique workload requirements of agentic AI - including ultralow latency, high throughput and large input sequence lengths?

Does the platform support the full lifecycle, from training and post-training to high-scale inference, across all model architectures, to ensure infrastructure fungibility and high utilization?

Every one of these algorithmic, hardware and software optimizations must be active and integrated, or the denominator collapses. A cheaper GPU that delivers significantly fewer tokens per second results in a much higher cost per token. AI infrastructure that gets it right across the full stack ensures that every optimization enhances the others.

Why Does Cost per Token Matter Much More Than FLOPS per Dollar? The following data for the DeepSeek-R1 AI model demonstrates the difference between theoretical and actual business outcomes.

Looking at compute cost alone, the NVIDIA Blackwell platform appears to cost roughly 2x more than NVIDIA Hopper - but compute cost says nothing about the output that investment buys. An analysis of mere FLOPS per dollar suggests a 2x NVIDIA Blackwell advantage compared with the NVIDIA Hopper architecture. However, the actual outcome is orders of magnitude different: Blackwell delivers more than 50x greater token output per watt than Hopper, resulting in nearly 35x lower cost per million tokens.

MetricNVIDIA Hopper (HGX H200) NVIDIA Blackwell (GB300 NVL72) NVIDIA Blackwell Relative to Hopper

Cost per GPU per Hour ($) $1.41 $2.65 2x

FLOP per Dollar (PFLOPS) 2.8 5.6 2x

Tokens per Second per GPU 90 6,000 65x

Tokens per Second per MW 54K 2.8M 50x

Cost per Million Tokens ($) $4.20 $0.12 35x lower

Note: Data is sourced from NVIDIA analysis and the SemiAnalysis InferenceX v2 benchmark.

This massive divergence proves NVIDIA Blackwell delivers a massive leap in business value over the earlier Hopper generation that far outpaces

LINK:	https://blogs.nvidia.com/blog/lowest-token-cost-ai-factories/...
	See more stories from nvidia

Most recent headlines

05/01/2027

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be demoed at CES 2026

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...

07/10/2026

Dalet Flex LTS Delivers Smarter Media Operations from Ingest to Distribution

Dalet, a leading technology and service provider for media-rich organizations, today announced the latest Long-Term Supported (LTS) release of Dalet Flex. Build...

06/09/2026

Dolby and MagentaTV Bring Fans Closer to the FIFA World Cup 2026 in Germany with Dolby Vision and Dolby Atmos

June 9 2026, 23:00 (PDT) Dolby and MagentaTV Bring Fans Closer to the FIFA Worl...

04/08/2026

Dalet Announces Commercial Availability of Dalia, Bringing Media-Aware Agentic AI to Enterprise Productions

Dalet, a leading technology and service provider for media-rich organizations, t...

15/07/2026

S&P Analysis: Three Quarters of Americans Watch Live Sports

Share Copy link Facebook X Linkedin Bluesky Email...

15/07/2026

Scripps Sports, Ion Score Women's Volleyball Rights

Share Copy link Facebook X Linkedin Bluesky Email...

14/07/2026

Bowling Green State Upgrades Doyt Perry Stadium With New Daktronics LED Display

South end zone videoboard, cloud-based control system will be ready for 2026 football season...

14/07/2026

Mizzou Athletics Launches Connected Digital Platform

Redesigned website, enhanced mobile app unify content, ticketing, and personalized fan engagement...

14/07/2026

DePaul Athletics, Playfly Sports Agree to Multimedia Rights Partnership

Agreement spans sponsorship sales, digital monetization, radio production, and new practice facility naming rights...

14/07/2026

American Association Expands Broadcast Reach Through FanDuel Sports Network Partnership

Independent league adds 14 regional sports network affiliates, growing distribut...

14/07/2026

Euroleague Basketball Introduces Euroleague Basketball+ Digital Ecosystem Initiative

New strategy aims to unify competitions, content, fan engagement, and commercial...

14/07/2026

Professional Fighters League, ESPN Reach Multi-Year Media Rights Deal for Brazil

ESPN and Disney+ become exclusive home of PFL events in key international MMA market...

14/07/2026

TEGNA Names Scott Gill VP of Technology and Operations

Gill will oversee engineering, technology, and sports operations across the company's 64 local television stations...

14/07/2026

Guest Post: Dynamic Media Facilities Could Reshape the Future of Broadcast Workflows

Submitted by North American Broadcasters Association (NABA) As broadcasters con...

14/07/2026

Bayerischer Rundfunk Debuts Fully Software-Defined SMPTE ST 2110 Radio OB Van Built Around Lawo Technology

Modernized mobile unit combines HOME Apps, mc 56 console, and IP infrastructure ...

14/07/2026

Scripps Sports, ION Secure U.S. Rights to 2027 FIVB Womens Volleyball World Cup

Every match of the 32-team tournament will air across ION and Scripps Sports platforms in English and Spanish...

14/07/2026

FloSports Lands Exclusive U.S. Rights to IIHF Mens World Championship Beginning in 2027

FloHockey to stream every game of the annual international tournament under four...

14/07/2026

Minnesota Lynx Add Three Games to KARE 11s Over-the-Air Schedule

Victory+ telecasts to be simulcast on TEGNA-owned station, expanding free local distribution...

14/07/2026

Avalanche Tones debut with Chainsaw Suite

Plug-ins for heavy music Avalanche Tones is the brainchild of Ava Toton, a 17-year-old musician and developer who says her goal is to make the lives of gui...

14/07/2026

IK Multimedia introduce ReSing Voices Brazilian Pack

Launched alongside new Singer Showcase purchase model IK Multimedia's innovative vocal-synthesis software has just gained its latest voice add-on, the R...

14/07/2026

MIDI Innovations Awards 2026

Registration open until 1 September 2026 The MIDI Association have revealed that the registration deadline for this year's MIDI Innovation Awards has no...

14/07/2026

Launchkey MK4 88 joins Novation line-up

88-note model completes MK4 range Novation have just introduced the final model in their flagship MIDI controller keyboard range, the Launchkey MK4 88. Roun...

14/07/2026

CBS Atlanta Adds a Noon Newscast

Share Copy link Facebook X Linkedin Bluesky Email...

14/07/2026

Tegna Names Scott Gill VP, Technology and Operations

Share Copy link Facebook X Linkedin Bluesky Email...

14/07/2026

Colorado Wildfires Bring Close Call for Broadcasters

Share Copy link Facebook X Linkedin Bluesky Email...

14/07/2026

IBC2026 sets conference agenda

IBC2026 has unveiled a powerful Conference programme bringing together global media leaders, technology innovators, creators, sports organisations, broadcasters...

14/07/2026

Nominations for Best of Show Awards at IBC2026 Now Open

Share Copy link Facebook X Linkedin Bluesky Email...

14/07/2026

Broadcast Solutions delivers industry-first software-defi...

Broadcast Solutions, a leading systems integrator and provider of innovative solutions for the broadcast media industry, has delivered two highly capable outsid...

14/07/2026

UPDATED: Scripps, DirecTV End Blackout, Ink New Retrans Deal

Share Copy link Facebook X Linkedin Bluesky Email...

14/07/2026

12 States Sue to Block $110 Billion Warner Bros./Paramount Merger

Share Copy link Facebook X Linkedin Bluesky Email...

14/07/2026

Heidi Raphael to Head N.Y. State Broadcasters Association

Share Copy link Facebook X Linkedin Bluesky Email...

14/07/2026

CBS Atlanta Expands Live Local News Programming

Share Copy link Facebook X Linkedin Bluesky Email...

14/07/2026

Nemotron Labs: How Open Models Give Enterprises and Nations AI They Can Trust, Control and Customize

Editor's note: This post is part of the Nemotron Labs blog series, which exp...

14/07/2026

Techtel Successfully Relocates AICD Broadcast Studio to New Sydney Headquarters

Techtel Successfully Relocates AICD Broadcast Studio to New Sydney Headquarters BroadcastBroadcast EquipmentLive StreamingBroadcast Studio2026 14 July Writ...

14/07/2026

First look revealed for Friday the 13th prequel, Crystal Lake, from A24 coming to Sky and NOW in the UK and Ireland this October

Tuesday 14 July 2026 First look revealed for Friday the 13th prequel, Crystal ...

14/07/2026

Surround Is Still the Standard

When immersive audio dominates industry headlines, it's easy to assume that every broadcaster is preparing for an Atmos future. The reality is quite differ...

14/07/2026

Fresh Thinking from MAD//Fest London 2026

Emma and Sophie from ICG's marketing team joined thousands of fellow marketers, brands and agencies at MAD//Fest London 2026, one of the UK's biggest ma...

14/07/2026

Seven paradoxes shaping the next era of media production - Episode 3

Why Trusted and Secure Media Operations Matter In this series, we explore the technologies, architectures and operational realities shaping modern media operati...

14/07/2026

How Merchants Can Prepare for the Next Evolution in Digital Commerce

Pilot Project Shows How Retailers Are Prepared for the Next Step in the Evolution of Digital Commerce Arvato Systems Drives Agentic Commerce Forward G terslo...

14/07/2026

Building a more sustainable future - Our commitment to climate action

As part of this commitment, weve joined the SME Climate Hub, publicly pledging to: measure our greenhouse gas emissions reduce them in line with a net zero p...

14/07/2026

Why Performance per Watt Is the Ultimate Metric for AI Infrastructure Efficiency

Power is AI infrastructure's inescapable constraint. How many tokens an AI factory can generate within a fixed power budget determines its revenue and profi...

13/07/2026

BravesVision GM Jeff Cravens on Launching MLB's Newest Team-Owned Network in 35 Days

The Braves opted to keep production in-house rather than hand it off to MLB...

13/07/2026

Behind The Mic: Adam Schefter Signs Multi-Year Extension with ESPN

Behind The Mic provides a roundup of recent news regarding on-air talent, including new deals, departures, and assignments compiled from press releases and repo...

13/07/2026

Eurovision Sport and European Athletics Bring Live Athletics to More Fans with Multilingual AI Commentary Initiative

Eurovision Sport is making live athletics more accessible to fans than ever befo...

13/07/2026

Milwaukee Bucks Return to Full-Season Over-the-Air Television for First Time in 31 Years

The Milwaukee Bucks will return to full-season over-the-air television for the 2...

13/07/2026

SMPTE Expands Education Offerings with Connected Learning Path for IP Media Workflows

SMPTE has announced an expanded education pathway for media technology professio...

13/07/2026

Vizrt Graphics Power Netflix MVP MMA Event at Intuit Dome

Vizrt has announced that its graphics technology was used by broadcast design agency Girraphic for Netflix's debut MVP MMA event, broadcast live from the In...

13/07/2026

ARRI To Sell Global Rental Business to H2 Equity Partners in Management Buyout

ARRI has announced an agreement to sell its global rental activities in Europe, the United Kingdom, and North America to H2 Equity Partners through a management...

13/07/2026

DAZN and Premier Boxing Champions Announce Global Broadcasting Partnership

DAZN has announced a partnership with Premier Boxing Champions (PBC) to bring PBC fight nights to DAZN subscribers globally. The partnership begins Saturday, Ju...

13/07/2026

TikTok and WSC Sports Partner To Connect Sports Rightsholders With Content Creators

TikTok and WSC Sports have announced a strategic partnership that gives sports r...

View most recent headlines