
NVIDIA Blackwell swept the new SemiAnalysis InferenceMAX v1 benchmarks, delivering the highest performance and best overall efficiency.
InferenceMax v1 is the first independent benchmark to measure total cost of compute across diverse models and real-world scenarios.
Best return on investment: NVIDIA GB200 NVL72 delivers unmatched AI factory economics - a $5 million investment generates $75 million in DSR1 token revenue, a 15x return on investment.
Lowest total cost of ownership: NVIDIA B200 software optimizations achieve two cents per million tokens on gpt-oss, delivering 5x lower cost per token in just 2 months.
Best throughput and interactivity: NVIDIA B200 sets the pace with 60,000 tokens per second per GPU and 1,000 tokens per second per user on gpt-oss with the latest NVIDIA TensorRT-LLM stack.
As AI shifts from one-shot answers to complex reasoning, the demand for inference - and the economics behind it - is exploding.
The new independent InferenceMAX v1 benchmarks are the first to measure total cost of compute across real-world scenarios. The results? The NVIDIA Blackwell platform swept the field - delivering unmatched performance and best overall efficiency for AI factories.
A $5 million investment in an NVIDIA GB200 NVL72 system can generate $75 million in token revenue. That's a 15x return on investment (ROI) - the new economics of inference.
Inference is where AI delivers value every day, said Ian Buck, vice president of hyperscale and high-performance computing at NVIDIA. These results show that NVIDIA's full-stack approach gives customers the performance and efficiency they need to deploy AI at scale.
Enter InferenceMAX v1 InferenceMAX v1, a new benchmark from SemiAnalysis released Monday, is the latest to highlight Blackwell's inference leadership. It runs popular models across leading platforms, measures performance for a wide range of use cases and publishes results anyone can verify.
Why do benchmarks like this matter?
Because modern AI isn't just about raw speed - it's about efficiency and economics at scale. As models shift from one-shot replies to multistep reasoning and tool use, they generate far more tokens per query, dramatically increasing compute demands.
NVIDIA's open-source collaborations with OpenAI (gpt-oss 120B), Meta (Llama 3 70B), and DeepSeek AI (DeepSeek R1) highlight how community-driven models are advancing state-of-the-art reasoning and efficiency.
Partnering with these leading model builders and the open-source community, NVIDIA ensures the latest models are optimized for the world's largest AI inference infrastructure. These efforts reflect a broader commitment to open ecosystems - where shared innovation accelerates progress for everyone.
Deep collaborations with the FlashInfer, SGLang and vLLM communities enable codeveloped kernel and runtime enhancements that power these models at scale.
Software Optimizations Deliver Continued Performance Gains NVIDIA continuously improves performance through hardware and software codesign optimizations. Initial gpt-oss-120b performance on an NVIDIA DGX Blackwell B200 system with the NVIDIA TensorRT LLM library was market-leading, but NVIDIA's teams and the community have significantly optimized TensorRT LLM for open-source large language models.
The TensorRT LLM v1.0 release is a major breakthrough in making large AI models faster and more responsive for everyone.
Through advanced parallelization techniques, it uses the B200 system and NVIDIA NVLink Switch's 1,800 GB/s bidirectional bandwidth to dramatically improve the performance of the gpt-oss-120b model.
The innovation doesn't stop there. The newly released gpt-oss-120b-Eagle3-v2 model introduces speculative decoding, a clever method that predicts multiple tokens at a time.
This reduces lag and delivers even quicker results, tripling throughput at 100 tokens per second per user (TPS/user) - boosting per-GPU speeds from 6,000 to 30,000 tokens.
For dense AI models like Llama 3.3 70B, which demand significant computational resources due to their large parameter count and the fact that all parameters are utilized simultaneously during inference, NVIDIA Blackwell B200 sets a new performance standard in InferenceMAX v1 benchmarks.
Blackwell delivers over 10,000 TPS per GPU at 50 TPS per user interactivity - 4x higher per-GPU throughput compared with the NVIDIA H200 GPU.
Performance Efficiency Drives Value Metrics like tokens per watt, cost per million tokens and TPS/user matter as much as throughput. In fact, for power-limited AI factories, Blackwell delivers 10x throughput per megawatt compared with the previous generation, which translates into higher token revenue.
The cost per token is crucial for evaluating AI model efficiency, directly impacting operational expenses. The NVIDIA Blackwell architecture lowered cost per million tokens by 15x versus the previous generation, leading to substantial savings and fostering wider AI deployment and innovation.
Multidimensional Performance InferenceMAX uses the Pareto frontier - a curve that shows the best trade-offs between different factors, such as data center throughput and responsiveness - to map performance.
But it's more than a chart. It reflects how NVIDIA Blackwell balances the full spectrum of production priorities: cost, energy efficiency, throughput and responsiveness. That balance enables the highest ROI across real-world workloads.
Systems that optimize for just one mode or scenario may show peak performance in isolation, but the economics of that doesn't scale. Blackwell's full-stack design delivers efficiency and value where it matters most: in production.
For a deeper look at how these curves are built - and why they matter for total cost of ownership and service-level agreement planning - check out this technical deep d
Most recent headlines
05/01/2027
Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...
01/06/2026
January 6 2026, 05:30 (PST) Dolby Sets the New Standard for Premium Entertainment at CES 2026
Throughout the week, Dolby brings to life the latest innovatio...
02/05/2026
Dalet, a leading technology and service provider for media-rich organizations, t...
01/05/2026
January 5 2026, 18:30 (PST) NBCUniversal's Peacock to Be First Streamer to ...
01/04/2026
January 4 2026, 18:00 (PST) DOLBY AND DOUYIN EMPOWER THE NEXT GENERATON OF CREATORS WITH DOLBY VISION
Douyin Users Can Now Create And Share Videos With Stun...
15/03/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
14/03/2026
Combines mic, USB interface & wireless IEMs
Following a successful Kickstarter campaign, HISONG have announced that their innovative AirStudio S1 device is ...
14/03/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
14/03/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
14/03/2026
Yospace, the trusted leader in Dynamic Ad Insertion (DAI), stitched 5.4 billion one-to-one addressable OTT advertisements across the 17 days of Milano Cortina 2...
14/03/2026
Telestream Advances Production-Ready AI Across Its Product Portfolio
Brie Clayton March 13, 2026
0 Comments
New AI capabilities drive smarter automati...
14/03/2026
Kraken Graded in DaVinci Resolve Studio
Brie Clayton March 13, 2026
0 Comments
Senior Colorist Dylan Hopkin delivers the first Scandinavian feature in...
14/03/2026
Tedial Powers the Future of Media Operations at NAB Show 2026
Brie Clayton March 13, 2026
0 Comments
Transforming Media Through Intelligence, Flexibil...
13/03/2026
Recently named CEO Andreas Eriksson has taken the helm at Net Insight at a pivot...
13/03/2026
Scripps Sports and Ally Financial are partnering with the Professional Women's Hockey League (PWHL) to broadcast its first game on national linear televisio...
13/03/2026
Disney+ has launched Verts, a vertical video feed on its U.S. mobile app, markin...
13/03/2026
LTN, a managed IP video transport company, and Appear, a live production technol...
13/03/2026
The Professional Fighters League (PFL) has announced an agreement with Sportradar for global betting data and streaming rights. Under the deal, Sportradar becom...
13/03/2026
In-venue and creative video staffers at the professional and collegiate level ha...
13/03/2026
The streamer will be the first entertainment platform to offer AI-enabled vertical video for live games, starting with the NBA...
13/03/2026
Ease Live, an Evertz company specializing in interactive graphical overlays, has deployed its platform on Red Bull TV for Premier Padel coverage. The deployment...
13/03/2026
Monday Night Football, ESPN's premiere NFL property, has continued to be improved and upgraded from a production perspective. Alternative broadcasts are aug...
13/03/2026
At NAB Show 2026, Net Insight (booth W1653) is introducing the Nimbra 520, a high-density media processing node for live contribution and distribution across ma...
13/03/2026
Harmonic (booth W2831) has announced Spectrum X Plus, the newest generation of its Spectrum X media server, offering double the channel density of previous gene...
13/03/2026
Riedel Communications has announced the expansion of its Managed Technology Divi...
13/03/2026
Telestream (booth W1503) has announced the expansion of Telestream Cloud Services with the introduction of UP, a cloud-native solution for ingest, orchestration...
13/03/2026
From awards ceremonies and sports honors shows to festivals and fan conventions,...
13/03/2026
Overtime has announced a partnership with Metro by T-Mobile, naming Metro the Of...
13/03/2026
At NAB 2026, Calrec (booth C6907) will IP Ecosystem powered by True Control 2.0, integrating the company's IP-native Argo consoles - including the U.S. debu...
13/03/2026
Ratings Roundup is a rundown of recent ratings news derived from press releases ...
13/03/2026
Spotify has always been built around your taste. More than 80% of listeners say personalization is what they love most about us. Now we're taking that even ...
13/03/2026
The new Spotify Legends Club has opened its doors. Its members: select German-sp...
13/03/2026
Pushing drum sampler technology into new territories
The latest version of Klevgrand's software drum sampler has just arrived, boasting a newly designe...
13/03/2026
Expanded headphone support & engine improvements
IK Multimedia's recently introduced ARC On-Ear system brings the power of their monitoring-correction s...
13/03/2026
Extra sound collections, more presets & new Keys category
UVI's rhythm and pattern instrument has just received a major update that introduces four new ...
13/03/2026
Over a year ago, L3Harris delivered the first missionized Bombardier Global 6500 aircraft for U.S. Indo-Pacific Command. Two ATHENA-R platforms now average 400+...
13/03/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
13/03/2026
Harmonic (NASDAQ: HLIT) today announced Spectrum X Plus, the newest generation of its Spectrum X media server, offering double the channel density of previous ...
13/03/2026
When Ewing Covenant Church made the decision to return to its original, historic building, affectionately called 1867 Sanctuary for weekly worship, the congre...
13/03/2026
Marshall Electronics introduces its first all-IP 4K POV camera, the CV574-WP, at NAB 2026 (Booth C8339). The CV574-WP supports NDI |HX, providing ultra-efficien...
13/03/2026
At NAB Show 2026, Net Insight introduces Nimbra 520, a high-density media processing node designed to simplify live contribution and distribution across both ma...
13/03/2026
Abandon Editorial Signs With Micha l Dimitri for West Coast Representation
Brie Clayton March 12, 2026
0 Comments
Abandon Editorial is excited to part...
13/03/2026
Documentary The Bulldogs Shot and Edited with Blackmagic Design
Brie Clayton March 12, 2026
0 Comments
Editorial tools helped shape film in real time,...
13/03/2026
AE Captions as Fast as CapCut - No Plugins
Graham Quince March 12, 2026
0 Comments
Stop wasting hours clicking through nested compositions and manuall...
13/03/2026
New Music USA and Berklee Institute of Jazz and Gender Justice Announce 2026 Nex...
13/03/2026
13 Mar 2026
VEON Delivers Record Digital Growth: 4Q25 Digital Revenues Grow 84%...
13/03/2026
Friday 13 March 2026
Sky Adds Blood on Snow to Original Film Slate in Acquisiti...
13/03/2026
RT has announced today that Rick O'Shea is the new presenter of Arena RT Radio 1's flagship weeknight arts and culture programme. Rick has been pres...
13/03/2026
Lights! Camera! Action! The 98th Oscars set to air live as RT backs the Irish n...
12/03/2026
Staines-upon-Thames, UK, 11th March, 2026 - Yospace, the trusted leader in Dyna...