Sony Pixel Power calrec Sony

NVIDIA Blackwell Ultra Sets New Inference Records in MLPerf Debut

09/09/2025

As large language models (LLMs) grow larger, they get smarter, with open models from leading developers now featuring hundreds of billions of parameters. At the same time, today's leading models are also capable of reasoning, which means that they generate many intermediate reasoning tokens before delivering a final response to the user. The combination of these two trends-larger models that think using more tokens-drives the need for significantly higher compute performance.

Delivering the highest performance on production workloads takes a state-of-the-art technology stack-spanning chips, systems, and software-and an expansive developer ecosystem that is constantly building on that stack.

MLPerf Inference v5.1 is the latest version of the MLPerf Inference industry standard benchmark. With benchmark rounds held twice per year, the benchmark features many tests of AI inference performance and is regularly updated with new models and scenarios. This round features:

DeepSeek-R1 - a popular 671-billion parameter mixture-of-experts (MoE) reasoning model, developed by DeepSeek. In the server scenario, the time-to-first-token (TTFT) threshold is 2 seconds with a 12.5 tokens/second/user (TPS/user) target. All TPS/user targets are 99th percentile, meaning that 99% of tokens meet or exceed that TPS/user speed.

Llama 3.1 405B - MLPerf Inference v5.1 adds a new interactive scenario for the largest of the Llama 3.1 series of models, providing a faster 12.5 TPS/user threshold with a shorter 4.5 second TTFT requirement compared to the existing server scenario.

Llama 3.1 8B - an 8-billion parameter member of the Llama 3.1 series of models with offline, server (2 second TTFT, 10 TPS/user), and interactive (0.5 second TTFT, 33 TPS/user) scenarios. This replaces the GPT-J benchmark used in prior rounds.

Whisper - a popular speech recognition model that recently saw nearly 5 million downloads in a month on HuggingFace. This replaces RNN-T, which was featured in prior editions of the MLPerf Inference benchmark suite.

This round, NVIDIA submitted the first results using the new Blackwell Ultra architecture, announced in March. It came just six months after Blackwell made its debut in the available category in MLPerf Inference v5.0, setting new inference performance records. Additionally, the NVIDIA platform set new performance records on all newly added benchmarks this round-DeepSeek-R1, Llama 3.1 405B, Llama 3.1 8B, and Whisper-and continues to hold per-GPU performance records on all other MLPerf inference benchmarks.

MLPerf Inference Per-Accelerator Records

Benchmark Offline Server Interactive

DeepSeek-R1 5,842 tokens/second/GPU 2,907 tokens/second/GPU **

Llama 3.1 405B 224 tokens/second/GPU 170 tokens/second/GPU 138 tokens/second/GPU

Llama 2 70B 99.9% 12,934 tokens/second/GPU 12,701 tokens/second/GPU 7,856 tokens/second/GPU

Llama 2 70B 99% 13,015 tokens/second/GPU 12,701 tokens/second/GPU 7,856 tokens/second/GPU

Llama 3.1 8B 18,370 tokens/second/GPU 16,099 tokens/second/GPU 15,284 tokens/second/GPU

Stable Diffusion XL 4.07 samples/second/GPU 3.59 queries/second/GPU **

Mixtral 8x7B 16,099 tokens/second/GPU 16,131 tokens/second/GPU **

DLRMv2 99% 87,228 samples/second/GPU 80,515 samples/second/GPU **

DLRMv2 99.9% 48,666 samples/second/GPU 46,259 queries/second/GPU **

Whisper 5,667 tokens/second/GPU ** **

R-GAT 81,404 samples/second/GPU ** **

Retinanet 1,875 samples/second/GPU 1,801 queries/second/GPU **

Table 1. Performance records per GPU based on submissions powered by the NVIDIA platform. MLPerf Inference v5.0 and v5.1, Closed Division. Results retrieved from www.mlcommons.org on September 9, 2025. NVIDIA platform results from the following entries: 5.0-0072, 5.1-0007, 5.1-0053, 5.1-0079, 5.1-0028, 5.1-0062, 5.1-0086, 5.1-0073, 5.1-0008, 5.1-0070,5.1-0046, 5.1-0009, 5.1-0060, 5.1-0072. 5.1-0071, 5.1-0069 Per chip performance derived by dividing total throughput by number of reported chips. Per-chip performance is not a primary metric of MLPerf Inference v5.0 or v5.1.The MLPerf name and logo are registered and unregistered trademarks of MLCommons Association in the United States and other countries. All rights reserved. Unauthorized use strictly prohibited. See www.mlcommons.org for more information.

NVIDIA also made extensive use of NVFP4 acceleration across all DeepSeek-R1 and Llama model submissions using the Blackwell and Blackwell Ultra architectures.

In this post, we take a closer look at these performance results and the full-stack technologies that enabled them.

Blackwell Ultra sets reasoning records in MLPerf debut This round, NVIDIA submitted results in the available category using the GB300 NVL72 rack-scale system, the first-ever MLPerf submissions using the Blackwell Ultra architecture. Blackwell Ultra builds upon the many advances in the NVIDIA Blackwell architecture, with several key enhancements:

1.5x higher peak NVFP4 AI compute

2x higher attention-layer compute

1.5x higher HBM3e capacity

Compared to the GB200 NVL72 submission, GB300 NVL72 delivered up to 45% higher performance per GPU, setting the standard on the new DeepSeek-R1 benchmark. And compared to unverified results collected on a Hopper-based system, Blackwell Ultra delivered about 5x higher throughput per GPU-translating into significantly higher AI factory throughput and much lower cost per token.

DeepSeek-R1 Performance

Architecture Offline Server

Hopper 1,253 tokens/second/GPU 556 tokens/second/GPU

Blackwell Ultra 5,842 tokens/second/GPU 2,907 tokens/second/GPU

Blackwell Ultra Advantage 4.7x 5.2x

Table 2. Per-GPU performance on DeepSeek-R1. MLPerf Inference v5.1, Closed. Blackwell Ultra results based on results in entry 5.1-0072. Hopper results not verified by MLCommons Association. Per-GPU performance is not a primary metric of MLPerf Inference v5.1 and is calcu
LINK: https://developer.nvidia.com/blog/nvidia-blackwell-ultra-sets-new-infe...
See more stories from nvidia

Most recent headlines

09/11/2025

Dalet Unveils Agentic AI Media Workflows at IBC2025

Dalet today announced a transformative leap forward for media operations: Agentic Artificial Intelligence (AI) that unifies the Dalet ecosystem under one natura...

06/10/2025

France Tlvisions Wins Prestigious 2025 EBU Technology & Innovation Award in Groundbreaking Collaboration with Dalet

France T l visions, France's leading broadcaster, has received the 2025 EBU ...

12/09/2025

College Football Kickoff 2025: Fox Sports Ups Look as Canon, Sony Power Shallow Focus Coverage

College Football Kickoff 2025: Fox Sports Ups Look as Canon, Sony Power Shallow ...

12/09/2025

ABC/ESPN Excited For WNBA Postseason Coverage In Revamped Format

ABC/ESPN Excited For WNBA Postseason Coverage In Revamped FormatThe Finals moves to a best-of-seven series in 2025By Mark J Burns, SVG Contributor Friday, Sep...

12/09/2025

Rabbit Trap Pulsates With Folklore Dread

(L-R) Jade Croot, Rosy McEwen, and Bryn Chainey attend the 2025 Sundance Film Festival premiere of Rabbit Trap at Eccles Theatre on January 24, 2025, in Park ...

12/09/2025

Spotify's The Drop Weekly' Brings You the Week in New Releases, Straight From Our Editors

For fans, we know how important it is to stay plugged into music culture and dis...

12/09/2025

Agama and Consult Red announce RDK Accelerator integration

Link ping, Sweden and Shipley, United Kingdom, September 12, 2025 - Agama, the expert in video observability and analytics for service quality and customer expe...

12/09/2025

IBC2025 Opens for Business

IBC2025 began on Sept. 12, with exhibits and conferences running through Sept. 15 at the RAI Amsterdam Convention Center. Explore the full TV Tech coverage of t...

12/09/2025

The Best Fictional Bands (and the Artists Who Make Them Great)

The Best Fictional Bands (and the Artists Who Make Them Great) With Spinal Tap II: The End Continues hitting theaters and songs from Kpop Demon Hunters ruling...

12/09/2025

Tom Baldassare Joins Advanced Systems Group

Industry veteran Tom Baldassare has joined Advanced Systems Group, LLC (ASG), a technology and services provider for media creatives and content owners, as a Se...

12/09/2025

Maxon Unveils a Brand New Look for its Growing Family of...

Maxon, maker of powerful, approachable software solutions for creators working in 2D and 3D design, motion graphics, visual effects, and more, today announced a...

12/09/2025

PlayBox Neo US Partners with AI-Media to Deliver Scalable...

PlayBox Neo, a leading provider of media playout solutions, has partnered with AI-Media, pioneering developers of AI-powered captioning technology, to integrate...

12/09/2025

Dalet Unveils Agentic AI Media Workflows at IBC2025

Dalet today announced a transformative leap forward for media operations: Agentic Artificial Intelligence (AI) that unifies the Dalet ecosystem under one natura...

12/09/2025

Keepit and Ingram Micro launch strategic sales agreement...

New alliance strengthens the IT channel in Germany and Switzerland in protecting business-critical SaaS data. Keepit, the world s only independent, cloud-nativ...

12/09/2025

Mediaset selects Fincons Group AllRights to evolve rights...

Fincons Group, an international IT business consultancy and systems integrator company with more than 40 years of experience in the market, is proud to announce...

12/09/2025

EVS Acquires XD motion

Following its acquisition of Telemetrics, EVS continues its push into robotics with an announcement at IBC2025 that it is acquiring XD motion....

12/09/2025

Televisa Executive Joins NABA Board

TORONTO The North American Broadcasters Association (NABA) has announced the appointment of Eduardo Ruiz Sanchez, deputy director, broadcast operations at Telev...

12/09/2025

Ed Miller, Former SBE President, Has Died

Ed Miller, a longtime broadcast engineer in Ohio and a former national president of the Society of Broadcast Engineers, has died....

12/09/2025

IBC2025: Dynamic HDR Gains Traction

AMSTERDAM At this year's IBC2025, the Advanced HDR by Technicolor initiative will be pushing broadcasters to adopt a more dynamic, frame-by-frame conversion...

12/09/2025

Bob Geldof to receive Lifetime Achievement Award at the Sky Arts Awards 2025

Friday 12 September 2025 The Boomtown Rats, Nyah Grace, Soweto Kinch, Royal Ballet and Madness also announced to perform at the ceremony on Tuesday Sky today ...

12/09/2025

Riedel Unveils Ultra-Light Bolero Mini Wireless Intercom Beltpack

Wuppertal September 12, 2025 Riedel Unveils Ultra-Light Bolero Mini Wireless Intercom BeltpackAt IBC2025 in Amsterdam, Riedel Communications unveiled Bolero M...

12/09/2025

Riedel Communications Acquires hi human interface

Wuppertal September 12, 2025 Riedel Communications Acquires hi human interfaceRiedel Communications today announced the acquisition of hi human interface fro...

12/09/2025

New International Crime Series Road (WT)' Explores Twisted Murders Across Borders

Back to All News New International Crime Series Road (WT)' Explores Twiste...

12/09/2025

First Look: Thai Crime Drama Everybody Loves Me When I'm Dead' Premieres October 14

Back to All News First Look: Thai Crime Drama Everybody Loves Me When I'm ...

12/09/2025

Netflix Marks 10 Years in Japan, Announces Three New Series That Will Keep You Hitting The Next Episode

Back to All News Netflix Marks 10 Years in Japan, Announces Three New Series Th...

12/09/2025

What Is CORE+ Technologyand How Does It Elevate Church Sound?

CORE+ virtually removes distortion, setting a new standard for church sound and giving worship teams the clarity and confidence they need. Read the full artic...

12/09/2025

Margot Robbie, Colin Farrell, Mary Robinson and Conor Murray amongst guests on Late Late Show season opener

The Late Late Show is back with a bang after the summer break, and Patrick Kielt...

12/09/2025

Another jam-packed weekend of live, free-to-air Sport across RT

The World Athletics Championships, Ireland v France in the Women's Rugby World Cup quarter-final, the Irish Champions Festival, and two Sports Direct Men...

12/09/2025

Katie Hannon explores the shelves of Ireland's National Archives in new series

The Records Show starts Sunday at 6.30pm on RT One and RT Player. Katie Hanno...

11/09/2025

Report: Busy Live Sports Streaming Execs Have Low-hanging Fruit' in Front of Them

Report: Busy Live Sports Streaming Execs Have Low-hanging Fruit' in Front o...

11/09/2025

Inside Game Creek Video's Big Week as Ovation, Flagship Make NFL Debuts

Inside Game Creek Video's Big Week as Ovation, Flagship Make NFL DebutsBy Ken Kerschbaumer, Editorial Director Thursday, September 11, 2025 - 7:00 am Pr...

11/09/2025

NFL Kickoff 2025: Prime Sports Starts New Season at Lambeau Field; Sets Sights on Holiday Matchups, Second-Ever Playoff Game

NFL Kickoff 2025: Prime Sports Starts New Season at Lambeau Field; Sets Sights o...

11/09/2025

College Football Kickoff 2025: NBC Sports Pushes HDR Image Quality, Aerial Drones for Big Ten Saturday Night' and Notre Dame on NBC'

College Football Kickoff 2025: NBC Sports Pushes HDR Image Quality, Aerial Drone...

11/09/2025

RADAR's 1,000-Artist Milestone: Rachel Chinouriri, Zimmer90, and More Celebrate With Spotify in Berlin

RADAR, Spotify's program for emerging talent, recently hit a major milestone...

11/09/2025

SBS shares Australian National Anthem in over 60 languages to foster belonging and connection

SBS shares Australian National Anthem in over 60 languages to foster belonging a...

11/09/2025

L3Harris Successfully Demonstrates Electronic Warfare Technology in the UK

L3Harris showcases advanced Distributed Spectrum Collaboration and Operations (DiSCO) technology on Defence Science and Technology Laboratory's MAST-13 uncr...

11/09/2025

L3Harris Providing EO/IR targeting systems to Kongsberg Defence & Aerospace for new C-UxS Land Vehicle

L3Harris recently signed an agreement with Kongsberg Defence & Aerospace to supp...

11/09/2025

JULY 2025 - PODIUM CHANGES

Warsaw, Poland, 20.08.25: Nielsen, the global leader in audience measurement, data and analytics, has released its latest July All Screens Video Landscape repor...

11/09/2025

Warner Bros. Discovery and Nielsen Sign Multi-year Measurement and Analytics Deal

Warner Bros. Discovery to add key Big Data and Advanced Audience capabilities fr...

11/09/2025

Smarter Video Analytics and Easier Operations with Agama 11

Link ping, Sweden, September 11, 2025 - Agama, the expert in video observability & analytics for service quality and customer experience, announced today the la...

11/09/2025

From the Classroom to the Red Carpet Maxon Delivers Speed...

Maxon today unveiled the latest Maxon One release, delivering new innovations across its unified creative ecosystem and introducing a fresh visual identity that...

11/09/2025

Lightware launches new Enterprise Program

Lightware is pleased to announce the launch of the Lightware Enterprise Program, empowering corporations and organisations with its portfolio of scalable, relia...

11/09/2025

CueScript Brings Latest Prompter Innovations to IBC 2025

CueScript, the leading international developer of professional teleprompting solutions with over a decade of innovation and hands-on industry expertise, is intr...

11/09/2025

Faster Safer Smarter MASV Showcases the Next Wave of File...

MASV (massive.io), the fastest and most reliable secure large file transfer platform for media professionals and an IDC Innovator 2025 for Media & Entertainment...

11/09/2025

Crosspoint and Amplify Enhance RTVE Archive Metadata with...

CROSSPOINT/CROSSMEDIA (part of ES MEDIA Group), in partnership with AMPLIFY, has been selected to deliver the automatic metadata service for RTVE's historic...

11/09/2025

TSL Hummingbird Takes Flight

TSL is launching Hummingbird, a unified, interoperable ecosystem of control and monitoring applications and interfaces, designed to drive efficiency and reduce ...

11/09/2025

LucidLink announces TeamCache at IBC 2025 delivering loca...

LucidLink, the cloud-native storage collaboration platform, today announced new innovations at IBC 2025 (Stand 6.A12) designed to deliver faster, more secure wo...

11/09/2025

SES Space & Defense Awarded Sustainment Tactical Network Contract to Support U.S. Army

Under the USD 89.6 Million award, SES Space & Defense will provide global commer...

11/09/2025

Telekom Srbija Expands and Extends Partnership with SES

Leading Balkan DTH provider adds capacity to consolidate its m:Sat TV platform at 23.5 degrees East and serve more customers across the region Luxembourg, 11 S...

11/09/2025

Viaplay Taps Vizrt to Increase Live Premier League Content Output

LONDON Vizrt has announced that it is providing automation technologies for live sports production to the Viaplay streaming service that are being used to cover...