
Inference performance is critical, as it directly influences the economics of an AI factory. The higher the throughput of AI factory infrastructure, the more tokens it can produce at a high speed - increasing revenue, driving down total cost of ownership (TCO) and enhancing the system's overall productivity.
Less than half a year since its debut at NVIDIA GTC, the NVIDIA GB300 NVL72 rack-scale system - powered by the NVIDIA Blackwell Ultra architecture - set records on the new reasoning inference benchmark in MLPerf Inference v5.1, delivering up to 1.4x more DeepSeek-R1 inference throughput compared with NVIDIA Blackwell-based GB200 NVL72 systems.
Blackwell Ultra builds on the success of the Blackwell architecture, with the Blackwell Ultra architecture featuring 1.5x more NVFP4 AI compute and 2x more attention-layer acceleration than Blackwell, as well as up to 288GB of HBM3e memory per GPU.
The NVIDIA platform also set performance records on all new data center benchmarks added to the MLPerf Inference v5.1 suite - including DeepSeek-R1, Llama 3.1 405B Interactive, Llama 3.1 8B and Whisper - while continuing to hold per-GPU records on every MLPerf data center benchmark.
Stacking It All Up Full-stack co-design plays an important role in delivering these latest benchmark results. Blackwell and Blackwell Ultra incorporate hardware acceleration for the NVFP4 data format - an NVIDIA-designed 4-bit floating point format that provides better accuracy compared with other FP4 formats, as well as comparable accuracy to higher-precision formats.
NVIDIA TensorRT Model Optimizer software quantized DeepSeek-R1, Llama 3.1 405B, Llama 2 70B and Llama 3.1 8B to NVFP4. In concert with the open-source NVIDIA TensorRT-LLM library, this optimization enabled Blackwell and Blackwell Ultra to deliver higher performance while meeting strict accuracy requirements in submissions.
Large language model inference consists of two workloads with distinct execution characteristics: 1) context for processing user input to produce the first output token and 2) generation to produce all subsequent output tokens.
A technique called disaggregated serving splits context and generation tasks so each part can be optimized independently for best overall throughput. This technique was key to record-setting performance on the Llama 3.1 405B Interactive benchmark, helping to deliver a nearly 50% increase in performance per GPU with GB200 NVL72 systems compared with each Blackwell GPU in an NVIDIA DGX B200 server running the benchmark with traditional serving.
NVIDIA also made its first submissions this round using the NVIDIA Dynamo inference framework.
NVIDIA partners - including cloud service providers and server makers - submitted great results using the NVIDIA Blackwell and/or Hopper platform. These partners include Azure, Broadcom, Cisco, CoreWeave, Dell Technologies, Giga Computing, HPE, Lambda, Lenovo, Nebius, Oracle, Quanta Cloud Technology, Supermicro and the University of Florida.
The market-leading inference performance on the NVIDIA AI platform is available from major cloud providers and server makers. This translates to lower TCO and enhanced return on investment for organizations deploying sophisticated AI applications.
Learn more about these full-stack technologies by reading the NVIDIA Technical Blog on MLPerf Inference v5.1. Plus, visit the NVIDIA DGX Cloud Performance Explorer to learn more about NVIDIA performance, model TCO and generate custom reports.
Most recent headlines
06/10/2025
France T l visions, France's leading broadcaster, has received the 2025 EBU ...
09/09/2025
Canon EOS C50 is Canon's Smallest, Lightest Cinema Camera EverBy Ken Kerschbaumer, Editorial Director
Tuesday, September 9, 2025 - 9:00 am
Print This St...
09/09/2025
ReachTV Elevates Its Game To Bring Live Sports to TravelersThe company is redefining sports distribution across airports, hotels, digital platformsBy Brandon Co...
09/09/2025
SVG All-Stars: Rusty West, Coordinating Producer of Live Events Technology, NBC ...
09/09/2025
AWS Elemental at 10: A Steady Stream of Cloud-based Video Processing Innovation ...
09/09/2025
Since 2020, Spotify's RADAR program has been a launchpad for emerging talent...
09/09/2025
In 2020, Spotify launched RADAR with a clear mission: to identify, nurture, and ...
09/09/2025
If you're on BookTok, you know the drill. You scroll, you tap, and suddenly your To Be Read list is overflowing with trending must-reads. Not unlike Spoti...
09/09/2025
The new L3Harris fiber-optic torpedo cable will provide improved bandwidth, strength and reliability for the U.S. Navy's next-generation MK-48 (Mod-8) torpe...
09/09/2025
eds3_5_jq(document).ready(function($) { $(#eds_sliderM519).chameleonSlider_2_1({ content_source:......
09/09/2025
Streaming drives 55.8% of total TV time for Hispanic viewers,
outpacing 46% fo...
09/09/2025
COLUMBIA, Md. Video infrastructure provider LTN has promoted Bryan McGuirk to chief revenue officer, tasked with driving the company's global growth and st...
09/09/2025
The traditional pay-TV sector saw a decline in subscriptions for the ninth consecutive year according to a new report from S&P Global Market Intelligence. The d...
09/09/2025
CLEVELAND Cam Eicher has been promoted to executive vice president of audio production at broadcast audio firm Telos Alliance....
09/09/2025
PARIS Ateme, a provider of video compression, delivery, and streaming solutions, today announced it is leveraging Google Cloud's generative AI capabilities ...
09/09/2025
BINGEN AM RHEIN, Germany Broadcast Solutions, one of Europe's largest media system integration groups, has acquired Dutch camera support specialist Egripmen...
09/09/2025
McGuirk to lead growth strategy and success for media and sports customers navigating the satellite-to-IP transformation
LTN today announced the promotion of B...
09/09/2025
France T l visions has reported record-breaking digital and broadcast engagement figures for its coverage of the 2025 Roland-Garros tournament, thanks to a bold...
09/09/2025
The UK's most powerful hub for the creative industries sets the stage for creators, podcasters, and industry innovators for its 10th edition
MPTS (The Medi...
09/09/2025
As the industry accelerates toward new forms of streaming monetization, Ad Insertion Platform (AIP), a pioneer in cutting-edge ad insertion technology solutions...
09/09/2025
Broadcast Solutions, one of Europe's largest media system integration groups, has acquired Dutch camera support specialist Egripment BV, and its rental subs...
09/09/2025
ITV, the British media and broadcasting company, announces that it is now using Unified Virtual Channel from Unified Streaming to stream nearly one dozen FAST (...
09/09/2025
LiveU, the global leader in IP-video solutions, today unveiled the LU900Q its next-generation field unit that delivers a quantum leap in live transmission and...
09/09/2025
ENGLEWOOD, Colo. and HAWTHORNE, Calif. EchoStar has entered into a definitive agreement with SpaceX to sell the companys AWS-4 and H-block spectrum licenses for...
09/09/2025
Samsung and ESPN have launched a limited time promotion for the ESPN Unlimited streaming service and new Samsung Glare Free TVs that will provide customers who ...
09/09/2025
YouTube's first ever global broadcast of an NFL game set a record for most concurrent viewers of a live stream on YouTube, with a global average minute audi...
09/09/2025
LONDON During IBC2025, video software provider, Synamedia, will be showcasing a number of key improvements to Synamedia Go, Quortex Play, and its dynamic ad ins...
09/09/2025
Cam Eicher Named Executive Vice President of Audio Production
Search
Cleveland, Ohio (September 9, 2025) - Telos Alliance , trusted global leader in broadc...
09/09/2025
Back to All News
Big News! Fifth Season of Love Is Blind: Brazil Drops With Ext...
09/09/2025
Back to All News
Spanish Language Livestream Available for Canelo lvarez vs. T...
09/09/2025
Harmonic's VOS360 Ad SaaS Enables Personalized Ad Delivery and Seamless Integration with the Ad Tech Ecosystem SAN JOSE, Calif. - Sept. 9, 2025 - Harmonic ...
09/09/2025
European debut of AiDi (no-internet-required AI) transforms sports broadcasting with real-time player tracking, automated graphics and intelligent analysis...
09/09/2025
St. Luke's Episcopal Church in New Jersey needed audio that truly connects and they found the answer.
Read the full case study here!...
09/09/2025
From AI-driven sound design to adaptable systems and uncompromising audio quality, the discussion highlights how innovation is shaping the future of performance...
09/09/2025
The conversation covers DPA's natural, uncolored sound, how to choose the right mic for any application and the company's latest innovations including C...
09/09/2025
From the origins of CORE to the latest CORE+ upgrade, the article explores how DPA minimizes distortion, expands dynamic range and ensures clarity at every SPL ...
09/09/2025
At this week's AI Infrastructure Summit in Silicon Valley, NVIDIA's VP o...
09/09/2025
Inference performance is critical, as it directly influences the economics of an AI factory. The higher the throughput of AI factory infrastructure, the more to...
09/09/2025
At this week's IAA Mobility conference in Munich, NVIDIA Vice President of A...
09/09/2025
September 9 2025, 07:22 (PDT) Dolby Revolutionizes Entertainment on Four Wheels...
09/09/2025
The Autumn Schedule has begun on RT Raidi na Gaeltachta, with a feast of music, entertainment, news and current affairs programmes as well as an interesting s...
09/09/2025
ComfyUI - an open-source, node-based graphical interface for running and buildin...
08/09/2025
Behind The Mic: John Fanta Finalizes Move to NBC; ESPN Gears Up for Fall of Foot...
08/09/2025
Powered Up: Inside the Los Angeles Chargers' New Content Studio at The Bolt Chargers SVP, Brand Creative & Content Jason Lavine offers a look at the facili...
08/09/2025
NFL Kickoff 2025: YouTube Breaks Ground With First Free Global NFL Stream, Backe...
08/09/2025
Dolby Vision 2 Promises to Make HDR Viewing Even BetterBy SVG Staff
Friday, September 5, 2025 - 10:14 am
Print This Story | Subscribe
Story Highlights
Ea...
08/09/2025
BMG, Optimist Studios Combine Forces in L.A. to Give Content Creators Maximum Fi...
08/09/2025
CSI Sports/Fight Sports Announces 2026 Tyson-Mayweather Mega-Fight; Teases Strea...
08/09/2025
NFL Kickoff 2025: Fox Sports Upgrades A' Game Production with Ovation, Game...
08/09/2025
NFL Kickoff 2025: FOX Sports Launches Groundbreaking New LED-Fueled XR/AR StageThe revamp of Stage B follows in the footsteps of FOX's Stage A, which launch...