
As enterprises race to adopt generative AI and bring new services to market, the demands on data center infrastructure have never been greater. Training large language models is one challenge, but delivering LLM-powered real-time services is another.
In the latest round of MLPerf industry benchmarks, Inference v4.1, NVIDIA platforms delivered leading performance across all data center tests. The first-ever submission of the upcoming NVIDIA Blackwell platform revealed up to 4x more performance than the NVIDIA H100 Tensor Core GPU on MLPerf's biggest LLM workload, Llama 2 70B, thanks to its use of a second-generation Transformer Engine and FP4 Tensor Cores.
The NVIDIA H200 Tensor Core GPU delivered outstanding results on every benchmark in the data center category - including the latest addition to the benchmark, the Mixtral 8x7B mixture of experts (MoE) LLM, which features a total of 46.7 billion parameters, with 12.9 billion parameters active per token.
MoE models have gained popularity as a way to bring more versatility to LLM deployments, as they're capable of answering a wide variety of questions and performing more diverse tasks in a single deployment. They're also more efficient since they only activate a few experts per inference - meaning they deliver results much faster than dense models of a similar size.
The continued growth of LLMs is driving the need for more compute to process inference requests. To meet real-time latency requirements for serving today's LLMs, and to do so for as many users as possible, multi-GPU compute is a must. NVIDIA NVLink and NVSwitch provide high-bandwidth communication between GPUs based on the NVIDIA Hopper architecture and provide significant benefits for real-time, cost-effective large model inference. The Blackwell platform will further extend NVLink Switch's capabilities with larger NVLink domains with 72 GPUs.
In addition to the NVIDIA submissions, 10 NVIDIA partners - ASUSTek, Cisco, Dell Technologies, Fujitsu, Giga Computing, Hewlett Packard Enterprise (HPE), Juniper Networks, Lenovo, Quanta Cloud Technology and Supermicro - all made solid MLPerf Inference submissions, underscoring the wide availability of NVIDIA platforms.
Relentless Software Innovation NVIDIA platforms undergo continuous software development, racking up performance and feature improvements on a monthly basis.
In the latest inference round, NVIDIA offerings, including the NVIDIA Hopper architecture, NVIDIA Jetson platform and NVIDIA Triton Inference Server, saw leaps and bounds in performance gains.
The NVIDIA H200 GPU delivered up to 27% more generative AI inference performance over the previous round, underscoring the added value customers get over time from their investment in the NVIDIA platform.
Triton Inference Server, part of the NVIDIA AI platform and available with NVIDIA AI Enterprise software, is a fully featured open-source inference server that helps organizations consolidate framework-specific inference servers into a single, unified platform. This helps lower the total cost of ownership of serving AI models in production and cuts model deployment times from months to minutes.
In this round of MLPerf, Triton Inference Server delivered near-equal performance to NVIDIA's bare-metal submissions, showing that organizations no longer have to choose between using a feature-rich production-grade AI inference server and achieving peak throughput performance.
Going to the Edge Deployed at the edge, generative AI models can transform sensor data, such as images and videos, into real-time, actionable insights with strong contextual awareness. The NVIDIA Jetson platform for edge AI and robotics is uniquely capable of running any kind of model locally, including LLMs, vision transformers and Stable Diffusion.
In this round of MLPerf benchmarks, NVIDIA Jetson AGX Orin system-on-modules achieved more than a 6.2x throughput improvement and 2.4x latency improvement over the previous round on the GPT-J LLM workload. Rather than developing for a specific use case, developers can now use this general-purpose 6-billion-parameter model to seamlessly interface with human language, transforming generative AI at the edge.
Performance Leadership All Around This round of MLPerf Inference showed the versatility and leading performance of NVIDIA platforms - extending from the data center to the edge - on all of the benchmark's workloads, supercharging the most innovative AI-powered applications and services. To learn more about these results, see our technical blog.
H200 GPU-powered systems are available today from CoreWeave - the first cloud service provider to announce general availability - and server makers ASUS, Dell Technologies, HPE, QCT and Supermicro.
See notice regarding software product information.
Most recent headlines
09/11/2025
Dalet today announced a transformative leap forward for media operations: Agentic Artificial Intelligence (AI) that unifies the Dalet ecosystem under one natura...
04/11/2025
SVG Sit-Down: Why Professional Fight League CEO John Martin Believes Growth Is I...
04/11/2025
SVG All-Stars: David Koppett, Executive Producer, Live Sports and Studio, NESN a...
04/11/2025
From concept to kick-off: How TAMS could transform sports workflows By Paul Markham
Tuesday, October 28, 2025 - 09:43
Print This Story
Techex tx darwin pr...
04/11/2025
College Hoops Preview 2025: The CW Tips Off Third Season of ACC Men's/Women&...
04/11/2025
College Hoops Preview 2025: Big Ten Network Heats Up for Busy Season With 500 Me...
04/11/2025
College Hoops Preview 2025: CBS Sports Readies 300+ Game Broadcasts Across Its P...
04/11/2025
College Hoops Preview 2025: NBC Sports Slate Features 200+ Big Ten, BIG EAST, an...
04/11/2025
College Hoops Preview 2025: ESPN Remote-Ops Team Preps for Massive Slate of 7,40...
04/11/2025
Never-before-seen footage of Selena Quintanilla and her family's band offers...
04/11/2025
Joel Edgerton at Train Dreams Park City premiere (photo by Soul Brother / Shutterstock for Sundance Film Festival)...
04/11/2025
Today, we announced our third quarter 2025 earnings, marking strong momentum as we surpassed 700 million Monthly Active Users and achieved double-digit subscrib...
04/11/2025
Idag rapporterar vi v rt resultat f r det tredje kvartalet 2025, vilket markerar en stark och fortsatt tillv xt d vi passerade 700 miljoner m natliga aktiva an...
04/11/2025
SBS calls for bold, thought-provoking factual ideas: up to $50,000 in developmen...
04/11/2025
Tomorrow's fight will demand networks that deliver both capacity and survivability, the speed to move mission applications at scale, and the resilience to e...
04/11/2025
New York, NY - November 3, 2025 - Neptune BidCo US Inc. (the Issuer or the Co...
04/11/2025
WASHINGTON The National Association of Broadcasters took aim at YouTube TV and its owner Google in a blog post for its heavy hand in deciding what viewers can ...
04/11/2025
HACKENSACK, N.J. The European Broadcasting Union (EBU) has awarded LiveU a five-year contract to deliver 24/7 live news content through its Eurovision News Exch...
04/11/2025
Bob Dylan Awarded Honorary Doctorate from Berklee College of Music The songwriter, performer, and cultural icon is recognized for a six-decade career that redef...
04/11/2025
SAN JOSE, Calif. Roku has launched Roku Ads API, a fully open, self-serve developer platform for connected TV (CTV) advertising. The Roku Ads API gives develope...
04/11/2025
Harmonic (NASDAQ: HLIT) today announced an expanded partnership with Spectrum to extend the company's industry-leading cOS vCMTS and advanced network and o...
04/11/2025
The inauguration of Empresa de Meios Audiovisuais' (EMAV's) first virtual studio in Lisbon marks a major technological milestone for the Portuguese audi...
04/11/2025
ZTransform, a leader in transformational system design, integration, and launch services for broadcasters, sports venues, educational facilities, and corporate ...
04/11/2025
Fred Baumgartner's op-ed (ATSC 3.0: I Cant Imagine Anyone Defending Our Current Adoption Strategy) on the broadcast industry's transition to ATSC 3.0 dr...
04/11/2025
Q&A with Music Alum Andrew van der Paardt The oboist and English horn player reports back from the pit of the New York City Ballet Orchestra, and tells how he...
04/11/2025
Damien Moloney as Jim Bergerac
As filming wraps on the highly anticipated second series of Bergerac(6x60'), UKTV today unveils a selection of first look im...
04/11/2025
Tuesday 4 November 2025
To view this content, please enable our use of cookies....
04/11/2025
Back to All News
Netflix and Embratur launch audiovisual tourism guide at the W...
04/11/2025
Back to All News
Frankenstein' Sightings Grip Hollywood With Halloween Wee...
04/11/2025
From the recent SMPTE Media Technology Summit in Pasadena, with FilmLight Image Engineer, Daniele Siragusano, and Research Engineer, Julius Tschannerl.
Matchin...
04/11/2025
Begins Thursday November 6 on RT One and RT Player at 10:15pm
Camogie: Inside...
04/11/2025
In Berlin on Tuesday, Deutsche Telekom and NVIDIA unveiled the world's first...
04/11/2025
When inspiration strikes, nothing kills momentum faster than a slow tool or a frozen timeline. Creative apps should feel fast and fluid - an extension of imagin...
04/11/2025
Douglas W. Phillips and Steven M. Paul join Scripps Research Board of Directors Finance and biomedical leaders bring decades of experience in investment strateg...
03/11/2025
SVG Sit-Down: Inside the Sports Rights Landscape (and the new IMG) with Andrew D...
03/11/2025
Challenging the norm: How TNT Sports is evolving coverage of the men's and w...
03/11/2025
Inspired storytelling: TNT Sports' Pete Thomas on creating opportunities out...
03/11/2025
NBA 2K League Returns With New Format Featuring NBA Players, Creators, and FansSeason will include online tournaments, in-person events, and open-ladder fan com...
03/11/2025
Live on the Water: The Rowing Channel Pulls Off Historic Production at Head Of T...
03/11/2025
Strategic partnership to expand specialized testing equipment, advance national security and support regional economic growth...
03/11/2025
In less than two weeks during late September and early October, the Federal Communications Commission acted on two proposed rulemakings that could have an enorm...
03/11/2025
Josh Miely is returning to a more hands-on radio and TV role with the National Association of Broadcasters....
03/11/2025
Broadcasters have spent years trying to integrate different vendor technologies in their facilities. As the industry has moved closer to software, that struggle...
03/11/2025
As the malevolent siege against broadcasters' interests intensifies from the far reaches of artificial intelligence misuse to relentless innovation in the m...
03/11/2025
Wheatstone founder and owner Gary Snow will retire from the company by the end of next year....
03/11/2025
In ye olde days of traditional television, when U.S. TV viewing options were limited to ABC, CBS, NBC and PBS, Nielsen's paper diaries were sufficient for t...
03/11/2025
Harmonic's cOS Virtualized Broadband Platform Will Further Enhance Broadband...
03/11/2025
Two out of every three people are likely to be living in cities or other urban c...
03/11/2025
Monday 3 November 2025
To view this content, please enable our use of cookies. ...
03/11/2025
Rohde & Schwarz acquires open source intelligence specialist Munich Innovation L...