
As enterprises race to adopt generative AI and bring new services to market, the demands on data center infrastructure have never been greater. Training large language models is one challenge, but delivering LLM-powered real-time services is another.
In the latest round of MLPerf industry benchmarks, Inference v4.1, NVIDIA platforms delivered leading performance across all data center tests. The first-ever submission of the upcoming NVIDIA Blackwell platform revealed up to 4x more performance than the NVIDIA H100 Tensor Core GPU on MLPerf's biggest LLM workload, Llama 2 70B, thanks to its use of a second-generation Transformer Engine and FP4 Tensor Cores.
The NVIDIA H200 Tensor Core GPU delivered outstanding results on every benchmark in the data center category - including the latest addition to the benchmark, the Mixtral 8x7B mixture of experts (MoE) LLM, which features a total of 46.7 billion parameters, with 12.9 billion parameters active per token.
MoE models have gained popularity as a way to bring more versatility to LLM deployments, as they're capable of answering a wide variety of questions and performing more diverse tasks in a single deployment. They're also more efficient since they only activate a few experts per inference - meaning they deliver results much faster than dense models of a similar size.
The continued growth of LLMs is driving the need for more compute to process inference requests. To meet real-time latency requirements for serving today's LLMs, and to do so for as many users as possible, multi-GPU compute is a must. NVIDIA NVLink and NVSwitch provide high-bandwidth communication between GPUs based on the NVIDIA Hopper architecture and provide significant benefits for real-time, cost-effective large model inference. The Blackwell platform will further extend NVLink Switch's capabilities with larger NVLink domains with 72 GPUs.
In addition to the NVIDIA submissions, 10 NVIDIA partners - ASUSTek, Cisco, Dell Technologies, Fujitsu, Giga Computing, Hewlett Packard Enterprise (HPE), Juniper Networks, Lenovo, Quanta Cloud Technology and Supermicro - all made solid MLPerf Inference submissions, underscoring the wide availability of NVIDIA platforms.
Relentless Software Innovation NVIDIA platforms undergo continuous software development, racking up performance and feature improvements on a monthly basis.
In the latest inference round, NVIDIA offerings, including the NVIDIA Hopper architecture, NVIDIA Jetson platform and NVIDIA Triton Inference Server, saw leaps and bounds in performance gains.
The NVIDIA H200 GPU delivered up to 27% more generative AI inference performance over the previous round, underscoring the added value customers get over time from their investment in the NVIDIA platform.
Triton Inference Server, part of the NVIDIA AI platform and available with NVIDIA AI Enterprise software, is a fully featured open-source inference server that helps organizations consolidate framework-specific inference servers into a single, unified platform. This helps lower the total cost of ownership of serving AI models in production and cuts model deployment times from months to minutes.
In this round of MLPerf, Triton Inference Server delivered near-equal performance to NVIDIA's bare-metal submissions, showing that organizations no longer have to choose between using a feature-rich production-grade AI inference server and achieving peak throughput performance.
Going to the Edge Deployed at the edge, generative AI models can transform sensor data, such as images and videos, into real-time, actionable insights with strong contextual awareness. The NVIDIA Jetson platform for edge AI and robotics is uniquely capable of running any kind of model locally, including LLMs, vision transformers and Stable Diffusion.
In this round of MLPerf benchmarks, NVIDIA Jetson AGX Orin system-on-modules achieved more than a 6.2x throughput improvement and 2.4x latency improvement over the previous round on the GPT-J LLM workload. Rather than developing for a specific use case, developers can now use this general-purpose 6-billion-parameter model to seamlessly interface with human language, transforming generative AI at the edge.
Performance Leadership All Around This round of MLPerf Inference showed the versatility and leading performance of NVIDIA platforms - extending from the data center to the edge - on all of the benchmark's workloads, supercharging the most innovative AI-powered applications and services. To learn more about these results, see our technical blog.
H200 GPU-powered systems are available today from CoreWeave - the first cloud service provider to announce general availability - and server makers ASUS, Dell Technologies, HPE, QCT and Supermicro.
See notice regarding software product information.
Most recent headlines
21/12/2025
Back to All News
Legoshi and Haru's Story Reaches Its Finale: BEASTARS Fin...
21/12/2025
John Shortt named Young Sportsperson of the Year Kerry are the Team of the Year
...
20/12/2025
Atomos announced the immediate availability of a new firmware update for its Ninja TX GO and Ninja TX monitor-recorders, unlocking ProRes RAW recording from the...
20/12/2025
CJP Broadcast has completed the digitisation of the European Gymnastics tape archive, converting 328 tapes containing more than forty years of recorded material...
20/12/2025
Bitmovin, the leading provider of video streaming solutions, today announced the launch of the Stream Lab MCP Server, to give AI agents and large language model...
20/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
20/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
20/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
20/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
20/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
20/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
20/12/2025
Barack Obama Includes Laufey on His 2025 Favorite Music List The former presidents roundup of books, music, and movies includes a song from the Berklee alums ...
20/12/2025
Study reveals a key hormonal circuit in the kidneys Scripps Research scientists identify the protein that helps kidney cells regulate renin, providing foundatio...
19/12/2025
With Playout Release 2025.4, ToolsOnAir continues to push professional playout w...
19/12/2025
SVG Sit-Down: Diversified's Jared Timmins on AI for Broadcast Sports and Cre...
19/12/2025
2025 SVG Summit Audio Recap: Say What?The Audio Production and Distribution Workshop at the SVG Summit 20 took on issues including speech intelligibility, Next-...
19/12/2025
Gamified fun: Channel 5 on its NFL Big Game Night ambitions with Hungry Bear Med...
19/12/2025
College Football Playoff Preview: For ESPN, Round 1 is a Fantastic Yet Familia...
19/12/2025
AWS's Jason Dvorkin on Developing Partnerships With the NBA and PGA Tour, Em...
19/12/2025
Netflix Kicks Off Packed Sports Week with Paul-Joshua Fight Before Shifting to N...
19/12/2025
SVG New Sponsor Spotlight: Presidio's Nareev Shah on the Role of Its Captiva...
19/12/2025
Mounted to the pylon of an AH-1Z Viper helicopter, a Red Wolf vehicle successful...
19/12/2025
L3Harris technology for the SDA Tranche 3 Tracking Layer program will provide in...
19/12/2025
Partnership brings Nielsen ONE measurement activation directly into XR's adv...
19/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
19/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
19/12/2025
Berklee Announces Spring 2026 Signature Series This season's highlights include the Gospel Extravaganza, the 40th International Folk Festival, special gue...
19/12/2025
Performing arts centres across the globe have doubled down on live production infrastructure in recent years. For venues like the Queensland Performing Arts Cen...
19/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
19/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
19/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
19/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
19/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
19/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
19/12/2025
Ricardo Coke-Thomas Named Chair of Theater for Boston Conservatory at Berklee The distinguished theater educator, director, and performer will join the Conser...
19/12/2025
Back to All News
Salvador Arrives to Netflix on February 6
Entertainment
19 December 2025
GlobalSpain
Link copied to clipboard
WHEN THERE IS NOTHING LEFT ...
19/12/2025
As the year comes to a close, it's the perfect time to give your WO Automation for Radio system a quick tune up. At the top of your year end checklist is on...
19/12/2025
19 Dec 2025
VEON's Mobilink Microfinance Bank Launches Islamic Banking Oper...
19/12/2025
Wrapping up a year of connection and clarity!
19 Dec Written By Suzanne Costello
As 2025 comes to a close, we want to take a moment to thank our incredib...
19/12/2025
The six-part drama, set in a close-knit Welsh town fractured by an unspeakable c...
19/12/2025
Rohde & Schwarz drives the future of mobility at CES 2026 At the 2026 Consumer Electronics Show in Las Vegas, Rohde & Schwarz will present a powerful lineup o...
19/12/2025
Back to All News
Salvador arrives to Netflix on February 6
Entertainment
19 December 2025
GlobalSpain
Link copied to clipboard
WHEN THERE IS NOTHING LEFT ...
19/12/2025
Back to All News
Last Samurai Standing' Renewed for Season 2 - A Global Se...
19/12/2025
RT is proud to return to the RDS to support the 2026 Stripe Young Scientist & T...
19/12/2025
Nanoparticle vaccine strategy could protect against Ebola and other deadly filoviruses Scripps Research scientists turn nanoparticles into virus showcases to ...
18/12/2025
SVG Campus Shot Callers: Kurt Sutton, Director of Broadcast Operations, Clemson ...
18/12/2025
Follow the Money Episode 2: Inside the Sports Media Biz with Sam McCleery and St...
18/12/2025
SVG Sit-Down: Google Cloud's Anshul Kapoor on the Future of Generative Prod...
18/12/2025
The 2025 SVG Summit Draws Record Crowd for 20th-Annual Sports-Production Industr...
18/12/2025
SBS's sports schedule sizzles in January with Dakar Rally, Kooyong Classic a...