
As enterprises race to adopt generative AI and bring new services to market, the demands on data center infrastructure have never been greater. Training large language models is one challenge, but delivering LLM-powered real-time services is another.
In the latest round of MLPerf industry benchmarks, Inference v4.1, NVIDIA platforms delivered leading performance across all data center tests. The first-ever submission of the upcoming NVIDIA Blackwell platform revealed up to 4x more performance than the NVIDIA H100 Tensor Core GPU on MLPerf's biggest LLM workload, Llama 2 70B, thanks to its use of a second-generation Transformer Engine and FP4 Tensor Cores.
The NVIDIA H200 Tensor Core GPU delivered outstanding results on every benchmark in the data center category - including the latest addition to the benchmark, the Mixtral 8x7B mixture of experts (MoE) LLM, which features a total of 46.7 billion parameters, with 12.9 billion parameters active per token.
MoE models have gained popularity as a way to bring more versatility to LLM deployments, as they're capable of answering a wide variety of questions and performing more diverse tasks in a single deployment. They're also more efficient since they only activate a few experts per inference - meaning they deliver results much faster than dense models of a similar size.
The continued growth of LLMs is driving the need for more compute to process inference requests. To meet real-time latency requirements for serving today's LLMs, and to do so for as many users as possible, multi-GPU compute is a must. NVIDIA NVLink and NVSwitch provide high-bandwidth communication between GPUs based on the NVIDIA Hopper architecture and provide significant benefits for real-time, cost-effective large model inference. The Blackwell platform will further extend NVLink Switch's capabilities with larger NVLink domains with 72 GPUs.
In addition to the NVIDIA submissions, 10 NVIDIA partners - ASUSTek, Cisco, Dell Technologies, Fujitsu, Giga Computing, Hewlett Packard Enterprise (HPE), Juniper Networks, Lenovo, Quanta Cloud Technology and Supermicro - all made solid MLPerf Inference submissions, underscoring the wide availability of NVIDIA platforms.
Relentless Software Innovation NVIDIA platforms undergo continuous software development, racking up performance and feature improvements on a monthly basis.
In the latest inference round, NVIDIA offerings, including the NVIDIA Hopper architecture, NVIDIA Jetson platform and NVIDIA Triton Inference Server, saw leaps and bounds in performance gains.
The NVIDIA H200 GPU delivered up to 27% more generative AI inference performance over the previous round, underscoring the added value customers get over time from their investment in the NVIDIA platform.
Triton Inference Server, part of the NVIDIA AI platform and available with NVIDIA AI Enterprise software, is a fully featured open-source inference server that helps organizations consolidate framework-specific inference servers into a single, unified platform. This helps lower the total cost of ownership of serving AI models in production and cuts model deployment times from months to minutes.
In this round of MLPerf, Triton Inference Server delivered near-equal performance to NVIDIA's bare-metal submissions, showing that organizations no longer have to choose between using a feature-rich production-grade AI inference server and achieving peak throughput performance.
Going to the Edge Deployed at the edge, generative AI models can transform sensor data, such as images and videos, into real-time, actionable insights with strong contextual awareness. The NVIDIA Jetson platform for edge AI and robotics is uniquely capable of running any kind of model locally, including LLMs, vision transformers and Stable Diffusion.
In this round of MLPerf benchmarks, NVIDIA Jetson AGX Orin system-on-modules achieved more than a 6.2x throughput improvement and 2.4x latency improvement over the previous round on the GPT-J LLM workload. Rather than developing for a specific use case, developers can now use this general-purpose 6-billion-parameter model to seamlessly interface with human language, transforming generative AI at the edge.
Performance Leadership All Around This round of MLPerf Inference showed the versatility and leading performance of NVIDIA platforms - extending from the data center to the edge - on all of the benchmark's workloads, supercharging the most innovative AI-powered applications and services. To learn more about these results, see our technical blog.
H200 GPU-powered systems are available today from CoreWeave - the first cloud service provider to announce general availability - and server makers ASUS, Dell Technologies, HPE, QCT and Supermicro.
See notice regarding software product information.
Most recent headlines
05/01/2027
Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...
01/06/2026
January 6 2026, 05:30 (PST) Dolby Sets the New Standard for Premium Entertainment at CES 2026
Throughout the week, Dolby brings to life the latest innovatio...
02/05/2026
Dalet, a leading technology and service provider for media-rich organizations, t...
01/05/2026
January 5 2026, 18:30 (PST) NBCUniversal's Peacock to Be First Streamer to ...
01/04/2026
January 4 2026, 18:00 (PST) DOLBY AND DOUYIN EMPOWER THE NEXT GENERATON OF CREATORS WITH DOLBY VISION
Douyin Users Can Now Create And Share Videos With Stun...
11/02/2026
Clear-Com provided an advanced, IP-based communications infrastructure for TEDNext 2025, supporting production, media, and editorial teams with a highly flexib...
11/02/2026
Astera introduces QuikBeam, the newest addition to its acclaimed Quik family of focusing LED Fresnels. This ultra-compact spotlight combines the equivalent powe...
11/02/2026
Following a competitive public tender process, Rai (Radiotelevisione Italiana), the national public broadcasting company of Italy, has awarded Imagine Communica...
11/02/2026
With Convertible Mount for NL Bowens & Aputure A Mounts See it at BSC Expo Stand #133 LCA
DoPchoice continues to refine light shaping tools for professional LE...
11/02/2026
World Premiere at BSC Expo, Booth #319 Oberkochen/Germany, 10 February 2026
ZEISS introduces the new Aatma, set of nine high-end full frame T1.5 cinema primes ...
11/02/2026
As Re-recording Mixer and Head of Sound at The Farm, one of UK's leading post-production facilities, Nick Fry has built his career on making stories sound a...
11/02/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
11/02/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
11/02/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
11/02/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
11/02/2026
Graduate Spotlight: Gabrielle Rodriguez The educator, who grew up in the Philippines, shares how shes bringing what she learned at Berklee back home.
Februar...
11/02/2026
The intergalactic children's show starring Adam King will premiere on 14 February on RT 2, RT KIDSjr and RT Player
The Late Late Toy Show star Adam King...
10/02/2026
From San Fran to Santa Clara down to Los Angeles, ESPN celebrates the Big Game w...
10/02/2026
A team of broadcast engineers and experts dispersed across Northern Italy help broadcasters and still photographers keep shooting
OBS has put new imaging techn...
10/02/2026
If you can have the best pictures and the best sound quality for these global ev...
10/02/2026
Disney+ will add vertical video within its app this year after ESPN introduced V...
10/02/2026
GameChanger today unveiled the most comprehensive product update in its 15-year history, marking a major step forward in how families, athletes, coaches, fans, ...
10/02/2026
With a new film adaptation of Wuthering Heights arriving just in time for Valent...
10/02/2026
Today, we announced our fourth quarter 2025 earnings, marking a strong finish to the year with exceptional user growth and continued momentum across the busines...
10/02/2026
I dag redovisar vi v rt resultat f r fj rde kvartalet 2025, vilket markerar ett starkt avslut p ret med robust anv ndartillv xt och fortsatt momentum i hela v...
10/02/2026
World-first opt-out function now fully integrated on SBS On Demand
10 February, 2026
Media releases
SBS has announced a suite of audience-first enhancement...
10/02/2026
Jennifer Hanley, Vice President, International, L3Harris, signed a Memorandum of...
10/02/2026
eds3_5_jq(document).ready(function($) { $(#eds_sliderM519).chameleonSlider_2_1({ content_source:......
10/02/2026
NEW YORK February 10, 2026 Nielsen's Gracenote, the global leader in entertainment metadata, today announced the continuation of its partnership with Go...
10/02/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
10/02/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
10/02/2026
Clear dialogue has long been one of the biggest pain points in post-production. From complex mixes to unpredictable playback environments, intelligibility somet...
10/02/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
10/02/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
10/02/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
10/02/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
10/02/2026
10 Feb 2026
VEON's Kyivstar Expands Digital Healthcare Services in Ukraine ...
10/02/2026
The new TV campaign is set to air on Sky from 10th February - 6th April inviting...
10/02/2026
Red Seat Ventures Acquires Leading Podcast Subscription Platform Supercast Los Angeles, CA, February 10, 2026 - Red Seat Ventures, a division of Fox Corporati...
10/02/2026
Arvato Systems Celebrates a Decade of Innovation and Customer Success in the Dig...
09/02/2026
A look inside the tech, tools, and team that make the Super Bowl into true eye c...
09/02/2026
Software-defined IP backbone and centralized signal-control hub redefine champio...
09/02/2026
Broadcasters continue to raise the bar when it comes to producing an eye-catchin...
09/02/2026
Game coverage will feature nearly 100 cameras, a deep well of replay channels, a...
09/02/2026
Sony's imaging tech is the literal lens through which the spectacle and exc...
09/02/2026
The technologies, including AI, allow fans at home to see the athletes, feel' the speed, and sense the skill
With four years from one Winter Games to the ...
09/02/2026
Sportradar AG announces a multi-year agreement with NBC Sports Regional Sports N...
09/02/2026
New York Festivals Advertising Awards proudly unveils a dynamic new Sports Category Group, expanding its 2026 competition to recognize the powerful role sports ...
09/02/2026
Devlin Design Group, Filmwerks, LTN pitched in on the four-day effort
The site ...
09/02/2026
Despite the game taking place right in the middle of NBC Sports' busiest month ever, its production and operations teams pulled off a massive Super Bowl LX ...