
In its debut on the MLPerf industry benchmarks, the NVIDIA GH200 Grace Hopper Superchip ran all data center inference tests, extending the leading performance of NVIDIA H100 Tensor Core GPUs.
The overall results showed the exceptional performance and versatility of the NVIDIA AI platform from the cloud to the network's edge.
Separately, NVIDIA announced inference software that will give users leaps in performance, energy efficiency and total cost of ownership.
GH200 Superchips Shine in MLPerf The GH200 links a Hopper GPU with a Grace CPU in one superchip. The combination provides more memory, bandwidth and the ability to automatically shift power between the CPU and GPU to optimize performance.
Separately, NVIDIA HGX H100 systems that pack eight H100 GPUs delivered the highest throughput on every MLPerf Inference test in this round.
Grace Hopper Superchips and H100 GPUs led across all MLPerf's data center tests, including inference for computer vision, speech recognition and medical imaging, in addition to the more demanding use cases of recommendation systems and the large language models (LLMs) used in generative AI.
Overall, the results continue NVIDIA's record of demonstrating performance leadership in AI training and inference in every round since the launch of the MLPerf benchmarks in 2018.
The latest MLPerf round included an updated test of recommendation systems, as well as the first inference benchmark on GPT-J, an LLM with six billion parameters, a rough measure of an AI model's size.
TensorRT-LLM Supercharges Inference To cut through complex workloads of every size, NVIDIA developed TensorRT-LLM, generative AI software that optimizes inference. The open-source library - which was not ready in time for August submission to MLPerf - enables customers to more than double the inference performance of their already purchased H100 GPUs at no added cost.
NVIDIA's internal tests show that using TensorRT-LLM on H100 GPUs provides up to an 8x performance speedup compared to prior generation GPUs running GPT-J 6B without the software.
The software got its start in NVIDIA's work accelerating and optimizing LLM inference with leading companies including Meta, AnyScale, Cohere, Deci, Grammarly, Mistral AI, MosaicML (now part of Databricks), OctoML, Tabnine and Together AI.
MosaicML added features that it needs on top of TensorRT-LLM and integrated them into its existing serving stack. It's been an absolute breeze, said Naveen Rao, vice president of engineering at Databricks.
TensorRT-LLM is easy-to-use, feature-packed and efficient, Rao said. It delivers state-of-the-art performance for LLM serving using NVIDIA GPUs and allows us to pass on the cost savings to our customers.
TensorRT-LLM is the latest example of continuous innovation on NVIDIA's full-stack AI platform. These ongoing software advances give users performance that grows over time at no extra cost and is versatile across diverse AI workloads.
L4 Boosts Inference on Mainstream Servers In the latest MLPerf benchmarks, NVIDIA L4 GPUs ran the full range of workloads and delivered great performance across the board.
For example, L4 GPUs running in compact, 72W PCIe accelerators delivered up to 6x more performance than CPUs rated for nearly 5x higher power consumption.
In addition, L4 GPUs feature dedicated media engines that, in combination with CUDA software, provide up to 120x speedups for computer vision in NVIDIA's tests.
L4 GPUs are available from Google Cloud and many system builders, serving customers in industries from consumer internet services to drug discovery.
Performance Boosts at the Edge Separately, NVIDIA applied a new model compression technology to demonstrate up to a 4.7x performance boost running the BERT LLM on an L4 GPU. The result was in MLPerf's so-called open division, a category for showcasing new capabilities.
The technique is expected to find use across all AI workloads. It can be especially valuable when running models on edge devices constrained by size and power consumption.
In another example of leadership in edge computing, the NVIDIA Jetson Orin system-on-module showed performance increases of up to 84% compared to the prior round in object detection, a computer vision use case common in edge AI and robotics scenarios.
The Jetson Orin advance came from software taking advantage of the latest version of the chip's cores, such as a programmable vision accelerator, an NVIDIA Ampere architecture GPU and a dedicated deep learning accelerator.
Versatile Performance, Broad Ecosystem The MLPerf benchmarks are transparent and objective, so users can rely on their results to make informed buying decisions. They also cover a wide range of use cases and scenarios, so users know they can get performance that's both dependable and flexible to deploy.
Partners submitting in this round included cloud service providers Microsoft Azure and Oracle Cloud Infrastructure and system manufacturers ASUS, Connect Tech, Dell Technologies, Fujitsu, GIGABYTE, Hewlett Packard Enterprise, Lenovo, QCT and Supermicro.
Overall, MLPerf is backed by more than 70 organizations, including Alibaba, Arm, Cisco, Google, Harvard University, Intel, Meta, Microsoft and the University of Toronto.
Read a technical blog for more details on how NVIDIA achieved the latest results.
All the software used in NVIDIA's benchmarks is available from the MLPerf repository, so everyone can get the same world-class results. The optimizations are continuously folded into containers available on the NVIDIA NGC software hub for GPU applications.
Most recent headlines
05/01/2027
Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...
06/09/2026
June 9 2026, 23:00 (PDT) Dolby and MagentaTV Bring Fans Closer to the FIFA Worl...
04/08/2026
Dalet, a leading technology and service provider for media-rich organizations, t...
04/07/2026
April 7 2026, 19:00 (PDT) Detective Conan: Fallen Angel of the Highway Opens in...
01/07/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
01/07/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
01/07/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
01/07/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
01/07/2026
Groundbreaking First Nations Screen Business Accelerator launched through nation...
01/07/2026
Chyron Launches the All-New Chyron Academy: A Reimagined, Hands-On Learning Expe...
01/07/2026
Amplium Captures Kawasaki Brave Thunders Game with Blackmagic URSA Cine Immersiv...
01/07/2026
Boris FX Optics Expands Plugin Support to Apple Photos, Capture One, and Affinit...
30/06/2026
Could your journalism reach an international stage?
Entries are now open for the Thomson Foundation's Young Journalist Award 2026, one of the most prestigi...
30/06/2026
As Brazil's only way for fans to see all 104 matches, YouTube channel proves the power of digital...
30/06/2026
Brings together saturation & lo-fi effects
Following on from the release of their Voxcraft vocal-processing plug-in, UJAM have announced the launch of Retro...
30/06/2026
New IR reverb engine, Juno-inspired chorus & more
The latest version of Rapid Flow's hardware-emulation synth plug-in expands on its predecessor with a ...
30/06/2026
Excels at heavy-handed VCA compression
For their latest release, Shy Audio have recreated the crunchy' sound of a rackmount compressor that found its w...
30/06/2026
Component scarcity drives cost increases
Shortly after Apple's CEO Tim Cook acknowledged that cost increases would soon be inevitable , the company hav...
30/06/2026
Statement regarding GetUp Save Our SBS' campaign
30 June, 2026
Media releases
The GetUp Save Our SBS' campaign is an independent initiative. SBS ...
30/06/2026
Hitachi and Bank Pekao S.A. have completed the installation of the first Hitachi...
30/06/2026
eds3_5_jq(document).ready(function($) { $(#eds_sliderM519).chameleonSlider_2_1({ content_source:......
30/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
30/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
30/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
30/06/2026
MainStreaming, the award-winning and innovative Edge Video Delivery Network, today announced that it has been selected by ITV to support the delivery of ITVX, I...
30/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
30/06/2026
When Wheel of Fortune and Jeopardy! needed to upgrade their wireless communications system, they turned to Clear-Com FreeSpeak wireless for their iconic televi...
30/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
30/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
30/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
30/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
30/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
30/06/2026
Other World Computing Launches New Atlas Core Line with 256GB CFExpress 4.0 Type...
30/06/2026
DaVinci Resolve Studio Used for Taketoshi Sado's Perfume Cold Sleep -25 year...
30/06/2026
Life sciences has entered an era of computational scale, and for more than a dec...
30/06/2026
Fernando Cruz and Jaz Wray Join as Regional Sales Managers; Bringing Extensive S...
30/06/2026
As organizations move from AI pilots to production AI factories, infrastructure decisions have shifted from peak chip specifications to cost per token: how many...
30/06/2026
Editor's note: This post is part of Into the Omniverse, a series focused on ...
30/06/2026
Scripps Research scientists demonstrate a faster, cheaper route to making critical drugs using common table sugar New method illustrates how to build a tough ch...
29/06/2026
By Andy Rayner, CTO, Appear
The 2026 FIFA World Cup is the largest football tou...
29/06/2026
A new multi-country study from ESL FACEIT Group, Hero Esports, and Niko Partners estimates that 400 million Gen Z consumers regularly engage with esports, under...
29/06/2026
ESPN will mark America's 250th anniversary with a series of content initiatives across its linear, digital, and streaming platforms, including a special edi...
29/06/2026
The Esports Foundation has named OBSBOT the Official Camera and Webcam Partner for the Esports World Cup 2026, bringing the company's AI-powered imaging tec...
29/06/2026
Insight Productions has launched Insight Storm, a 53-foot mobile broadcast unit designed specifically for esports production, live entertainment, and digital-fi...
29/06/2026
Gravity Media once again provided broadcast, streaming, and content-distribution...
29/06/2026
The All England Lawn Tennis Club and IBM have introduced new and enhanced digita...
29/06/2026
Four-layer instrument aimed at dark electronic music
Excite Audio's latest software instrument has been designed with dark drum and bass, atmospheric te...
29/06/2026
New AI Assistant, Multi-channel Audio, ARA2 improvements & more
Tracktion's DAW software has just received its latest major update, gaining a selection ...
29/06/2026
Details environmental policies & results
The Focusrite Group have just announced that following a long audit process, they have published their 2026 sustain...
29/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...