Sony Pixel Power calrec Sony

NVIDIA Hopper Leaps Ahead in Generative AI at MLPerf

27/03/2024

It's official: NVIDIA delivered the world's fastest platform in industry-standard tests for inference on generative AI.

In the latest MLPerf benchmarks, NVIDIA TensorRT-LLM - software that speeds and simplifies the complex job of inference on large language models - boosted the performance of NVIDIA Hopper architecture GPUs on the GPT-J LLM nearly 3x over their results just six months ago.

The dramatic speedup demonstrates the power of NVIDIA's full-stack platform of chips, systems and software to handle the demanding requirements of running generative AI.

Leading companies are using TensorRT-LLM to optimize their models. And NVIDIA NIM - a set of inference microservices that includes inferencing engines like TensorRT-LLM - makes it easier than ever for businesses to deploy NVIDIA's inference platform.

Raising the Bar in Generative AI TensorRT-LLM running on NVIDIA H200 Tensor Core GPUs - the latest, memory-enhanced Hopper GPUs - delivered the fastest performance running inference in MLPerf's biggest test of generative AI to date.

The new benchmark uses the largest version of Llama 2, a state-of-the-art large language model packing 70 billion parameters. The model is more than 10x larger than the GPT-J LLM first used in the September benchmarks.

The memory-enhanced H200 GPUs, in their MLPerf debut, used TensorRT-LLM to produce up to 31,000 tokens/second, a record on MLPerf's Llama 2 benchmark.

The H200 GPU results include up to 14% gains from a custom thermal solution. It's one example of innovations beyond standard air cooling that systems builders are applying to their NVIDIA MGX designs to take the performance of Hopper GPUs to new heights.

Memory Boost for NVIDIA Hopper GPUs NVIDIA is sampling H200 GPUs to customers today and shipping in the second quarter. They'll be available soon from nearly 20 leading system builders and cloud service providers.

H200 GPUs pack 141GB of HBM3e running at 4.8TB/s. That's 76% more memory flying 43% faster compared to H100 GPUs. These accelerators plug into the same boards and systems and use the same software as H100 GPUs.

With HBM3e memory, a single H200 GPU can run an entire Llama 2 70B model with the highest throughput, simplifying and speeding inference.

GH200 Packs Even More Memory Even more memory - up to 624GB of fast memory, including 144GB of HBM3e - is packed in NVIDIA GH200 Superchips, which combine on one module a Hopper architecture GPU and a power-efficient NVIDIA Grace CPU. NVIDIA accelerators are the first to use HBM3e memory technology.

With nearly 5 TB/second memory bandwidth, GH200 Superchips delivered standout performance, including on memory-intensive MLPerf tests such as recommender systems.

Sweeping Every MLPerf Test On a per-accelerator basis, Hopper GPUs swept every test of AI inference in the latest round of the MLPerf industry benchmarks.

In addition, NVIDIA Jetson Orin remains at the forefront in MLPerf's edge category. In the last two inference rounds, Orin ran the most diverse set of models in the category, including GPT-J and Stable Diffusion XL.

The MLPerf benchmarks cover today's most popular AI workloads and scenarios, including generative AI, recommendation systems, natural language processing, speech and computer vision. NVIDIA was the only company to submit results on every workload in the latest round and every round since MLPerf's data center inference benchmarks began in October 2020.

Continued performance gains translate into lower costs for inference, a large and growing part of the daily work for the millions of NVIDIA GPUs deployed worldwide.

Advancing What's Possible Pushing the boundaries of what's possible, NVIDIA demonstrated three innovative techniques in a special section of the benchmarks called the open division, created for testing advanced AI methods.

NVIDIA engineers used a technique called structured sparsity - a way of reducing calculations, first introduced with NVIDIA A100 Tensor Core GPUs - to deliver up to 33% speedups on inference with Llama 2.

A second open division test found inference speedups of up to 40% using pruning, a way of simplifying an AI model - in this case, an LLM - to increase inference throughput.

Finally, an optimization called DeepCache reduced the math required for inference with the Stable Diffusion XL model, accelerating performance by a whopping 74%.

All these results were run on NVIDIA H100 Tensor Core GPUs.

A Trusted Source for Users MLPerf's tests are transparent and objective, so users can rely on the results to make informed buying decisions.

NVIDIA's partners participate in MLPerf because they know it's a valuable tool for customers evaluating AI systems and services. Partners submitting results on the NVIDIA AI platform in this round included ASUS, Cisco, Dell Technologies, Fujitsu, GIGABYTE, Google, Hewlett Packard Enterprise, Lenovo, Microsoft Azure, Oracle, QCT, Supermicro, VMware (recently acquired by Broadcom) and Wiwynn.

All the software NVIDIA used in the tests is available in the MLPerf repository. These optimizations are continuously folded into containers available on NGC, NVIDIA's software hub for GPU applications, as well as NVIDIA AI Enterprise - a secure, supported platform that includes NIM inference microservices.

The Next Big Thing The use cases, model sizes and datasets for generative AI continue to expand. That's why MLPerf continues to evolve, adding real-world tests with popular models like Llama 2 70B and Stable Diffusion XL.

Keeping pace with the explosion in LLM model sizes, NVIDIA founder and CEO Jensen Huang announced last week at GTC that the NVIDIA Blackwell architecture GPUs will deliver new levels of performance required for the multitrillion-parameter AI models.

Inference for large language models is difficult, requiring both e
LINK: https://blogs.nvidia.com/blog/tensorrt-llm-inference-mlperf/...
See more stories from nvidia

North America Stories

27/04/2024

Audinate Adds Major New Features to Dante Connect

PORTLAND, Oregon Audinate Group Limited, the developer of the Dante AV-over-IP solution, announced significant new additions to Dante Connect, its cloud-based D...

27/04/2024

Bell Media Launches New Portfolio of FAST Channels

TORONTO Bell Media has launched 10 English and French-language FAST channels featuring entertainment, factual, news, and sports programming. The new free stream...

27/04/2024

Study: Broadcast TV Evening News Avoids Serious Economic Issues

An extensive new analysis of the news segments in the broadcast evening news programs of ABC, CBS, NBC and PBS has found that broadcasters devoted most of their...

27/04/2024

Hughes Opens Manufacturing Facility and Private 5G Incubation Center in Maryland

GERMANTOWN, Md. EchoStar's Hughes Network Systems has opened a new manufacturing facility and private 5G incubation center in Germantown, Maryland....

27/04/2024

Broadcasting Legend Harry Pappas Dead At 78

Harry Pappas, one of three brothers who founded Pappas Telecasting Companies in 1971, died April 24. He was 78 years old....

27/04/2024

Televisa Selects Synamedia For Broadcast Distribution Overhaul

ATLANTA and LONDON Mexican telecommunications and broadcast company Televisa has selected Synamedia for an overhaul of its broadcast distribution....

27/04/2024

Participate in the Survey - The Impact of AI on Media and the Creative Industry

Participate in the Survey - The Impact of AI on Media and the Creative Industry Pascal Wagner April 26, 2024 0 Comments By participating in this surve...

27/04/2024

SDVI Rally Access Workstation Earns Two Top Awards at 2024 NAB Show

SDVI Rally Access Workstation Earns Two Top Awards at 2024 NAB Show Brie Clayton April 26, 2024 0 Comments SDVI, the leading platform provider for clo...

27/04/2024

Berklee's Music and Health Institute Launches Community Health Musician Certificate

Berklee's Music and Health Institute Launches Community Health Musician Cert...

27/04/2024

Charter Reports Higher Q1 Profits Despite Broadband, Video Losses

Charter Communications reported higher first-quarter profits despite continued cord-cutting and competition for broadband customers....

27/04/2024

Environmental Groups Aim To Make Unscripted TV More Sustainable

Two environmentally-focused groups are partnering to engage the unscripted TV world in finding better ways to address climate change. Reality of Change is an ec...

27/04/2024

Sarah Garcia Named Weekend Anchor at Telemundo 40 in Texas

Sarah Garcia has been promoted to weekend anchor at KTLM McAllen, Texas, known as Telemundo 40. Starting April 27, she will anchor Noticias Telemundo 40 weekend...

27/04/2024

CBS Sports Kicks Off FAST Channel for UEFA Champions League on Pluto TV

CBS Sports said it launched a new 24-hour free, ad supported streaming television (FAST) channel devoted to the UEFA Champions League....

27/04/2024

Brian Roberts's Pay Rose To $35 Million at Comcast

Comcast chairman and CEO Brian Roberts received $35.4 million in compensation in 2023, up 11% from the previous year, according to a proxy statement filed by th...

27/04/2024

John Lithgow Goes Back to School in Art Happens Here'

Art Happens Here With John Lithgow, which sees the actor study dance, ceramics, silk-screen printing and vocal jazz with students in Los Angeles, debuts on PBS ...

27/04/2024

FETV Wants Upfront Buyers Seeking Cable Viewers To Join Its Family

Remember Leave It to Beaver? Bewitched? Dragnet? When cable ratings were rising?...

27/04/2024

Catchy Comedy Features Gomer Pyle, USMC' Weekend Marathon

Next up for the weekend binge at Catchy Comedy is Gomer Pyle, U.S.M.C. Every weekend, Catchy Comedy features The Catchy Binge, a marathon of a classic sitcom....

26/04/2024

Sundance Film Festival CDMX 2024 kicks off today at Cinpolis

Sundance Film Festival CDMX 2024 kicks-off today with screenings in 5 theaters in Mexico City and the opening-night film, FRIDA, directed by Carla Guti rrez...

26/04/2024

Interview: Lourdes Portillo, Director of Las madres de la Plaza de Mayo, La Ofrenda

[Editor's Note: This interview is part of a larger feature about the women d...

26/04/2024

L3Harris Technologies Reports Strong First Quarter 2024 Results, Increases 2024 Profitability Guidance

Orders1 of $5.5 billion; book-to-bill of 1.06x Revenue of $5.2 billion, up 17%,...

26/04/2024

What Makes A Network Resilient?

Five Considerations For Communications Modernization In The 21st Century In the digital-enabled battlespace, the Joint Force needs to shoot, move and communica...

26/04/2024

CBS Sports Launches New Free Streaming Channel

CBS Sports has launched Champions League as a new, 24-hour streaming channel that will serve as the year-round destination for nonstop highlights of the UEFA ...

26/04/2024

Roku Streaming Homes Hit 81.6M

Despite tough competition in the streaming space, Roku reported solid results in Q1 2024, beating revenue expectations, with total net revenue up 19% YoY to $88...

26/04/2024

Sarah Farrell Named General Manager Of Pinewood Toronto Studios

LONDON AND TORONTO Pinewood Toronto Studios has appointed Sarah Farrell as general manager of the Studios in downtown Toronto....

26/04/2024

Quantum to Offer Advanced Filesharing Technology and Performance in StorNext and Myriad Solutions

Quantum to Offer Advanced Filesharing Technology and Performance in StorNext and...

26/04/2024

FilmLight Colour Awards welcomes 2024 entries and introduces new Emerging Talent' award

FilmLight Colour Awards welcomes 2024 entries and introduces new Emerging Talen...

26/04/2024

Picture Shop Announces Chris Evans as Head of Unscripted

Picture Shop Announces Chris Evans as Head of Unscripted Brie Clayton April 26, 2024 0 Comments Picture Shop announced Chris Evans will lead Unscripte...

26/04/2024

Participate in a Survey - The Impact of AI on Media and the Creative Industry

Participate in a Survey - The Impact of AI on Media and the Creative Industry Pascal Wagner April 26, 2024 0 Comments By participating in this survey,...

26/04/2024

Hi Barbie! Mattel Launching First FAST Channels on Samsung TV Plus

Toy maker Mattel said it is working with Samsung to launch its first free ad-supported streaming television (FAST) channels later this year....

26/04/2024

Marty Moe Named President Of Trusted Media Brands

Trusted Media Brands (TMB) said it named Marty Moe as president....

26/04/2024

Ron Howard Directs Jim Henson Documentary for Disney Plus

Ron Howard is the director on Jim Henson Idea Man, a documentary that premieres on Disney Plus May 31. Henson of course created Kermit the Frog, Miss Piggy, Big...

26/04/2024

Kraken Skate Away From RSN Root Sports for Deals With Tegna, Amazon

The ice continues to melt under the regional sports network business as the Seattle Kraken of the National Hockey League have made a long-term deal to broadcast...

26/04/2024

Warner Bros. Discovery Launches Olli First-Party Data Platform

Heading into the upfronts, Warner Bros. Discovery said it launched Olli, a first-party data platform advertiser can use for converged, targeted advertising camp...

26/04/2024

The Equalizer' Gets Season 5 on CBS

CBS has renewed the drama The Equalizer, which will see season five on in 2024-2025. Queen Latifah stars....

26/04/2024

The CW Inks New Deal for Miss USA, Miss Teen USA

The CW has entered into an exclusive multiyear broadcast partnership for the Miss USA Pageant and the Miss Teen USA Pageant. The 73rd Miss USA Pageant will air ...

26/04/2024

Fuse Urging Young Viewers To Vote With Blunt Campaign

Fuse Media isn't mincing words in a campaign urging its young viewers to register and participate in the 2024 elections....

26/04/2024

Neil Gaiman's Sandman' Universe Expands With Dead Boy Detectives'

Dead Boy Detectives, a series from Neil Gaiman about a detective agency staffed by ghosts, debuts on Netflix April 25. George Rexstrew and Jayden Revri are in t...

26/04/2024

Teradek Announces Smaller More Robust Built-in Antennas f...

Teradek, the industry leader in wireless video transmitters and receivers, announced today the launch of new Bolt 6 LT 750 and Bolt 6 Monitor Module 750 RX with...

26/04/2024

Amagi Names Richard Perkett Chief Product Officer

NEW YORK Amagi has appointed Richard Perkett chief product officer (CPO)....

26/04/2024

NAB Board Election Results Announced

WASHINGTON, D.C. The National Association of Broadcasters (NAB) has announced the results of the 2024 NAB Radio and Television Board of Directors elections. The...

26/04/2024

Mattel to Launch First FAST Channels on Samsung TV Plus

EL SEGUNDO, Calif. & NEW YORK Mattel has announced a deal to launch its first three 24/7 free ad supported streaming (FAST) channels on Samsung TV Plus, Samsung...

26/04/2024

NextGen TV Launches In Portland, Maine

PORTLAND, Maine Viewers here can now receive the NextGen TV signals of five local stations with the launch of ATSC 3.0 service from host station WPFO, which is ...

26/04/2024

Rogers Inks Agreement to Deploy Comcast's Products and Technologies

TORONTO Rogers Communications has signed a 10-year agreement with Comcast to bring the latest Xfinity products and technology to Canadians....

26/04/2024

Calrec scoops third Argo win with NAB Show Product of the...

Calrec is very pleased to announce the benefits of its highly flexible approach to customer service has been recognized with a third award for the Argo range. T...

26/04/2024

New Adobe Photoshop with Advanced Generative Fill and Generate Image Brings New Superpowers to All

New Adobe Photoshop with Advanced Generative Fill and Generate Image Brings New ...

26/04/2024

Trends And Takeaways From NAB Show 2024

Trends And Takeaways From NAB Show 2024 Melanie Ciotti April 25, 2024 0 Comments NAB's Eric Trabb (right) awards SNS with two NAB Show Product of ...

26/04/2024

Sony's Ci Media Cloud Keeps NHRA Moving

Sony's Ci Media Cloud Keeps NHRA Moving Brie Clayton April 25, 2024 0 Comments Cloud-Based Media Asset Management and Collaboration Platform Accel...

26/04/2024

2024 Emerging Leaders Internship Program Open for Applications, Deadline May 31st

Capitol Broadcasting & Leadership Triangle Collaborate for Third Year Capitol B...

26/04/2024

MIX 101.5 Launches New YouTube Shows

The staff at CBC Radio have been hard at work creating several new features for listeners and viewers. Several debuts are taking place this spring. I have nev...