NVIDIA Hopper Leaps Ahead in Generative AI at MLPerf
27/03/2024
In the latest MLPerf benchmarks, NVIDIA TensorRT-LLM - software that speeds and simplifies the complex job of inference on large language models - boosted the performance of NVIDIA Hopper architecture GPUs on the GPT-J LLM nearly 3x over their results just six months ago.
The dramatic speedup demonstrates the power of NVIDIA's full-stack platform of chips, systems and software to handle the demanding requirements of running generative AI.
Leading companies are using TensorRT-LLM to optimize their models. And NVIDIA NIM - a set of inference microservices that includes inferencing engines like TensorRT-LLM - makes it easier than ever for businesses to deploy NVIDIA's inference platform.
Raising the Bar in Generative AI TensorRT-LLM running on NVIDIA H200 Tensor Core GPUs - the latest, memory-enhanced Hopper GPUs - delivered the fastest performance running inference in MLPerf's biggest test of generative AI to date.
The new benchmark uses the largest version of Llama 2, a state-of-the-art large language model packing 70 billion parameters. The model is more than 10x larger than the GPT-J LLM first used in the September benchmarks.
The memory-enhanced H200 GPUs, in their MLPerf debut, used TensorRT-LLM to produce up to 31,000 tokens/second, a record on MLPerf's Llama 2 benchmark.
The H200 GPU results include up to 14% gains from a custom thermal solution. It's one example of innovations beyond standard air cooling that systems builders are applying to their NVIDIA MGX designs to take the performance of Hopper GPUs to new heights.
Memory Boost for NVIDIA Hopper GPUs NVIDIA is sampling H200 GPUs to customers today and shipping in the second quarter. They'll be available soon from nearly 20 leading system builders and cloud service providers.
H200 GPUs pack 141GB of HBM3e running at 4.8TB/s. That's 76% more memory flying 43% faster compared to H100 GPUs. These accelerators plug into the same boards and systems and use the same software as H100 GPUs.
With HBM3e memory, a single H200 GPU can run an entire Llama 2 70B model with the highest throughput, simplifying and speeding inference.
GH200 Packs Even More Memory Even more memory - up to 624GB of fast memory, including 144GB of HBM3e - is packed in NVIDIA GH200 Superchips, which combine on one module a Hopper architecture GPU and a power-efficient NVIDIA Grace CPU. NVIDIA accelerators are the first to use HBM3e memory technology.
With nearly 5 TB/second memory bandwidth, GH200 Superchips delivered standout performance, including on memory-intensive MLPerf tests such as recommender systems.
Sweeping Every MLPerf Test On a per-accelerator basis, Hopper GPUs swept every test of AI inference in the latest round of the MLPerf industry benchmarks.
In addition, NVIDIA Jetson Orin remains at the forefront in MLPerf's edge category. In the last two inference rounds, Orin ran the most diverse set of models in the category, including GPT-J and Stable Diffusion XL.
The MLPerf benchmarks cover today's most popular AI workloads and scenarios, including generative AI, recommendation systems, natural language processing, speech and computer vision. NVIDIA was the only company to submit results on every workload in the latest round and every round since MLPerf's data center inference benchmarks began in October 2020.
Continued performance gains translate into lower costs for inference, a large and growing part of the daily work for the millions of NVIDIA GPUs deployed worldwide.
Advancing What's Possible Pushing the boundaries of what's possible, NVIDIA demonstrated three innovative techniques in a special section of the benchmarks called the open division, created for testing advanced AI methods.
NVIDIA engineers used a technique called structured sparsity - a way of reducing calculations, first introduced with NVIDIA A100 Tensor Core GPUs - to deliver up to 33% speedups on inference with Llama 2.
A second open division test found inference speedups of up to 40% using pruning, a way of simplifying an AI model - in this case, an LLM - to increase inference throughput.
Finally, an optimization called DeepCache reduced the math required for inference with the Stable Diffusion XL model, accelerating performance by a whopping 74%.
All these results were run on NVIDIA H100 Tensor Core GPUs.
A Trusted Source for Users MLPerf's tests are transparent and objective, so users can rely on the results to make informed buying decisions.
NVIDIA's partners participate in MLPerf because they know it's a valuable tool for customers evaluating AI systems and services. Partners submitting results on the NVIDIA AI platform in this round included ASUS, Cisco, Dell Technologies, Fujitsu, GIGABYTE, Google, Hewlett Packard Enterprise, Lenovo, Microsoft Azure, Oracle, QCT, Supermicro, VMware (recently acquired by Broadcom) and Wiwynn.
All the software NVIDIA used in the tests is available in the MLPerf repository. These optimizations are continuously folded into containers available on NGC, NVIDIA's software hub for GPU applications, as well as NVIDIA AI Enterprise - a secure, supported platform that includes NIM inference microservices.
The Next Big Thing The use cases, model sizes and datasets for generative AI continue to expand. That's why MLPerf continues to evolve, adding real-world tests with popular models like Llama 2 70B and Stable Diffusion XL.
Keeping pace with the explosion in LLM model sizes, NVIDIA founder and CEO Jensen Huang announced last week at GTC that the NVIDIA Blackwell architecture GPUs will deliver new levels of performance required for the multitrillion-parameter AI models.
Inference for large language models is difficult, requiring both e
North America Stories
27/04/2024
Audinate Adds Major New Features to Dante Connect
PORTLAND, Oregon Audinate Group Limited, the developer of the Dante AV-over-IP solution, announced significant new additions to Dante Connect, its cloud-based D...
27/04/2024
Bell Media Launches New Portfolio of FAST Channels
TORONTO Bell Media has launched 10 English and French-language FAST channels featuring entertainment, factual, news, and sports programming. The new free stream...
27/04/2024
Study: Broadcast TV Evening News Avoids Serious Economic Issues
An extensive new analysis of the news segments in the broadcast evening news programs of ABC, CBS, NBC and PBS has found that broadcasters devoted most of their...
27/04/2024
Hughes Opens Manufacturing Facility and Private 5G Incubation Center in Maryland
GERMANTOWN, Md. EchoStar's Hughes Network Systems has opened a new manufacturing facility and private 5G incubation center in Germantown, Maryland....
27/04/2024
Broadcasting Legend Harry Pappas Dead At 78
Harry Pappas, one of three brothers who founded Pappas Telecasting Companies in 1971, died April 24. He was 78 years old....
27/04/2024
Televisa Selects Synamedia For Broadcast Distribution Overhaul
ATLANTA and LONDON Mexican telecommunications and broadcast company Televisa has selected Synamedia for an overhaul of its broadcast distribution....
27/04/2024
Participate in the Survey - The Impact of AI on Media and the Creative Industry
Participate in the Survey - The Impact of AI on Media and the Creative Industry Pascal Wagner April 26, 2024 0 Comments By participating in this surve...
27/04/2024
SDVI Rally Access Workstation Earns Two Top Awards at 2024 NAB Show
SDVI Rally Access Workstation Earns Two Top Awards at 2024 NAB Show Brie Clayton April 26, 2024 0 Comments SDVI, the leading platform provider for clo...
27/04/2024
Berklee's Music and Health Institute Launches Community Health Musician Certificate
Berklee's Music and Health Institute Launches Community Health Musician Cert...
27/04/2024
Charter Reports Higher Q1 Profits Despite Broadband, Video Losses
Charter Communications reported higher first-quarter profits despite continued cord-cutting and competition for broadband customers....
27/04/2024
Environmental Groups Aim To Make Unscripted TV More Sustainable
Two environmentally-focused groups are partnering to engage the unscripted TV world in finding better ways to address climate change. Reality of Change is an ec...
27/04/2024
Sarah Garcia Named Weekend Anchor at Telemundo 40 in Texas
Sarah Garcia has been promoted to weekend anchor at KTLM McAllen, Texas, known as Telemundo 40. Starting April 27, she will anchor Noticias Telemundo 40 weekend...
27/04/2024
CBS Sports Kicks Off FAST Channel for UEFA Champions League on Pluto TV
CBS Sports said it launched a new 24-hour free, ad supported streaming television (FAST) channel devoted to the UEFA Champions League....
27/04/2024
Brian Roberts's Pay Rose To $35 Million at Comcast
Comcast chairman and CEO Brian Roberts received $35.4 million in compensation in 2023, up 11% from the previous year, according to a proxy statement filed by th...
27/04/2024
John Lithgow Goes Back to School in Art Happens Here'
Art Happens Here With John Lithgow, which sees the actor study dance, ceramics, silk-screen printing and vocal jazz with students in Los Angeles, debuts on PBS ...
27/04/2024
FETV Wants Upfront Buyers Seeking Cable Viewers To Join Its Family
Remember Leave It to Beaver? Bewitched? Dragnet? When cable ratings were rising?...
27/04/2024
Catchy Comedy Features Gomer Pyle, USMC' Weekend Marathon
Next up for the weekend binge at Catchy Comedy is Gomer Pyle, U.S.M.C. Every weekend, Catchy Comedy features The Catchy Binge, a marathon of a classic sitcom....
26/04/2024
Sundance Film Festival CDMX 2024 kicks off today at Cinpolis
Sundance Film Festival CDMX 2024 kicks-off today with screenings in 5 theaters in Mexico City and the opening-night film, FRIDA, directed by Carla Guti rrez...
26/04/2024
Interview: Lourdes Portillo, Director of Las madres de la Plaza de Mayo, La Ofrenda
[Editor's Note: This interview is part of a larger feature about the women d...
26/04/2024
L3Harris Technologies Reports Strong First Quarter 2024 Results, Increases 2024 Profitability Guidance
Orders1 of $5.5 billion; book-to-bill of 1.06x Revenue of $5.2 billion, up 17%,...
26/04/2024
What Makes A Network Resilient?
Five Considerations For Communications Modernization In The 21st Century In the digital-enabled battlespace, the Joint Force needs to shoot, move and communica...
26/04/2024
CBS Sports Launches New Free Streaming Channel
CBS Sports has launched Champions League as a new, 24-hour streaming channel that will serve as the year-round destination for nonstop highlights of the UEFA ...
26/04/2024
Roku Streaming Homes Hit 81.6M
Despite tough competition in the streaming space, Roku reported solid results in Q1 2024, beating revenue expectations, with total net revenue up 19% YoY to $88...
26/04/2024
Sarah Farrell Named General Manager Of Pinewood Toronto Studios
LONDON AND TORONTO Pinewood Toronto Studios has appointed Sarah Farrell as general manager of the Studios in downtown Toronto....
26/04/2024
Quantum to Offer Advanced Filesharing Technology and Performance in StorNext and Myriad Solutions
Quantum to Offer Advanced Filesharing Technology and Performance in StorNext and...
26/04/2024
FilmLight Colour Awards welcomes 2024 entries and introduces new Emerging Talent' award
FilmLight Colour Awards welcomes 2024 entries and introduces new Emerging Talen...
26/04/2024
Picture Shop Announces Chris Evans as Head of Unscripted
Picture Shop Announces Chris Evans as Head of Unscripted Brie Clayton April 26, 2024 0 Comments Picture Shop announced Chris Evans will lead Unscripte...
26/04/2024
Participate in a Survey - The Impact of AI on Media and the Creative Industry
Participate in a Survey - The Impact of AI on Media and the Creative Industry Pascal Wagner April 26, 2024 0 Comments By participating in this survey,...
26/04/2024
Hi Barbie! Mattel Launching First FAST Channels on Samsung TV Plus
Toy maker Mattel said it is working with Samsung to launch its first free ad-supported streaming television (FAST) channels later this year....
26/04/2024
Marty Moe Named President Of Trusted Media Brands
Trusted Media Brands (TMB) said it named Marty Moe as president....
26/04/2024
Ron Howard Directs Jim Henson Documentary for Disney Plus
Ron Howard is the director on Jim Henson Idea Man, a documentary that premieres on Disney Plus May 31. Henson of course created Kermit the Frog, Miss Piggy, Big...
26/04/2024
Kraken Skate Away From RSN Root Sports for Deals With Tegna, Amazon
The ice continues to melt under the regional sports network business as the Seattle Kraken of the National Hockey League have made a long-term deal to broadcast...
26/04/2024
Warner Bros. Discovery Launches Olli First-Party Data Platform
Heading into the upfronts, Warner Bros. Discovery said it launched Olli, a first-party data platform advertiser can use for converged, targeted advertising camp...
26/04/2024
The Equalizer' Gets Season 5 on CBS
CBS has renewed the drama The Equalizer, which will see season five on in 2024-2025. Queen Latifah stars....
26/04/2024
The CW Inks New Deal for Miss USA, Miss Teen USA
The CW has entered into an exclusive multiyear broadcast partnership for the Miss USA Pageant and the Miss Teen USA Pageant. The 73rd Miss USA Pageant will air ...
26/04/2024
Fuse Urging Young Viewers To Vote With Blunt Campaign
Fuse Media isn't mincing words in a campaign urging its young viewers to register and participate in the 2024 elections....
26/04/2024
Neil Gaiman's Sandman' Universe Expands With Dead Boy Detectives'
Dead Boy Detectives, a series from Neil Gaiman about a detective agency staffed by ghosts, debuts on Netflix April 25. George Rexstrew and Jayden Revri are in t...
26/04/2024
Teradek Announces Smaller More Robust Built-in Antennas f...
Teradek, the industry leader in wireless video transmitters and receivers, announced today the launch of new Bolt 6 LT 750 and Bolt 6 Monitor Module 750 RX with...
26/04/2024
Amagi Names Richard Perkett Chief Product Officer
NEW YORK Amagi has appointed Richard Perkett chief product officer (CPO)....
26/04/2024
NAB Board Election Results Announced
WASHINGTON, D.C. The National Association of Broadcasters (NAB) has announced the results of the 2024 NAB Radio and Television Board of Directors elections. The...
26/04/2024
Mattel to Launch First FAST Channels on Samsung TV Plus
EL SEGUNDO, Calif. & NEW YORK Mattel has announced a deal to launch its first three 24/7 free ad supported streaming (FAST) channels on Samsung TV Plus, Samsung...
26/04/2024
NextGen TV Launches In Portland, Maine
PORTLAND, Maine Viewers here can now receive the NextGen TV signals of five local stations with the launch of ATSC 3.0 service from host station WPFO, which is ...
26/04/2024
Rogers Inks Agreement to Deploy Comcast's Products and Technologies
TORONTO Rogers Communications has signed a 10-year agreement with Comcast to bring the latest Xfinity products and technology to Canadians....
26/04/2024
Calrec scoops third Argo win with NAB Show Product of the...
Calrec is very pleased to announce the benefits of its highly flexible approach to customer service has been recognized with a third award for the Argo range. T...
26/04/2024
New Adobe Photoshop with Advanced Generative Fill and Generate Image Brings New Superpowers to All
New Adobe Photoshop with Advanced Generative Fill and Generate Image Brings New ...
26/04/2024
Trends And Takeaways From NAB Show 2024
Trends And Takeaways From NAB Show 2024 Melanie Ciotti April 25, 2024 0 Comments NAB's Eric Trabb (right) awards SNS with two NAB Show Product of ...
26/04/2024
Sony's Ci Media Cloud Keeps NHRA Moving
Sony's Ci Media Cloud Keeps NHRA Moving Brie Clayton April 25, 2024 0 Comments Cloud-Based Media Asset Management and Collaboration Platform Accel...
26/04/2024
2024 Emerging Leaders Internship Program Open for Applications, Deadline May 31st
Capitol Broadcasting & Leadership Triangle Collaborate for Third Year Capitol B...
26/04/2024
MIX 101.5 Launches New YouTube Shows
The staff at CBC Radio have been hard at work creating several new features for listeners and viewers. Several debuts are taking place this spring. I have nev...