
NVIDIA today announced optimizations across all its platforms to accelerate Meta Llama 3, the latest generation of the large language model (LLM).
The open model combined with NVIDIA accelerated computing equips developers, researchers and businesses to innovate responsibly across a wide variety of applications.
Trained on NVIDIA AI Meta engineers trained Llama 3 on computer clusters packing 24,576 NVIDIA H100 Tensor Core GPUs, linked with RoCE and NVIDIA Quantum-2 InfiniBand networks.
To further advance the state of the art in generative AI, Meta recently described plans to scale its infrastructure to 350,000 H100 GPUs.
Putting Llama 3 to Work Versions of Llama 3, accelerated on NVIDIA GPUs, are available today for use in the cloud, data center, edge and PC.
From a browser, developers can try Llama 3 at ai.nvidia.com. It's packaged as an NVIDIA NIM microservice with a standard application programming interface that can be deployed anywhere.
Businesses can fine-tune Llama 3 with their data using NVIDIA NeMo, an open-source framework for LLMs that's part of the secure, supported NVIDIA AI Enterprise platform. Custom models can be optimized for inference with NVIDIA TensorRT-LLM and deployed with NVIDIA Triton Inference Server.
Taking Llama 3 to Devices and PCs Llama 3 also runs on NVIDIA Jetson Orin for robotics and edge computing devices, creating interactive agents like those in the Jetson AI Lab.
What's more, NVIDIA RTX and GeForce RTX GPUs for workstations and PCs speed inference on Llama 3. These systems give developers a target of more than 100 million NVIDIA-accelerated systems worldwide.
Get Optimal Performance with Llama 3 Best practices in deploying an LLM for a chatbot involves a balance of low latency, good reading speed and optimal GPU use to reduce costs.
Such a service needs to deliver tokens - the rough equivalent of words to an LLM - at about twice a user's reading speed which is about 10 tokens/second.
Applying these metrics, a single NVIDIA H200 Tensor Core GPU generated about 3,000 tokens/second - enough to serve about 300 simultaneous users - in an initial test using the version of Llama 3 with 70 billion parameters.
That means a single NVIDIA HGX server with eight H200 GPUs could deliver 24,000 tokens/second, further optimizing costs by supporting more than 2,400 users at the same time.
For edge devices, the version of Llama 3 with eight billion parameters generated up to 40 tokens/second on Jetson AGX Orin and 15 tokens/second on Jetson Orin Nano.
Advancing Community Models An active open-source contributor, NVIDIA is committed to optimizing community software that helps users address their toughest challenges. Open-source models also promote AI transparency and let users broadly share work on AI safety and resilience.
Learn more about how NVIDIA's AI inference platform, including how NIM, TensorRT-LLM and Triton use state-of-the-art techniques such as low-rank adaptation to accelerate the latest LLMs.
Most recent headlines
05/01/2027
Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...
04/08/2026
Dalet, a leading technology and service provider for media-rich organizations, t...
04/07/2026
April 7 2026, 19:00 (PDT) Detective Conan: Fallen Angel of the Highway Opens in...
01/06/2026
January 6 2026, 05:30 (PST) Dolby Sets the New Standard for Premium Entertainment at CES 2026
Throughout the week, Dolby brings to life the latest innovatio...
29/05/2026
Gerald (Jerry) Pierce, a pioneering technologist who helped shape the digital transformation of the motion picture industry, died April 12, 2026, at his home in...
29/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
29/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
29/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
29/05/2026
Comparing 5 AI Video Enhancers for Restoring Old Video Quality
Kate Luvis May 29, 2026
0 Comments
Digitizing VHS, MiniDV, and other legacy formats doe...
29/05/2026
Studio Hamburg Builds New Post Pipeline with DaVinci Resolve Studio
Brie Clayton May 29, 2026
0 Comments
Workflow replaces a patchwork of legacy tools...
29/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
29/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
29/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
29/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
29/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
29/05/2026
At the Intersection of Music and Dance, an Epic Collaboration Boston Conservatory musicians and dancers found creative parallels in their recent performance o...
29/05/2026
Back to All News
The Official Trailer of Physical 100 Italy, on Netflix From Se...
29/05/2026
At just 20 years old, Sean Melia from Summerhill Co. Meath has been revealed as the winner of Super Garden 2026. Sean is the youngest ever contestant and winner...
29/05/2026
Experimental HIV vaccine achieves a long-sought goal In a first for the field, all non-human primates given a new series of vaccines generated antibodies capabl...
28/05/2026
Thomsons Climate Crisis Toolkit is already being put to work, helping journalist...
28/05/2026
FOX Sports' MLB coverage dominated the night at the 47th Annual Sports Emmy ...
28/05/2026
One of the most memorable postseasons in baseball history would have had no memo...
28/05/2026
NBC's Sunday Night Football is among the most-decorated and most-watched pro...
28/05/2026
The 2026 Sports Emmys marked a watershed moment for Prime Video Sports. After bu...
28/05/2026
Hosted by ESPN Senior Writer Ryan McGee, the ceremony honored winners from athletic departments, conference offices, professional networks, production companies...
28/05/2026
Major rights-holders rolled out their 2026 college football slates today, offering a first look at marquee windows, kickoff partners, and conference distributio...
28/05/2026
The 4 Nations Face-Off, an international competition that launched in February 2...
28/05/2026
Soccer is a truly global sport, and, to match the U.S. viewing audience's in...
28/05/2026
The NFL Today, CBS Sports' top-of-the-line NFL studio show, has a storied hi...
28/05/2026
Prime Video is no stranger to innovation, and, within the realm of live streamin...
28/05/2026
The crack of the bat, the pop of the glove, and the cheer of the crowd: those are three prime examples of the glorious sounds fans hear on televised baseball ga...
28/05/2026
Every year, March Madness captivates the entire sports world. Nonstop action, un...
28/05/2026
NBA on NBC was an integral part to live sports production in the 1990s, and, when the league made these media rights available, NBC Sports saw an opportunity to...
28/05/2026
UmpCam is a technological wrinkle that has given fans unprecedented views of wha...
28/05/2026
T-Mobile and the United States Golf Association (USGA) announced a multi-year partnership, making T-Mobile the Official 5G Network Partner of the U.S. Women'...
28/05/2026
By Gina McIntyre
(L-R) Sara Dosa and Andri Sn r Magnason attend the Time and Water premiere during the 2026 Sundance Film Festival at Library Center Theatre ...
28/05/2026
A little refresh can go a long way. We've been making behind-the-scenes upda...
28/05/2026
Since early last year, our quarterly Creator Milestone Awards have celebrated podcasts from around the world that hit major streaming milestones on Spotify. Tod...
28/05/2026
New software-based MIDI multi-clock announced
Rapid Flow have just announced the launch of a new MIDI utility plug-in that aims to solve one of the most per...
28/05/2026
Company introduce most affordable interface yet
SSL's range of audio interfaces has just gained its latest, and smallest member, the SSL 1. Designed to ...
28/05/2026
New compact drum machine & sampler announced
The latest instrument from Sonicware sees the company kick off an all-new Deconstruct series with a compact dru...
28/05/2026
Get fit for the FIFA World Cup 2026 with new SBS On Demand features for the ult...
28/05/2026
Hellenic Civil Aviation Authority modernizes nationwide ATC voice communications...
28/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
28/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
28/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
28/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
28/05/2026
Harmonic (NASDAQ: HLIT) today announced that Swiss broadcaster Canal Alpha has deployed Harmonics award-winning, software-based XOS Advanced Media Processor to ...
28/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
28/05/2026
Riedel Communications today announced that Berliner Ensemble, one of Berlin's five major theater companies, has expanded its backstage communications and te...