
NVIDIA today announced optimizations across all its platforms to accelerate Meta Llama 3, the latest generation of the large language model (LLM).
The open model combined with NVIDIA accelerated computing equips developers, researchers and businesses to innovate responsibly across a wide variety of applications.
Trained on NVIDIA AI Meta engineers trained Llama 3 on computer clusters packing 24,576 NVIDIA H100 Tensor Core GPUs, linked with RoCE and NVIDIA Quantum-2 InfiniBand networks.
To further advance the state of the art in generative AI, Meta recently described plans to scale its infrastructure to 350,000 H100 GPUs.
Putting Llama 3 to Work Versions of Llama 3, accelerated on NVIDIA GPUs, are available today for use in the cloud, data center, edge and PC.
From a browser, developers can try Llama 3 at ai.nvidia.com. It's packaged as an NVIDIA NIM microservice with a standard application programming interface that can be deployed anywhere.
Businesses can fine-tune Llama 3 with their data using NVIDIA NeMo, an open-source framework for LLMs that's part of the secure, supported NVIDIA AI Enterprise platform. Custom models can be optimized for inference with NVIDIA TensorRT-LLM and deployed with NVIDIA Triton Inference Server.
Taking Llama 3 to Devices and PCs Llama 3 also runs on NVIDIA Jetson Orin for robotics and edge computing devices, creating interactive agents like those in the Jetson AI Lab.
What's more, NVIDIA RTX and GeForce RTX GPUs for workstations and PCs speed inference on Llama 3. These systems give developers a target of more than 100 million NVIDIA-accelerated systems worldwide.
Get Optimal Performance with Llama 3 Best practices in deploying an LLM for a chatbot involves a balance of low latency, good reading speed and optimal GPU use to reduce costs.
Such a service needs to deliver tokens - the rough equivalent of words to an LLM - at about twice a user's reading speed which is about 10 tokens/second.
Applying these metrics, a single NVIDIA H200 Tensor Core GPU generated about 3,000 tokens/second - enough to serve about 300 simultaneous users - in an initial test using the version of Llama 3 with 70 billion parameters.
That means a single NVIDIA HGX server with eight H200 GPUs could deliver 24,000 tokens/second, further optimizing costs by supporting more than 2,400 users at the same time.
For edge devices, the version of Llama 3 with eight billion parameters generated up to 40 tokens/second on Jetson AGX Orin and 15 tokens/second on Jetson Orin Nano.
Advancing Community Models An active open-source contributor, NVIDIA is committed to optimizing community software that helps users address their toughest challenges. Open-source models also promote AI transparency and let users broadly share work on AI safety and resilience.
Learn more about how NVIDIA's AI inference platform, including how NIM, TensorRT-LLM and Triton use state-of-the-art techniques such as low-rank adaptation to accelerate the latest LLMs.
Most recent headlines
05/01/2027
Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...
04/08/2026
Dalet, a leading technology and service provider for media-rich organizations, t...
04/07/2026
April 7 2026, 19:00 (PDT) Detective Conan: Fallen Angel of the Highway Opens in...
05/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
05/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
05/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
05/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
05/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
05/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
05/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
05/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
05/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
05/06/2026
Frequency, the engine powering many of the world's leading streaming television channels, today announced the launch of In-Scene Advertising, a new monetiza...
05/06/2026
Berklee Study Reveals Video Has Become Essential to Music Careers Survey findings show social platforms have become the primary source of music for video cont...
04/06/2026
Sony Electronics is introducing the SRG-AS10, a 4K 60p-compatible PTZ auto-frami...
04/06/2026
This recent grad from Spring, TX, led creative-video output for the Aggies' men's basketball team last season and has been producing video and creating ...
04/06/2026
For the first time at a women's golf major, every player in the field will r...
04/06/2026
Three Panasonic PT-RQ45 40,000-lumen 3-Chip DLP projectors made their first live...
04/06/2026
Bitmovin and Akamai have announced a collaboration with NRJ Group, a French mult...
04/06/2026
Telestream will exhibit at InfoComm 2026 (Booth N7952), demonstrating media work...
04/06/2026
Sony has announced the development of RIALTO 65, a 65mm format image sensor block for the VENICE 2 digital cinema camera, targeting release in the first half of...
04/06/2026
KOKUSAI DENKI Electric America will exhibit at InfoComm 2026 (Booth N8025, June ...
04/06/2026
Bell Media's TSN and RDS are the exclusive Canadian broadcasters of FIFA Wor...
04/06/2026
The Challenge: Receiving Heavy Media Files From Road Games Quickly and ReliablyT...
04/06/2026
MASV, a managed file transfer platform used in broadcast and live sports product...
04/06/2026
NESN has announced the appointment of Fahad Haider as Vice President of Operations and Engineering. Haider returns to NESN, where he previously served as Vice P...
04/06/2026
David J. Halberstam, who spent almost 50 years in sports as a broadcaster and an executive, died June 2 after a years-long battle with brain cancer.
Over his l...
04/06/2026
Although collegiate production programs are tasked with delivering high-quality ...
04/06/2026
California studio, two production trucks, global distribution system are combine...
04/06/2026
New global program empowers and supports storytellers through scriptwriting course and access to industry experts
TikTok and Sundance Institute today announce...
04/06/2026
Steinberg DAWs now boast in-depth Tonalic integration
Celemony's innovative virtual session musician plug-in has just received an update that brings ARA...
04/06/2026
Get Hands-On With Over 20 Mic Brands
GearExpo UK is fast approaching, and if you've been looking for a chance to check out some new mics, then you'r...
04/06/2026
Combos feature new Amplifier Intelligence engine
Positive Grid's latest release sees the company introduce two new combo amplifiers that promise to offe...
04/06/2026
Is Your Job Making You Work this June?
4 June, 2026
Media releases
SBS Launches the World Cup Watchers' Rights Association to Stand Up For Australians&...
04/06/2026
Statement regarding unauthorised use of SBS logos on third party social content
4 June, 2026
Media releases
SBS has become aware of social media posts in c...
04/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
04/06/2026
Bitmovin, a leading provider of video streaming infrastructure, and Akamai, the cybersecurity and cloud computing company that powers and protects business onli...
04/06/2026
American Underground (AU), the Startup Hub of the South and a community of mor...
04/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
04/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
04/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
04/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
04/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
04/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
04/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
04/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
04/06/2026
NVIDIA and Microsoft Reinvent Windows PCs for the Age of Personal AI
Brie Clayton June 3, 2026
0 Comments
RTX Spark - a 1-Petaflop Superchip, the Full...
04/06/2026
Inside the Peaky Blinders: The Immortal Man Grade
Brie Clayton June 3, 2026
0 Comments
Simone Grattarola discusses shaping the look in DaVinci Resolve...
04/06/2026
Cine Gear Expo Announces 2026 Awards of Excellence Recipients
Brie Clayton June 3, 2026
0 Comments
Ed Lachman ASC, Caleb Deschanel ASC, and M. David M...