
NVIDIA today announced optimizations across all its platforms to accelerate Meta Llama 3, the latest generation of the large language model (LLM).
The open model combined with NVIDIA accelerated computing equips developers, researchers and businesses to innovate responsibly across a wide variety of applications.
Trained on NVIDIA AI Meta engineers trained Llama 3 on computer clusters packing 24,576 NVIDIA H100 Tensor Core GPUs, linked with RoCE and NVIDIA Quantum-2 InfiniBand networks.
To further advance the state of the art in generative AI, Meta recently described plans to scale its infrastructure to 350,000 H100 GPUs.
Putting Llama 3 to Work Versions of Llama 3, accelerated on NVIDIA GPUs, are available today for use in the cloud, data center, edge and PC.
From a browser, developers can try Llama 3 at ai.nvidia.com. It's packaged as an NVIDIA NIM microservice with a standard application programming interface that can be deployed anywhere.
Businesses can fine-tune Llama 3 with their data using NVIDIA NeMo, an open-source framework for LLMs that's part of the secure, supported NVIDIA AI Enterprise platform. Custom models can be optimized for inference with NVIDIA TensorRT-LLM and deployed with NVIDIA Triton Inference Server.
Taking Llama 3 to Devices and PCs Llama 3 also runs on NVIDIA Jetson Orin for robotics and edge computing devices, creating interactive agents like those in the Jetson AI Lab.
What's more, NVIDIA RTX and GeForce RTX GPUs for workstations and PCs speed inference on Llama 3. These systems give developers a target of more than 100 million NVIDIA-accelerated systems worldwide.
Get Optimal Performance with Llama 3 Best practices in deploying an LLM for a chatbot involves a balance of low latency, good reading speed and optimal GPU use to reduce costs.
Such a service needs to deliver tokens - the rough equivalent of words to an LLM - at about twice a user's reading speed which is about 10 tokens/second.
Applying these metrics, a single NVIDIA H200 Tensor Core GPU generated about 3,000 tokens/second - enough to serve about 300 simultaneous users - in an initial test using the version of Llama 3 with 70 billion parameters.
That means a single NVIDIA HGX server with eight H200 GPUs could deliver 24,000 tokens/second, further optimizing costs by supporting more than 2,400 users at the same time.
For edge devices, the version of Llama 3 with eight billion parameters generated up to 40 tokens/second on Jetson AGX Orin and 15 tokens/second on Jetson Orin Nano.
Advancing Community Models An active open-source contributor, NVIDIA is committed to optimizing community software that helps users address their toughest challenges. Open-source models also promote AI transparency and let users broadly share work on AI safety and resilience.
Learn more about how NVIDIA's AI inference platform, including how NIM, TensorRT-LLM and Triton use state-of-the-art techniques such as low-rank adaptation to accelerate the latest LLMs.
Most recent headlines
05/01/2027
Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...
04/08/2026
Dalet, a leading technology and service provider for media-rich organizations, t...
04/07/2026
April 7 2026, 19:00 (PDT) Detective Conan: Fallen Angel of the Highway Opens in...
01/06/2026
January 6 2026, 05:30 (PST) Dolby Sets the New Standard for Premium Entertainment at CES 2026
Throughout the week, Dolby brings to life the latest innovatio...
19/05/2026
Actus Digital, a LiveU company, and a leading provider of intelligent media monitoring and broadcast compliance platforms, announced today that METEO CONSULT ha...
19/05/2026
Globecast, a leading provider of managed services for the broadcast and media industry, today announced the completion of a major renovation of its Singapore fa...
19/05/2026
Following a major upgrade to its Connecticut remote production centre with the installation of three 60-fader Argo S consoles in 2023, RPS has switched one of t...
19/05/2026
PlayBox Technology today announced the launch of Celebro Play, a browser-based media orchestration platform designed specifically for broadcast facilities and p...
19/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
19/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
19/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
19/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
19/05/2026
LYNX Technik, provider of modular signal processing solutions for broadcast, post, and professional AV will showcase its latest signal processing solutions at B...
19/05/2026
Media Links, a leading developer and manufacturer of IP media transport solutions, has announced the integration of its Xscend platform with Skyline Communicat...
19/05/2026
Collaboration expands regional partner ecosystem as Media Links showcases IP media transport solutions on stand 4M4-4 and alongside Clearcast Asia on stand
Me...
19/05/2026
Company Highlights Labor-Saving Control and Custom Branding Opportunities Designed to Increase Integrator Profitability
LynTec, a leading manufacturer of ele...
19/05/2026
Mago Studio Moves AI Video Beyond Prompts with New Production-Focused Tools
Brie Clayton May 18, 2026
0 Comments
New Capabilities Bring Controllable, ...
19/05/2026
It May be Time to Stop and Take an AI Breather
Brie Clayton May 19, 2026
0 Comments
If the rock you stay on starts to roll, jump clean. Or you'll...
18/05/2026
Despite a bumpy start to the year, the provider of event-production support and ...
18/05/2026
Daktronics has partnered with the University of North Carolina to manufacture and install 11 LED displays totaling more than 10,000 square feet and more than 14...
18/05/2026
CBS LA has announced a multi-year partnership with the Los Angeles Rams, covering exclusive local broadcasts of Rams preseason games, weekly year-round programm...
18/05/2026
Skyline Communications has announced an integration between its DataMiner xOps p...
18/05/2026
eCLUTCH, the hybrid esports platform powered by iKOMG, has announced an expansion of its distribution across Europe, MENA, Africa, and Asia, along with new cont...
18/05/2026
Behind The Mic provides a roundup of recent news regarding on-air talent, includ...
18/05/2026
With 22 games this season, the production team looks forward to tweaking and enhancing the coverage
After 8,660 days off the air, the WNBA returned to NBC yest...
18/05/2026
(L-R) Midori Francis and Natalie Erika James attend the Saccharine premiere during the 2026 Sundance Film Festival at The Ray Theatre on January 22, 2026, in ...
18/05/2026
Last night, the Spotify Podcast Awards in Mexico returned to the country's capital. Now in its second year, the evening honors creators whose voices are hel...
18/05/2026
ZEN-Core synth goes mobile
Roland's powerful ZEN-Core software synthesizer has just been introduced to the iPad, offering a convenient entry point into ...
18/05/2026
Versatile new limiter plug-in announced
Based in Sheffield, UK, fedDSP offer a range of plug-ins that span the music production, live sound and high-end med...
18/05/2026
A new use for convolution?
Viiri Audio's debut plug-in aims to do something a little different with convolution processing, allowing users to adjust all...
18/05/2026
Delta Goodrem shines for SBS as more than 3.27 million Australians tune in for E...
18/05/2026
The Australian Defence Force uses L3Harris T4 and T7 robots for explosive ordnan...
18/05/2026
Continued investment across Europe and Germany is expanding local teams and improving access to stock, regional expertise, and specialist broadcast support.
CV...
18/05/2026
Agentic AI inference at one-tenth the cost per token with NVIDIA Vera Rubin NVL7...
18/05/2026
18 May 2026
Dubai and New York, May 18, 2026 - VEON Ltd. (NASDAQ: VEON), a glob...
18/05/2026
Monday 18 May 2026
Sky News offers ad-free podcasts and bonus episodes for just...
18/05/2026
Comscore March 2026 Consumer AI Chatbot Usage Rankings Show Claude Gaining Share OpenAI's ChatGPT maintains lead while Anthropic's Claude continues to c...
17/05/2026
Delta Goodrem's Eurovision Eclipse marks end of a stellar run
17 May, 2026
Media releases
Bulgaria wins Eurovision 2026
Relive every spellbinding mome...
17/05/2026
Back to All News
Oasis Premieres on Netflix June 19
Entertainment
17 May 2026
GlobalSpain
Link copied to clipboard
Summer, sunshine, the beach, parties. T...
16/05/2026
Helps to deliver a clean, balanced midrange
Developed alongside Newfangled Audio, the latest plug-in in Eventide's software collection has been designed...
16/05/2026
Brings onboard stem rendering to RANE System One
Engine DJ have just released Engine DJ 5.0, a free update for their Engine DJ OS embedded hardware and Engi...
16/05/2026
Boris FX Continuum Pairs AI Precision and Advanced Creative Controls
Jessie Electa Petrov May 16, 2026
0 Comments
The 2026.5 release adds automatic de...
16/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
16/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
16/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
16/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
15/05/2026
Seattle Sounders FC and Seattle Reign FC, in partnership with RAVE Foundation an...
15/05/2026
Dan Brumm has served as sound designer on Bluey, the Australian children's t...
15/05/2026
The Professional Audio Manufacturers Alliance (PAMA) and Shure Incorporated are accepting applications for the 6th annual Mark Brunner Professional Audio Schola...