
NVIDIA today announced optimizations across all its platforms to accelerate Meta Llama 3, the latest generation of the large language model (LLM).
The open model combined with NVIDIA accelerated computing equips developers, researchers and businesses to innovate responsibly across a wide variety of applications.
Trained on NVIDIA AI Meta engineers trained Llama 3 on computer clusters packing 24,576 NVIDIA H100 Tensor Core GPUs, linked with RoCE and NVIDIA Quantum-2 InfiniBand networks.
To further advance the state of the art in generative AI, Meta recently described plans to scale its infrastructure to 350,000 H100 GPUs.
Putting Llama 3 to Work Versions of Llama 3, accelerated on NVIDIA GPUs, are available today for use in the cloud, data center, edge and PC.
From a browser, developers can try Llama 3 at ai.nvidia.com. It's packaged as an NVIDIA NIM microservice with a standard application programming interface that can be deployed anywhere.
Businesses can fine-tune Llama 3 with their data using NVIDIA NeMo, an open-source framework for LLMs that's part of the secure, supported NVIDIA AI Enterprise platform. Custom models can be optimized for inference with NVIDIA TensorRT-LLM and deployed with NVIDIA Triton Inference Server.
Taking Llama 3 to Devices and PCs Llama 3 also runs on NVIDIA Jetson Orin for robotics and edge computing devices, creating interactive agents like those in the Jetson AI Lab.
What's more, NVIDIA RTX and GeForce RTX GPUs for workstations and PCs speed inference on Llama 3. These systems give developers a target of more than 100 million NVIDIA-accelerated systems worldwide.
Get Optimal Performance with Llama 3 Best practices in deploying an LLM for a chatbot involves a balance of low latency, good reading speed and optimal GPU use to reduce costs.
Such a service needs to deliver tokens - the rough equivalent of words to an LLM - at about twice a user's reading speed which is about 10 tokens/second.
Applying these metrics, a single NVIDIA H200 Tensor Core GPU generated about 3,000 tokens/second - enough to serve about 300 simultaneous users - in an initial test using the version of Llama 3 with 70 billion parameters.
That means a single NVIDIA HGX server with eight H200 GPUs could deliver 24,000 tokens/second, further optimizing costs by supporting more than 2,400 users at the same time.
For edge devices, the version of Llama 3 with eight billion parameters generated up to 40 tokens/second on Jetson AGX Orin and 15 tokens/second on Jetson Orin Nano.
Advancing Community Models An active open-source contributor, NVIDIA is committed to optimizing community software that helps users address their toughest challenges. Open-source models also promote AI transparency and let users broadly share work on AI safety and resilience.
Learn more about how NVIDIA's AI inference platform, including how NIM, TensorRT-LLM and Triton use state-of-the-art techniques such as low-rank adaptation to accelerate the latest LLMs.
Most recent headlines
05/01/2027
Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...
01/06/2026
January 6 2026, 05:30 (PST) Dolby Sets the New Standard for Premium Entertainment at CES 2026
Throughout the week, Dolby brings to life the latest innovatio...
02/05/2026
Dalet, a leading technology and service provider for media-rich organizations, t...
01/05/2026
January 5 2026, 18:30 (PST) NBCUniversal's Peacock to Be First Streamer to ...
01/04/2026
January 4 2026, 18:00 (PST) DOLBY AND DOUYIN EMPOWER THE NEXT GENERATON OF CREATORS WITH DOLBY VISION
Douyin Users Can Now Create And Share Videos With Stun...
17/03/2026
NASA+'s Rebecca Sirmons and Brittany Brown offer unique look at live streami...
17/03/2026
The transition to IP has fundamentally reshaped professional media infrastructures. Video, audio, and increasingly metadata now circulate as independent, precis...
17/03/2026
Live sports streaming can push every element in your video delivery chain to its limit, exposing every potential weakness in seconds. When the Super Bowl, the O...
17/03/2026
Texas Athletics sought to modernize its media production, enhance fan experience...
17/03/2026
Ikegami USA will demonstrate the latest additions to its wide range of broadcast...
17/03/2026
TNA Wrestling and iHeartMedia announces a new multi-platform collaboration that will integrate iHeartMedia across TNA's premium live events, weekly televisi...
17/03/2026
The goal was to transform Hard Rock Stadium into a global leader in sports and e...
17/03/2026
Spectrum has announced the launch of its new Multiview feature in the Spectrum TV App, giving customers the ability to watch up to four NCAA men's or women&...
17/03/2026
Genius Sports deal also covers data technology, AI, fan engagement, and performance analysis....
17/03/2026
Net Insight is supporting the rollout of a new state-wide centralized operation with Rede Massa, which is an SBT affiliate, the Brazilian regional television ne...
17/03/2026
Featuring audio from practice sessions, qualifying races, and Grand Prix races, the film represents Apple's sports-media ambitions
At Sunday night's Ac...
17/03/2026
Live broadcast has always been one of the most demanding environments in media a...
17/03/2026
DirecTV is introducing several new viewing features, including a multi-screen March Madness Mix channel and an updated Sports Central mobile app hub, ahead of...
17/03/2026
Deltatre has announced a multi-year partnership with ATP Media, the media arm of the ATP Tour, covering broadcast graphics, data, and production across the 2026...
17/03/2026
The Detroit Pistons have announced a third consecutive season partnering with Sc...
17/03/2026
Fresh Finds Africa spotlights emerging artists and movements across the continent and its global diaspora, with listeners tuning in to discover new Afro-forward...
17/03/2026
Last month, more than 60,000 fans piled into Bangkok's Rajamangala National ...
17/03/2026
Vintage-inspired channel strip joins line-up
Black Rooster Audio's latest plug-in provides an all-in-one mixing tool inspired by classic analogue consol...
17/03/2026
Level and EQ voice, reverb and noise independently
The latest plug-in to join Accentize's collection is said to take a new approach to dialogue processi...
17/03/2026
UHF radio mic & IEM bandwidth at risk
Once again, the UHF bandwidth that is currently allocated to RF audio gear is at risk of being reassigned to high-spee...
17/03/2026
Last Friday, the first Plant Fire Department Training Week in Bavaria successf...
17/03/2026
New campaign from NAATI and SBS CulturalConnnect highlights how we all deserve ...
17/03/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
17/03/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
17/03/2026
3D streaming infrastructure provider Miris today announced the launch of a public beta for its new 3D asset streaming platform. Miris is building the infrastruc...
17/03/2026
As media organizations face mounting pressure to produce more content, faster, while maximizing value and operational efficiency, Tedial, a leading provider of ...
17/03/2026
Brainstorm, a leading manufacturer of real-time graphics, augmented and virtual production, is launching the newest version of its platform, Brainstorm Suite 7,...
17/03/2026
Limecraft today announces the release of Limecraft 2026.2, the second platform update in its 2026 release cycle. Limecraft is an AI-powered production platform ...
17/03/2026
Broadcast Solutions, a leading system integrator and provider of innovative solutions for the broadcast and media industry, showcased its latest broadcast and V...
17/03/2026
THIS ANNOUNCEMENT RELATES TO THE DISCLOSURE OF INFORMATION THAT QUALIFIED OR MAY HAVE QUALIFIED AS INSIDE INFORMATION WITHIN THE MEANING OF ARTICLE 7(1) OF THE ...
17/03/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
17/03/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
17/03/2026
QuickLink's Latest StudioEdge Models to Make North American Debut at NAB 202
Brie Clayton March 16, 2026
0 Comments
The Multi-platform Remote Gues...
17/03/2026
Frankenstein Graded with DaVinci Resolve Studio
Brie Clayton March 16, 2026
0 Comments
Sonnenfeld enhances the controlled interplay between warm and c...
17/03/2026
New Voyavox from Link Electronics with Real-Time Speech-to-Text Captioning to be...
17/03/2026
Berklee City Music Stewards META Fellowship Supporting Massachusetts Music Educa...
17/03/2026
17 Mar 2026
VEON Strengthens Leadership Team to Accelerate Digital Ambition Senior executive appointments are to enhance country operations and VEON Group'...
17/03/2026
UKTV today announces the appointment of Sam Tewungwa as Chief Executive Officer, who will lead the company's next phase of growth.
Tewungwa, currently Mana...
17/03/2026
Tuesday 17 March 2026
Disney included in the Sky TV subscription from today, b...
17/03/2026
The five-part thriller, formerly known as Inheritance, also stars Jonny Lee Mill...
17/03/2026
Tuesday 17 March 2026
Sky Arts welcomes Fearne Cotton as co-host of Landscape Artist of the Year
Sky Arts' much-loved Landscape Artist of the Year is set ...
17/03/2026
Season 2 of the darkly comic thriller will launch on Sky in 2026Tuesday 17 March 2026
Ella Purnell returns as wallflower-turned-killer in Sky Original Sweetpea...
17/03/2026
Back to All News
Fantasy Action Series Agent from Above' Puts Contemporary...
17/03/2026
Back to All News
Netflix Presents the Trailer for the Final Season of Turn of the Tide
Entertainment
17 March 2026
GlobalSpainPortugal
Link copied to clipb...
17/03/2026
The features on social media apps like Snapchat evolve nearly as fast as what...