
NVIDIA today announced optimizations across all its platforms to accelerate Meta Llama 3, the latest generation of the large language model (LLM).
The open model combined with NVIDIA accelerated computing equips developers, researchers and businesses to innovate responsibly across a wide variety of applications.
Trained on NVIDIA AI Meta engineers trained Llama 3 on computer clusters packing 24,576 NVIDIA H100 Tensor Core GPUs, linked with RoCE and NVIDIA Quantum-2 InfiniBand networks.
To further advance the state of the art in generative AI, Meta recently described plans to scale its infrastructure to 350,000 H100 GPUs.
Putting Llama 3 to Work Versions of Llama 3, accelerated on NVIDIA GPUs, are available today for use in the cloud, data center, edge and PC.
From a browser, developers can try Llama 3 at ai.nvidia.com. It's packaged as an NVIDIA NIM microservice with a standard application programming interface that can be deployed anywhere.
Businesses can fine-tune Llama 3 with their data using NVIDIA NeMo, an open-source framework for LLMs that's part of the secure, supported NVIDIA AI Enterprise platform. Custom models can be optimized for inference with NVIDIA TensorRT-LLM and deployed with NVIDIA Triton Inference Server.
Taking Llama 3 to Devices and PCs Llama 3 also runs on NVIDIA Jetson Orin for robotics and edge computing devices, creating interactive agents like those in the Jetson AI Lab.
What's more, NVIDIA RTX and GeForce RTX GPUs for workstations and PCs speed inference on Llama 3. These systems give developers a target of more than 100 million NVIDIA-accelerated systems worldwide.
Get Optimal Performance with Llama 3 Best practices in deploying an LLM for a chatbot involves a balance of low latency, good reading speed and optimal GPU use to reduce costs.
Such a service needs to deliver tokens - the rough equivalent of words to an LLM - at about twice a user's reading speed which is about 10 tokens/second.
Applying these metrics, a single NVIDIA H200 Tensor Core GPU generated about 3,000 tokens/second - enough to serve about 300 simultaneous users - in an initial test using the version of Llama 3 with 70 billion parameters.
That means a single NVIDIA HGX server with eight H200 GPUs could deliver 24,000 tokens/second, further optimizing costs by supporting more than 2,400 users at the same time.
For edge devices, the version of Llama 3 with eight billion parameters generated up to 40 tokens/second on Jetson AGX Orin and 15 tokens/second on Jetson Orin Nano.
Advancing Community Models An active open-source contributor, NVIDIA is committed to optimizing community software that helps users address their toughest challenges. Open-source models also promote AI transparency and let users broadly share work on AI safety and resilience.
Learn more about how NVIDIA's AI inference platform, including how NIM, TensorRT-LLM and Triton use state-of-the-art techniques such as low-rank adaptation to accelerate the latest LLMs.
Most recent headlines
11/12/2025
Dalet, a leading provider of cloud-native, end-to-end media workflow solutions, ...
22/11/2025
The deadline for entries for the 2025 Best in Market Awards has been extended to 23:59 PST on November 28, 2025....
22/11/2025
Clear-Com announced the upcoming launch of its 4-Channel HelixNet beltpack, a next-generation advancement of its widely used 2-channel model. The new beltpack...
22/11/2025
Marshall Electronics is showcasing the latest additions to its CV600 Series of PTZ cameras, the CV625 and CV612, which both feature AI track and follow capabili...
22/11/2025
At this year's European Respiratory Society (ERS) Congress, held at the RAI Amsterdam, LiveConnect delivered an ambitious and technically complex live produ...
22/11/2025
Professional Wireless Systems (PWS), a leader in wireless frequency coordination and RF system design, provided a comprehensive wireless gear package and onsite...
22/11/2025
Telestream, a global leader in media workflow technologies, today announced the release of ARGUS v2.3, which introduces Live Look, a powerful new feature that e...
22/11/2025
Peer Software today announced significant advancements across its enterprise data orchestration and analytics platform with new releases of Peer Global File Ser...
22/11/2025
At InterBEE 2025, Atomos announces a major firmware update that brings integrated camera control to the Ninja TX GO and Ninja TX its new CFexpress-based monit...
22/11/2025
Today, AWS announces the general availability of AWS Elemental MediaConnect Router, a new capability that enables broadcasters and content providers to dynamica...
22/11/2025
Rise, the award-winning advocacy group for gender diversity in the broadcast media technology sector, is delighted to announce the winners for this year's R...
22/11/2025
Lightware, industry leader in signal management, is strengthening its Taurus UCX product family with the introduction of the new HC60 lineup. The new product li...
22/11/2025
CARSON, Calif. IDX has introduced the IDX CUE-J Series battery/charger kits, including the CUE-J98, CUE-J150 and CUE-J198....
22/11/2025
The NBA has released encouraging viewing and social media data that the beginnings of its $76 billion deal with NBC/Peacock, Prime Video and ESPN are paying off...
22/11/2025
WASHINGTON The Federal Communications Commission has set deadlines for comments on its newest proposals for NextGen TV, aka ATSC 3.0, with comments due on Jan. ...
22/11/2025
Seeking Advice for a New Opera, Laura Kaminsky Consulted the Experts: Her Studen...
21/11/2025
Platinum White Paper: Appear Shares Why Media Exchange Is the Missing Link in So...
21/11/2025
NWSL Championship 2025: CBS Sports To Deploy Two-Point FlyCam for Match Coverage...
21/11/2025
NWSL Caps 2025 Season With Awards Show, Skills Challenge ProductionsA team of 70 is on the ground in California to produce both eventsBy Mark J Burns, SVG Contr...
21/11/2025
USL and NEP Ready for Largest USL Championship Final Production EverThe broadcast from Tulsa, OK, will air CBS and TUDN on Saturday at 12 p.m. ETBy Jason Dachma...
21/11/2025
With Two New Teams, PWHL Boosts Production Workforce and Central Review for Seas...
21/11/2025
Jared Lank and his mother in the '90s...
21/11/2025
Fans have been counting down the days until the final theatrical chapter of Wicked is revealed. To celebrate the highly anticipated release of Wicked: For Good ...
21/11/2025
Last week, Spotify turned up the volume in Seoul with the return of Spotify Hous...
21/11/2025
Wiesbaden, November 21, 2025. The SGL Carbon site in Meitingen has reason to celebrate as one of its trainees received a special award. Elias Stemmer was honore...
21/11/2025
MELBOURNE, Fla., Nov. 21, 2025 - L3Harris Technologies (NYSE: LHX) has announced this year's LHX Excellence Awards, the company's most prestigious recog...
21/11/2025
WASHINGTON The Federal Communications Commission by a 3-0 vote opened a notice of proposed rulemaking (NPRM) to advance Congress's mandate to clear a minimu...
21/11/2025
WASHINGTON The Federal Communications Commission by a 3-0 vote adopted a Notice of Proposed Rulemaking (NPRM) to advance Congress's mandate to clear a minim...
21/11/2025
STAMFORD, Conn. Charter's Spectrum has expanded the devices that can offer 4K content on the Spectrum TV app to compatible Apple TV 4K and Roku devices....
21/11/2025
NAPERVILLE, Ill. Media industry employers are continuing their multiyear trend of increasing salaries for all worker segments but lag general industry raises, s...
21/11/2025
WASHINGTON The National Association of Broadcasters said it is accepting nominations for the 2026 NAB Technology Awards, honors that recognize excellence in bro...
21/11/2025
American Amplifier Technologies has released a vector network analysis module....
21/11/2025
The Best Movie Musicals on Every Streaming Platform From Wicked to The Sound of Music, heres where to stream all the classic movie musicals and recent hits on...
21/11/2025
The agreement creates a platform for joint collaboration, technology integration...
21/11/2025
Sky Media's £2m award-winning sustainability initiative crowns its first charity as this year's standout changemakerFriday 21 November 2025
GoodGym nam...
21/11/2025
As COP30 draws to a close, the International Electrotechnical Commission (IEC), ...
20/11/2025
MLB Media-Rights Shakeup: NBC's New Three-Year Deal Covers Sunday Night Bas...
20/11/2025
MLB Media-Rights Shakeup: New Deal Will Bring 30 National Games to ESPN's Li...
20/11/2025
MLB Media-Rights Shakeup: Netflix Lands Opening Night, Home Run Derby, Field of ...
20/11/2025
MLB Media-Rights Shakeup Overview: ESPN, NBCU, Netflix Ink Three-Year DealsESPN gets new 30-game package, MLB.TV; NBC extends Sunday nights; Netflix adds tentpo...
20/11/2025
SVG Students To Watch: Henry Thuss, Indiana UniversityThe Southern California product has his goals set on the front benchBy Brandon Costa, Director of Digital ...
20/11/2025
Done+Dusted's Guy Carrington on Creating the Spectacular League of Legends W...
20/11/2025
FIA Extreme H World Cup Host Broadcaster Aurora Goes Inside the Production of th...
20/11/2025
Platinum White Paper: Amagi Utilizes Cloud Production for Sports Events - Multi-...
20/11/2025
2025 Sports Broadcasting Hall of Fame: Marc Herklotz, Steady Hand Behind the Sce...
20/11/2025
NFL Deep Dive: How 32 Cameras at Each Stadium Drive Virtual Measurement, Boundar...
20/11/2025
Charlie Shackleton attends the 2025 Sundance Film Festival premiere of Zodiac K...
20/11/2025
Your playlists are personal. They're the soundtracks to your road trips, your quiet mornings, and your biggest celebrations; collections of memories and dis...
20/11/2025
Spotify, uzun s redir zerine al t T rk m zik k lt r n n ikon haline gelmi ...
20/11/2025
For the first time, Spotify has teamed up with The Hollywood Reporter to cohost ...