Sony Pixel Power calrec Sony

Lightweight Champ: NVIDIA Releases Small Language Model With State-of-the-Art Accuracy

21/08/2024

Developers of generative AI typically face a tradeoff between model size and accuracy. But a new language model released by NVIDIA delivers the best of both, providing state-of-the-art accuracy in a compact form factor.

Mistral-NeMo-Minitron 8B - a miniaturized version of the open Mistral NeMo 12B model released by Mistral AI and NVIDIA last month - is small enough to run on an NVIDIA RTX-powered workstation while still excelling across multiple benchmarks for AI-powered chatbots, virtual assistants, content generators and educational tools. Minitron models are distilled by NVIDIA using NVIDIA NeMo, an end-to-end platform for developing custom generative AI.

We combined two different AI optimization methods - pruning to shrink Mistral NeMo's 12 billion parameters into 8 billion, and distillation to improve accuracy, said Bryan Catanzaro, vice president of applied deep learning research at NVIDIA. By doing so, Mistral-NeMo-Minitron 8B delivers comparable accuracy to the original model at lower computational cost.

Unlike their larger counterparts, small language models can run in real time on workstations and laptops. This makes it easier for organizations with limited resources to deploy generative AI capabilities across their infrastructure while optimizing for cost, operational efficiency and energy use. Running language models locally on edge devices also delivers security benefits, since data doesn't need to be passed to a server from an edge device.

Developers can get started with Mistral-NeMo-Minitron 8B packaged as an NVIDIA NIM microservice with a standard application programming interface (API) - or they can download the model from Hugging Face. A downloadable NVIDIA NIM, which can be deployed on any GPU-accelerated system in minutes, will be available soon.

State-of-the-Art for 8 Billion Parameters For a model of its size, Mistral-NeMo-Minitron 8B leads on nine popular benchmarks for language models. These benchmarks cover a variety of tasks including language understanding, common sense reasoning, mathematical reasoning, summarization, coding and ability to generate truthful answers.

Packaged as an NVIDIA NIM microservice, the model is optimized for low latency, which means faster responses for users, and high throughput, which corresponds to higher computational efficiency in production.

In some cases, developers may want an even smaller version of the model to run on a smartphone or an embedded device like a robot. To do so, they can download the 8-billion-parameter model and, using NVIDIA AI Foundry, prune and distill it into a smaller, optimized neural network customized for enterprise-specific applications.

The AI Foundry platform and service offers developers a full-stack solution for creating a customized foundation model packaged as a NIM microservice. It includes popular foundation models, the NVIDIA NeMo platform and dedicated capacity on NVIDIA DGX Cloud. Developers using NVIDIA AI Foundry can also access NVIDIA AI Enterprise, a software platform that provides security, stability and support for production deployments.

Since the original Mistral-NeMo-Minitron 8B model starts with a baseline of state-of-the-art accuracy, versions downsized using AI Foundry would still offer users high accuracy with a fraction of the training data and compute infrastructure.

Harnessing the Perks of Pruning and Distillation To achieve high accuracy with a smaller model, the team used a process that combines pruning and distillation. Pruning downsizes a neural network by removing model weights that contribute the least to accuracy. During distillation, the team retrained this pruned model on a small dataset to significantly boost accuracy, which had decreased through the pruning process.

The end result is a smaller, more efficient model with the predictive accuracy of its larger counterpart.

This technique means that a fraction of the original dataset is required to train each additional model within a family of related models, saving up to 40x the compute cost when pruning and distilling a larger model compared to training a smaller model from scratch.

Read the NVIDIA Technical Blog and a technical report for details.

NVIDIA also announced this week Nemotron-Mini-4B-Instruct, another small language model optimized for low memory usage and faster response times on NVIDIA GeForce RTX AI PCs and laptops. The model is available as an NVIDIA NIM microservice for cloud and on-device deployment and is part of NVIDIA ACE, a suite of digital human technologies that provide speech, intelligence and animation powered by generative AI.

Experience both models as NIM microservices from a browser or an API at ai.nvidia.com.

See notice regarding software product information.
LINK: https://blogs.nvidia.com/blog/mistral-nemo-minitron-8b-small-language-...
See more stories from nvidia

Most recent headlines

09/11/2025

Dalet Unveils Agentic AI Media Workflows at IBC2025

Dalet today announced a transformative leap forward for media operations: Agentic Artificial Intelligence (AI) that unifies the Dalet ecosystem under one natura...

06/10/2025

France Tlvisions Wins Prestigious 2025 EBU Technology & Innovation Award in Groundbreaking Collaboration with Dalet

France T l visions, France's leading broadcaster, has received the 2025 EBU ...

13/09/2025

Cox Media Group's Misti Turnbull Inducted into the NATAS Silver Circle

ATLANTA Cox Media Group has announced that the company's vice president of news, Misty Turnbull has been inducted into the National Academy of Television Ar...

13/09/2025

Shotoku Debuts Swoop Cranes for Studio Robotics at IBC2025

AMSTERDAM Shotoku Broadcast Systems, a major developer of robotic systems, has announced plans to take studio robotics to the next level at IBC2025 by debuting ...

13/09/2025

Riedel Unveils Ultra-Light Bolero Mini Wireless Intercom...

At IBC2025 in Amsterdam, Riedel Communications unveiled Bolero Mini, the company's lightest and flattest wireless intercom beltpack to date. Designed to del...

13/09/2025

Shotoku Takes Studio Robotics to New Heights with IBC Deb...

Shotoku Broadcast Systems, the international developer of dependable, userfriendly robotic systems, is taking studio robotics to the next level at IBC 2025 with...

13/09/2025

The Bitmovin Video Developer Report 2025-26 Reveals Cost...

Bitmovin, a leading provider of video streaming solutions, today released the 9th annual Video Developer Report 2025/26, offering an in-depth look at the evolvi...

13/09/2025

Bitmovin and StreamShark Partner to Deliver High Quality...

Bitmovin, the leading provider of video streaming solutions, today announced a strategic partnership with StreamShark, the trusted video platform for enterprise...

13/09/2025

Ikegami Announces VFE-P711AD 7-inch OLED Multiformat On-C...

Ikegami has chosen IBC 2025 in Amsterdam as the launch venue for a major addition to its range of viewfinders. The new VFE-P711AD is a 7-inch high resolution OL...

13/09/2025

KitBash3D and Greyscalegorilla Announce Merger

Founder-led Merger to Fast Track R&D, Asset Library Upgrades, Tools and More; No Disruption to Pricing or Support for Users Today, KitBash3D, a pioneer in 3D a...

13/09/2025

Mavis Puts Itself at the Heart of Mobile Production

With NDI certification, Atomos integration, Grass Valley collaboration, and a new Monitor app, at this year's IBC, Mavis is showcasing a series of powerful...

13/09/2025

Creamsource Expands Vortex Family with Vortex24 Soft

Creamsource, maker of artisan LED lighting for film and television, has unveiled the Vortex24 Soft (V24S), a 1950W native soft light and the largest soft source...

13/09/2025

DAZN streams 2025 FIFA Club World Cup to billions of fans...

When international sports streaming service DAZN secured the global rights to the 2025 FIFA Club World Cup football tournament, it set out to deliver an unmatch...

13/09/2025

Riedel Communications Acquires hi human interface

Riedel Communications today announced the acquisition of hi human interface from Broadcast Solutions, bringing a powerful, vendor-agnostic control system to it...

13/09/2025

RTW chooses Calrec as technology partner for its AI ready...

Building on its long-term relationship with audio metering specialist RTW, Calrec has integrated the company's brand new TMxCore metering platform across it...

13/09/2025

Calrec unveils 48 fader Argo M at IBC2025 and demonstrate...

Calrec is expanding its family of future-ready self-contained Argo M control surfaces at IBC2025, with the addition of a brand new powerful 48-fader console. Co...

13/09/2025

SKY Perfect Modernizes Playout-to-Delivery with Harmonic

Harmonic's Software-Based XOS Advanced Media Processor Provides Unparalleled Efficiency and Unlocks New Business Models SAN JOSE, Calif. - Sept. 13, 2025 -...

13/09/2025

September 11, 2025

Researchers find brain region that fuels compulsive drinking Study by Scripps Research scientists shows how the brain learns to seek alcohol for relief, not jus...

12/09/2025

College Football Kickoff 2025: Fox Sports Ups Look as Canon, Sony Power Shallow Focus Coverage

College Football Kickoff 2025: Fox Sports Ups Look as Canon, Sony Power Shallow ...

12/09/2025

ABC/ESPN Excited For WNBA Postseason Coverage In Revamped Format

ABC/ESPN Excited For WNBA Postseason Coverage In Revamped FormatThe Finals moves to a best-of-seven series in 2025By Mark J Burns, SVG Contributor Friday, Sep...

12/09/2025

Rabbit Trap Pulsates With Folklore Dread

(L-R) Jade Croot, Rosy McEwen, and Bryn Chainey attend the 2025 Sundance Film Festival premiere of Rabbit Trap at Eccles Theatre on January 24, 2025, in Park ...

12/09/2025

Spotify's The Drop Weekly' Brings You the Week in New Releases, Straight From Our Editors

For fans, we know how important it is to stay plugged into music culture and dis...

12/09/2025

Agama and Consult Red announce RDK Accelerator integration

Link ping, Sweden and Shipley, United Kingdom, September 12, 2025 - Agama, the expert in video observability and analytics for service quality and customer expe...

12/09/2025

IBC2025 Opens for Business

IBC2025 began on Sept. 12, with exhibits and conferences running through Sept. 15 at the RAI Amsterdam Convention Center. Explore the full TV Tech coverage of t...

12/09/2025

The Best Fictional Bands (and the Artists Who Make Them Great)

The Best Fictional Bands (and the Artists Who Make Them Great) With Spinal Tap II: The End Continues hitting theaters and songs from KPop Demon Hunters ruling...

12/09/2025

Tom Baldassare Joins Advanced Systems Group

Industry veteran Tom Baldassare has joined Advanced Systems Group, LLC (ASG), a technology and services provider for media creatives and content owners, as a Se...

12/09/2025

Maxon Unveils a Brand New Look for its Growing Family of...

Maxon, maker of powerful, approachable software solutions for creators working in 2D and 3D design, motion graphics, visual effects, and more, today announced a...

12/09/2025

PlayBox Neo US Partners with AI-Media to Deliver Scalable...

PlayBox Neo, a leading provider of media playout solutions, has partnered with AI-Media, pioneering developers of AI-powered captioning technology, to integrate...

12/09/2025

Dalet Unveils Agentic AI Media Workflows at IBC2025

Dalet today announced a transformative leap forward for media operations: Agentic Artificial Intelligence (AI) that unifies the Dalet ecosystem under one natura...

12/09/2025

Keepit and Ingram Micro launch strategic sales agreement...

New alliance strengthens the IT channel in Germany and Switzerland in protecting business-critical SaaS data. Keepit, the world s only independent, cloud-nativ...

12/09/2025

Mediaset selects Fincons Group AllRights to evolve rights...

Fincons Group, an international IT business consultancy and systems integrator company with more than 40 years of experience in the market, is proud to announce...

12/09/2025

EVS Acquires XD motion

Following its acquisition of Telemetrics, EVS continues its push into robotics with an announcement at IBC2025 that it is acquiring XD motion....

12/09/2025

Televisa Executive Joins NABA Board

TORONTO The North American Broadcasters Association (NABA) has announced the appointment of Eduardo Ruiz Sanchez, deputy director, broadcast operations at Telev...

12/09/2025

Ed Miller, Former SBE President, Has Died

Ed Miller, a longtime broadcast engineer in Ohio and a former national president of the Society of Broadcast Engineers, has died....

12/09/2025

IBC2025: Dynamic HDR Gains Traction

AMSTERDAM At this year's IBC2025, the Advanced HDR by Technicolor initiative will be pushing broadcasters to adopt a more dynamic, frame-by-frame conversion...

12/09/2025

Granville opens up one last time for U&GOLD in Open All Hours: Inside Out

Feature-length retrospective from Studio Crook to air in 2026 Sir David Jason returns to the nation's favourite comedy channel, U&GOLD, for Open All Hours:...

12/09/2025

Bob Geldof to receive Lifetime Achievement Award at the Sky Arts Awards 2025

Friday 12 September 2025 The Boomtown Rats, Nyah Grace, Soweto Kinch, Royal Ballet and Madness also announced to perform at the ceremony on Tuesday Sky today ...

12/09/2025

Riedel Unveils Ultra-Light Bolero Mini Wireless Intercom Beltpack

Wuppertal September 12, 2025 Riedel Unveils Ultra-Light Bolero Mini Wireless Intercom BeltpackAt IBC2025 in Amsterdam, Riedel Communications unveiled Bolero M...

12/09/2025

Riedel Communications Acquires hi human interface

Wuppertal September 12, 2025 Riedel Communications Acquires hi human interfaceRiedel Communications today announced the acquisition of hi human interface fro...

12/09/2025

New International Crime Series Road (WT)' Explores Twisted Murders Across Borders

Back to All News New International Crime Series Road (WT)' Explores Twiste...

12/09/2025

First Look: Thai Crime Drama Everybody Loves Me When I'm Dead' Premieres October 14

Back to All News First Look: Thai Crime Drama Everybody Loves Me When I'm ...

12/09/2025

Netflix Marks 10 Years in Japan, Announces Three New Series That Will Keep You Hitting The Next Episode

Back to All News Netflix Marks 10 Years in Japan, Announces Three New Series Th...

12/09/2025

What Is CORE+ Technologyand How Does It Elevate Church Sound?

CORE+ virtually removes distortion, setting a new standard for church sound and giving worship teams the clarity and confidence they need. Read the full artic...

12/09/2025

Margot Robbie, Colin Farrell, Mary Robinson and Conor Murray amongst guests on Late Late Show season opener

The Late Late Show is back with a bang after the summer break, and Patrick Kielt...

12/09/2025

Another jam-packed weekend of live, free-to-air Sport across RT

The World Athletics Championships, Ireland v France in the Women's Rugby World Cup quarter-final, the Irish Champions Festival, and two Sports Direct Men...

12/09/2025

Katie Hannon explores the shelves of Ireland's National Archives in new series

The Records Show starts Sunday at 6.30pm on RT One and RT Player. Katie Hanno...

11/09/2025

Report: Busy Live Sports Streaming Execs Have Low-hanging Fruit' in Front of Them

Report: Busy Live Sports Streaming Execs Have Low-hanging Fruit' in Front o...

11/09/2025

Inside Game Creek Video's Big Week as Ovation, Flagship Make NFL Debuts

Inside Game Creek Video's Big Week as Ovation, Flagship Make NFL DebutsBy Ken Kerschbaumer, Editorial Director Thursday, September 11, 2025 - 7:00 am Pr...

11/09/2025

NFL Kickoff 2025: Prime Sports Starts New Season at Lambeau Field; Sets Sights on Holiday Matchups, Second-Ever Playoff Game

NFL Kickoff 2025: Prime Sports Starts New Season at Lambeau Field; Sets Sights o...