
Of around 7,000 languages in the world, a tiny fraction are supported by AI language models. NVIDIA is tackling the problem with a new dataset and models that support the development of high-quality speech recognition and translation AI for 25 European languages - including languages with limited available data like Croatian, Estonian and Maltese.
These tools will enable developers to more easily scale AI applications to support global users with fast, accurate speech technology for production-scale use cases such as multilingual chatbots, customer service voice agents and near-real-time translation services. They include:
Granary, a massive, open-source corpus of multilingual speech datasets that contains around a million hours of audio, including nearly 650,000 hours for speech recognition and over 350,000 hours for speech translation.
NVIDIA Canary-1b-v2, a billion-parameter model trained on Granary for high-quality transcription of European languages, plus translation between English and two dozen supported languages.
NVIDIA Parakeet-tdt-0.6b-v3, a streamlined, 600-million-parameter model designed for real-time or large-volume transcription of Granary's supported languages.
The paper behind Granary will be presented at Interspeech, a language processing conference taking place in the Netherlands, Aug. 17-21. The dataset, as well as the new Canary and Parakeet models, are now available on Hugging Face.
How Granary Addresses Data Scarcity To develop the Granary dataset, the NVIDIA speech AI team collaborated with researchers from Carnegie Mellon University and Fondazione Bruno Kessler. The team passed unlabeled audio through an innovative processing pipeline powered by NVIDIA NeMo Speech Data Processor toolkit that turned it into structured, high-quality data.
This pipeline allowed the researchers to enhance public speech data into a usable format for AI training, without the need for resource-intensive human annotation. It's available in open source on GitHub.
With Granary's clean, ready-to-use data, developers can get a head start building models that tackle transcription and translation tasks in nearly all of the European Union's 24 official languages, plus Russian and Ukrainian.
For European languages underrepresented in human-annotated datasets, Granary provides a critical resource to develop more inclusive speech technologies that better reflect the linguistic diversity of the continent - all while using less training data.
The team demonstrated in their Interspeech paper that, compared to other popular datasets, it takes around half as much Granary training data to achieve a target accuracy level for automatic speech recognition (ASR) and automatic speech translation (AST).
Tapping NVIDIA NeMo to Turbocharge Transcription The new Canary and Parakeet models offer examples of the kinds of models developers can build with Granary, customized to their target applications. Canary-1b-v2 is optimized for accuracy on complex tasks, while parakeet-tdt-0.6b-v3 is designed for high-speed, low-latency tasks.
By sharing the methodology behind the Granary dataset and these two models, NVIDIA is enabling the global speech AI developer community to adapt this data processing workflow to other ASR or AST models or additional languages, accelerating speech AI innovation.
Canary-1b-v2, available under a permissive license, expands the Canary family's supported languages from four to 25. It offers transcription and translation quality comparable to models 3x larger while running inference up to 10x faster.
https://blogs.nvidia.com/wp-content/uploads/2025/08/Canary-demo.mp4
NVIDIA NeMo, a modular software suite for managing the AI agent lifecycle, accelerated speech AI model development. NeMo Curator, part of the software suite, enabled the team to filter out synthetic examples from the source data so that only high-quality samples were used for model training. The team also harnessed the NeMo Speech Data Processor toolkit for tasks like aligning transcripts with audio files and converting data into the required formats.
Parakeet-tdt-0.6b-v3 prioritizes high throughput and is capable of transcribing 24-minute audio segments in a single inference pass. The model automatically detects the input audio language and transcribes without additional prompting steps.
Both Canary and Parakeet models provide accurate punctuation, capitalization and word-level timestamps in their outputs.
Read more on GitHub and get started with Granary on Hugging Face.
Most recent headlines
06/10/2025
France T l visions, France's leading broadcaster, has received the 2025 EBU ...
04/09/2025
Monumental Sports & Entertainment (MSE), in collaboration with Dalet, has been a...
15/08/2025
Arlen Borrego Miranda Awarded 2025 Prodigy Scholarship by Latin Grammy Cultural ...
15/08/2025
WASHINGTON The pay TV and telco industry-backed American Television Alliance told staffers at the Federal Communications Commission's Media Bureau that the ...
15/08/2025
PHILADELPHIA and SUWANEE, Ga. Hisense and Xumo, the streaming joint venture between Comcast and Charter Communications, have announced the launch of Hisense Cha...
15/08/2025
SURREY, U.K. Mark Roberts Motion Control (MRMC) has launched the Cinebot Nano, a motion control robot designed to make professional-grade camera movement more a...
15/08/2025
Of around 7,000 languages in the world, a tiny fraction are supported by AI lang...
14/08/2025
(L-R) Clay Pateneaude, Tabatha Zimiga, Porshia Zimiga, director Kate Beecroft, Leanna Shumpert, Jesse Thorson, and Jennifer Ehle attend the premiere of East o...
14/08/2025
Colombia siempre ha sido potencia en innovaci n musical! Desde la revoluci n del reggaet n en Medell n, pasando por las escenas indie y rockeras en Bogot , y el...
14/08/2025
Colombia has long been a powerhouse of musical innovation. From Medell n's reggaet n revolution to Bogot 's indie and rock scenes and Cali's salsa l...
14/08/2025
As the mercury rises and the days stretch long, Spotify has been turning up the vibrant sounds of South America, Central America, and the Caribbean with our Ver...
14/08/2025
Statement on the passing of David Stratton
14 August, 2025
Insights & articles
Statement by SBS Managing Director, James Taylor
SBS is deeply saddened by...
14/08/2025
The best project was the Olympic Games ever since I was young, I have been passionate about sport and to have the opportunity to be in the Stade de France for ...
14/08/2025
MONTREAL Grass Valley today announced that Sky Network Television, New Zealand's largest pay-TV provider, has chosen Grass Valley' AMPP to overhaul its ...
14/08/2025
MIAMI Telemundo today debuts Telemundo Deportes Ahora, a 24/7 Spanish-language sports FAST channel, on Peacock, Xumo Play, the NBC News FAST hub and Telemundo.c...
14/08/2025
DENVER, Colo. Sports entertainment platform DAZN relied on the MediaKind MK.IO elastic, cloud-native streaming platform to support high-quality streaming of the...
14/08/2025
WASHINGTON The pay-TV and telco industry-backed American Television Alliance told staffers at the Federal Communications Commission's Media Bureau that the ...
14/08/2025
Kevin Trueblood, current vice president of the Society of Broadcast Engineers has been elected president of the national board for the association for broadcast...
14/08/2025
Performing arts venues have become a destination for memorable entertainment experiences. Technological advancements have helped elevate the in-venue experience...
14/08/2025
The Alliance for IP Media Solutions (AIMS), Advanced Media Workflow Association (AMWA), and the Video Services Forum (VSF) today announced the full IP Showcase ...
14/08/2025
SMPTE , the home of media professionals, technologists, and engineers, today announced its lineup of sessions and show highlights for IBC2025, Sept. 12-15, in A...
14/08/2025
Digital Alert Systems, a global leader in emergency communications solutions for media providers, today announced the promotion of Adam Jones to the position of...
14/08/2025
Clear-Com will showcase a full lineup of intercom innovations for broadcast professionals at IBC2025, taking place September 12-15 in Amsterdam, NL. On display...
14/08/2025
VisualOn, a leader in advanced video optimization technology, today announced a strategic partnership with Anyscreen, a provider of comprehensive live and VOD s...
14/08/2025
STAMFORD, Conn., and LIBERTY CORNER, N.J. NBCUniversal has extended its media rights agreement with the United States Golf Association (USGA) through 2032. Its ...
14/08/2025
BOTHELL, Wash. The Alliance for IP Media Solutions (AIMS), Advanced Media Workflow Association (AMWA) and the Video Services Forum (VSF) have released the full ...
14/08/2025
WASHINGTON As the Federal Aviation Administration considers a notice of proposed rulemaking on regulations governing the use of unmanned aircraft vehicles (UAVs...
14/08/2025
LONDON and NEW YORK Charter Communication's ad sales unit Spectrum Reach and tvbeat have announced that they are working together to bring tvbeat's prog...
14/08/2025
BURLINGTON, Mass. LG Electronics is now using a multi-lingual text-to-speech solution from Cerence AI to enable its TVs to deliver information via spoken comman...
14/08/2025
The free-ad-support streaming platform, Samsung TV Plus, has unveiled a major upgrade to its user interface and search capabilities, with a new interface, smart...
14/08/2025
LOS ANGELES Hoping to tap into the popularity of short form content, Cineverse and Banyan Ventures, the venture arm of former ABC Entertainment Group and WME Ch...
14/08/2025
MONTREAL Matrox Video will debut its ORIGIN Fabric, designed for developers to share content among media applications using the most efficient available connect...
14/08/2025
Graduate Spotlight: Sofija Zlatanova The educator, who grew up in North Macedonia, shares how her Berklee journey went from viola performance to music educati...
14/08/2025
14 Aug 2025
Ahead of Its Historic Listing on Nasdaq, Kyivstar Group Completes B...
14/08/2025
German Innovation: An In-Depth Look at the DFL's 10K VR Stream of the Superc...
14/08/2025
Esports World Cup 2025: Music Is in the Sport's DNA Entertainment is intrinsic with esports, blending its passion and diversity By Dan Daley, Audio Editor ...
14/08/2025
Stress test: SailGP embraces real-time cloud workflows for Portsmouth event By Joe OHalloran
Tuesday, August 12, 2025 - 10:15
Print This Story
Emirates Gr...
14/08/2025
SVG New Sponsor Spotlight: EZDRM's Olga Kornienko on Digital Rights Manageme...
14/08/2025
Esports World Cup 2025: Inside EFG's Hybrid Production For the complex operation, the emphasis was on scale, integration, localization' By Mark J Burn...
14/08/2025
Esports World Cup: IMG puts a Spotlight on innovation as esports goes mainstream...
14/08/2025
Thursday 14 August 2025
To view this content, please enable our use of cookies. To do so, click Privacy Options below and Accept All
Privacy options
Full Tra...
14/08/2025
Thursday 14 August 2025
All Her Fault starring Sarah Snook, Dakota Fanning, Jak...
14/08/2025
The next wave of test and measurement: Rohde & Schwarz to highlight its solution...
14/08/2025
Back to All News
Netflix More Than Doubles US Upfront Commitment and Secures Gl...
14/08/2025
Deploying Scalable, Hybrid Cloud Solutions to Deliver High-Quality UHD Content and Accelerate Channel Growth
MONTREAL, CANADA - August 14, 2025 - Grass Valley,...
14/08/2025
Toronto-based integrated solutions provider to drive Cinegy's software-defined television technology across Canada's media and entertainment sector
Mun...
14/08/2025
NVIDIA is partnering with the U.S. National Science Foundation (NSF) to create a...
14/08/2025
Warhammer 40,000: Dawn of War - Definitive Edition is marching onto GeForce NOW,...
13/08/2025
Spotify and Kobalt, the world's largest independent music publisher, have entered into a direct, multiyear licensing agreement designed to deliver greater f...
13/08/2025
A vibrant blend of pop, hip-hop, R&B, electronic, and rock, K-Pop is more than just a genre in Mexico-it's a way of life. With more than 14 million K-Pop fa...