
Of around 7,000 languages in the world, a tiny fraction are supported by AI language models. NVIDIA is tackling the problem with a new dataset and models that support the development of high-quality speech recognition and translation AI for 25 European languages - including languages with limited available data like Croatian, Estonian and Maltese.
These tools will enable developers to more easily scale AI applications to support global users with fast, accurate speech technology for production-scale use cases such as multilingual chatbots, customer service voice agents and near-real-time translation services. They include:
Granary, a massive, open-source corpus of multilingual speech datasets that contains around a million hours of audio, including nearly 650,000 hours for speech recognition and over 350,000 hours for speech translation.
NVIDIA Canary-1b-v2, a billion-parameter model trained on Granary for high-quality transcription of European languages, plus translation between English and two dozen supported languages. It tops Hugging Face's leaderboard of open models for multilingual speech recognition accuracy.
NVIDIA Parakeet-tdt-0.6b-v3, a streamlined, 600-million-parameter model designed for real-time or large-volume transcription of Granary's supported languages. It has the highest throughput of multilingual models on the Hugging Face leaderboard, measured as duration of audio transcribed divided by computation time.
The paper behind Granary will be presented at Interspeech, a language processing conference taking place in the Netherlands, Aug. 17-21. The dataset, as well as the new Canary and Parakeet models, are now available on Hugging Face.
How Granary Addresses Data Scarcity To develop the Granary dataset, the NVIDIA speech AI team collaborated with researchers from Carnegie Mellon University and Fondazione Bruno Kessler. The team passed unlabeled audio through an innovative processing pipeline powered by NVIDIA NeMo Speech Data Processor toolkit that turned it into structured, high-quality data.
This pipeline allowed the researchers to enhance public speech data into a usable format for AI training, without the need for resource-intensive human annotation. It's available in open source on GitHub.
With Granary's clean, ready-to-use data, developers can get a head start building models that tackle transcription and translation tasks in nearly all of the European Union's 24 official languages, plus Russian and Ukrainian.
For European languages underrepresented in human-annotated datasets, Granary provides a critical resource to develop more inclusive speech technologies that better reflect the linguistic diversity of the continent - all while using less training data.
The team demonstrated in their Interspeech paper that, compared to other popular datasets, it takes around half as much Granary training data to achieve a target accuracy level for automatic speech recognition (ASR) and automatic speech translation (AST).
Tapping NVIDIA NeMo to Turbocharge Transcription The new Canary and Parakeet models offer examples of the kinds of models developers can build with Granary, customized to their target applications. Canary-1b-v2 is optimized for accuracy on complex tasks, while parakeet-tdt-0.6b-v3 is designed for high-speed, low-latency tasks.
By sharing the methodology behind the Granary dataset and these two models, NVIDIA is enabling the global speech AI developer community to adapt this data processing workflow to other ASR or AST models or additional languages, accelerating speech AI innovation.
Canary-1b-v2, available under a permissive license, expands the Canary family's supported languages from four to 25. It offers transcription and translation quality comparable to models 3x larger while running inference up to 10x faster.
https://blogs.nvidia.com/wp-content/uploads/2025/08/Canary-demo.mp4
NVIDIA NeMo, a modular software suite for managing the AI agent lifecycle, accelerated speech AI model development. NeMo Curator, part of the software suite, enabled the team to filter out synthetic examples from the source data so that only high-quality samples were used for model training. The team also harnessed the NeMo Speech Data Processor toolkit for tasks like aligning transcripts with audio files and converting data into the required formats.
Parakeet-tdt-0.6b-v3 prioritizes high throughput and is capable of transcribing 24-minute audio segments in a single inference pass. The model automatically detects the input audio language and transcribes without additional prompting steps.
Both Canary and Parakeet models provide accurate punctuation, capitalization and word-level timestamps in their outputs.
Read more on GitHub and get started with Granary on Hugging Face.
North America Stories
16/08/2025
BALTIMORE Sinclair has announced that its free, over-the-air multicast networks Charge, Comet, Roar, and The Nest have concluded a series of national distributi...
16/08/2025
BOSTON EditShare will unveil its latest Ultimate EFS Nodes, optimized for high-performance media workflows at any scale, during IBC2025, Sept. 12-15, at the RAI...
16/08/2025
WASHINGTON PBS has informed public stations that it plans to cut its budget by about 21% as part of an effort to deal with the elimination of Federal funding an...
16/08/2025
ATLANTA Gray Media has named Bob Kroeger chief technology officer for the company, effective immediately. Bob has served as chief information officer for both G...
15/08/2025
Arlen Borrego Miranda Awarded 2025 Prodigy Scholarship by Latin Grammy Cultural ...
15/08/2025
WASHINGTON The pay TV and telco industry-backed American Television Alliance told staffers at the Federal Communications Commission's Media Bureau that the ...
15/08/2025
PHILADELPHIA and SUWANEE, Ga. Hisense and Xumo, the streaming joint venture between Comcast and Charter Communications, have announced the launch of Hisense Cha...
15/08/2025
SURREY, U.K. Mark Roberts Motion Control (MRMC) has launched the Cinebot Nano, a motion control robot designed to make professional-grade camera movement more a...
15/08/2025
Upgrade and Save - Now Through August 31st!
Enjoy 20% or more off all Ivory 3 and Ivory II Upgrades for a limited time.
If you've been considering unlocki...
15/08/2025
Back to All News
Netflix Celebrates Mexican Cinema Day by Announcing the Produc...
15/08/2025
Of around 7,000 languages in the world, a tiny fraction are supported by AI lang...
14/08/2025
(L-R) Clay Pateneaude, Tabatha Zimiga, Porshia Zimiga, director Kate Beecroft, Leanna Shumpert, Jesse Thorson, and Jennifer Ehle attend the premiere of East o...
14/08/2025
MONTREAL Grass Valley today announced that Sky Network Television, New Zealand's largest pay-TV provider, has chosen Grass Valley' AMPP to overhaul its ...
14/08/2025
MIAMI Telemundo today debuts Telemundo Deportes Ahora, a 24/7 Spanish-language sports FAST channel, on Peacock, Xumo Play, the NBC News FAST hub and Telemundo.c...
14/08/2025
DENVER, Colo. Sports entertainment platform DAZN relied on the MediaKind MK.IO elastic, cloud-native streaming platform to support high-quality streaming of the...
14/08/2025
WASHINGTON The pay-TV and telco industry-backed American Television Alliance told staffers at the Federal Communications Commission's Media Bureau that the ...
14/08/2025
Kevin Trueblood, current vice president of the Society of Broadcast Engineers has been elected president of the national board for the association for broadcast...
14/08/2025
Performing arts venues have become a destination for memorable entertainment experiences. Technological advancements have helped elevate the in-venue experience...
14/08/2025
The Alliance for IP Media Solutions (AIMS), Advanced Media Workflow Association (AMWA), and the Video Services Forum (VSF) today announced the full IP Showcase ...
14/08/2025
SMPTE , the home of media professionals, technologists, and engineers, today announced its lineup of sessions and show highlights for IBC2025, Sept. 12-15, in A...
14/08/2025
Digital Alert Systems, a global leader in emergency communications solutions for media providers, today announced the promotion of Adam Jones to the position of...
14/08/2025
Clear-Com will showcase a full lineup of intercom innovations for broadcast professionals at IBC2025, taking place September 12-15 in Amsterdam, NL. On display...
14/08/2025
VisualOn, a leader in advanced video optimization technology, today announced a strategic partnership with Anyscreen, a provider of comprehensive live and VOD s...
14/08/2025
STAMFORD, Conn., and LIBERTY CORNER, N.J. NBCUniversal has extended its media rights agreement with the United States Golf Association (USGA) through 2032. Its ...
14/08/2025
BOTHELL, Wash. The Alliance for IP Media Solutions (AIMS), Advanced Media Workflow Association (AMWA) and the Video Services Forum (VSF) have released the full ...
14/08/2025
WASHINGTON As the Federal Aviation Administration considers a notice of proposed rulemaking on regulations governing the use of unmanned aircraft vehicles (UAVs...
14/08/2025
LONDON and NEW YORK Charter Communication's ad sales unit Spectrum Reach and tvbeat have announced that they are working together to bring tvbeat's prog...
14/08/2025
BURLINGTON, Mass. LG Electronics is now using a multi-lingual text-to-speech solution from Cerence AI to enable its TVs to deliver information via spoken comman...
14/08/2025
The free-ad-support streaming platform, Samsung TV Plus, has unveiled a major upgrade to its user interface and search capabilities, with a new interface, smart...
14/08/2025
LOS ANGELES Hoping to tap into the popularity of short form content, Cineverse and Banyan Ventures, the venture arm of former ABC Entertainment Group and WME Ch...
14/08/2025
MONTREAL Matrox Video will debut its ORIGIN Fabric, designed for developers to share content among media applications using the most efficient available connect...
14/08/2025
Graduate Spotlight: Sofija Zlatanova The educator, who grew up in North Macedonia, shares how her Berklee journey went from viola performance to music educati...
14/08/2025
J nger Audio and Fraunhofer IIS Demonstrate IP Production Chain at SET Expo 202...
14/08/2025
German Innovation: An In-Depth Look at the DFL's 10K VR Stream of the Superc...
14/08/2025
Esports World Cup 2025: Music Is in the Sport's DNA Entertainment is intrinsic with esports, blending its passion and diversity By Dan Daley, Audio Editor ...
14/08/2025
Stress test: SailGP embraces real-time cloud workflows for Portsmouth event By Joe OHalloran
Tuesday, August 12, 2025 - 10:15
Print This Story
Emirates Gr...
14/08/2025
SVG New Sponsor Spotlight: EZDRM's Olga Kornienko on Digital Rights Manageme...
14/08/2025
Esports World Cup 2025: Inside EFG's Hybrid Production For the complex operation, the emphasis was on scale, integration, localization' By Mark J Burn...
14/08/2025
Esports World Cup: IMG puts a Spotlight on innovation as esports goes mainstream...
14/08/2025
Back to All News
Netflix More Than Doubles US Upfront Commitment and Secures Gl...
14/08/2025
Deploying Scalable, Hybrid Cloud Solutions to Deliver High-Quality UHD Content and Accelerate Channel Growth
MONTREAL, CANADA - August 14, 2025 - Grass Valley,...
14/08/2025
NVIDIA is partnering with the U.S. National Science Foundation (NSF) to create a...
14/08/2025
Warhammer 40,000: Dawn of War - Definitive Edition is marching onto GeForce NOW,...
13/08/2025
The L3Harris-built Navigation Technology Satellite-3 (NTS-3) satellite launched on a United Launch Alliance Vulcan rocket Aug. 12 from Cape Canaveral Space Forc...
13/08/2025
Sydney - August 13, 2025 - Nielsen, a global leader in audience measurement, data, and analytics, today announced the upcoming launch of Connected TV (CTV) insi...
13/08/2025
LYNDONVILLE, N.Y. Emergency communications solutions provider Digtal Alert Systems said it has promoted Adam Jones to senior account manager....
13/08/2025
BALTIMORE Sinclair has named Billy Robbins vice president of station sales operations, a newly created role for the broadcast group....
13/08/2025
STAMFORD, Conn., and LIBERTY CORNER, N.J. NBCUniversal has extended its media rights agreement with the United States Golf Association (USGA) through 2032. Its ...
13/08/2025
CHANTILLY, Va. BIA Advisory Services' has revised its 2025 U.S. Local Advertising Forecast down to $169 billion this year, reflecting a 2.4% decline compare...
13/08/2025
WHITE PLAINS, N.Y. SMPTE has announced its lineup of sessions and show highlights for IBC2025, Sept. 12-15, in Amsterdam. Show visitors will find SMPTE and oth...