Sony Pixel Power calrec Sony

Speech AI Expands Global Reach With Telugu Language Breakthrough

02/12/2022

More than 75 million people speak Telugu, predominantly in India's southern regions, making it one of the most widely spoken languages in the country.

Despite such prevalence, Telugu is considered a low-resource language when it comes to speech AI. This means there aren't enough hours' worth of speech datasets to easily and accurately create AI models for automatic speech recognition (ASR) in Telugu.

And that means billions of people are left out of using ASR to improve transcription, translation and additional speech AI applications in Telugu and other low-resource languages.

To build an ASR model for Telugu, the NVIDIA speech AI team turned to the NVIDIA NeMo framework for developing and training state-of-the-art conversational AI models. The model won first place in a competition conducted in October by IIIT-Hyderabad, one of India's most prestigious institutes for research and higher education.

NVIDIA placed first in accuracy for both tracks of the Telugu ASR Challenge, which was held in collaboration with the Technology Development for Indian Languages program and India's Ministry of Electronics and Information Technology as a part of its National Language Translation Mission.

For the closed track, participants had to use around 2,000 hours of a Telugu-only training dataset provided by the competition organizers. And for the open track, participants could use any datasets and pretrained AI models to build the Telugu ASR model.

NVIDIA NeMo-powered models topped the leaderboards with a word error rate of approximately 13% and 12% for the closed and open tracks, respectively, outperforming by a large margin all models built on popular ASR frameworks like ESPnet, Kaldi, SpeechBrain and others.

What sets NVIDIA NeMo apart is that we open source all of the models we have - so people can easily fine-tune the models and do transfer learning on them for their use cases, said Nithin Koluguri, a senior research scientist on the conversational AI team at NVIDIA. NeMo is also one of the only toolkits that supports scaling training to multi-GPU systems and multi-node clusters.

Building the Telugu ASR Model The first step in creating the award-winning model, Koluguri said, was to preprocess the data.

Koluguri and his colleague Megh Makwana, an applied deep learning solution architect manager at NVIDIA, removed invalid letters and punctuation marks from the speech dataset that was provided for the closed track of the competition.

Our biggest challenge was dealing with the noisy data, Koluguri said. This is when the audio and the transcript don't match - in this case you cannot guarantee the accuracy of the ground-truth transcript you're training on.

The team cleaned up the audio clips by cutting them to be less than 20 seconds, chopped out clips of less than 1 second and removed sentences with a greater-than-30 character rate, which measures characters spoken per second.

Makwana then used NeMo to train the ASR model for 160 epochs, or full cycles through the dataset, which had 120 million parameters.

For the competition's open track, the team used models pretrained with 36,000 hours of data on all 40 languages spoken in India. Fine-tuning this model for the Telugu language took around three days using an NVIDIA DGX system, according to Makwana.

Inference test results were then shared with the competition organizers. NVIDIA won with around 2% better word error rates than the second-place participant. This is a huge margin for speech AI, according to Koluguri.

The impact of ASR model development is very high, especially for low-resource languages, he added. If a company comes forward and sets a baseline model, as we did for this competition, people can build on top of it with the NeMo toolkit to make transcription, translation and other ASR applications more accessible for languages where speech AI is not yet prevalent.

NVIDIA Expands Speech AI for Low-Resource Languages ASR is gaining a lot of momentum in India majorly because it will allow digital platforms to onboard and engage with billions of citizens through voice-assistance services, Makwana said.

And the process for building the Telugu model, as outlined above, is a technique that can be replicated for any language.

Of around 7,000 world languages, 90% are considered to be low resource for speech AI - representing 3 billion speakers. This doesn't include dialects, pidgins and accents.

Open sourcing all of its models on the NeMo toolkit is one way NVIDIA is improving linguistic inclusion in the field of speech AI.

In addition, pretrained models for speech AI, as part of the NVIDIA Riva software development kit, are now available in 10 languages - with many additions planned for the future.

And NVIDIA last month hosted its inaugural Speech AI Summit, featuring speakers from Google, Meta, Mozilla Common Voice and more. Learn more about Unlocking Speech AI Technology for Global Language Users by watching the presentation on demand.

Get started building and training state-of-the-art conversational AI models with NVIDIA NeMo.
LINK: https://blogs.nvidia.com/blog/2022/12/02/speech-ai-telugu-language-bre...
See more stories from nvidia

Most recent headlines

06/10/2025

France Tlvisions Wins Prestigious 2025 EBU Technology & Innovation Award in Groundbreaking Collaboration with Dalet

France T l visions, France's leading broadcaster, has received the 2025 EBU ...

04/09/2025

Monumental Sports & Entertainment and Dalet Win Prestigious 2025 NAB Show Project of the Year Award

Monumental Sports & Entertainment (MSE), in collaboration with Dalet, has been a...

07/08/2025

Tata Motors & Dolby Bring Dolby Atmos to Harrier.ev, Redefining In-Car Entertainment Experience

July 8 2025, 22:30 (PDT) Tata Motors & Dolby Bring Dolby Atmos to Harrier.ev, R...

12/07/2025

TV Station Groups Launch Texas Flood Relief Efforts

As the death toll continues to mount, with at least 120 killed and more than 170 people still missing on July 10 from devastating Texas floods, a number of broa...

12/07/2025

DirecTV Adds ViX Premium With Ads to MiEspaol Genre Pack

EL SEGUNDO, Calif., and MIAMI -DirecTV and TelevisaUnivision have signed a deal that will make the ad-supported premium subscription tier of ViX, ViX Premium wi...

11/07/2025

2025 Sundance Institute Producers Lab Fellows Announced

PARK CITY, UTAH, July 11, 2025 - The nonprofit Sundance Institute announced today the 11 producers chosen for its annual Producers Labs, returning to Ucross Fou...

11/07/2025

Superman' Filmmaker James Gunn Reveals the Playlists Inspired by Your Favorite Characters

If you've ever wondered what might be playing in Clark Kent's headphones...

11/07/2025

L3Harris Delivers First P-8A Poseidon Aircraft to US Navy

L3Harris Technologies President of Intelligence, Surveillance and Reconnaissance Jason Lambert and General Manager of L3Harris Waco facility Sean Ling held a ce...

11/07/2025

WETA Launches WETA+ Free Streaming Service

ARLINGTON, Va. WETA, the flagship public media station in the national capital area, has launched WETA+, a new streaming service tailored for the local Washingt...

11/07/2025

TV Tech's Top Regulatory Stories of 2025

The Federal Communications Commission has emerged as one of the central players in the broadcast TV landscape in 2025, with its deregulatory policies sparking h...

11/07/2025

Calrec to Feature Suite of Interconnected Audio Solutions at IBC2025

Calrec will introduce usability, customization and system enhancements across its entire range of Argo consoles during IBC2025, Sept. 12-15, at the RAI Amsterda...

11/07/2025

Encompass Supports DAZN's Coverage of 2025 FIFA Club World Cup

LONDON Encompass Digital Media said it will support live and on-demand viewing of the 2025 FIFA Club World Cup across multiple global regions for sports enterta...

11/07/2025

SBE Survey: Certified Broadcast Engineers Earn More

Two-thirds of broadcast engineers reaped the benefits of a pay raise within the last year....

11/07/2025

SmallHD Unveils Quantum 27 OLED Monitor

CARY, N.C. SmallHD has launched the Quantum 27, a new 26.5-inch Quantum-Dot OLED monitor designed to deliver postproduction image quality in a compact, set-frie...

11/07/2025

Tegna Will Pay $225K to Settle FCC Investigation

The Federal Communications Commission's Enforcement Bureau and Tegna have entered into a consent decree that will settle an investigation into the accidenta...

11/07/2025

Sens. Markey, Lujn Again Call for FCC Vote on Paramount-Skydance Merger

WASHINGTON Following news in early July that Paramount had settled President Donald Trump's lawsuit, Sens. Edward J. Markey (D-Mass.) and Ben Ray Luj n (D-N...

11/07/2025

Model/Actriz Performs Lead Single Cinderella on The Late Show with Stephen Colbert

Model/Actriz Performs Lead Single Cinderella on The Late Show with Stephen Colbe...

11/07/2025

Behind the Mic: Amazon Prime Preps for First Season of NBA Action; MSG Networks Adjusts Broadcast Booths for Rangers, Devils

Behind the Mic: Amazon Prime Preps for First Season of NBA Action; MSG Networks ...

11/07/2025

SVG New Sponsor Spotlight: Suite Studios' Craig Hering on Adapting to Clients' Needs With Scalable Cloud-Based Storage

SVG New Sponsor Spotlight: Suite Studios' Craig Hering on Adapting to Client...

11/07/2025

2025 SVG Content Management Forum Breaks Down AI's Impact, Continued Transition to the Cloud

2025 SVG Content Management Forum Breaks Down AI's Impact, Continued Transit...

11/07/2025

A Journey HOME: University of Nebraska's HuskerVision Goes IP

A Journey HOME: University of Nebraska's HuskerVision Goes IP Leaders from the HuskerVision and Lawo share their IP learnings By SVG Staff Friday, July 1...

11/07/2025

CMSI, Remote Picture Labs, Ace ESPN's Cloud-Based Editing Efforts for Wimbledon

CMSI, Remote Picture Labs, Ace ESPN's Cloud-Based Editing Efforts for Wimble...

11/07/2025

Netflix Enters the Live-Boxing-Production Ring for Round 2 With Historic Taylor-Serrano 3 Card at MSG

Netflix Enters the Live-Boxing-Production Ring for Round 2 With Historic Taylor-...

11/07/2025

'Too Hot to Handle: Italy' Is Coming on July 18 Only on Netflix

Back to All News Too Hot to Handle: Italy Is Coming on July 18 Only on Netflix Entertainment 11 July 2025 GlobalItaly Link copied to clipboard July 11, 20...

11/07/2025

Netflix Will Release 'Death Inc.' Seasons 1, 2 and 3

Back to All News Netflix Will Release Death Inc. Seasons 1, 2 and 3 Entertainment 11 July 2025 GlobalSpain Link copied to clipboard Season 1 Season 2 Se...

11/07/2025

AI and Multimedia Authenticity Standards Collaboration

AI and Multimedia Authenticity Standards Collaboration launches two papers to guide the future of AI integration, today at the AI for Good Global Summit The...

11/07/2025

A Gaming GPU Helps Crack the Code on a Thousand-Year Cultural Conversation

Ceramics - the humble mix of earth, fire and artistry - have been part of a global conversation for millennia. From Tang Dynasty trade routes to Renaissance pa...

10/07/2025

Taliban repression silences award-winning Afghan woman journalist

The current holder of the prestigious Thomson Foundation Young Journalist of the Year Award has been forced to stop reporting over fears for her safety in Afgha...

10/07/2025

A New Report Puts Aussie Artists in the Spotlight

Spotify is turning up the volume on Australian music with a multipronged initiative designed to highlight the dominance of Australian artists on the global stag...

10/07/2025

Spotify Toasts Oasis's Return With Exclusive Ways for Fans to Get Closer to the Music That Defined a Generation

This is not a drill: Oasis is back on the road-marking its first live performanc...

10/07/2025

Spotify and Youth Music Open Doors for Future Artists

The music industry depends on fresh ideas, bold voices, and emerging talent. Yet across the U.K., too many young musicians lack the space to develop their craft...

10/07/2025

Nielsen Appoints Richard Pacheco as Head of Global Partnerships

NEW YORK - July 10, 2025 - Nielsen, the global leader in audience measurement, data and analytics, today announced that it appointed Richard Pacheco as head of ...

10/07/2025

Sponsored: Robotic Deployments Are Transforming Local News

Local newscasts don't exist in a vacuum. News directors and station management constantly evaluate what's working, what isn't and perhaps most impor...

10/07/2025

Stuttgart Media University Upgrades Studio with Lawo mc56

Lawo has announced that Stuttgart Media University (Hochschule der Medien, HdM) has comprehensively modernized its central recording studio after selecting an I...

10/07/2025

SMPTE Opens Early Bird Registration for Media Technology Summit

The Society of Motion Picture and Television Engineers (SMPTE) has opened early-bird registration for the Media Technology Summit, which will take place in a ne...

10/07/2025

TNDV Television Launches Aspiration 35 to Support Cinematic Workflows

NASHVILLE, Tenn. TNDV Television has launched Aspiration 35, a new version of its 40-foot Aspiration truck reimagined for cinematic multicamera productions....

10/07/2025

Key Code Education Launches Beginner, Intermediate Training Courses

BURBANK, Calif. Key Code Education, a provider of instructor-led postproduction training, is growing its curriculum with new programs for beginner and intermedi...

10/07/2025

Actus Digital to Show Actus X Intelligent Monitoring With AI at IBC2025

HACKENSACK, N.J. Actus Digital will demonstrate how broadcasters can transform compliance monitoring from a necessary expense into a strategic revenue driver at...

10/07/2025

Comments on FCC Ownership Rules Due in August

The Federal Register has published a summary of the Federal Communications Commission's Public Notice seeking comments on its ownership rules that lists a d...

10/07/2025

Netflix Presents the Official Trailer for 'Superestar'

Back to All News Netflix Presents the Official Trailer for SuperestarPlay Video Play Video Entertainment 10 July 2025 GlobalSpain Link copied to clipboard...

10/07/2025

From Terabytes to Turnkey: AI-Powered Climate Models Go Mainstream

In the race to understand our planet's changing climate, speed and accuracy are everything. But today's most widely used climate simulators often strugg...

10/07/2025

Indonesia on Track to Achieve Sovereign AI Goals With NVIDIA, Cisco and IOH

As one of the world's largest emerging markets, Indonesia is making strides toward its Golden 2045 Vision - an initiative tapping digital technologies and...

10/07/2025

VEON and Cohen Circle Secure Investor Commitments for Kyivstar Listing

10 Jul 2025 VEON and Cohen Circle Secure Investor Commitments for Kyivstar Listing Kyiv, New York, Dubai, and Philadelphia - July 10, 2025 - VEON Ltd. (Nasdaq:...

10/07/2025

5G for All? What the DFL's Use of Easy5G, RefCam Could Mean for Events in the Future

5G for all? What the DFL's use of Easy5G and RefCam could mean for events in...

10/07/2025

Save the Date: PGA TOUR Studios Welcomes SVG Remote Production Summit on Oct 14-15

Save the Date: PGA TOUR Studios Welcomes SVG Remote Production Summit on Oct 14-...

10/07/2025

Cloud on the Road: How Remote-Production-Service Providers Are Adapting to a New Era

Cloud on the Road: How Remote-Production-Service Providers Are Adapting to a New...

10/07/2025

Seattle Kraken's Ryan Schaber on the NHL Team Taking Live Game Productions In-House

Seattle Kraken's Ryan Schaber on the NHL Team Taking Live Game Productions I...

10/07/2025

FOX Sports Reboots Small Control Room in Los Angeles as Hub for Vertical-First Production

FOX Sports Reboots Small Control Room in Los Angeles as Hub for Vertical-First P...

10/07/2025

SVG Sit-Down: MSE's Zach Leonsis, ViewLift's Rick Allen Go Deep on Joint Venture Targeting Local-Sports-Media Market

SVG Sit-Down: MSE's Zach Leonsis, ViewLift's Rick Allen Go Deep on Joint...