Sony Pixel Power calrec Sony

Speech AI Expands Global Reach With Telugu Language Breakthrough

02/12/2022

More than 75 million people speak Telugu, predominantly in India's southern regions, making it one of the most widely spoken languages in the country.

Despite such prevalence, Telugu is considered a low-resource language when it comes to speech AI. This means there aren't enough hours' worth of speech datasets to easily and accurately create AI models for automatic speech recognition (ASR) in Telugu.

And that means billions of people are left out of using ASR to improve transcription, translation and additional speech AI applications in Telugu and other low-resource languages.

To build an ASR model for Telugu, the NVIDIA speech AI team turned to the NVIDIA NeMo framework for developing and training state-of-the-art conversational AI models. The model won first place in a competition conducted in October by IIIT-Hyderabad, one of India's most prestigious institutes for research and higher education.

NVIDIA placed first in accuracy for both tracks of the Telugu ASR Challenge, which was held in collaboration with the Technology Development for Indian Languages program and India's Ministry of Electronics and Information Technology as a part of its National Language Translation Mission.

For the closed track, participants had to use around 2,000 hours of a Telugu-only training dataset provided by the competition organizers. And for the open track, participants could use any datasets and pretrained AI models to build the Telugu ASR model.

NVIDIA NeMo-powered models topped the leaderboards with a word error rate of approximately 13% and 12% for the closed and open tracks, respectively, outperforming by a large margin all models built on popular ASR frameworks like ESPnet, Kaldi, SpeechBrain and others.

What sets NVIDIA NeMo apart is that we open source all of the models we have - so people can easily fine-tune the models and do transfer learning on them for their use cases, said Nithin Koluguri, a senior research scientist on the conversational AI team at NVIDIA. NeMo is also one of the only toolkits that supports scaling training to multi-GPU systems and multi-node clusters.

Building the Telugu ASR Model The first step in creating the award-winning model, Koluguri said, was to preprocess the data.

Koluguri and his colleague Megh Makwana, an applied deep learning solution architect manager at NVIDIA, removed invalid letters and punctuation marks from the speech dataset that was provided for the closed track of the competition.

Our biggest challenge was dealing with the noisy data, Koluguri said. This is when the audio and the transcript don't match - in this case you cannot guarantee the accuracy of the ground-truth transcript you're training on.

The team cleaned up the audio clips by cutting them to be less than 20 seconds, chopped out clips of less than 1 second and removed sentences with a greater-than-30 character rate, which measures characters spoken per second.

Makwana then used NeMo to train the ASR model for 160 epochs, or full cycles through the dataset, which had 120 million parameters.

For the competition's open track, the team used models pretrained with 36,000 hours of data on all 40 languages spoken in India. Fine-tuning this model for the Telugu language took around three days using an NVIDIA DGX system, according to Makwana.

Inference test results were then shared with the competition organizers. NVIDIA won with around 2% better word error rates than the second-place participant. This is a huge margin for speech AI, according to Koluguri.

The impact of ASR model development is very high, especially for low-resource languages, he added. If a company comes forward and sets a baseline model, as we did for this competition, people can build on top of it with the NeMo toolkit to make transcription, translation and other ASR applications more accessible for languages where speech AI is not yet prevalent.

NVIDIA Expands Speech AI for Low-Resource Languages ASR is gaining a lot of momentum in India majorly because it will allow digital platforms to onboard and engage with billions of citizens through voice-assistance services, Makwana said.

And the process for building the Telugu model, as outlined above, is a technique that can be replicated for any language.

Of around 7,000 world languages, 90% are considered to be low resource for speech AI - representing 3 billion speakers. This doesn't include dialects, pidgins and accents.

Open sourcing all of its models on the NeMo toolkit is one way NVIDIA is improving linguistic inclusion in the field of speech AI.

In addition, pretrained models for speech AI, as part of the NVIDIA Riva software development kit, are now available in 10 languages - with many additions planned for the future.

And NVIDIA last month hosted its inaugural Speech AI Summit, featuring speakers from Google, Meta, Mozilla Common Voice and more. Learn more about Unlocking Speech AI Technology for Global Language Users by watching the presentation on demand.

Get started building and training state-of-the-art conversational AI models with NVIDIA NeMo.
LINK: https://blogs.nvidia.com/blog/2022/12/02/speech-ai-telugu-language-bre...
See more stories from nvidia

North America Stories

24/04/2024

PlayBox Neo Highlights Latest Advances in Smart Media Pla...

PlayBox Neo reports a successful and well attended 2024 NAB Show, held in mid April at the Las Vegas Convention Center. Now in its 101st year, the event attract...

24/04/2024

MPEG-5 LCEVC Showcase at NAB 2024 demonstrates ecosystem...

MPEG-5 LCEVC ecosystem support continues to grow and includes TVs and set-top boxes, gearing up for the deployment of TV3.0 in Brazil New, never-before-seen ...

24/04/2024

Black Box and NOVO at Integrate Middle East 2024 Featurin...

At Integrate Middle East 2024 in Booth AR-F45, where the company is exhibiting along with its partner NOVO, Black Box will feature IP-based KVM and AV solution...

24/04/2024

NEP Belgium Taps manifold CLOUD in Virtual OB Setup for Live Sports

FRANKFURT, Germany Software solutions provider manifold technologies has announced that its manifold CLOUD was selected by NEP Belgium to provide a signal proce...

24/04/2024

Integral Ad Science Achieves MRC Accreditation

NEW YORK Integral Ad Science has announced that it has earned Media Rating Council (MRC) accreditation for its sophisticated invalid traffic (SIVT) filtration f...

24/04/2024

Plex Launches NFL Channel

LOS GATOS, Calif. Streaming media company Plex has announced a partnership with the National Football League that will add the NFL Channel, the NFLs 24/7 free l...

24/04/2024

NAB Extends Contract for President & CEO Curtis LeGeyt

WASHINGTON, D.C. NAB President and CEO Curtis LeGeyt has agreed to a contract extension that will keep him at the helm of NAB through 2029, according to NAB Joi...

24/04/2024

GlobalM, scoopa Form Strategic Partnership To Enable Speedy Content Distribution

GENEVA, Switzerland Software-defined video network (SDVN) platform specialist GlobalM and broadcast-quality content distribution expert scoopa have formed a str...

24/04/2024

Berklee Online to Hold 2024 Berklee Onsite Music Conference May 31-June 1

Berklee Online to Hold 2024 Berklee Onsite Music Conference May 31-June 1 The in-person conference is open to Berklee Online students and the music community ...

24/04/2024

Kargo Gives Buyers More Detail on CTV Campaigns

Kargo said it has launched a new product that gives advertisers more details about the connected TV campaigns they are buying....

24/04/2024

MundoNow Connect Launches To Help Brands Target Hispanics on CTV

Mundo Hispano Digital said it has launched MundoNow Connect, which is designed to use first-party audience data to help advertisers reach Latino connected-TV vi...

24/04/2024

Brittney Griner Sits With Robin Roberts for 20/20' Special

Brittney Griner, the WNBA star who spent 10 months in a Russian prison, sits with Robin Roberts for a primetime special on ABC Wednesday, May 1. The 20/20 speci...

24/04/2024

DirecTV Adds Content From Cineverse, Scripps

To give its subscribers more content to pick from, DirecTV Stream is making deals to carry streaming channels from Cineverse and additional digital multicast ch...

24/04/2024

Jenny Padura Named Anchor at Univision's WLTV Miami

Jenny Padura has been promoted to co-anchor on the 6 and 11 p.m. newscasts at WLTV Miami, known as Univision 23. She starts April 25 and will anchor with Ambros...

24/04/2024

Paris Schutz Joins WFLD Chicago as Political Reporter

Paris Schutz has joined WFLD Chicago, known as Fox 32, as a political reporter. He starts April 29 and will handle all political coverage for the station, repor...

24/04/2024

Plex Adds NFL Channel Ahead of Draft Coverage

Plex has made a deal with the National Football League that will let users watch the NFL Channel, the league's free ad-supported streaming television (FAST)...

24/04/2024

Schwarzenegger, Stallone Discuss Rivalry on TMZ Presents'

Arnold Schwarzenegger and Sylvester Stallone sit for a chat when TMZ Presents: Arnold & Sly: Rivals, Friends, Icons debuts on Fox Tuesday, April 23. The pair di...

24/04/2024

Lionsgate Signs Long-Term Deal With TV Production Head Gary Goodman

Lionsgate Television said it signed a new multiyear employment agreement with executive VP, worldwide production, Gary Goodman....

24/04/2024

Miguelangel Lopez Upped to Regional VP of News at Telemundo Florida Trio

Miguelangel Lopez, WTMO Orlando (Florida) VP of news, has been named regional VP of news and content for WTMO, WRMD Tampa and WWDT Fort Myers, effective immedia...

23/04/2024

SUNDANCE FILM FESTIVAL: LONDON 2024 REVEALS FULL PROGRAMME LINE-UP BURSTING WITH BOLD CINEMATIC VOICES FOR 11TH EDITION

IN ADDITION TO FICTION AND DOCUMENTARY FEATURES, THE SELECTION INCLUDES: PRO...

23/04/2024

IBC Now Accepting Nominations for 2024 Innovation Awards

LONDON IBC is now accepting nominations for its IBC2024 Innovation Awards, which honor achievements in technological innovation and social impact media & entert...

23/04/2024

NAB Backs Broadcast VOICES Act

WASHINGTON, D.C. In response to the introduction of the Broadcast VOICES Act that seeks to encourage more diversity in the ownership of TV and radio stations, t...

23/04/2024

Hemisphere Media Group Launches WAPA+ on the Roku Channel

MIAMI, Fla. As part of a push to expand its FAST channel and streaming operations, Hemisphere Media Group has launched its free ad supported television (FAST) c...

23/04/2024

DirecTV Launches New Streaming Channels from Cineverse and Scripps Networks

LOS ANGELES DirecTV is adding seven new channels to its rapidly expanding streaming lineup through new licensing agreements with Cineverse Corp. and Scripps Net...

23/04/2024

IBC Launches 2024 Innovation Awards with Nominations Now...

IBC is preparing to honour the highest achievements in technological innovation and social impact with the launch of its prestigious IBC2024 Innovation Awards, ...

23/04/2024

Bolin Technology Integrates NDI Across Product Portfolio

BREA, Calif. Bolin Technology is integrating Network Device Interface (NDI) across its product line, bringing support for the latest NDI technologies to its PTZ...

23/04/2024

Aaron LaBerge Joins PENN Entertainment at CTO

WYOMISSING, Pa. PENN Entertainment has announced that longtime Disney and ESPN executive Aaron LaBerge has been named chief technology officer effective July 1,...

23/04/2024

Nexstar Disputes FCC $1.2M Fine, Order to Sell WPIX

In response to an FCC ruling in March proposing a $1.2 million fine against Nexstar and a ruling that Mission Broadcasting sell WPIX, Nexstar has filled a respo...

23/04/2024

Sony's Ci Media Cloud Keeps NHRA Moving

Sony has announced new details about how the National Hot Rod Association (NHRA) is using Sony's Ci Media Cloud platform to streamline its workflows and spe...

23/04/2024

NiTRos M-CUBE Studio Levels Up Esports Productions with A...

Esports' rapid rise to popularity can be attributed to many factors, from technological advancements that have enhanced the quality of modern gaming experie...

23/04/2024

OpenDrives New Software-Defined Platform Atlas Wins 2024...

OpenDrives, Inc., the global provider of software-defined media workflow solutions and data management capabilities, has been selected as a winner in the 2024 N...

23/04/2024

Vipe by BCNEXXT Takes Home 2024 NAB Show Product of the Y...

BCNEXXT is pleased to announce that Vpe has won the 2024 NAB Show Product of the Year Award in the Asset Management, Automation and Playout category. The offici...

23/04/2024

Gravity Media Australia to Deliver Broadcast Technology a...

Gravity Media, a world leading global provider of complex live creative production and media services, today outlined the broadcast technology and television pr...

23/04/2024

Industry veteran Peter Brennan to lead Scality operations...

Scality, a global leader in cyber-resilient storage for the AI era, today announced the appointment of Peter Brennan as Chief Executive Officer (CEO) of Scality...

23/04/2024

Blackmagic Design Announces New ATEM Constellation 4K Swi...

Blackmagic Design today announced ATEM 1 M/E Constellation 4K and ATEM 2 M/E Constellation 4K switchers, two new Ultra HD models of the ATEM Constellation famil...

23/04/2024

Blackmagic Design Announces New Blackmagic Videohub 120x1...

Blackmagic Design today announced Blackmagic Videohub 120x120 12G, the newest model in the Videohub 12G router family. The massive Blackmagic Videohub 120x120 1...

23/04/2024

Blackmagic Design Announces New Lower Price for Blackmagi...

Blackmagic Design today announced new lower prices for Blackmagic Video Assist 3G and 12G models. Customers can save US$200 on all models. These lower prices ma...

23/04/2024

Adder Wins Best of Show TV Tech Award at NAB 2024

Adder Technology continues to shine brightly, with its groundbreaking product, the ADDERLink INFINITY 3000 (ALIF3000), once again taking center stage. Hot on t...

23/04/2024

Pliant Technologies Wins 2024 NAB Show Product of The Yea...

Pliant Technologies' CRP-C12 Compact Radio Pack is an Audio Production, Processing and Networking category winner in the 2024 NAB Show Product of the Year A...

23/04/2024

SipRadius transforms remote production at CABSAT with liv...

SipRadius, the specialists in live media transport and transcoding, will showcase their world-leading implementation of AV1 and JPEG-XS over transport streams a...

23/04/2024

Rafael Krux Makes Room For A Prism Sound Dream ADA-128 Co...

Composer Rafael Krux has added a Prism Sound Dream ADA-128 modular conversion system to his production studio where he creates music for commercials, trailers, ...

23/04/2024

PMC Loudspeakers Provide Reference Sound For Steven Wilso...

PMC proudly provides the ultimate large-scale Dolby Atmos audio system for the legendary musician Steven Wilson at the High End, Munich. He will share his exten...

23/04/2024

Particles emitting from a 3D rotating shape's edge

Particles emitting from a 3D rotating shape's edge Graham Quince April 22, 2024 0 Comments In Fixin' FX, people send me their problematic proj...

23/04/2024

Hollyland Technology Announces a New Line of Video Transmitters: Introducing the Pyro Series

Hollyland Technology Announces a New Line of Video Transmitters: Introducing the...

23/04/2024

Charlotte Deleste, WISC Madison Anchor, Gives Notice

Charlotte Deleste, anchor at WISC Madison, Wisconsin, is leaving the station. Her final day on the air is April 24. Deleste has spent 18 years there....

23/04/2024

Tegna Brings Caitlin Clark Effect To Local Broadcast With Fever Deal

Top WNBA draft choice Caitlin Clark will appear on local broadcast for 17 games next season in a deal between Tegna and the Indiana Fever....

23/04/2024

Small and Mighty: NVIDIA Accelerates Microsoft's Open Phi-3 Mini Language Models

NVIDIA announced today its acceleration of Microsoft's new Phi-3 Mini open l...

23/04/2024

Soccer's Appeal to Younger Audiences: Inside the Gamification of European Football Coverage

Soccer's Appeal to Younger Audiences: Inside the Gamification of European Fo...

23/04/2024

With NBA and NHL Playoffs in Full Swing, Never a Dull Moment for TNT Sports Production and Ops Teams

With NBA and NHL Playoffs in Full Swing, Never a Dull Moment for TNT Sports Prod...