Sony Pixel Power calrec Sony

How Deep Learning Is Aiding Preservation of Seneca and Other Endangered Languages

03/01/2019

Linguists estimate that at least half of the world's estimated 7,000 spoken languages will become extinct by the century's end, due to forces ranging from globalization to cultural assimilation.

Part of the challenge of documenting and revitalizing endangered languages is a lack of texts and speech recordings to work with. Seneca, a language of one of the six Iroquois Nations in North America, has only about 100 first-language speakers and several hundred more second-language learners.

Automatic speech recognition (ASR) technology is widely used to transcribe languages with millions or billions of speakers, like English and Mandarin. But it has only scratched the surface with languages like Seneca, which have vastly fewer speakers and significantly less data to work with.

Now a team of researchers at the Rochester Institute of Technology in New York, along with colleagues from the University at Buffalo, is tapping deep learning to bolster the ability of ASR. And while its focus is on Seneca, the project's vision encompasses the preservation of languages globally as well as an important part of our shared cultural history.

Knowing about different languages teaches us a lot about how our brain works, said Emily Prud'hommeaux, an assistant professor of computer science at Boston College and a research faculty member at RIT. When you document a language, you're preserving information not only about that language but also about how humans use language in general.

It's no coincidence that Prud'hommeaux and her team started with the Seneca language. Three members of the Seneca nation are part of the effort - a direct connection that is rare in research of this type, she said.

Leading the charge is Robbie Jimerson, a Ph.D. student in RIT's Golisano College of Computing and Information Science. He is a member of the Seneca Nation of Indians and is passionate about ensuring the survival of the Seneca language.

There's a big effort by the leaders of the tribe to preserve and promote our language, said Jimerson. I was looking for an opportunity to contribute.

Using GANs to Create More Language Samples Now in its third year, the project has had challenges when it comes to accumulating language data. Jimerson said the Seneca community can be guarded about what it shares with other people, so there wasn't an abundance of recordings of the language being spoken. He set out to change that.

He started by recording friends and elders who speak the language and asking them to record their friends. He found out whenever someone was speaking Seneca in public. He asked for family recordings of elders telling stories handed down from previous generations. And he grabbed any publicly available videos or recordings he could find online.

The team has fine-tuned an ASR model for Seneca, running it through generative adversarial networks to create more samples out of the limited number of recordings. The model turns wave files of the spoken language into streams of characters, while computing probability and making corrections.

The resulting data is fed into a deep learning model that in turn expands upon the ASR model's accuracy.

The team's networks run in two compute settings: on a nine-server machine learning lab running a variety of NVIDIA Tesla GPUs, and on a university cluster of large servers, each running 10 NVIDIA Tesla P4 GPUs. Each cluster runs a range of deep learning frameworks such as TensorFlow and Caffe.

The computer engineering cluster is for all students in the computer engineering department, and so they have to compete' for these resources, said Ray Ptucha, assistant professor of computer engineering at RIT, another collaborator on this project.

With access to these clusters at a premium, Jimerson tests code and checks the stability of models on a local machine running an NVIDIA TITAN X rather than inconvenience other students by running a model that might crash.

Achieving Better Accuracy So far, the team's efforts have brought the word error rate of its ASR model from 70 percent down to 56 percent. The goal, said Prud'hommeaux, is to get that rate down to 25 percent, which is where ASR systems were in processing English several years ago.

The more samples of spoken and written Seneca the team can accumulate, the more the error rate will decrease. (Today, English ASR models can achieve word error rates as low as 5 percent.)

The team's work is expected to help with language preservation efforts around the world.

Prud'hommeaux said the team has an agreement with an archiving institution that's a condition of a grant the project received from the National Science Foundation. The resulting language archiving database will be made available as a resource for other efforts seeking to document threatened languages.

Additionally, Prud'hommeaux said the team's work could prove helpful for any deep learning effort that has to make do with limited amounts of data.

Read more about the team's work in their research papers here and here.

Feature image: The Haudenosaunee (Iroquois Confederacy) flag, via Wikimedia Commons.
LINK: https://blogs.nvidia.com/blog/2019/01/02/deep-learning-preserves-senec...
See more stories from nvidia

Most recent headlines

05/01/2027

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be demoed at CES 2026

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...

06/09/2026

Dolby and MagentaTV Bring Fans Closer to the FIFA World Cup 2026 in Germany with Dolby Vision and Dolby Atmos

June 9 2026, 23:00 (PDT) Dolby and MagentaTV Bring Fans Closer to the FIFA Worl...

04/08/2026

Dalet Announces Commercial Availability of Dalia, Bringing Media-Aware Agentic AI to Enterprise Productions

Dalet, a leading technology and service provider for media-rich organizations, t...

04/07/2026

Detective Conan: Fallen Angel of the Highway Opens in Dolby Cinemas Across Japan, Presented in Dolby Atmos and Dolby ...

April 7 2026, 19:00 (PDT) Detective Conan: Fallen Angel of the Highway Opens in...

10/06/2026

SVG Sit-Down: Team Whistle's Joe Caporoso on Building World Cup Content Around Fans, Culture, IRL Experiences

DAZN-owned digital-media company launches three fan-first series leaning into cr...

10/06/2026

Clear-Com Appoints Jason Dino as Southwest Regional Sales Manager

Clear-Com has announced the appointment of Jason Dino as Southwest Regional Sales Manager USA, covering Southern California and the Southwest region. Dino joins...

10/06/2026

Caretta Research: 2026 World Cup Revenue Growth Due to More Matches; Rights Revenue Up 32%

An 11% decrease in number of global broadcast deals reflects the organization...

10/06/2026

Women Without Boundaries Awards Are Back!

The Women Without Boundaries Awards recognize women whose work is advancing the future of media, broadcast, AV, workplace technology, digital experience, and re...

10/06/2026

On Eve of World Cup Kickoff, FIFA and HBS Offer Deep Dive into IBC Operations, Commentary, and Ref Cam

Today is match day minus two for FIFA and HBS. On Thursday, there will be two ma...

10/06/2026

SES Supporting World's Biggest Soccer Tournament Broadcast Distribution Worldwide

SES is supporting broadcast distribution of the world's biggest football tou...

10/06/2026

BirdDog Achieves Full NDI 6.3 Compatibility Across Entire Product Line

NDI has announced that BirdDog has become the first hardware manufacturer to achieve full NDI 6.3 compatibility across its complete lineup of cameras, encoders,...

10/06/2026

Emmy Award-Winning Audio Team To Present at SVG Audio Symposium

Vince Caputo and Scott Carter, winners of the 2026 Sports Emmy for Outstanding Post Produced Audio have been announced as presenters for the 2026 SVG Advanced A...

10/06/2026

FOX One Set to Deliver World Cup in 4K; Personalization via AI Drives Experience

FOX One today unveiled a slate of new product features and enhancements designed to elevate the viewing experience for fans on the official streaming platform o...

10/06/2026

PWHL Scales Broadcast Operation in Season 3, Relying on World-Feed Model and Key Vendors

Primary production partners Dome Productions and Raycom Sports once again played...

10/06/2026

NFL Films Application Deadline for Women in Sports Filmmaking Experienceship, Augusts 26-29 in Mount Laurel, Closes June 18

The Women in Sports Filmmaking Experienceship is an immersive professional devel...

10/06/2026

NBAs In-House Broadcast Ops & Engineering Teams Power Global Finals Coverage From NYC, San Antonio

The league has expanded its HSAN architecture for the NBA Finals to manage more ...

10/06/2026

MoonPay X Games League Winter Draft Set for September 16 at Cosm Los Angeles

The inaugural MoonPay X Games League (XGL) Winter Draft will take place Wednesday, September 16, 2026 at Cosm Los Angeles from 7-9 p.m. PT. The event will strea...

10/06/2026

University of Oklahoma and Learfield Extend 30-Year Partnership, Announce Sooner Evolution Center

The University of Oklahoma (OU) Athletics Department and Learfield have announce...

10/06/2026

VSF Releases RIST Satellite-Hybrid Out-of-Band Specification

The Video Services Forum (VSF) has released TR-06-4 Part 8, a new specification for RIST Satellite-Hybrid: Out-of-Band Method. The specification creates a mecha...

10/06/2026

Riedel Artist Intercom Powers Live Neurovascular Conference in Lisbon

Riedel Communications provided the communications infrastructure for the 14th World Live Neurovascular Conference (WLNC) in Lisbon, supporting live medical proc...

10/06/2026

Sundance Film Festival 101: Films by LGBTQ+ Directors

A still from The Doom Generation by Gregg Araki (Courtesy of Sundance Institute) By Lucy Spicer Have you checked out our Sundance Film Festival 101 list yet...

10/06/2026

GearExpo UK: Interfaces & Mic Preamp Update

Get Hands-On with Interfaces & Mic Preamp Brands If youre after a new interface or preamp, then GearExpo UK is the place to be! Well have a whole host of au...

10/06/2026

MONO Music Conference 2026

November 13-14 2026, The Midway, San Francisco Following their recent rebranding, MONO Music Conference (formerly Music Expo) have officially announced thei...

10/06/2026

ebbandflow launch with deFORM

Debut instrument free for limited time deFORM is the debut release from newly founded developer ebbandflow, and it's being offered as a free download fo...

10/06/2026

Alone Australia Season 4: Meet the Cast

Alone Australia Season 4: Meet the Cast 10 June, 2026 Media releases WATCH THE TRAILER Smash-hit survival series Alone Australia drops its highly anticipa...

10/06/2026

DEADLY THEN, DEADLY NOW, DEADLY ALWAYS: SBS & NITV IGNITE NAIDOC WEEK 2026 WITH 50 YEARS OF DEADLY

DEADLY THEN, DEADLY NOW, DEADLY ALWAYS: SBS & NITV IGNITE NAIDOC WEEK 2026 WITH ...

10/06/2026

Rohde & Schwarz and TRUMPF advance laser-based drone defense with THORIS LCS

Rohde & Schwarz and TRUMPF advance laser-based drone defense with THORIS LCS Rohde & Schwarz is showcasing THORIS at ILA 2026: A sovereign, end to end counter...

10/06/2026

MAHLE and Rohde & Schwarz develop application for sensor testing of modern driver assistance systems

MAHLE and Rohde & Schwarz develop application for sensor testing of modern drive...

10/06/2026

NFVF CALL FOR FUNDING APPLICATIONS: PRODUCTION & DEVELOPMENT 2026/27

Production and Development Funding supports the creation of compelling, commercially viable, artistic and culturally relevant South African screen content. Deve...

10/06/2026

Nielsen launches Four-Screen Ad Deduplication measurement on YouTube campaigns in Japan

Media buyers and sellers can now compare YouTube reach from computer, mobile, an...

10/06/2026

Ecoflow X Launches as Experimentation Arm for Sustainabil...

Accedo, Humans not Robots, and the Institution of Engineering and Technology (IET) have announced the launch of Ecoflow X. Formerly an IBC Accelerator project, ...

10/06/2026

Frequency Appoints James Smith as General Manager - Monet...

Frequency, the engine powering the world's leading streaming television channels, today announced that James Smith has joined the company as General Manager...

10/06/2026

Riedel Artist at the Heart of the 14th World Live Neurova...

At the 14th World Live Neurovascular Conference (WLNC) in Lisbon, Riedel Communications provided the communications infrastructure for live medical procedures s...

10/06/2026

Globecast Unveils Content Exchange Platform Powered by Or...

Globecast, a leading provider of broadcast, media, and entertainment managed services, today announced the launch of its Content Exchange platform powered by Or...

10/06/2026

Venues and integrators shift toward professional recharge...

Klvr will showcase how venues, integrators and production teams are rethinking disposable battery usage at InfoComm 2026 (Las Vegas, June 17-19, booth #N6311). ...

10/06/2026

VSF Releases Specification for RIST Satellite Hybrid Out-...

The Video Services Forum (VSF) has further enhanced the Reliable Internet Streaming Transport (RIST) protocol by incorporating a new feature, RIST Satellite-Hyb...

10/06/2026

Microphone Maker Audix Adds Eric Reese as VP

Share Copy link Facebook X Linkedin Bluesky Email...

10/06/2026

How to watch every ICC Womens T20 World Cup 2026 match live on Sky Sports

Wednesday 10 June 2026 How to watch every ICC Women's T20 World Cup 2026 match live on Sky Sports Where is the ICC Women's T20 World Cup 2026 availabl...

10/06/2026

PRLA brings first-ever Beautifully Clean Oral Care TV campaign to screens nationwide with Sky

Wednesday 10 June 2026 P RLA brings first-ever Beautifully Clean Oral Care'...

10/06/2026

Sky reveals pulse-pounding first teaser trailer for upcoming crime drama Fightland

Wednesday 10 June 2026 Sky reveals pulse-pounding first teaser trailer for upco...

10/06/2026

Riedel Artist at the Heart of the 14th World Live Neurovascular Conference

Wuppertal June 10, 2026 Riedel Artist at the Heart of the 14th World Live Neurovascular ConferenceAt the 14th World Live Neurovascular Conference (WLNC) in Li...

10/06/2026

Netflix Showcases New Mobile Experience, Collections and Games Vision in APAC

Back to All News Netflix Showcases New Mobile Experience, Collections and Games Vision in APAC Product 10 June 2026 GlobalSouth KoreaJapan Link copied to c...

10/06/2026

Ready to Join? The Netflix Fan Club Arrives at Cannes Lions

Back to All News Ready to Join? The Netflix Fan Club Arrives at Cannes Lions Business 10 June 2026 Global Link copied to clipboard Download all assets N...

10/06/2026

June 10, 2026

Chemists snap together complex 3D molecules from highly reactive radicals-without losing their shape Scripps Research team develops stereoretentive radical-radi...

09/06/2026

Kiswe Expands Partnership With ONE Championship To LaunchGlobal Subscription Platform

Kiswe announces an expanded long-term partnership with ONE Championship (ONE), t...

09/06/2026

SiriusXM to Carry All 104 FIFA World Cup 2026 Matches via FOX Sports Commentary

SiriusXM will broadcast FOX Sports' English-language commentary for all 104 FIFA World Cup 2026 matches from June 11 through July 19, available to subscribe...

09/06/2026

EVS Broadcast Equipment Rebrands as EVS

EVS has announced it is changing its corporate name from EVS Broadcast Equipment to EVS, reflecting the company's expanded portfolio beyond broadcast equipm...

09/06/2026

FOX and NFL Announce Multi-Year Agreement for NFL Coverage in Mexico Starting in 2026

Fox Corporation and the NFL have announced a multi-year agreement to bring NFL c...

09/06/2026

FOX Sports and ReachTV to Carry All 104 FIFA World Cup 2026 Matches Across U.S. Airports

FOX Sports and ReachTV, an airport media network, have announced an agreement to...