Sony Pixel Power calrec Sony

How Deep Learning Is Aiding Preservation of Seneca and Other Endangered Languages

03/01/2019

Linguists estimate that at least half of the world's estimated 7,000 spoken languages will become extinct by the century's end, due to forces ranging from globalization to cultural assimilation.

Part of the challenge of documenting and revitalizing endangered languages is a lack of texts and speech recordings to work with. Seneca, a language of one of the six Iroquois Nations in North America, has only about 100 first-language speakers and several hundred more second-language learners.

Automatic speech recognition (ASR) technology is widely used to transcribe languages with millions or billions of speakers, like English and Mandarin. But it has only scratched the surface with languages like Seneca, which have vastly fewer speakers and significantly less data to work with.

Now a team of researchers at the Rochester Institute of Technology in New York, along with colleagues from the University at Buffalo, is tapping deep learning to bolster the ability of ASR. And while its focus is on Seneca, the project's vision encompasses the preservation of languages globally as well as an important part of our shared cultural history.

Knowing about different languages teaches us a lot about how our brain works, said Emily Prud'hommeaux, an assistant professor of computer science at Boston College and a research faculty member at RIT. When you document a language, you're preserving information not only about that language but also about how humans use language in general.

It's no coincidence that Prud'hommeaux and her team started with the Seneca language. Three members of the Seneca nation are part of the effort - a direct connection that is rare in research of this type, she said.

Leading the charge is Robbie Jimerson, a Ph.D. student in RIT's Golisano College of Computing and Information Science. He is a member of the Seneca Nation of Indians and is passionate about ensuring the survival of the Seneca language.

There's a big effort by the leaders of the tribe to preserve and promote our language, said Jimerson. I was looking for an opportunity to contribute.

Using GANs to Create More Language Samples Now in its third year, the project has had challenges when it comes to accumulating language data. Jimerson said the Seneca community can be guarded about what it shares with other people, so there wasn't an abundance of recordings of the language being spoken. He set out to change that.

He started by recording friends and elders who speak the language and asking them to record their friends. He found out whenever someone was speaking Seneca in public. He asked for family recordings of elders telling stories handed down from previous generations. And he grabbed any publicly available videos or recordings he could find online.

The team has fine-tuned an ASR model for Seneca, running it through generative adversarial networks to create more samples out of the limited number of recordings. The model turns wave files of the spoken language into streams of characters, while computing probability and making corrections.

The resulting data is fed into a deep learning model that in turn expands upon the ASR model's accuracy.

The team's networks run in two compute settings: on a nine-server machine learning lab running a variety of NVIDIA Tesla GPUs, and on a university cluster of large servers, each running 10 NVIDIA Tesla P4 GPUs. Each cluster runs a range of deep learning frameworks such as TensorFlow and Caffe.

The computer engineering cluster is for all students in the computer engineering department, and so they have to compete' for these resources, said Ray Ptucha, assistant professor of computer engineering at RIT, another collaborator on this project.

With access to these clusters at a premium, Jimerson tests code and checks the stability of models on a local machine running an NVIDIA TITAN X rather than inconvenience other students by running a model that might crash.

Achieving Better Accuracy So far, the team's efforts have brought the word error rate of its ASR model from 70 percent down to 56 percent. The goal, said Prud'hommeaux, is to get that rate down to 25 percent, which is where ASR systems were in processing English several years ago.

The more samples of spoken and written Seneca the team can accumulate, the more the error rate will decrease. (Today, English ASR models can achieve word error rates as low as 5 percent.)

The team's work is expected to help with language preservation efforts around the world.

Prud'hommeaux said the team has an agreement with an archiving institution that's a condition of a grant the project received from the National Science Foundation. The resulting language archiving database will be made available as a resource for other efforts seeking to document threatened languages.

Additionally, Prud'hommeaux said the team's work could prove helpful for any deep learning effort that has to make do with limited amounts of data.

Read more about the team's work in their research papers here and here.

Feature image: The Haudenosaunee (Iroquois Confederacy) flag, via Wikimedia Commons.
LINK: https://blogs.nvidia.com/blog/2019/01/02/deep-learning-preserves-senec...
See more stories from nvidia

Most recent headlines

06/10/2025

France Tlvisions Wins Prestigious 2025 EBU Technology & Innovation Award in Groundbreaking Collaboration with Dalet

France T l visions, France's leading broadcaster, has received the 2025 EBU ...

04/09/2025

Monumental Sports & Entertainment and Dalet Win Prestigious 2025 NAB Show Project of the Year Award

Monumental Sports & Entertainment (MSE), in collaboration with Dalet, has been a...

07/08/2025

Tata Motors & Dolby Bring Dolby Atmos to Harrier.ev, Redefining In-Car Entertainment Experience

July 8 2025, 22:30 (PDT) Tata Motors & Dolby Bring Dolby Atmos to Harrier.ev, R...

18/07/2025

Realm of Satan Disturbs the Comfortable at 2024 Sundance Film Festival

Blanche Barton, Peter Gilmore, Scott Cummings, Sundance Institute Director of Programming Kim Yutani, and Peggy Nadramia throwing horns. (Stephen Lovekin/Shutte...

18/07/2025

Give Me the Backstory: Get to Know Heidi Ewing and Rachel Grady, the Directors of FOLKTALES

By Jessica Herndon One of the most exciting things about the Sundance Film Fest...

18/07/2025

NESN Unveils Enhanced NESN 360 App for iOS, Android Devices

BOSTON Sports network NESN today relaunched the NESN 360 app with a redesigned user experience and upgrades....

18/07/2025

Berklee Singers Wow the Competition on Netflixs Building the Band

Berklee Singers Wow the Competition on Netflixs Building the Band Netflixs new music competition show features two Berklee artists looking to make their pop b...

18/07/2025

Trump Expected to Sign Bill Defunding CPB After House Approves $1B Cuts

As the House of Representatives voted early Friday morning to eliminate federal funding for public broadcasting, advocates warned that the elimination of $1 bil...

18/07/2025

Imagine Communications Sharpens Focus on Sports, Live Events Market

DENVER Imagine Communications has announced that it is deepening its commitment to the North American sports and live events market with investments in its prod...

18/07/2025

Netflix Beats Revenue, Profit Expectations

LOS GATOS, Calif. Netflix once again reported strong financial growth, with revenue increasing 16% year-over-year to $11.079 billion in Q2 2025 as net income s...

18/07/2025

Live from the 153rd Open Championship: SMT Enhances Video Board Experience; Enables Practice Scheduling for Players

Live from the 153rd Open Championship: SMT Enhances Video Board Experience; Enab...

18/07/2025

ShotTracker's Davyeon Ross on Revolutionizing Basketball On the Court and Beyond Through Tracking Tech

ShotTracker's Davyeon Ross on Revolutionizing Basketball On the Court and ...

18/07/2025

Home Pitch: How the Danish League Brought Its Live Match Production In-House

Home Pitch: How the Danish League Brought Its Live Match Production In-House Matchday Production produced 800+ matches across Denmark in its first year By Geor...

18/07/2025

Live From the 153rd Open Championship: EMG Debuts Spidercam for Golf, Expands Support for NBC Sports

Live From the 153rd Open Championship: EMG Debuts Spidercam for Golf, Expands Su...

18/07/2025

Usyk vs Dubois II: Bringing the Fight to Life with State of the Art Cameras and Graphics from DAZN and Gravity Media

Usyk vs Dubois II: Bringing the fight to life with state of the art cameras and ...

18/07/2025

Usyk vs Dubois II: DAZN's Head of Boxing Chris Glanville Talks Helicopters, a Ringwalk Spectacular and Directing the Action

Usyk vs Dubois II: DAZN's head of boxing Chris Glanville talks helicopters, ...

18/07/2025

Live From the 153rd Open Championship: NBC Expands Unilateral Effort; Shared Studio Gives Live From,' NBC Sports a Great View

Live From the 153rd Open Championship: NBC Expands Unilateral Effort; Shared Stu...

18/07/2025

Warner Bros. Discovery's Anne Graham on Training New Professionals to Ensure the Future of the Business

Warner Bros. Discovery's Anne Graham on Training New Professionals to Ensure...

18/07/2025

Live From the 153rd Open Championship: How R&A Spidercam Became a Reality for Golf Coverage

Live From the 153rd Open Championship: How R&A Spidercam Became a Reality for Go...

18/07/2025

ESPN Bulks Up Onsite Studios, Cameras, Player Mics for WNBA All-Star Weekend

ESPN Bulks Up Onsite Studios, Cameras, Player Mics for WNBA All-Star Weekend Besides the game, four studio shows will be produced onsite By Jason Dachman, Edit...

18/07/2025

The Sky Arts Awards Are Back, Live From The Roundhouse This September

Friday 18 July 2025 Credit: Marc Sethi The Sky Arts Awards will return on 16 September, once again taking place at the iconic Roundhouse in London. Produced b...

18/07/2025

2025-07-18

This month, players can dive into special crossover events featuring SpongeBob SquarePants in Snake.io+ and Crossy Road Castle, available exclusively on Apple A...

18/07/2025

Hurling Fans Flock Together for a Night of Craic agus Ceol on RT's Up For The Match this Saturday night

Up for the Match brings the fun, the passion-and a bit of Riverdance- tomorrow n...

17/07/2025

Realm of SatanDisturbs the Comfortable at 2024 Sundance Film Festival

Blanche Barton, Peter Gilmore, Scott Cummings, Sundance Institute Director of Programming Kim Yutani, and Peggy Nadramia throwing horns. (Stephen Lovekin/Shutte...

17/07/2025

Audiobooks+ Brings More Choice and Flexibility to Spotify Premium Subscribers

Spotify's goal with audiobooks is to help grow the publishing industry and create the best possible experience for booklovers, which is why we're launch...

17/07/2025

L3Harris to Expand Solid Rocket Motor Production in Arkansas

An advanced large solid rocket motor hot fire test conducted at L3Harris' Camden, Arkansas, site....

17/07/2025

L3Harris Has the Future of PNT Ready Now

L3Harris payloads and components have been on board every U.S. GPS satellite-more than 70 missions since the 1970s. Our technology is at the core of GPS availab...

17/07/2025

L3Harris Introduces Launched Effects Vehicles to Increase US Multi-Domain Superiority

L3Harris' pack of launched effects is comprised of multi-role vehicles that ...

17/07/2025

Nielsen taps Ryan Moore as Chief Business Officer of Gracenote

NEW YORK - July 17, 2025 - Gracenote, the content data business unit of Nielsen, has announced the appointment of Ryan Moore to the new role of Chief Business O...

17/07/2025

David Cohen Joins Miri Technologies as VP, Sales and Business Development

READING, Pa. Network technology startup Miri Technologies has added David Cohen as vice president, business development. He'll be tasked with expanding the ...

17/07/2025

Berklee Wins CASE Award for Standout Digital Campaign

Berklee Wins CASE Award for Standout Digital Campaign The Alumni Affairs team's Variations Signature Series earned recognition for offering an authentic p...

17/07/2025

Senate Votes to Strip Federal Funds for Public Broadcasting

Early Thursday morning, the U.S. Senate voted to cut federal funding for PBS and NPR in a bill that also included cuts for foreign aid. In a 51-48 vote, the act...

17/07/2025

British Innovation Leads the Way at IBC2025

GREAT Britain and Northern Ireland Pavilions across IBC2025 (12 15 September, RAI Amsterdam) feature a remarkable 34 companies, each providing unique, innovativ...

17/07/2025

SMPTE Forms Study Group on Content Provenance and Authenticity In Media

WHITE PLAINS, N.Y. The Society of Motion Picture and Television Engineers today introduced its Content Provenance and Authenticity (CPA) in Media Study Group (S...

17/07/2025

NBCU Reports Highest Sales Volume in History' for 2025-26 Upfronts

NEW YORK Despite widespread concerns about a soft advertising market, NBCUniversal said it closed its 2025-26 upfront negotiation cycle with a record ad sales v...

17/07/2025

Study: TV Ads Are Top Influencer of Older Adults Buying Retirement Homes

NEW YORK Television advertising remains an effective means to market and influence younger baby boomer and Generation X consumers approaching retirement age, ac...

17/07/2025

Service Electric Cable TV Deploys OpenVault PMN

ALLENTOWN, Pa. & JERSEY CITY, N.J. Service Electric Cable TV and OpenVault have announced that the operator is deploying the OpenVault Proactive Network Manage...

17/07/2025

FCC Chair Outlines Busy Summer Agenda That Includes Major EAS Vote

WASHINGTON Following up on his first major policy speech just two ago, Federal Communications Commission Chair Brendan Carr is promising to buck the usual summ...

17/07/2025

SES Receives All Required Regulatory Approvals to Complete Intelsat Acquisition

Luxembourg, 14 July 2025 - SES received the final regulatory approvals for the SES-Intelsat transaction, including the US Federal Communications Commission. O...

17/07/2025

Berklee Alumni Nominated for 2025 Emmy Awards

Berklee Alumni Nominated for 2025 Emmy Awards Twenty-seven Berklee alumni received nominations across categories in sound mixing, sound editing, and compositi...

17/07/2025

What We Watched the First Half of 2025

Back to All News What We Watched the First Half of 2025 Entertainment 17 July 2025 Global Link copied to clipboard Today, we're sharing our latest Eng...

17/07/2025

Vote for Ivory 3 American Concert D in the MIDI Innovation Awards!

Cast your vote in 2 categories for Ivory 3 American Concert DIvory 3 American Concert D is in the 2025 MIDI Innovation Awards and we are honored to be nominated...

17/07/2025

SVG Sit-Down: OpenDrives' Sean Lee, Michael Wilsker, and Jason Matousek on Evolving Data Management

SVG Sit-Down: OpenDrives' Sean Lee, Michael Wilsker, and Jason Matousek on E...

17/07/2025

Oracle's Geoff Tognetti on Staying Ahead of the AI Curve Through Further Adoption

Oracle's Geoff Tognetti on Staying Ahead of the AI Curve Through Further Ado...

17/07/2025

NBA's Summer League Is a Hot Bed of AI, Immersive Innovation Testing

NBA's Summer League Is a Hot Bed of AI, Immersive Innovation Testing Audio mixing, intelligent camera framing are among AI-driven efforts By Ken Kerschbaum...

17/07/2025

SVG's TranSPORT Conference Returns to NYC on Oct. 21; Register Today!

SVG's TranSPORT Conference Returns to NYC on Oct. 21; Register Today! By SVG Staff Thursday, July 17, 2025 - 8:29 am Print This Story | Subscribe St...

17/07/2025

Gravity Media Chosen as Group Name for Merged OB Business

Gravity Media chosen as group name for merged OB business By George Bevir Wednesday, July 16, 2025 - 11:30 Print This Story Gravity Media executive chairm...

17/07/2025

SES Closes Deal to Acquire Intelsat, Creating New Giant in Satellite Connectivity

SES Closes Deal to Acquire Intelsat, Creating New Giant in Satellite Connectivit...

17/07/2025

How MLB Pulled Off That Unforgettable Hank Aaron Tribute at the All-Star Game

How MLB Pulled Off That Unforgettable Hank Aaron Tribute at the All-Star Game Vintage content, state-of-the-art projection/processing combine in marvelous mile...

17/07/2025

Riedel Communications Powers Handa Opera on Sydney Harbour with Integrated Network and Comms Solutions

Wuppertal July 17, 2025 Riedel Communications Powers Handa Opera on Sydney Har...