Sony Pixel Power calrec Sony

How Deep Learning Is Aiding Preservation of Seneca and Other Endangered Languages

03/01/2019

Linguists estimate that at least half of the world's estimated 7,000 spoken languages will become extinct by the century's end, due to forces ranging from globalization to cultural assimilation.

Part of the challenge of documenting and revitalizing endangered languages is a lack of texts and speech recordings to work with. Seneca, a language of one of the six Iroquois Nations in North America, has only about 100 first-language speakers and several hundred more second-language learners.

Automatic speech recognition (ASR) technology is widely used to transcribe languages with millions or billions of speakers, like English and Mandarin. But it has only scratched the surface with languages like Seneca, which have vastly fewer speakers and significantly less data to work with.

Now a team of researchers at the Rochester Institute of Technology in New York, along with colleagues from the University at Buffalo, is tapping deep learning to bolster the ability of ASR. And while its focus is on Seneca, the project's vision encompasses the preservation of languages globally as well as an important part of our shared cultural history.

Knowing about different languages teaches us a lot about how our brain works, said Emily Prud'hommeaux, an assistant professor of computer science at Boston College and a research faculty member at RIT. When you document a language, you're preserving information not only about that language but also about how humans use language in general.

It's no coincidence that Prud'hommeaux and her team started with the Seneca language. Three members of the Seneca nation are part of the effort - a direct connection that is rare in research of this type, she said.

Leading the charge is Robbie Jimerson, a Ph.D. student in RIT's Golisano College of Computing and Information Science. He is a member of the Seneca Nation of Indians and is passionate about ensuring the survival of the Seneca language.

There's a big effort by the leaders of the tribe to preserve and promote our language, said Jimerson. I was looking for an opportunity to contribute.

Using GANs to Create More Language Samples Now in its third year, the project has had challenges when it comes to accumulating language data. Jimerson said the Seneca community can be guarded about what it shares with other people, so there wasn't an abundance of recordings of the language being spoken. He set out to change that.

He started by recording friends and elders who speak the language and asking them to record their friends. He found out whenever someone was speaking Seneca in public. He asked for family recordings of elders telling stories handed down from previous generations. And he grabbed any publicly available videos or recordings he could find online.

The team has fine-tuned an ASR model for Seneca, running it through generative adversarial networks to create more samples out of the limited number of recordings. The model turns wave files of the spoken language into streams of characters, while computing probability and making corrections.

The resulting data is fed into a deep learning model that in turn expands upon the ASR model's accuracy.

The team's networks run in two compute settings: on a nine-server machine learning lab running a variety of NVIDIA Tesla GPUs, and on a university cluster of large servers, each running 10 NVIDIA Tesla P4 GPUs. Each cluster runs a range of deep learning frameworks such as TensorFlow and Caffe.

The computer engineering cluster is for all students in the computer engineering department, and so they have to compete' for these resources, said Ray Ptucha, assistant professor of computer engineering at RIT, another collaborator on this project.

With access to these clusters at a premium, Jimerson tests code and checks the stability of models on a local machine running an NVIDIA TITAN X rather than inconvenience other students by running a model that might crash.

Achieving Better Accuracy So far, the team's efforts have brought the word error rate of its ASR model from 70 percent down to 56 percent. The goal, said Prud'hommeaux, is to get that rate down to 25 percent, which is where ASR systems were in processing English several years ago.

The more samples of spoken and written Seneca the team can accumulate, the more the error rate will decrease. (Today, English ASR models can achieve word error rates as low as 5 percent.)

The team's work is expected to help with language preservation efforts around the world.

Prud'hommeaux said the team has an agreement with an archiving institution that's a condition of a grant the project received from the National Science Foundation. The resulting language archiving database will be made available as a resource for other efforts seeking to document threatened languages.

Additionally, Prud'hommeaux said the team's work could prove helpful for any deep learning effort that has to make do with limited amounts of data.

Read more about the team's work in their research papers here and here.

Feature image: The Haudenosaunee (Iroquois Confederacy) flag, via Wikimedia Commons.
LINK: https://blogs.nvidia.com/blog/2019/01/02/deep-learning-preserves-senec...
See more stories from nvidia

Most recent headlines

05/01/2027

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be demoed at CES 2026

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...

01/06/2026

Dolby Sets the New Standard for Premium Entertainment at CES 2026

January 6 2026, 05:30 (PST) Dolby Sets the New Standard for Premium Entertainment at CES 2026 Throughout the week, Dolby brings to life the latest innovatio...

01/05/2026

NBCUniversal's Peacock to Be First Streamer to Integrate Dolby's Full Suite of Premium Picture and Sound Innovations

January 5 2026, 18:30 (PST) NBCUniversal's Peacock to Be First Streamer to ...

01/04/2026

DOLBY AND DOUYIN EMPOWER THE NEXT GENERATON OF CREATORS WITH DOLBY VISION

January 4 2026, 18:00 (PST) DOLBY AND DOUYIN EMPOWER THE NEXT GENERATON OF CREATORS WITH DOLBY VISION Douyin Users Can Now Create And Share Videos With Stun...

15/01/2026

Milano Cortina 2026: OBS CEO Yiannis Exarchos Previews Production Innovations

Milano Cortina 2026: OBS CEO Yiannis Exarchos Previews Production InnovationsFirst Person View drones, AI-based tech target greater fan engagement, efficiencyBy...

15/01/2026

Clever With Cameras: Bringing the Drama of the 2026 ISU European Figure Skating Championships to Viewers with Sunset+Vine

Clever with cameras: Bringing the drama of the 2026 ISU European Figure Skating ...

15/01/2026

No Stranger to Ice: Host Broadcaster Sunset+Vine on Pressure and Planning for the 2026 ISU European Figure Skating Championships

No stranger to ice: Host broadcaster Sunset Vine on pressure and planning for th...

15/01/2026

SVG Campus Shot Callers: Leah Gill, Associate Athletic Director, Digital Media, University of Tennessee at Chattanooga

SVG Campus Shot Callers: Leah Gill, Associate Athletic Director, Digital Media, ...

15/01/2026

National Sports Broadcaster Perspectives: What to Expect in Production in 2026

National Sports Broadcaster Perspectives: What to Expect in Production in 2026Leaders from ESPN, FOX Sports, Netflix, and Warner Bros. Discovery share their pro...

15/01/2026

From Berlin to London: Amazon Delivers NBA Europe Games for an International Audience

From Berlin to London: Amazon delivers NBA Europe games for an international aud...

15/01/2026

Sundance Institute Appoints David Linde as CEO

LOS ANGELES, CA, January 15, 2026 - The nonprofit Sundance Institute today announced the appointment of David Linde as Chief Executive Officer. Linde will assum...

15/01/2026

Excellent training at SGL Carbons Bonn site

The SGL Carbon site in Bonn has a long tradition of training. For many years, young talent has been successfully trained here, regularly achieving excellent exa...

15/01/2026

SGL Carbon and BMW Group receive JEC Innovation Award for Natural Fiber Composites Project

The JEC Composites Innovation Awards annually honor the most innovative and ambi...

15/01/2026

SGL Carbon signs long-term supply agreement with X-energy

X-energy Reactor Company, LLC ( X-energy ) and SGL Carbon LLC ( SGL ) have signed a 10-year framework agreement to provide graphite for the deployment of X-ener...

15/01/2026

Versant Completes Acquisition Of Free TV Networks

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

15/01/2026

House Oversight Hearing on FCC Puts Chair in Spotlight

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

15/01/2026

Radiant Media Player, Cloud DRM Partner on Integration

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

15/01/2026

Telycam to Showcase New Mix One Video Switcher at ISE 2026

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

15/01/2026

Pliant Names Adam Grede as Regional Sales Manager

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

15/01/2026

NFL Wild Card Games Score With Viewers

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

15/01/2026

VEON's Kyivstar Reaches 3.0 million Customers with Starlink Direct to Cell Services

15 Jan 2026 VEON's Kyivstar Reaches 3.0 million Customers with Starlink Dir...

15/01/2026

UKTV Enjoys Best Year Yet: VOD Viewing and Linear Share Hit Record Levels

Views to free streaming service U grew by 15%, average monthly active users by 23% and registrations by 18% UKTV's channels achieved record viewing share, ...

15/01/2026

Sky Sports to show Final Stage of inaugural FIFA Womens Champions Cup

Thursday 15 January 2026 Sky Sports to show Final Stage of inaugural FIFA Women's Champions Cup Sky and FIFA have agreed an exclusive new partnership whi...

15/01/2026

The official trailer for the second season of Seth MacFarlanes live-action comedy, Ted, has landed ahead of its launch on 6 March

Thursday 15 January 2026 The official trailer for the second season of Seth Mac...

15/01/2026

Riedel RefCam Takes Center Court in German Basketball Research Initiative With DBB and DSHS

Wuppertal January 15, 2026 Riedel RefCam Takes Center Court in German Basketba...

15/01/2026

Nah Yung-suk Presents Take a Hike!' - A Snowy Reality Adventure Coming to Netflix

Back to All News Nah Yung-suk Presents Take a Hike!' - A Snowy Reality Adv...

15/01/2026

'Firebreak' Premieres on Netflix February 20

Back to All News Firebreak Premieres on Netflix February 20 Entertainment 15 January 2026 GlobalSpain Link copied to clipboard DOWNLOAD THE FIRST LOOK IMA...

15/01/2026

The Variety, Voices, and Vision Shaping What's Next on Netflix Indonesia in 2026

Back to All News The Variety, Voices, and Vision Shaping What's Next on Net...

15/01/2026

Netflix and Sony Pictures Entertainment Enter New Pay-1 Deal With First-of-Its-Kind Global Reach

Back to All News Netflix and Sony Pictures Entertainment Enter New Pay-1 Deal W...

15/01/2026

Hollywood Professional Association Unveils 2026 HPA Awards Innovation & Technology Nominees

The Hollywood Professional Association (HPA) today announced the nominees for th...

15/01/2026

FOR-A Europe to Demonstrate Broadcast and Pro-AV Convergence at ISE 2026

Award-winning production solutions bridge traditional and next-generation workflows FOR-A MixBoard FOR-A IMPULSE viztrick AiDi MFR-3100EX...

15/01/2026

Arvato Systems Named Launch Partner for AWS European Sovereign Cloud

Arvato Systems Named Launch Partner for AWS European Sovereign Cloud As a launch partner for the AWS European Sovereign Cloud, Arvato Systems enables customer...

15/01/2026

Survive the Quarantine Zone and More With Devolver Digital Games on GeForce NOW

NVIDIA kicked off the year at CES, where the crowd buzzed about the latest gaming announcements - including the native GeForce NOW app for Linux and Amazon Fire...

14/01/2026

ITV selects Yospace for Advanced Ad Measurement and Monetisation on Freely

Staines-upon-Thames, UK, 13th January, 2026 ITV, one of the UKs leading broadcasters, has selected Yospace, the global leader in Dynamic Ad Insertion (DAI), to ...

14/01/2026

Tech Focus: Audio Consoles, Part 2 - New Options for Virtual Mixing

Tech Focus: Audio Consoles, Part 2 - New Options for Virtual MixingA variety of solutions offer both technical and economic benefitsBy Dan Daley, Audio Editor ...

14/01/2026

Tech Focus: Audio Consoles, Part 1 - Key Component Evolves Toward the Totally Virtual

Tech Focus: Audio Consoles, Part 1 - Key Component Evolves Toward the Totally Vi...

14/01/2026

SVG Summit 2025: Audio from Monday Workshops Now Available

SVG Summit 2025: Audio from Monday Workshops Now AvailableListen to sessions from Live Production Innovation, AI Production Tools, Cloud Production, Content Wor...

14/01/2026

US Navy and Marines Select L3Harris T7 Robots to Enhance Ordnance Disposal Capabilities

The L3Harris large T7 robotic systems will provide U.S. Navy and U.S. Marines wi...

14/01/2026

Steiger Media reimagines broadcast workflows with Calrec

Steiger Media's adoption of Calrec's compact Argo M console not only makes its innovative new hybrid truck faster, more efficient, and agile, but also e...

14/01/2026

NBC Sports to Deploy viztrick AiDi for Live Sports Production

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

14/01/2026

Sinclair Accepting Applications for 2026 Scholarship Program

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

14/01/2026

Neal Shapiro to Retire as President and CEO of The WNET Group

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

14/01/2026

Press Release: The Boston Globe Names Cartesian a Top Place to Work in 2025

Press Release: The Boston Globe Names Cartesian a Top Place to Work in 2025 January 14, 2026 News Cartesian - January 14, 2026 - EINPresswire.com - Sp...

14/01/2026

Comscore and Marcus Theatres Announce Five-Year Extension for Cinema ACE and Enterprise Web Solutions

Comscore and Marcus Theatres Announce Five-Year Extension for Cinema ACE and Ent...

14/01/2026

Comscore and Santikos Entertainment Announce Five-Year Circuit Wide Commitment to Cinema ACE and Enterprise Web Solutions

Comscore and Santikos Entertainment Announce Five-Year Circuit Wide Commitment t...

14/01/2026

Tribeca Announces Best New York Short Award for 25th Anniversary Festival

January 14th, 2026 TRIBECA ANNOUNCES BEST NEW YORK SHORT AWARD FOR 25TH ANNIVERSARY FESTIVAL In Celebration of Its 25th Anniversary, Tribeca Introduces a N...

14/01/2026

Sky News announces Cathy Newman to lead flagship new political programme

Wednesday 14 January 2026 Sky News announces Cathy Newman to lead flagship new political programme Sky News today announces that award-winning journalist and ...

14/01/2026

'State of Fear', The First Spin-Off of a Netflix Brazil Production, Premieres February 11

Back to All News State of Fear, The First Spin-Off of a Netflix Brazil Producti...

14/01/2026

Special stamp celebrates 100 Years of Broadcasting in Ireland

The first stamp of An Post's 2026 Stamp Programme, marking 100 Years of Broadcasting, was unveiled at the GPO by Patrick O'Donovan TD, Minister for Cult...

14/01/2026

It's Official! Beverley Callard joins Fair City

It's official! Beverley Callard has landed in Carrigstown. The beloved actor, known for her unforgettable roles and iconic screen presence, is joining the c...