Sony Pixel Power calrec Sony

How Deep Learning Is Aiding Preservation of Seneca and Other Endangered Languages

03/01/2019

Linguists estimate that at least half of the world's estimated 7,000 spoken languages will become extinct by the century's end, due to forces ranging from globalization to cultural assimilation.

Part of the challenge of documenting and revitalizing endangered languages is a lack of texts and speech recordings to work with. Seneca, a language of one of the six Iroquois Nations in North America, has only about 100 first-language speakers and several hundred more second-language learners.

Automatic speech recognition (ASR) technology is widely used to transcribe languages with millions or billions of speakers, like English and Mandarin. But it has only scratched the surface with languages like Seneca, which have vastly fewer speakers and significantly less data to work with.

Now a team of researchers at the Rochester Institute of Technology in New York, along with colleagues from the University at Buffalo, is tapping deep learning to bolster the ability of ASR. And while its focus is on Seneca, the project's vision encompasses the preservation of languages globally as well as an important part of our shared cultural history.

Knowing about different languages teaches us a lot about how our brain works, said Emily Prud'hommeaux, an assistant professor of computer science at Boston College and a research faculty member at RIT. When you document a language, you're preserving information not only about that language but also about how humans use language in general.

It's no coincidence that Prud'hommeaux and her team started with the Seneca language. Three members of the Seneca nation are part of the effort - a direct connection that is rare in research of this type, she said.

Leading the charge is Robbie Jimerson, a Ph.D. student in RIT's Golisano College of Computing and Information Science. He is a member of the Seneca Nation of Indians and is passionate about ensuring the survival of the Seneca language.

There's a big effort by the leaders of the tribe to preserve and promote our language, said Jimerson. I was looking for an opportunity to contribute.

Using GANs to Create More Language Samples Now in its third year, the project has had challenges when it comes to accumulating language data. Jimerson said the Seneca community can be guarded about what it shares with other people, so there wasn't an abundance of recordings of the language being spoken. He set out to change that.

He started by recording friends and elders who speak the language and asking them to record their friends. He found out whenever someone was speaking Seneca in public. He asked for family recordings of elders telling stories handed down from previous generations. And he grabbed any publicly available videos or recordings he could find online.

The team has fine-tuned an ASR model for Seneca, running it through generative adversarial networks to create more samples out of the limited number of recordings. The model turns wave files of the spoken language into streams of characters, while computing probability and making corrections.

The resulting data is fed into a deep learning model that in turn expands upon the ASR model's accuracy.

The team's networks run in two compute settings: on a nine-server machine learning lab running a variety of NVIDIA Tesla GPUs, and on a university cluster of large servers, each running 10 NVIDIA Tesla P4 GPUs. Each cluster runs a range of deep learning frameworks such as TensorFlow and Caffe.

The computer engineering cluster is for all students in the computer engineering department, and so they have to compete' for these resources, said Ray Ptucha, assistant professor of computer engineering at RIT, another collaborator on this project.

With access to these clusters at a premium, Jimerson tests code and checks the stability of models on a local machine running an NVIDIA TITAN X rather than inconvenience other students by running a model that might crash.

Achieving Better Accuracy So far, the team's efforts have brought the word error rate of its ASR model from 70 percent down to 56 percent. The goal, said Prud'hommeaux, is to get that rate down to 25 percent, which is where ASR systems were in processing English several years ago.

The more samples of spoken and written Seneca the team can accumulate, the more the error rate will decrease. (Today, English ASR models can achieve word error rates as low as 5 percent.)

The team's work is expected to help with language preservation efforts around the world.

Prud'hommeaux said the team has an agreement with an archiving institution that's a condition of a grant the project received from the National Science Foundation. The resulting language archiving database will be made available as a resource for other efforts seeking to document threatened languages.

Additionally, Prud'hommeaux said the team's work could prove helpful for any deep learning effort that has to make do with limited amounts of data.

Read more about the team's work in their research papers here and here.

Feature image: The Haudenosaunee (Iroquois Confederacy) flag, via Wikimedia Commons.
LINK: https://blogs.nvidia.com/blog/2019/01/02/deep-learning-preserves-senec...
See more stories from nvidia

Most recent headlines

05/01/2027

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be demoed at CES 2026

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...

01/06/2026

Dolby Sets the New Standard for Premium Entertainment at CES 2026

January 6 2026, 05:30 (PST) Dolby Sets the New Standard for Premium Entertainment at CES 2026 Throughout the week, Dolby brings to life the latest innovatio...

01/05/2026

NBCUniversal's Peacock to Be First Streamer to Integrate Dolby's Full Suite of Premium Picture and Sound Innovations

January 5 2026, 18:30 (PST) NBCUniversal's Peacock to Be First Streamer to ...

01/04/2026

DOLBY AND DOUYIN EMPOWER THE NEXT GENERATON OF CREATORS WITH DOLBY VISION

January 4 2026, 18:00 (PST) DOLBY AND DOUYIN EMPOWER THE NEXT GENERATON OF CREATORS WITH DOLBY VISION Douyin Users Can Now Create And Share Videos With Stun...

08/01/2026

How disinformation is shaping Sudan's conflict: a new report

An evidence-based analysis on disinformation and information manipulation in Sudan's ongoing conflict is published today. (January 8th 2026). Thomson Found...

08/01/2026

At CFP Semifinals, ESPN Again Flexes Its Operational Muscle With 20+ MegaCast Viewing Options

At CFP Semifinals, ESPN Again Flexes Its Operational Muscle With 20+ MegaCast Vi...

08/01/2026

SVG Students To Watch: Sophie Fowler, University of Oregon

SVG Students To Watch: Sophie Fowler, University of OregonThe Portland product has honed her skills as a producer, director, and TD at Quack VideoBy Brandon Cos...

08/01/2026

Follow the Money, Episode 3: Inside the Sports-Media Biz With Sam McCleery and Ken Aagaard

Follow the Money, Episode 3: Inside the Sports-Media Biz With Sam McCleery and K...

08/01/2026

SVG New Sponsor Spotlight: Qualstar's Jeff Sengpiehl on the Enduring Power and Value of LTO Tape for Video Archiving

SVG New Sponsor Spotlight: Qualstar's Jeff Sengpiehl on the Enduring Power a...

08/01/2026

Legendary February: Production Leaders at NBC Sports Pull Back the Curtain on Olympics, Super Bowl, NBA All-Star Plans

Legendary February: Production Leaders at NBC Sports Pull Back the Curtain on Ol...

08/01/2026

One Year In: How Creators Are Growing Their Shows and Connecting With Audiences Through the Spotify Partner Program

In 2025 we launched the Spotify Partner Program to give creators more ways to tu...

08/01/2026

Spotify Toasts to the Future of Podcasting With Creators at Our New Sycamore Studios

On Wednesday in Los Angeles, Spotify welcomed creators and press to a brunch cel...

08/01/2026

Hollywood Professional Association (HPA) Announces Updates to Board of Directors

The Hollywood Professional Association (HPA) today announced several updates to its Board of Directors. As part of HPA's annual governance cycle, new leader...

08/01/2026

Chyron Releases Virtual Placement 8.0

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

08/01/2026

SMPTE Names Board Officers, Governors for 2026

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

08/01/2026

FCC to Vote on Proposals Expanding Unlicensed Use of 6 GHz Band

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

08/01/2026

RTVE selects Alfalite Neopix for its first broadcast depl...

Spain's national public broadcaster, RTVE, has upgraded one of its main television production facilities in Madrid with the installation of two Alfalite NEO...

08/01/2026

Richard E. Wiley to Step Down as Media Institute's Chairman

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

08/01/2026

Cineverse Acquires Giant Worldwide

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

08/01/2026

Maxon Introduces Cinebench 2026

Maxon's new release of Cinebench features performance enhancements and adds support for the latest Nvidia and AMD GPUs as well as Apple Silicon. Maxon is t...

08/01/2026

Zixi Accelerates Global Growth with Appointment of Heathe...

Zixi, the industry leader in IP-based video transport and orchestration, today announced the appointment of Heather Mellish as Vice President, Global Sales. In...

08/01/2026

Pebble future-proofs playout at Canal Sur

Pebble, the leading automation, content management and integrated channel specialist, has provided a complete update of its installation at Canal Sur in Spain. ...

08/01/2026

Panasonics success in US market with Flagship Z95B OLED T...

iWedia, a global leader in software solutions for connected TV devices, proudly announces the success of its collaboration with Panasonic on the Z95B OLED TV, o...

08/01/2026

Secuoya Chile Invests in Ikegami UHK-X600 and UHL-X40 Cam...

Secuoya Chile, a leading provider of television content creation and supporting services, has invested in Ikegami UHK-X600 and UHL-X40 broadcast cameras as the ...

08/01/2026

Kiloview Highlights its Integrated AV-over-IP Ecosystem a...

Kiloview, a global leader in AV-over-IP solutions, will showcase its latest innovations at ISE 2026, highlighting the continued evolution of its complete, light...

08/01/2026

iWedia and Realtek Strengthen Collaboration to Shape the...

iWedia, a global leader in software solutions for connected TV devices, and Realtek, a leading global SoC design house, today announced the next phase of their ...

08/01/2026

CJP Broadcast delivers new pitch-side media facility for...

CJP Broadcast has completed a new pitch-side media installation for Cinderford RFC, creating a flexible production setup that supports match coverage, coaching ...

08/01/2026

PlayBox Neo CEO Defines Vision for 2026 after Record Year...

PlayBox Neo further drives momentum in Playout, Streaming, Media Management and Delivery "With a brand new year at PlayBox Neo already off to a flying start, I...

08/01/2026

Boston Conservatory at Berklee Presents Second Annual Commercial Dance BFA Concert

Boston Conservatory at Berklee Presents Second Annual Commercial Dance BFA Conce...

08/01/2026

TSA awards Rohde & Schwarz contract for advanced airport screening ahead of Soccer World Cup 2026

TSA awards Rohde & Schwarz contract for advanced airport screening ahead of Socc...

08/01/2026

New Korean Series Beauty in the Beast' (WT) in Production: A Fantasy Campus Romance With a Wolfy Twist

Back to All News New Korean Series Beauty in the Beast' (WT) in Production...

08/01/2026

Sound on Sound Magazine reviews CORE+ technology

The review looks back at DPA's miniature microphone development over the years. It compares the evolving technologies from the original mics through CORE an...

08/01/2026

Comscore Launches Audio Targeting and Measurement Capabilities with The Trade Desk, Helping to Unlock Incremental Reach for Advertisers

Comscore Launches Audio Targeting and Measurement Capabilities with The Trade De...

08/01/2026

Parents of murdered Irish woman Kirsty Ward speak out People need to know his name

Tonight, on RT Prime Time at 9:35pm on RT One and RT Player Tonight, Prime T...

08/01/2026

It's The Late Late Show Trad Special

The Late Late Show celebrates the very best of traditional Irish music with its first-ever full special dedicated entirely to the tradition Lisa Canny | Kevin...

08/01/2026

Japan Science and Technology Agency Develops NVIDIA-Powered Moonshot Robot for Elderly Care

The next universal technology since the smartphone is on the horizon - and it ma...

08/01/2026

AI Copilot Keeps Berkeley's X-Ray Particle Accelerator on Track

In the rolling hills of Berkeley, California, an AI agent is supporting high-stakes physics experiments at the Advanced Light Source (ALS) particle accelerator....

08/01/2026

DANCING WITH THE STARS - WHO WILL BE THE VICTIM OF THE EARLIEST EVER SERIES ELIMINATION?

It will be murder on the dancefloor when Dancing with the Stars returns this S...

08/01/2026

More Ways to Play, More Games to Love - GeForce NOW Wraps CES With Linux Support, Fire TV App, Flight Stick Controls

NVIDIA is wrapping up a big week at the CES trade show with a set of GeForce NOW...

07/01/2026

SVG Summit 2025: All General Sessions Now Available to Watch on SVG PLAY

SVG Summit 2025: All General Sessions Now Available To Watch on SVG PLAYThe SVG Summit celebrated its 20th edition with a loaded agendaBy Brandon Costa, Directo...

07/01/2026

NBCU's Brings Rinkside Live' and Courtside Live' Features to Peacock for Legendary February'

NBCU's Brings Rinkside Live' and Courtside Live' Features to Peaco...

07/01/2026

A Defining Moment for Podcasts: Spotify Kicks Off Week of Festivities After Contributing More Than $10 Billion to the Industry

Spotify is launching a week-long celebration spotlighting creators at the center...

07/01/2026

Introducing Listening Activity and Request to Jam in Messages on Spotify

We know people use Spotify not just to listen, but to share the songs, podcasts, and audiobooks they love with their friends and family. When we launched Messag...

07/01/2026

Go Inside Spotify Sycamore Studios, Our New State-of-the-Art Podcast Studio in Hollywood

This week, all eyes are on the podcast industry as the Golden Globes recognizes ...

07/01/2026

Making It Easier for Video Podcasters to Earn on Spotify

Podcasts are stepping onto a new stage this week as the Golden Globes recognize the medium for the first time. To mark this milestone moment, we're hosting ...

07/01/2026

Invitation to Submit Proposals for Micro-Budget Film Projects 2026

The National Film and Video Foundation (NFVF), in collaboration with a distributor, is commissioning new micro-budget fiction feature films and invites eligible...

07/01/2026

For L3Harris, the Action Starts Six Seconds Before Artemis II Lifts Off

NASA's Space Launch System rocket carrying the Orion spacecraft launches on the Artemis I flight test, Wednesday, Nov. 16, 2022, from Launch Complex 39B at ...

07/01/2026

L3Harris ROVER and TNR Products Receive NSA Approval for International Sales

For the first time ever, new C' variants allow international users to interoperate with U.S. personnel via NSA-certified cryptography....

07/01/2026

ATSC 3.0 Home Gateways To Appear At CES 2026

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

07/01/2026

SDVI Names Simon Eldridge Chief Operating Officer

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...