Sony Pixel Power calrec Sony

How Deep Learning Is Aiding Preservation of Seneca and Other Endangered Languages

03/01/2019

Linguists estimate that at least half of the world's estimated 7,000 spoken languages will become extinct by the century's end, due to forces ranging from globalization to cultural assimilation.

Part of the challenge of documenting and revitalizing endangered languages is a lack of texts and speech recordings to work with. Seneca, a language of one of the six Iroquois Nations in North America, has only about 100 first-language speakers and several hundred more second-language learners.

Automatic speech recognition (ASR) technology is widely used to transcribe languages with millions or billions of speakers, like English and Mandarin. But it has only scratched the surface with languages like Seneca, which have vastly fewer speakers and significantly less data to work with.

Now a team of researchers at the Rochester Institute of Technology in New York, along with colleagues from the University at Buffalo, is tapping deep learning to bolster the ability of ASR. And while its focus is on Seneca, the project's vision encompasses the preservation of languages globally as well as an important part of our shared cultural history.

Knowing about different languages teaches us a lot about how our brain works, said Emily Prud'hommeaux, an assistant professor of computer science at Boston College and a research faculty member at RIT. When you document a language, you're preserving information not only about that language but also about how humans use language in general.

It's no coincidence that Prud'hommeaux and her team started with the Seneca language. Three members of the Seneca nation are part of the effort - a direct connection that is rare in research of this type, she said.

Leading the charge is Robbie Jimerson, a Ph.D. student in RIT's Golisano College of Computing and Information Science. He is a member of the Seneca Nation of Indians and is passionate about ensuring the survival of the Seneca language.

There's a big effort by the leaders of the tribe to preserve and promote our language, said Jimerson. I was looking for an opportunity to contribute.

Using GANs to Create More Language Samples Now in its third year, the project has had challenges when it comes to accumulating language data. Jimerson said the Seneca community can be guarded about what it shares with other people, so there wasn't an abundance of recordings of the language being spoken. He set out to change that.

He started by recording friends and elders who speak the language and asking them to record their friends. He found out whenever someone was speaking Seneca in public. He asked for family recordings of elders telling stories handed down from previous generations. And he grabbed any publicly available videos or recordings he could find online.

The team has fine-tuned an ASR model for Seneca, running it through generative adversarial networks to create more samples out of the limited number of recordings. The model turns wave files of the spoken language into streams of characters, while computing probability and making corrections.

The resulting data is fed into a deep learning model that in turn expands upon the ASR model's accuracy.

The team's networks run in two compute settings: on a nine-server machine learning lab running a variety of NVIDIA Tesla GPUs, and on a university cluster of large servers, each running 10 NVIDIA Tesla P4 GPUs. Each cluster runs a range of deep learning frameworks such as TensorFlow and Caffe.

The computer engineering cluster is for all students in the computer engineering department, and so they have to compete' for these resources, said Ray Ptucha, assistant professor of computer engineering at RIT, another collaborator on this project.

With access to these clusters at a premium, Jimerson tests code and checks the stability of models on a local machine running an NVIDIA TITAN X rather than inconvenience other students by running a model that might crash.

Achieving Better Accuracy So far, the team's efforts have brought the word error rate of its ASR model from 70 percent down to 56 percent. The goal, said Prud'hommeaux, is to get that rate down to 25 percent, which is where ASR systems were in processing English several years ago.

The more samples of spoken and written Seneca the team can accumulate, the more the error rate will decrease. (Today, English ASR models can achieve word error rates as low as 5 percent.)

The team's work is expected to help with language preservation efforts around the world.

Prud'hommeaux said the team has an agreement with an archiving institution that's a condition of a grant the project received from the National Science Foundation. The resulting language archiving database will be made available as a resource for other efforts seeking to document threatened languages.

Additionally, Prud'hommeaux said the team's work could prove helpful for any deep learning effort that has to make do with limited amounts of data.

Read more about the team's work in their research papers here and here.

Feature image: The Haudenosaunee (Iroquois Confederacy) flag, via Wikimedia Commons.
LINK: https://blogs.nvidia.com/blog/2019/01/02/deep-learning-preserves-senec...
See more stories from nvidia

Most recent headlines

18/12/2025

SVG Campus Shot Callers: Kurt Sutton, Director of Broadcast Operations, Clemson University

SVG Campus Shot Callers: Kurt Sutton, Director of Broadcast Operations, Clemson ...

18/12/2025

Follow the Money Episode 2: Inside the Sports Media Biz with Sam McCleery and Steve Hellmuth

Follow the Money Episode 2: Inside the Sports Media Biz with Sam McCleery and St...

18/12/2025

SVG Sit-Down: Google Cloud's Anshul Kapoor on the Future of Generative Production' in Live Sports

SVG Sit-Down: Google Cloud's Anshul Kapoor on the Future of Generative Prod...

18/12/2025

The 2025 SVG Summit Draws Record Crowd for 20th-Annual Sports-Production Industry Homecoming in NYC

The 2025 SVG Summit Draws Record Crowd for 20th-Annual Sports-Production Industr...

18/12/2025

SBS's sports schedule sizzles in January with Dakar Rally, Kooyong Classic and Mapei Cadel Evans Great Ocean Road Race

SBS's sports schedule sizzles in January with Dakar Rally, Kooyong Classic a...

18/12/2025

Montreal's Bell Centre elevates fan experience with Argo S

Canada's largest indoor arena has transformed its live production capabilities with a full ST 2110 infrastructure and Calrec's compact Argo S console. S...

18/12/2025

The Gauge: Mexico November 2025

During November, streaming's share of TV viewing in Mexico settled at 24.2%, an increase of 0.5 share points from the previous month. Disclaimer: YUMI TV,...

18/12/2025

The Gauge: Poland | November 2025

November continued the upward trend in television viewership. The significantly colder weather and a rich programming lineup encouraged viewers to spend more ti...

18/12/2025

Gracenote helps TV platforms go beyond the game and deliver more connected, visually rich sports hub experiences

As viewers turn to sports highlights, recaps and documentary programming, expand...

18/12/2025

NAB Once Again Urges FCC to Eliminate Ownership Rules

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

18/12/2025

Carr Stands Up for His Policies in Senate Hearing

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

18/12/2025

The HELM and ARRI announce strategic partnership to redef...

The HELM, a global expert in cinematic live broadcast and high-end production workflows, has entered a strategic partnership with ARRI, the renowned designer an...

18/12/2025

Cadena Melodia Upgrades to DHD SX2 Audio Production Conso...

Cadena Melod a de Colombia (Cadena Melod a), a long-established Colombian radio network, has chosen DHD audio SX2 production consoles for integration into the m...

18/12/2025

Czech TV Elevates Video Streaming with Harmonic

Harmonic (NASDAQ: HLIT) today announced that Czech Television (Czech TV), the public broadcaster of the Czech Republic, has teamed up with Harmonic to modernize...

18/12/2025

Broadcast Solutions Group acquires PMT Professional Motio...

Broadcast Solutions Group, a leading system integrator and provider of innovative solutions for the broadcast and media industry, has announced the acquisition ...

18/12/2025

Keepit named a Leader in IDC MarketScape for Worldwide Sa...

Keepit, the SaaS data protection company, announced today that it has been named a Leader in the IDC MarketScape: Worldwide SaaS Data Protection 2025-2026 Vendo...

18/12/2025

Limecraft 2025 Version 8 adds User Controlled Notificatio...

Limecraft today announced the release of Limecraft 2025.8, the eighth and final major platform update of the year. This release strengthens daily workflows acro...

18/12/2025

creativespace Expands Footprint in the House of Worship M...

DigitalGlue is very grateful, especially at this time of the year, that its creative.space platform has expanded its footprint within the House of Worship marke...

18/12/2025

TAG Video Systems Celebrates Multiple APAC Award Wins for...

TAG Video Systems is proud to share that the company has recently received multiple industry recognitions across the Asia-Pacific region, reflecting its ongoing...

18/12/2025

NDI and Zoom team up to bring seamless connectivity to me...

NDI, the leading video connectivity standard for AV-over-IP, and Zoom, the AI-first collaboration platform, announce a strategic collaboration to integrate the ...

18/12/2025

YES and Synamedia extended deal backs Partner TV launch

Leading video software provider, Synamedia, today announced that it is extending its long-standing relationship with YES, the pay-TV subsidiary of the largest I...

18/12/2025

Riedel Builds Global Communication and Commentary Network...

Riedel Communications today announced it provided a fully integrated communications and commentary solution for the 15th National Games of China, supporting 56 ...

18/12/2025

Clear-Com Arcadia Central Station Links Toledo Walleye an...

When both the Toledo Walleye and Toledo Mud Hens play at home on the same night, communication between their respective production teams is essential. To stream...

18/12/2025

TMT Insights Focus Platform Recognized with TV Tech Best...

TMT Insights' new upstream media supply chain platform, Focus, was selected as a winner in the 2025 Media & Entertainment: Best in Market Awards in the TV T...

18/12/2025

Clear-Com Named Official Intercom Partner for NAMMs 125th...

Clear-Com is proud to announce its continued role as the official intercom supplier for the Yamaha Grand Plaza Stage at The 2026 NAMM Show, taking place Januar...

18/12/2025

CES: NBCU Unveils New Cross-Platform Ad Tech Solutions, Capabilities

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

18/12/2025

2026 NAB Show Opens Registration, Unveils Major Program Enhancements

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

18/12/2025

YouTube Wins Global Rights to Stream the Oscars

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

18/12/2025

PGA TOUR Studios Teams up with SES for Hybrid Content Distribution

Long-term agreement includes the SES SCORE platform and hybrid distribution worldwide to deliver more than 5,000 hours of golf tournaments annually featuring th...

18/12/2025

Sky Sports remains the exclusive home of the Masters Tournament, with more live coverage than ever before

Thursday 18 December 2025 Sky Sports remains the exclusive home of the Masters ...

18/12/2025

Teaser for Can This Love Be Translated' Previews a Heartwarming Romance To Open 2026

Back to All News Teaser for Can This Love Be Translated' Previews a Heartw...

18/12/2025

2025-11-18

Using the additive process of 3D printing, layer after layer gets printed until an object is as close to the final shape needed as possible. Historically, machi...

18/12/2025

RT Supporting the Arts 2025 Review | January 2026 Events

In 2025, RT proudly supported 185 arts and cultural events across the island of Ireland, reflecting significant growth since the scheme was re-launched in 2014...

18/12/2025

The RT Sport Young Sportsperson of the Year Nominees 2025 Revealed

RT Sports Awards 2025 live on RT One and RT Player at 8:05pm on Saturday 20 December On Saturday 20 December live on RT One and RT Player at the earlier t...

18/12/2025

RT lyric fm celebrates the Winter Solstice with a special Ambient Orbit live broadcast

RT lyric fm presents a very special Winter Solstice edition of Ambient Orbit, l...

18/12/2025

Now Generally Available, NVIDIA RTX PRO 5000 72GB Blackwell GPU Expands Memory Options for Desktop Agentic AI

Top-notch options for AI at the desktops of developers, engineers and designers ...

18/12/2025

Celebrating 100 Years of Public Broadcasting in Ireland in 2026

At 7.45pm on 1st January 1926, the precursor to RT , then 2RN, delivered the fledgling new Irish state's first public radio transmission. From those first c...

18/12/2025

Deck the Vaults: Fallout: New Vegas' Joins the Cloud This Holiday Season

Step out of the vault and into the future of gaming with Fallout: New Vegas streaming on GeForce NOW, just in time to celebrate the newest season of the hit Ama...

18/12/2025

The Movie Experience SLO Becomes First U.S. Exhibitor to Adopt Dolby Vision+Atmos Theatrical Solution

December 18 2025, 05:30 (PST) The Movie Experience SLO Becomes First U.S. Exhib...

17/12/2025

The EU Investigative Journalism Award 2025: bold reporting, regional impact, and rise in public-interest journalism

Investigative journalists across the Western Balkans and T rkiye continue to con...

17/12/2025

Sports Broadcasting Hall of Fame Inducts 10 Industry Icons During Unforgettable Night

Sports Broadcasting Hall of Fame Inducts 10 Industry Icons During Unforgettable ...

17/12/2025

ESPN to Debut MNF Playbook with Next Gen Stats, a New AI-Driven NFL Data-AltCast

ESPN to Debut MNF Playbook with Next Gen Stats, a New AI-Driven NFL Data-AltCastThe series, powered by Adrenaline TruPlay AI, launches Dec. 22 and runs through ...

17/12/2025

Inaugural Optum Golf Channel Games Debut Under the Lights' in Primetime on Golf Channel and USA Network

Inaugural Optum Golf Channel Games Debut Under the Lights' in Primetime on ...

17/12/2025

Ring In the New Year With New Playlists Mixed by Artists, and More Spotify Hacks

The right playlist is essential on New Year's Eve, building the energy as you get ready and keeping it high as you count down to midnight. This year, Spotif...

17/12/2025

Clear-Com's Arcadia Central Station Links Toledo Walleye and Mud Hens Venues with...

eds3_5_jq(document).ready(function($) { $(#eds_sliderM519).chameleonSlider_2_1({...

17/12/2025

Broadcast and Streaming Serve Up a Historic Month of TV in Nielsen's The Gauge

Audiences Watched Over 103 Billion Minutes of TV on Thanksgiving Day NFL Games ...

17/12/2025

EdgeBeam Wireless Makes Initial Sale, Expands Executive Team

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

17/12/2025

Warner Bros. Discovery Tells Shareholders to Reject Paramount Bid

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

17/12/2025

NDI, Zoom Collaborate on Seamless Connectivity

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

17/12/2025

Broadcasters Mark Momentous Year of Challenges Amid Viewing Fragmentation

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...