
Linguists estimate that at least half of the world's estimated 7,000 spoken languages will become extinct by the century's end, due to forces ranging from globalization to cultural assimilation.
Part of the challenge of documenting and revitalizing endangered languages is a lack of texts and speech recordings to work with. Seneca, a language of one of the six Iroquois Nations in North America, has only about 100 first-language speakers and several hundred more second-language learners.
Automatic speech recognition (ASR) technology is widely used to transcribe languages with millions or billions of speakers, like English and Mandarin. But it has only scratched the surface with languages like Seneca, which have vastly fewer speakers and significantly less data to work with.
Now a team of researchers at the Rochester Institute of Technology in New York, along with colleagues from the University at Buffalo, is tapping deep learning to bolster the ability of ASR. And while its focus is on Seneca, the project's vision encompasses the preservation of languages globally as well as an important part of our shared cultural history.
Knowing about different languages teaches us a lot about how our brain works, said Emily Prud'hommeaux, an assistant professor of computer science at Boston College and a research faculty member at RIT. When you document a language, you're preserving information not only about that language but also about how humans use language in general.
It's no coincidence that Prud'hommeaux and her team started with the Seneca language. Three members of the Seneca nation are part of the effort - a direct connection that is rare in research of this type, she said.
Leading the charge is Robbie Jimerson, a Ph.D. student in RIT's Golisano College of Computing and Information Science. He is a member of the Seneca Nation of Indians and is passionate about ensuring the survival of the Seneca language.
There's a big effort by the leaders of the tribe to preserve and promote our language, said Jimerson. I was looking for an opportunity to contribute.
Using GANs to Create More Language Samples Now in its third year, the project has had challenges when it comes to accumulating language data. Jimerson said the Seneca community can be guarded about what it shares with other people, so there wasn't an abundance of recordings of the language being spoken. He set out to change that.
He started by recording friends and elders who speak the language and asking them to record their friends. He found out whenever someone was speaking Seneca in public. He asked for family recordings of elders telling stories handed down from previous generations. And he grabbed any publicly available videos or recordings he could find online.
The team has fine-tuned an ASR model for Seneca, running it through generative adversarial networks to create more samples out of the limited number of recordings. The model turns wave files of the spoken language into streams of characters, while computing probability and making corrections.
The resulting data is fed into a deep learning model that in turn expands upon the ASR model's accuracy.
The team's networks run in two compute settings: on a nine-server machine learning lab running a variety of NVIDIA Tesla GPUs, and on a university cluster of large servers, each running 10 NVIDIA Tesla P4 GPUs. Each cluster runs a range of deep learning frameworks such as TensorFlow and Caffe.
The computer engineering cluster is for all students in the computer engineering department, and so they have to compete' for these resources, said Ray Ptucha, assistant professor of computer engineering at RIT, another collaborator on this project.
With access to these clusters at a premium, Jimerson tests code and checks the stability of models on a local machine running an NVIDIA TITAN X rather than inconvenience other students by running a model that might crash.
Achieving Better Accuracy So far, the team's efforts have brought the word error rate of its ASR model from 70 percent down to 56 percent. The goal, said Prud'hommeaux, is to get that rate down to 25 percent, which is where ASR systems were in processing English several years ago.
The more samples of spoken and written Seneca the team can accumulate, the more the error rate will decrease. (Today, English ASR models can achieve word error rates as low as 5 percent.)
The team's work is expected to help with language preservation efforts around the world.
Prud'hommeaux said the team has an agreement with an archiving institution that's a condition of a grant the project received from the National Science Foundation. The resulting language archiving database will be made available as a resource for other efforts seeking to document threatened languages.
Additionally, Prud'hommeaux said the team's work could prove helpful for any deep learning effort that has to make do with limited amounts of data.
Read more about the team's work in their research papers here and here.
Feature image: The Haudenosaunee (Iroquois Confederacy) flag, via Wikimedia Commons.
Most recent headlines
09/11/2025
Dalet today announced a transformative leap forward for media operations: Agentic Artificial Intelligence (AI) that unifies the Dalet ecosystem under one natura...
25/10/2025
LOS ANGELES As the popularity of short-for vertical videos from mobile devices continues to soar, vgames, Pitango and a group of Hollywood executives and celebr...
25/10/2025
LONDON The AI-powered VFX toolkit Slapshot has launched a professional-grade AI camera tracking tool the company said is designed to deliver precise camera sol...
25/10/2025
NEW YORK NAB Show New York said its 2025 edition wrapped up its program on Oct. 23 with 11,500 registered attendees from 95 countries, reinforcing its status as...
25/10/2025
NEW YORK Vimeo said it rolled out new AI-powered features and creative tools that it said will make professional video production faster, smarter and more rewar...
25/10/2025
HOUSTON Regional sports network Space City Home Network has upgraded its audio control room with a Solid State Logic System T S300-32 audio console as part of t...
25/10/2025
BAFTA-nominated cinematographer Annemarie Lean-Vercoe ( Breeders , Call the Midwife , Murder in Provence ) was just the DoP to set the look on sophisticated a...
25/10/2025
OpenDrives, Inc, a leading provider of software-defined data storage and data services, today announced a new distribution partnership with Versatile Distributi...
25/10/2025
Chaos today announces the release of Chaos Vantage 3, the first major update to its real-time visualization platform in more than two years. With Vantage, AEC p...
25/10/2025
Frequency, the engine behind many of the world's leading streaming television channels, today announced that it powered the first-ever delayed-live broadcas...
25/10/2025
LTN is accelerating the digital evolution of US local broadcasters with innovations that enable stations to launch streaming channels faster, deliver live news ...
25/10/2025
Rise WIB and Rise AV, advocacy groups championing gender diversity and professional development in the broadcast and AV sectors, have announced key leadership u...
25/10/2025
European technology developer Profuz Digital is proud to announce its partnership with the Atlantic Club of Bulgaria as a Special Technical Partner. To overcome...
25/10/2025
European cultural broadcaster ARTE has strengthened its long-standing relationship with Grass Valley, selecting the company's LDX 135 cameras and Creative G...
25/10/2025
IBC today announced the official call for challenge submissions to the Accelerator Media Innovation Programme 2026, inviting forward-thinking organisations from...
25/10/2025
ASB GlassFloor, the Germany-based global leader in high-performance sports flooring, announces the official launch of ASB Arena and Event Services AG (AES), a s...
25/10/2025
Bitmovin, leading provider of video streaming solutions, has announced that its internal playback stream testing system for the Bitmovin Player now leverages AI...
25/10/2025
Envoi, multi-cloud data management and data protection solutions provider, has launched a new solution, Envoi Express Lane, for managing the demands of distribu...
25/10/2025
Accedo, a global provider of video streaming software and services, has supported FloSports, a leading sports media company, to expand its service to Samsung an...
25/10/2025
Spanish-Language Music Production Course Debuts at Berklee Online New 12-week online course expands access to Berklee's renowned music curriculum for Span...
24/10/2025
NEP CEO Martin Stewart on $700M Investment, Restructuring, and the Challenges Fa...
24/10/2025
FOX Sports Debuts Next-Gen Graphics, Celebrates Career of Lead Producer Pete Mac...
24/10/2025
GROUP MEDIAPRO Chairman and CEO Tatxo Benet Steps DownBy Ken Kerschbaumer, Editorial Director
Friday, October 24, 2025 - 2:37 pm
Print This Story | Subscri...
24/10/2025
NBA Tip-Off: Amazon Prime Video Debuts Cutting-Edge Studio, Mobile Units, Global...
24/10/2025
(L-R) Director Justin Lin with his cast and producers at Eccles Theatre for the premiere of Last Days in Park City. (Photo by George Pimentel/Shutterstock for...
24/10/2025
As global connectivity demands continue to grow, non-terrestrial networks (NTNs) are emerging as a transformative force in telecommunications. By extending cove...
24/10/2025
Warsaw - Poland, October 20, 2025 - Nielsen, a global leader in audience measurement, data and analytics, has published its latest All Screens Video Landscape r...
24/10/2025
Springsteen: Deliver Me from Nowhere Filmed at Berklee NYCs Power Station The biopic, starring Jeremy Allen White as the Boss, focuses on the period when Spri...
24/10/2025
TORONTO Sometimes in sports, as in life, it's the little things that matter, and that aphorism will be on full display tonight when the Toronto Blue Jays ta...
24/10/2025
NEW YORK Charters Spectrum Reach has announced that its clients have used Waymark's AI-driven ad creation platform to create more than 15,000 ads since Spec...
24/10/2025
BURLINGTON, Mass. Avid has today announced the release of Pro Tools 2025.10, a feature-rich update that the company said offers notable advances in immersive mu...
24/10/2025
NEW YORK In a major change for the ad industry, Comcast Advertising will unveil technology that enables agencies and brands to buy targetable, biddable ads on l...
24/10/2025
WASHINGTON The ATSC broadcast standards group has outlined a growing list of international activities that the group said is expanding its influence and solidif...
24/10/2025
FIRST PLACE AND 5,000 LYNDA McCARTHY FOR WITNESS'
SECOND PLACE AND 4,000 ANGELA FINN FOR A SPECTRUM OF SORROW'
THIRD PLACE AND 3,000 IAN FE...
24/10/2025
24 Oct 2025
VEON to Release 3Q25 Earnings Update on November 10, 2025 Dubai, October 24, 2025 - VEON Ltd. (NASDAQ: VEON), a global digital operator, today conf...
24/10/2025
One-off special from the team behind BAFTA award-winning Libby, Are You Home Yet...
24/10/2025
The review examined how the model is developed, managed, and delivered against the requirements set out in the Origin framework.
Simon Redlich, Chief Executive...
24/10/2025
Countdown to GTC Washington, DC: What to Watch Next Week Next week, Washington, D.C., becomes the center of gravity for artificial intelligence. NVIDIA GTC W...
24/10/2025
RT will provide extensive coverage of the results of the Presidential Election across television, radio and online on Saturday, 25 October 2025.
Throughout th...
24/10/2025
New Coaches, New Families and New Challenges Set for Ireland's Fittest Famil...
24/10/2025
Westlife, Imelda May and Ben Elton among the guests on this week's Late Late...
23/10/2025
Unlocking character: Sportcast on executing the Bundesliga and Bundesliga 2 new ...
23/10/2025
Clear coordination: Juggling the new Bundesliga rights cycle requirements and pu...
23/10/2025
Analysis: Is piracy just the cost of doing business? By Callum McCarthy, Editor-at-Large
Tuesday, October 21, 2025 - 09:58
Print This Story
It's high ...
23/10/2025
ESPN's Adam Whitlock on Driving Real-World Innovation Across the Video-Trans...
23/10/2025
SVG TranSPORT 2025 Unites 300+ Industry Leaders in New York for Deep Dive Into L...
23/10/2025
NBA Tip-Off: League Starts Season With Two New Broadcast Partners, In-House NBA ...
23/10/2025
NFL Deepens Business Partnership with EA Sports; More Madden Casts to Come?EA Sports will remain the exclusive producer and distributor of Madden NFL video game...
23/10/2025
NFL Moves Pro Bowl Games Indoors and to Super Bowl Week; Leans Into a Made-for-T...
23/10/2025
By Alan Dominguez
Recently I have been thinking about the intersection of two e...