Sony Pixel Power calrec Sony

Tongues Untied: Dataset Starts Global Dialogue in Conversational AI

09/07/2021

A startup in East Africa is harnessing conversational AI to get the word out about a third wave of COVID-19 passing through the region. It hopes its Mbaza AI Chatbot will lead to partnerships that use the technology to tackle other concerns across the continent's many languages.

COVID is here to stay, unfortunately, and it's a volatile topic with measures that tighten and loosen from week to week, so it's important for people to have access to the latest information, said Audace Niyonkuru, founder and CEO of Digital Umuganda, the startup developing the software.

Based in Rwanda's capital of Kigali, his team aims to deploy a basic voice service in August. It will follow up with a version by year's end that can interpret and respond to spoken questions.

Conversational AI Gets the Word Out Ours is a more oral culture where there are still barriers to access because it's easier for people to talk than write, Niyonkuru said of the primarily rural country where three-quarters of the 12 million population are literate.

It's a challenge shared widely across Africa, home to more than 2,000 languages and dialects. But Niyonkuru, a lifelong entrepreneur, prefers to see the glass as half full.

There's a huge opportunity globally because conversational AI is a bridge over barriers to access - people can use their phones to get all sorts of medical or legal information, he said.

Giving AI a Common Voice To train a conversational AI model, you need an extremely large dataset of voice samples, something that takes lots of time to build or lots of money to buy. The startup trained its models on Mozilla Common Voice, a free and publicly available multilingual platform and dataset created by Mozilla and supported by NVIDIA. The Common Voice dataset was built through contributions from thousands of contributors across the world.

Digital Umuganda is Africa's largest contributor to the platform. To date, it's organized contributors to create 2,200 hours of Kinyarwanda, the language spoken by 40 million people in and around Rwanda. It's the largest dataset after English in Common Voice today.

To create the dataset, the startup tapped into Rwanda's tradition where neighbors gather on the last Saturday of each month to work on a community project. The startup embraced and extended the practice called umuganda.

The spirit of open source software is embedded in Rwanda's culture, so we just applied it to the digital world and datasets, he said.

Donations Shared with All Digital Umuganda started collecting data with student gatherings at universities, then went to the countryside to make sure the dataset represented people of all ages.

The beautiful thing is because it's in the open we see researchers around the world working with it, said Niyonkuru.

Two branches of the Rwandan government have expressed interest in using the startup's technology, and at least one third party has already created a conversational AI model using the dataset.

The COVID project got its start last spring when government call centers were overwhelmed by peaks of more than 10,000 calls for information about the pandemic. The Mbaza chatbot will be deployed on existing government healthcare lines as a 24/7 information service.

It's one example of how Common Voice is democratizing access to conversational AI around the globe, both for companies that develop the technology and consumers who use it.

Giving More Languages a Voice First launched in 2017, the Common Voice dataset gets an updated release twice a year. It focuses on expanding support in underrepresented languages, filling wide gaps left by commercial voice projects that typically focus on a handful of the most popular American, Asian and European languages.

Common Voice currently packs more than 10,000 hours of recorded voice samples, collected and validated by volunteers. It's a treasure trove for startups, researchers and small- to medium-sized developers who don't have the time or money to collect or purchase datasets of their own.

The next release, coming at the end of July, provides data from 75 languages, 15 of them debuting in Common Voice for the first time. They include Urdu, spoken by 70 million people in south Asia; Hausa, the language of 60 million Africans; as well as Azerbaijani, Armenian, Serbian and Uighur - none of which are supported by major commercial AI services.

It will be the first release since NVIDIA became a partner with Mozilla in April 2021, supporting Common Voice as part of a shared vision of making conversational AI available for everyone.

How You Can Help We created the NVIDIA Jarvis framework to give developers state-of-the-art pre-trained deep learning models and software tools to create interactive conversational AI services. Now we're helping make this rich, open dataset available, too.

Everyone is invited to join the global effort to make this technology available to all developers in all languages by going to Common Voice and contributing or validating voice samples as part of a dataset anyone can use freely.

Above: Digital Umuganda co-founder Ali Nyiringabo (right) with volunteers at an event in Kigali collecting and validating samples for Common Voice.
LINK: https://blogs.nvidia.com/blog/2021/07/09/common-voice-conversational-a...
See more stories from nvidia

Most recent headlines

05/01/2027

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be demoed at CES 2026

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...

06/09/2026

Dolby and MagentaTV Bring Fans Closer to the FIFA World Cup 2026 in Germany with Dolby Vision and Dolby Atmos

June 9 2026, 23:00 (PDT) Dolby and MagentaTV Bring Fans Closer to the FIFA Worl...

04/08/2026

Dalet Announces Commercial Availability of Dalia, Bringing Media-Aware Agentic AI to Enterprise Productions

Dalet, a leading technology and service provider for media-rich organizations, t...

04/07/2026

Detective Conan: Fallen Angel of the Highway Opens in Dolby Cinemas Across Japan, Presented in Dolby Atmos and Dolby ...

April 7 2026, 19:00 (PDT) Detective Conan: Fallen Angel of the Highway Opens in...

27/06/2026

Through Their Lens: What Cinematographer Amy Vincent Saw at the 2026 Directors Lab

There's no doubt that you've seen the world through Amy Vincent's ey...

27/06/2026

UJAM release Retrocraft

Brings together saturation & lo-fi effects Following on from the release of their Voxcraft vocal-processing plug-in, UJAM have announced the launch of Retro...

27/06/2026

A record 4.84 million Australians choose SBS as the Socceroos advance at FIFA World Cup 2026

A record 4.84 million Australians choose SBS as the Socceroos advance at FIFA Wo...

27/06/2026

Apogee CRAS Symphony Mkii Education Feature Blog

Why CRAS Upgraded to Symphony I/O MK II When an audio school runs studios all day, every day, gear doesn't just need to sound good , it needs to survive rea...

27/06/2026

MultiDyne Acquires the Assets of MRMC

Share Copy link Facebook X Linkedin Bluesky Email...

27/06/2026

Spectrum Intelligence Ventures Launches Latis

Share Copy link Facebook X Linkedin Bluesky Email...

27/06/2026

Krotos Video to Sound Plugin Now Available for Adobe Premiere Pro

Krotos Video to Sound Plugin Now Available for Adobe Premiere Pro Brie Clayton June 26, 2026 0 Comments Editors can analyze footage, generate synchron...

27/06/2026

Mirai Media Elevates Digital and Broadcast Productions with Blackmagic Design

Mirai Media Elevates Digital and Broadcast Productions with Blackmagic Design Brie Clayton June 26, 2026 0 Comments Studio uses Ultimatte 12 HD and Po...

27/06/2026

Lutra Cafe & Bakery Opens At American Tobacco Campus

DURHAM, N.C. - JUNE 26, 2026 - Lutra Cafe & Bakery has opened its first brick-and-mortar location at American Tobacco Campus after owner Chris McLaurin operated...

26/06/2026

SVG GameDay, Ep. 21: Minnesota Vikings Allan Wertheimer - Large-Scale Shows in Minny

In-venue and creative video staffers at the professional and collegiate level ha...

26/06/2026

Strike Fighter League Announces Second Online Tournament, Set for July 25 in Las Vegas

Strike Fighter League (SFL), a professional air combat digital sport combining f...

26/06/2026

InfoComm 2026: Wisycom Announces MPR60 Firmware Update, MATF Antenna Matrix, and PFL RFoF Box

Wisycom has announced three new additions to its professional wireless ecosystem...

26/06/2026

Eurovision Services Inaugurates Expanded Master Control Room in Madrid

Eurovision Services inaugurated an expanded Master Control Room (MCR) in Madrid on June 1, 2026, building on a broadcast hub the company has operated in the cit...

26/06/2026

Midco Sports and University of North Dakota Renew Broadcast and Sponsorship Partnership

Midco Sports and the University of North Dakota (UND) have announced a two-year ...

26/06/2026

G&D and VuWall Appoint Vutec as Exclusive South Africa Distributor

Guntermann and Drunck (G&D) and VuWall, both part of the Panoptec Technologies Group, have appointed Vutec (Pty) Ltd as exclusive distributor for their KVM and ...

26/06/2026

Visit Seattle Launches Drone Scoreboard at Space Needle for FIFA World Cup 2026

Visit Seattle, the official destination marketing organization for Seattle and King County, has launched what it describes as the world's first drone scoreb...

26/06/2026

CP Communications Provides RF and Wireless Support for 2026 NBA Draft at Barclays Center

CP Communications provided RF video, audio, and crew communications support for ...

26/06/2026

Reimagined MoonPay X Games League Kicks Off With Three-Day Event in Sacramento

Produced by longtime partner Echo Entertainment, the action-sports property is now a team-based year-round league The inaugural season of the MoonPay X Games L...

26/06/2026

MultiDyne Acquires MRMC, Expands into Camera Robotics and Motion Control

The deal establishes MultiDyne Robotics and Motion Control, maintaining the well-known MRMC brand.MultiDyne Video & Fiber Optic Systems has acquired the assets ...

26/06/2026

TNT Sports Heads Into Year 2 of NASCAR Return With New NEP Truck, Expanded In-Car Experience

PX1 will debut at Sonoma as TNT leans into super-slo-mo, drones, SMT data integr...

26/06/2026

Ratings Roundup: USMNT-Australia Draws 23M Viewers; Mexico-South Korea Is Most-Watched Spanish-Language Soccer Match Ever

Ratings Roundup is a rundown of recent rating news and is derived from press rel...

26/06/2026

David Kuckhermann brings calabash to Celemony Tonalic

Virtual session musician plug-in gains new percussion options Celemony's latest update for their virtual session musician platform complements the exist...

26/06/2026

Softube unveil the Console 1 Compact

Half-size model joins Console 1 line-up Shortly after the release of their new Flow Studio controller, Softube have announced the launch of another new surf...

26/06/2026

ELT Group and Rohde & Schwarz sign a cooperation agreement to explore commercial opportunities in electromagnetic warfare and defense

ELT Group and Rohde & Schwarz sign a cooperation agreement to explore commercial...

26/06/2026

Lightware Powers Teddy Swims UK And Europe Tour With Adva...

For Teddy Swims sold-out I've Tried Everything But Therapy tour, event technology specialists, PRG, provided video, automation and lighting across 19 date...

26/06/2026

Taurus TPN powers AV workflows at NurnbergMesse

Modern exhibition and event venues face the challenge of seamlessly integrating traditional conference technology, professional broadcast workflows and IP-based...

26/06/2026

FCC Adopts New Cybersecurity Requirements for Alerting Systems

Share Copy link Facebook X Linkedin Bluesky Email...

26/06/2026

Study: Roku Most Used But Not Highest Rated Streaming Platform

Share Copy link Facebook X Linkedin Bluesky Email...

26/06/2026

Samsung Ads Announces First Shoppable CTV Partners

Share Copy link Facebook X Linkedin Bluesky Email...

26/06/2026

Gray Media Names Annie Cordell General Manager of WMBF

Share Copy link Facebook X Linkedin Bluesky Email...

26/06/2026

Neko Oji: The Guy That Got Reincarnated as a Cat Edited with DaVinci Resolve Studio

Neko Oji: The Guy That Got Reincarnated as a Cat Edited with DaVinci Resolve Stu...

26/06/2026

Adobe to Acquire Topaz Labs

Adobe to Acquire Topaz Labs Brie Clayton June 25, 2026 0 Comments Adobe has seen strong demand for its AI products for creatives, including Adobe Fire...

26/06/2026

Berklee Students Earn Dedicated Section at Raindance Film Festival in London

Berklee Students Earn Dedicated Section at Raindance Film Festival in London Five documentary short films produced in the Africana Studies Department screen a...

26/06/2026

Keeping Pace with the Race

How IMS Productions and FOX Sports scaled coverage of the 109th Indianapolis 500. The last lap of this year's Indianapolis 500 delivered the kind of ending...

26/06/2026

Prison Wives of TikTok is Locked In for U and U&W

Flicker Productions to produce five-part docu-reality series following women who have fallen for men in prison and have become TikTok sensations, with brands an...

26/06/2026

Automating post-production workflows with Baselight, Daylight, Nara & FilmLight API. New York. 8 July 2026

Catch up on the latest developments across Baselight and Daylight v7, Nara and F...

26/06/2026

DFT installs second Polar HQ at China News Film Confirming Position as China's Leading 8K Film Preservation Partner

26. June 2026 News DFT is pleased to announce that a second Polar HQ film s...

26/06/2026

New documentary Freedom Founder: Thomas McKean and the American Revolution comes to RT

New documentary Freedom Founder: Thomas McKean and the American Revolution airs ...

25/06/2026

Launching a Career in Broadcast Engineering: Academic Paths and Essential Certifications

Launching a Career in Broadcast Engineering: Academic Paths and Essential Certif...

25/06/2026

SVG Students To Watch: Jude Kieffer, Ball State University

This superstar shooter/storyteller from Central Indiana hopes to make his mark in the blossoming sports-documentary and -features space In the live-sports-vid...

25/06/2026

Presidio and NHL Renew Multiyear North American Technology Partnership

Presidio and the National Hockey League have announced a multiyear renewal of their North American partnership. Presidio will remain an Official Technology Inno...

25/06/2026

Strike Fighter League Hits the Industry as First Professional Air Combat Sport

Strike Fighter League (SFL) is the world's first professional air combat digital sport that combines elite human performance and physical immersion with cut...

25/06/2026

Rise Reveals 2026 Worldwide Mentoring Cohorts to Support Future Industry Leaders

Rise, the award-winning advocacy group for gender diversity in the broadcast and media technology sector, is pleased to announce the global mentoring cohort for...

25/06/2026

MLB Network To Air American Association of Professional Baseball All-Star Game for First Time on July 15

The 2026 American Association of Professional Baseball (AAPB) All-Star Game will...

25/06/2026

Mediaproxy Partners with HVS for U.S. Broadcast Market

Mediaproxy has named Heartland Video Systems (HVS) as its exclusive partner for US television broadcasting. The Wisconsin-based systems integrator will represen...

25/06/2026

Backblaze Inks Five-Year Multi-Exabyte Data Storage Agreement with CoreWeave

Backblaze has formed an agreement with CoreWeave to create The Essential Cloud for AI. Under the multi-exabyte, $335 million agreement, Backblaze will provide...