
University of Washington researchers have developed new algorithms that can turn audio clips into a realistic, lip-synced video of the person speaking those words.
As detailed in a paper to be presented August 2 at SIGGRAPH 2017 in L.A., the team successfully generated realistic video of former president Barack Obama talking about terrorism, fatherhood, job creation and other topics using audio clips of those speeches and existing weekly video addresses that were originally on a different topic.
Ira Kemelmacher-Shlizerman, an assistant professor at the UW's Paul G. Allen School of Computer Science & Engineering said, Realistic audio-to-video conversion has practical applications like improving video conferencing for meetings, as well as futuristic ones such as being able to hold a conversation with a historical figure in virtual reality by creating visuals just from audio.
In a visual form of lip-syncing, the system converts audio files of an individual's speech into realistic mouth shapes, which are then grafted onto and blended with the head of that person from another existing video.
In the future video, chat tools like Skype or Messenger will enable anyone to collect videos that could be used to train computer models, Kemelmacher-Shlizerman said.
Because streaming audio over the internet takes up far less bandwidth than video, the new system has the potential to end video chats that are constantly timing out from poor connections.
When you watch Skype or Google Hangouts, often the connection is stuttery and low-resolution and really unpleasant, but often the audio is pretty good, said co-author and Allen School professor Steve Seitz. So if you could use the audio to produce much higher-quality video, that would be terrific.
By reversing the process feeding video into the network instead of just audio the team could also potentially develop algorithms that could detect whether a video is real or manufactured.
The new machine learning tool makes significant progress in overcoming what's known as the uncanny valley problem, which has dogged efforts to create realistic video from audio. When synthesised human likenesses appear to be almost real but still manage to somehow miss the mark people find them creepy or off-putting.
People are particularly sensitive to any areas of your mouth that don't look realistic, said lead author Supasorn Suwajanakorn, a recent doctoral graduate in the Allen School. If you don't render teeth right or the chin moves at the wrong time, people can spot it right away and it's going to look fake. So you have to render the mouth region perfectly to get beyond the uncanny valley.
A neural network first converts the sounds from an audio file into basic mouth shapes. Then the system grafts and blends those mouth shapes onto an existing target video and adjusts the timing to create a new realistic, lip-synced video.
Previously, audio-to-video conversion processes have involved filming multiple people in a studio saying the same sentences over and over to try to capture how a particular sound correlates to different mouth shapes, which is expensive, tedious and time-consuming. By contrast, Suwajanakorn developed algorithms that can learn from videos that exist in the wild on the internet or elsewhere.
There are millions of hours of video that already exist from interviews, video chats, movies, television programs and other sources. And these deep learning algorithms are very data hungry, so it's a good match to do it this way, Suwajanakorn said.
Rather than synthesising the final video directly from audio, the team tackled the problem in two steps. The first involved training a neural network to watch videos of an individual and translate different audio sounds into basic mouth shapes.
By combining previous research from the UW Graphics and Image Laboratory team with a new mouth synthesis technique, they were then able to realistically superimpose and blend those mouth shapes and textures on an existing reference video of that person. Another key insight was to allow a small time shift to enable the neural network to anticipate what the speaker is going to say next.
The new lip-syncing process enabled the researchers to create realistic videos of Obama speaking in the White House, using words he spoke on a television talk show or during an interview decades ago.
Currently, the neural network is designed to learn on one individual at a time, meaning that Obama's voice speaking words he actually uttered is the only information used to drive the synthesised video. Future steps, however, include helping the algorithms generalise across situations to recognise a person's voice and speech patterns with less data with only an hour of video to learn from, for instance, instead of 14 hours.
The research was funded by Samsung, Google, Facebook, Intel and the UW Animation Research Labs.
A neural network first converts the sounds from an audio file into basic mouth shapes. Then the system grafts and blends those mouth shapes onto an existing target video and adjusts the timing to create a new realistic, lip-synced video.
Most recent headlines
05/01/2027
Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...
04/08/2026
Dalet, a leading technology and service provider for media-rich organizations, t...
04/07/2026
April 7 2026, 19:00 (PDT) Detective Conan: Fallen Angel of the Highway Opens in...
01/06/2026
January 6 2026, 05:30 (PST) Dolby Sets the New Standard for Premium Entertainment at CES 2026
Throughout the week, Dolby brings to life the latest innovatio...
02/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
02/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
02/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
02/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
02/05/2026
Scripps Research immunologist Dennis Burton elected to American Academy of Arts and Sciences A leader in broadly neutralizing antibodies, Burton has helped driv...
02/05/2026
Dalet, a leading technology and service provider for media-rich organizations, t...
01/05/2026
Ratings Roundup is a rundown of recent rating news and is derived from press rel...
01/05/2026
BKB Bare Knuckle Boxing ( BKB ), today announced the appointment of Will Wright ...
01/05/2026
Lawo has been at the center of the industry's transition to IP and other next-generation technologies. At NAB 2026, its story was the Edge One AV stagebox, ...
01/05/2026
HBA Media, acting on behalf of NBC Sports and Churchill Downs Incorporated, has announced broadcast and streaming distribution for Kentucky Derby 152, taking pl...
01/05/2026
By Bailey Pennick
One of the most exciting things about the Sundance Film Festi...
01/05/2026
Florals for spring? Groundbreaking. But a playlist that tells you which The Devi...
01/05/2026
One of the world's biggest popstars is headed to El Cl sico. Later this mont...
01/05/2026
Limited-edition model celebrates 15th anniversary
Heritage Audio's range of monitor controllers has just gained a new member, the Baby RAM Black Edition...
01/05/2026
Dumble recreation now available as UAD plug-in
Along with their renowned processing plug-ins, Universal Audio have been steadily introducing emulations of c...
01/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
01/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
01/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
01/05/2026
To celebrate the opening of its new showroom and office, Lightware UK hosted a dedicated launch event at the new London location. The event welcomed partners, c...
01/05/2026
Choice without compromise
The broadcast industrys transformation is accelerating, and traditional broadcasters are having to fundamentally reinvent how they o...
01/05/2026
Beam Dynamics will return to MPTS 2026 with its asset intelligence platform, helping systems integrators, live production teams, media facilities and profession...
01/05/2026
Best-in-class UX design and rapid, scalable delivery for next-generation viewing experiences
Leading video software provider, Synamedia, today announced a coll...
01/05/2026
Compact new cforce MAX lens motor brings unrivaled speed and responsiveness to t...
01/05/2026
Panavision welcomes Fritz Heinzle as Vice President of Sales
Brie Clayton May 1, 2026
0 Comments
Heinzle will support Panavision's global growth s...
01/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
01/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
01/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
01/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
01/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
01/05/2026
LONDON, APRIL 30, 2026 The Post Republic London's Re-recording Mixer and Dialogue Editor Dan Johnson has built a reputation for clean, emotionally resonan...
01/05/2026
Adobe Unveils Powerful New Innovations in Photoshop & Lightroom
Deepa Subramaniam April 30, 2026
0 Comments
Your most tedious creative tasks just got ea...
01/05/2026
Berklee Partners with Santander US to Establish Global Opportunity Fund The $400,000 grant offers students access to experiential learning opportunities withi...
01/05/2026
Student Spotlight: Keziah Thomas The Indian composer, who was named the 2026 student commencement speaker for Berklee College of Music, talks about how shes p...
01/05/2026
Friday 1 May 2026
Hannah Waddingham and Ncuti Gatwa to host the series final tw...
01/05/2026
Friday 1 May 2026
Got plans? Cancel them. Sky Sports Big Weekend is coming
Sky Sports is preparing for a bumper weekend of live action, including Manchester ...
01/05/2026
Friday 1 May 2026
Sky Sports to broadcast all matches from World Sevens Football London edition
Sky Sports will be the exclusive UK broadcaster of the women...
01/05/2026
Back to All News
NIAJ Fest Gets Los Angeles In on the Joke With Free Pop-Up Events
Entertainment
01 May 2026
GlobalUnited States
Link copied to clipboard
...
01/05/2026
RT Sport awarded first pick free-to-air on Wednesday nights
Champions League and Super Cup finals
Highlights on Wednesday nights
RT today (Thursday 30 Apri...
01/05/2026
January 5 2026, 18:30 (PST) NBCUniversal's Peacock to Be First Streamer to ...
30/04/2026
The Professional Women's Hockey League (PWHL) concluded its third regular season on Saturday, reporting growth across attendance, viewership, digital engage...
30/04/2026
NBC Sports will air national MLB coverage on Sundays beginning May 3, with MLB Sunday Leadoff on Peacock and NBCSN at 12:30 p.m. ET, followed by the debut of th...
30/04/2026
Clear-Com has appointed Brian Grahn as Market Outreach Manager of the Americas and Ben Turnwell as Business Development Manager for EMEA live.
Grahn joined Cle...
30/04/2026
ARRI has introduced the cforce MAX, a new lens motor for the Hi-5 lens control system. The cforce MAX is twice as fast as the cforce plus motor it replaces whil...
30/04/2026
Knuerr, Voxtronic, and IHSE will jointly present an integrated control room solu...
30/04/2026
The CW Network and ESPN have announced an agreement to make the ESPN App the exclusive streaming home for all CW Sports live events. CW Sports will continue to ...
30/04/2026
Ed Sheeran's The Loop' tour launched in Auckland in January 2026 before moving on to Australia, with South America and the United States to follow late...