
MIcrosoft is presenting a research paper this week at Interspeech 2019 in Austria entitled Meeting Transcription Using Asynchronous Distant Microphones, which shows the potential to allow meeting participants to use smartphones, laptops, and tablets, which are already equipped with microphones, instead of specially designed mics.
The full details are posted on a blog on the Microsoft website.
The central idea is to use any internet-connected devices, such as the laptops and smart phones that attendees typically bring to meetings, and form an ad-hoc microphone array in the cloud. With this approach, teams would be able to choose to use the smartphones, laptops, and tablets they already have to enable high-accuracy transcription without needing special-purpose hardware.
While the idea sounds simple, it requires overcoming many technical challenges to be effective. The audio quality of devices varies significantly. The speech signals captured by different microphones are not aligned with each other. The number of devices and their relative positions are unknown. For these reasons and others, consolidating the information streams from multiple independent devices in a coherent way is much more complicated than it may seem. In fact, although the concept of ad hoc microphone arrays dates back to the beginning of this century, to our knowledge it has not been realized as a product or public prototype so far. Meanwhile, techniques for combining multiple information streams were developed in different research areas. At the same time, general advances in speech recognition, especially via the use of neural network models, have helped bring transcription accuracy closer to usable levels.
The diagram shown above depicts the resulting processing pipeline. It starts with aligning signals from different microphones, followed by blind beamforming. The term blind refers to the fact that beamforming is achieved without any knowledge about the microphones and their locations. This is achieved by using neural networks optimised to recover input features for acoustic models, as we reported previously. This beamformer generates multiple signals so that the downstream modules (speech recognition and speaker diarisation) can still leverage the acoustic diversity offered by the random microphone placement. After speech recognition and speaker diarisation, the speaker-annotated transcripts from multiple streams are consolidated by combining confusion networks that encode both word and speaker hypotheses and they are sent back to the meeting attendees. After the meeting, the attendees can choose to keep the transcripts available only to themselves or share them with specified people.
The work published at Interspeech 2019 is part of a longer focused effort, codenamed Project Denmark.
Most recent headlines
05/01/2027
Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...
06/09/2026
June 9 2026, 23:00 (PDT) Dolby and MagentaTV Bring Fans Closer to the FIFA Worl...
04/08/2026
Dalet, a leading technology and service provider for media-rich organizations, t...
04/07/2026
April 7 2026, 19:00 (PDT) Detective Conan: Fallen Angel of the Highway Opens in...
23/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
23/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
23/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
23/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
23/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
23/06/2026
Multifaceted Growth Executive Brings 20+ Years of Experience Leading Organizations Across Tech and M&E
Imagine Communications today announced the appointment ...
23/06/2026
Australians in Film and Screen Australia's talent development initiative UNT...
23/06/2026
Visual Productions Unveils RdmRelay2 Four-channel Relay Control at InfoComm 2026
Brie Clayton June 22, 2026
0 Comments
New Relay Solution Combines DMX, ...
23/06/2026
SMPTE Makes Its Standards Freely Accessible, Opening Standards Library to the Gl...
23/06/2026
The RT Toy Show Appeal has raised over 31 million since its inception in 2020 ...
23/06/2026
News Highlights:
NVIDIA technology runs 81% of the TOP500 and 90% of the systems new to the list.
26 systems on the TOP500 adopted the NVIDIA Grace CPU, up ei...
23/06/2026
Companies are asking how to build specialized AI that fits with the way their workflows actually run.
The first wave of enterprise AI was about access. Compan...
23/06/2026
Newly identified molecule strengthens the eye's response to damage in retinal disease Scripps Research discovery finds that restoring the naturally occurrin...
22/06/2026
Behind The Mic provides a roundup of recent news regarding on-air talent, includ...
22/06/2026
Cosm has announced the appointment of David Ho as Chief Legal Officer, a newly created executive role reporting to President and CEO Jeb Terry. Ho will oversee ...
22/06/2026
Warner Bros. Discovery and Amazon Web Services (AWS) have announced the developm...
22/06/2026
Daktronics has completed an audio control system upgrade at Petco Park in San Di...
22/06/2026
Accelerate Media has named John Willi as President and announced the launch of the Accelerate Sports Network (ASN), a prep sports media and streaming platform c...
22/06/2026
All Women's Sports Network (AWSN) and 3XBA (3 3 Basketball Association) have announced live television coverage of the annual 3XBA tournament on Friday, Jun...
22/06/2026
OWL AI has announced the appointment of Jay Prasad as Chief Executive Officer and member of the Board of Directors. Prasad succeeds Josh Gwyther, who has served...
22/06/2026
CP Communications delivered RF video and audio support for TNT's Inside the NBA at the 2026 NBA Finals, providing main show coverage in San Antonio and ea...
22/06/2026
Polymarket has announced a partnership with GRID, an official esports data platf...
22/06/2026
As sports venues continue to evolve into more video-centric, fan-engagement-driv...
22/06/2026
As the regional sports production scene shifts toward streaming, this Texan helps lead the engineering behind Victory+'s growing live platform...
22/06/2026
By Kristin Feeley, Director, Documentary Film & Artist Programs
the memories of your elders [are] a scaffolding for you to build your identity on - and t...
22/06/2026
New hyper-resolution analyser EQ revealed
CEDAR Audio's all-new Icons plug-in series has just gained its newest member, Blade. Described by the compan...
22/06/2026
Turn any live input into a cinematic soundscape
Designed for use in the studio and on stage, Sampleson's latest creation is capable of taking any audio ...
22/06/2026
Adds guitar strings to Eurorack rigs
ADDAC System are renowned for their weird and wonderful synth designs, and their line-up includes plenty of gear that...
22/06/2026
FIFA World Cup 2026 fever grows, as more than one third of Australians tune in ...
22/06/2026
In our latest blog, Tim Pearson explores NAGRA Venturi, the new streaming security solution for the AI era from NAGRAVISION. Designed to aggregate and analyze ...
22/06/2026
Expanded integrations give advertisers access to distinct contextual signals acr...
22/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
22/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
22/06/2026
Xilica today announced the release of Dynamic Voice Lift, a new feature in Xilica Designer v4.12 that brings adaptive speech reinforcement to large meeting spac...
22/06/2026
Telecom operators have seen remarkable returns from using generative AI to automate network management, customer care and back-office operations. Most of that i...
22/06/2026
Monday 22 June 2026
Official trailer released for Katie Price: Nothing to Hide,...
22/06/2026
The next era of AI will not be defined by compute alone. Its growth will be dete...
22/06/2026
Mission, Vision and Veritas - new Los Alamos National Laboratory (LANL) supercom...
22/06/2026
At the ISC conference running in Hamburg this week, NVIDIA is introducing new so...
22/06/2026
For the past two years, the U.S. National Science Foundation's National Arti...
22/06/2026
JUPITER, Europe's first exascale supercomputer at Germany's Forschungszentrum J lich, runs on NVIDIA Grace Hopper Superchips and NVIDIA Quantum-X800 Inf...
21/06/2026
To call the 2026 FIFA World Cup a big undertaking would be a big understatement....
21/06/2026
New series now live on Udemy
Regular SOS contributor and Cubase workshop columnist John Walden has just released a new Cubase video course that is now avail...
21/06/2026
Hot tubs sit at about 38 to 40 degrees Celsius, warm enough that most people can only soak for about 15 minutes. NVIDIA's newest AI servers can run their co...
21/06/2026
Sunday 21 June 2026
Sky announces immersive documentary series The Wargame
The Wargame first looks
ZIP (2MB)
Sky today confirms the commission of The Wargam...
20/06/2026
New add-on creates doubles & vocal stacks
IK Multimedia's latest ReSing add-on kits the innovative software out with the ability to automatically genera...