Sony Pixel Power calrec Sony

Why captioning cant be fully automated

31/10/2017

Author:

ContributorPublish date:

Aug 3, 2017Social count:

0

0

SHARES

You may have heard the headlines - We've reached human parity (Microsoft, 16th October) as they reach an accuracy of over 94 per cent; Google openly planning to compete with Dragon developers Nuance; Amazon attempting to revolutionise access to the internet via Echo and Alexa. It seems like everyone's at the Speech Recognition game surely the end is nigh for traditional methods of creating captions?

I think captioners can rest easy for a good while yet - for a few simple reasons. The first is simply the scale of the task that regulators and audiences set the captioner; typically a pre-recorded programme must be captioned 100 per cent accurately, and a live show should hit at least 98 per cent. Taking the pre-recorded example, how hard can that be for a machine? Surely there's all the time in the world to get it right?

Consider what 100 per cent actually means; not only does every word have to be identified and spelled correctly (no mean feat on a show such as Mastermind, where deliberately obscure questions can trigger equally obscure and possibly wrong answers). Imagine writing down every word you utter during any given day; would you go for something akin to the dialogue in a play accurate with all its disfluencies' (those crutch-like Ums' and Errs' that let your brain change gear whilst letting your mouth free-wheel). Do you talk in nice, tidy grammatical sentences? Do you pause neatly for mental punctuation? I guessed as much. If you simply transcribe such speech verbatim you'll get a very accurate representation of the words uttered, but that won't make for comprehensible captions and it could well be illegibly fast.

Speech recognition also thrives on good quality audio; not just a clear voice, but an absence of echo, background noise, music and so forth. It is possible with care and a complex workflow to ensure that the music and the speech remain separate in a recording but that doesn't help with poor acoustics or a duff recording. Much more research is needed to assist with improving ASR in complex audio environments and we're helping a PhD student at the University of Edinburgh to research precisely this.

The automatic insertion of punctuation is in its infancy; some inroads have been made by our research partners at Edinburgh, using techniques more commonly found in Machine Translation. Whilst ASR uses a largely probability-based approach to working out what's been said, punctuation needs something more rule-based. Questions are another matter entirely; cadence can be a good indicator for some speakers (as most languages will let you ignore the formalities of question words) but that's not a universal rule.

Identifying speaker changes is another area that needs more research; for many of our clients we need to be able to accurately identify either a change of speaker (denoted by chevrons or a change in text colour) or by identifying the speaker themselves. Whilst automated diarisation' reaches good levels of accuracy, it doesn't yet reach the level of accuracy required for broadcast.

Does this mean we can't use ASR at all? I think not. Not all content is the same; it's not all shouty gameshows, talkshows where each guest cuts across everyone else and sports output captured in the open, with the roar of the crowd and the rumble of the music bed. Some material is recorded cleanly, with professional speakers speaking at a moderate pace on a subject matter with plenty of background data to assist with the more tricky terms. If we have enough of this kind of data we can train ASR engines to make a pretty good job of transcription. We can utilise the vast archives of media with matching captions to create speech recognition engines, punctuation models and caption translation' systems to replicate the kind of output that a human could produce. We can then use audio alignment' tools to break this transcription up into readable blocks and time-align them to the original speaker's voice, leading to fully automated captions.

No doubt if I review this article in ten years' time I'll cringe at the bold assertions made about the progress of automated captioning, but I feel confident that genres such as comedy will remain a bastion of human-generated captioning even in 2027. Comedy is typically based around word play, incongruity and surprise. Speech engines are most comfortable with the opposite of this they know what they've been trained on, and a new comic turn of phrase will almost certainly bring about an unintentionally comic transcription. I'm pretty sure that a human captioner will be wrestling with the likes of Have I Got News For You for many years to come.

By Matt Simpson, head of product management, access services, Broadcast and Media Services, Ericsson
LINK: https://www.tvbeurope.com/features-2/captioning-cant-fully-automated...
See more stories from tvb

Most recent headlines

05/01/2027

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be demoed at CES 2026

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...

01/06/2026

Dolby Sets the New Standard for Premium Entertainment at CES 2026

January 6 2026, 05:30 (PST) Dolby Sets the New Standard for Premium Entertainment at CES 2026 Throughout the week, Dolby brings to life the latest innovatio...

01/05/2026

NBCUniversal's Peacock to Be First Streamer to Integrate Dolby's Full Suite of Premium Picture and Sound Innovations

January 5 2026, 18:30 (PST) NBCUniversal's Peacock to Be First Streamer to ...

01/04/2026

DOLBY AND DOUYIN EMPOWER THE NEXT GENERATON OF CREATORS WITH DOLBY VISION

January 4 2026, 18:00 (PST) DOLBY AND DOUYIN EMPOWER THE NEXT GENERATON OF CREATORS WITH DOLBY VISION Douyin Users Can Now Create And Share Videos With Stun...

15/01/2026

Versant Completes Acquisition Of Free TV Networks

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

15/01/2026

House Oversight Hearing on FCC Puts Chair in Spotlight

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

15/01/2026

Radiant Media Player, Cloud DRM Partner on Integration

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

15/01/2026

Telycam to Showcase New Mix One Video Switcher at ISE 2026

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

15/01/2026

Pliant Names Adam Grede as Regional Sales Manager

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

15/01/2026

NFL Wild Card Games Score With Viewers

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

15/01/2026

Riedel RefCam Takes Center Court in German Basketball Research Initiative With DBB and DSHS

Wuppertal January 15, 2026 Riedel RefCam Takes Center Court in German Basketba...

15/01/2026

Nah Yung-suk Presents Take a Hike!' - A Snowy Reality Adventure Coming to Netflix

Back to All News Nah Yung-suk Presents Take a Hike!' - A Snowy Reality Adv...

15/01/2026

'Firebreak' Premieres on Netflix February 20

Back to All News Firebreak Premieres on Netflix February 20 Entertainment 15 January 2026 GlobalSpain Link copied to clipboard DOWNLOAD THE FIRST LOOK IMA...

15/01/2026

The Variety, Voices, and Vision Shaping What's Next on Netflix Indonesia in 2026

Back to All News The Variety, Voices, and Vision Shaping What's Next on Net...

15/01/2026

Netflix and Sony Pictures Entertainment Enter New Pay-1 Deal With First-of-Its-Kind Global Reach

Back to All News Netflix and Sony Pictures Entertainment Enter New Pay-1 Deal W...

15/01/2026

Hollywood Professional Association Unveils 2026 HPA Awards Innovation & Technology Nominees

The Hollywood Professional Association (HPA) today announced the nominees for th...

15/01/2026

FOR-A Europe to Demonstrate Broadcast and Pro-AV Convergence at ISE 2026

Award-winning production solutions bridge traditional and next-generation workflows FOR-A MixBoard FOR-A IMPULSE viztrick AiDi MFR-3100EX...

15/01/2026

Arvato Systems Named Launch Partner for AWS European Sovereign Cloud

Arvato Systems Named Launch Partner for AWS European Sovereign Cloud As a launch partner for the AWS European Sovereign Cloud, Arvato Systems enables customer...

15/01/2026

Survive the Quarantine Zone and More With Devolver Digital Games on GeForce NOW

NVIDIA kicked off the year at CES, where the crowd buzzed about the latest gaming announcements - including the native GeForce NOW app for Linux and Amazon Fire...

14/01/2026

ITV selects Yospace for Advanced Ad Measurement and Monetisation on Freely

Staines-upon-Thames, UK, 13th January, 2026 ITV, one of the UKs leading broadcasters, has selected Yospace, the global leader in Dynamic Ad Insertion (DAI), to ...

14/01/2026

Tech Focus: Audio Consoles, Part 2 - New Options for Virtual Mixing

Tech Focus: Audio Consoles, Part 2 - New Options for Virtual MixingA variety of solutions offer both technical and economic benefitsBy Dan Daley, Audio Editor ...

14/01/2026

Tech Focus: Audio Consoles, Part 1 - Key Component Evolves Toward the Totally Virtual

Tech Focus: Audio Consoles, Part 1 - Key Component Evolves Toward the Totally Vi...

14/01/2026

SVG Summit 2025: Audio from Monday Workshops Now Available

SVG Summit 2025: Audio from Monday Workshops Now AvailableListen to sessions from Live Production Innovation, AI Production Tools, Cloud Production, Content Wor...

14/01/2026

US Navy and Marines Select L3Harris T7 Robots to Enhance Ordnance Disposal Capabilities

The L3Harris large T7 robotic systems will provide U.S. Navy and U.S. Marines wi...

14/01/2026

Steiger Media reimagines broadcast workflows with Calrec

Steiger Media's adoption of Calrec's compact Argo M console not only makes its innovative new hybrid truck faster, more efficient, and agile, but also e...

14/01/2026

NBC Sports to Deploy viztrick AiDi for Live Sports Production

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

14/01/2026

Sinclair Accepting Applications for 2026 Scholarship Program

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

14/01/2026

Neal Shapiro to Retire as President and CEO of The WNET Group

Share Share by: Copy link Facebook X Whatsapp Pinterest Flipboard...

14/01/2026

Press Release: The Boston Globe Names Cartesian a Top Place to Work in 2025

Press Release: The Boston Globe Names Cartesian a Top Place to Work in 2025 January 14, 2026 News Cartesian - January 14, 2026 - EINPresswire.com - Sp...

14/01/2026

Comscore and Marcus Theatres Announce Five-Year Extension for Cinema ACE and Enterprise Web Solutions

Comscore and Marcus Theatres Announce Five-Year Extension for Cinema ACE and Ent...

14/01/2026

Comscore and Santikos Entertainment Announce Five-Year Circuit Wide Commitment to Cinema ACE and Enterprise Web Solutions

Comscore and Santikos Entertainment Announce Five-Year Circuit Wide Commitment t...

14/01/2026

Tribeca Announces Best New York Short Award for 25th Anniversary Festival

January 14th, 2026 TRIBECA ANNOUNCES BEST NEW YORK SHORT AWARD FOR 25TH ANNIVERSARY FESTIVAL In Celebration of Its 25th Anniversary, Tribeca Introduces a N...

14/01/2026

Sky News announces Cathy Newman to lead flagship new political programme

Wednesday 14 January 2026 Sky News announces Cathy Newman to lead flagship new political programme Sky News today announces that award-winning journalist and ...

14/01/2026

'State of Fear', The First Spin-Off of a Netflix Brazil Production, Premieres February 11

Back to All News State of Fear, The First Spin-Off of a Netflix Brazil Producti...

14/01/2026

Special stamp celebrates 100 Years of Broadcasting in Ireland

The first stamp of An Post's 2026 Stamp Programme, marking 100 Years of Broadcasting, was unveiled at the GPO by Patrick O'Donovan TD, Minister for Cult...

14/01/2026

It's Official! Beverley Callard joins Fair City

It's official! Beverley Callard has landed in Carrigstown. The beloved actor, known for her unforgettable roles and iconic screen presence, is joining the c...

13/01/2026

AGILE Against the Odds: Backing Innovative Income Streams for Independent Media

Independent media in Brazil and Colombia is facing an urgent crisis of traditional business models alongside a deteriorating security environment, according to ...

13/01/2026

NHL Situation Room 2.0: How Sony Hawk-Eye Powers Centralized Officiating, Player Safety, the League's Next Chapter

NHL Situation Room 2.0: How Sony Hawk-Eye Powers Centralized Officiating, Player...

13/01/2026

NBC Sports Ices the Audio for the 2026 Prevagen U.S. Figure Skating Championships

NBC Sports Ices the Audio for the 2026 Prevagen U.S. Figure Skating Championship...

13/01/2026

DMF and MXL in Practice: Which Vendors are Adopting it, and How Fast is the Ecosystem Maturing?

DMF and MXL in practice: Which vendors are adopting it, and how fast is the ecos...

13/01/2026

CES 2026: Five Important Sports-Tech Buzzwords

CES 2026: Five Important Sports-Tech BuzzwordsThe terms highlight innovations for sports production at the showBy Daniel Frankel, SVG Contributor Tuesday, Jan...

13/01/2026

For TGL Season 2, Unity 6 Boosts Virtual-Graphic Quality; COSM 360 Cameras Improve Hitting-Box Coverage

For TGL Season 2, Unity 6 Boosts Virtual-Graphic Quality; COSM 360 Cameras Impro...

13/01/2026

Resetting Expectations? The State of the Sports Industry with Devoncroft's Josh Stinehour

Resetting Expectations? The State of the Sports Industry with Devoncroft's J...

13/01/2026

2026 Sundance Film Festival Unveils Jury Members

Top Row L-R: Ana Katz, Natalia Almada, Bao Nguyen, Tatiana Maslany, A.V. Rockwell, Dr. Heather Berlin Second Row L-R: Sophie Barthes, Azazel Jacobs, Janicza Br...

13/01/2026

L3Harris Accelerates Arsenal of Freedom' with Creation of a New Missile Solutions Company

DoW to invest $1B in planned independently traded Missile Solutions business...

13/01/2026

L3Harris Chairman and CEO Joins Under Secretary of War in Interview on FOX Business

L3Harris Chairman and CEO Christopher Kubasik and Under Secretary of War for Acq...

13/01/2026

First Gulf Expands into U.S. Market with Launch of First Westlake Logistics Park

April 10, 2025 First Gulf has taken a significant step in its U.S. expansion with the launch of its first industrial development in the country. First Westla...

13/01/2026

SoftMoc Leases 145,600 Sq. Ft. at 901 Hopkins in Whitby

April 11, 2025 Canadian footwear retailer SoftMoc has signed a lease for 145,600 square feet at 901 Hopkins Street in Whitby, where the space will serve as a w...

13/01/2026

25 Ontario Reaches Key Milestone with Occupancy Permit

April 14, 2025 First Gulf is proud to announce that 25 Ontario has officially received its occupancy permit, marking the transition from an active construction...

13/01/2026

Sherwin-Williams Selects First Gulf for New 350,000 Sq. Ft. Facility in Barrie

April 28, 2025 First Gulf has been awarded a design-build lease for a new 350,000 square foot office and warehouse facility for Sherwin-Williams. This project ...