Sony Pixel Power calrec Sony

Transcribing the future

11/10/2018

-- --

Facebook

Twitter

Google

Pinterest

SCREEN AFRICA EXCLUSIVE: A while back I was handed a bunch of very long audio files and was tasked with cutting about 60 hours of interviews, sound bites and voice over down to a 30-minute radio piece. Easy I thought, as long as I could get the material transcribed quickly and at reasonable cost and there my journey of discovery began.

Voice to text transcriptions have long been used in the media, medical and legal industries, traditionally done by human transcription teams. It's big business, but turn arounds can be slow and files often need a second error check to make sure the content is accurate. The cost of transcription actually wasn't a factor in my case - it was speed that I needed, I needed a machine to plough through my audio files and spit out a transcript so that I could search for key words and edit my story together.

-- --

Trolling the internet, I instantly found a few options and uploaded the same test file to all of them (as part of my free trials) but had really disappointing results ranging from 25 to about 59 per cent accuracy. Simple words and phrases were being interpreted as something completely different, more complex words like Fakarava Atoll came back as expletives! At first I thought that as the majority of interviews were heavy New Zealand accents that might be the problem but a snippet of the best British accented guest gave me similar results.

Through my work in the video world, I am aware that there is a lot of research and development in the transcription arena utilising Artificial Intelligence (AI). Machine learning works best when it is processing large analysable data sets like text. But most of the data being produced in the world right now is not text, it's the spoken word embedded in video and audio recording and thus the goal for AI developers to produce a reliable voice transcription process has intensified.

Tech companies like Apple, Google, Microsoft and Amazon are all actively involved in this space and have been researching voice recognition since the 90s and that research has only accelerated with the emergence of virtual assistants like Alexa, Cortana, and Google Voice and Siri. However, most people who use Siri or Alexa would agree that, while those tools do a reasonable job of understanding you, most of us wouldn't trust them with our lives. I asked Alexa where the Fakarava Atoll was and her response was, I would rather not answer that question. (Out of interest it is in Tahiti and is not a swear word!) A voice assistant like Alexa only needs to work out which, of a predetermined list of vocal commands is being asked, whereas a transcription programme needs to listen for and capture all spoken words and this wider variety of possible inputs and outputs makes it a more difficult task for AI.

Whilst stumbling around for my transcription answers I came across an article published by a team of Data Scientists and enthusiastic entrepreneurs, Ashutosh Trivedi and Anup Gosavi, who recently founded a company called Spext. Trivedi, based in Bangalore India, has deep interest and post graduate expertise in AI and has published his research in many IEEE journals. Gosavi is based in San Francisco and specialises in Design Thinking and Information Visualisation.

Spext describes their company name as a fusion of the words speech and text, and from the outset they looked like they could offer me exactly what I wanted and more. The service can best be described as a combined voice transcriber and media editor. You upload your audio files and the system automatically converts voice to text and displays the result in an edit window where it aligns the audio content with the text accurately and that means you can now do some amazing things with the resultant files. Not only do you get a full transcription of your work but you can edit the transcript, like you would on a word processor and then export the result as a new audio file. Obviously you can't create new sentences but the ability to edit and output the existing data as an audio file is a huge plus. It looks like a normal text editor and has familiar actions like copy-paste, cut-paste and I found editing by transcriptions on the fly extremely easy. When you are done you export your work as a word document, pdf and/or a new mp3 or wave file or even export your project to professional editing tools such as Adobe Audition, and Final Cut Pro for fine tuning.

The most important result is that the files I used to test other systems uploaded into the Spext system quickly and came back blazingly fast with a resultant accuracy of 96 per cent in my case. The system had even correctly punctuated the transcription, coped well with proper nouns like the names of fish species and fishing techniques but it too also battled with transcribing Fakarava (expletive) but at least recognised the word Atoll. It took no time at all to quickly manually edit any corrections. What could have taken weeks in production will easily get done in a matter of days now, artificial intelligence seems to have finally reached the point where transcription of audio by a machine works efficiently enough to make it viable and as researchers and companies improve and refine their algorithms, it seems evident that transcriptions will become even more accurate and the potential productivity savings of automated transcription will be hard to ignore. Someday soon, we might even be headed towards a world where audio files and text are thought of not as two distinct media types, but as two formats for the same content - as interchangeable - and as convertible as an .mp3 and a .wav file or a text file and a Word document.

The guys at Spext have used a unique combination of intuitive user experience design, to make it easy for the user, and advanced machine learning tha
LINK: http://www.screenafrica.com/2018/10/11/technology/ai-artificial-intell...
See more stories from screenafrica

Most recent headlines

05/01/2027

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be demoed at CES 2026

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...

04/08/2026

Dalet Announces Commercial Availability of Dalia, Bringing Media-Aware Agentic AI to Enterprise Productions

Dalet, a leading technology and service provider for media-rich organizations, t...

04/07/2026

Detective Conan: Fallen Angel of the Highway Opens in Dolby Cinemas Across Japan, Presented in Dolby Atmos and Dolby ...

April 7 2026, 19:00 (PDT) Detective Conan: Fallen Angel of the Highway Opens in...

01/06/2026

Dolby Sets the New Standard for Premium Entertainment at CES 2026

January 6 2026, 05:30 (PST) Dolby Sets the New Standard for Premium Entertainment at CES 2026 Throughout the week, Dolby brings to life the latest innovatio...

16/05/2026

Spectrum TV Adds Free Access to Discovery+

Share Copy link Facebook X Linkedin Bluesky Email...

16/05/2026

Tower Family Foundation Passes $3.5 Million Milestone

Share Copy link Facebook X Linkedin Bluesky Email...

16/05/2026

AIMS to Offer IPMX Education at InfoComm 2026

Share Copy link Facebook X Linkedin Bluesky Email...

16/05/2026

Rise AV Launches Second Year of UK Elevate Program

Share Copy link Facebook X Linkedin Bluesky Email...

15/05/2026

Seattle Sounders FC and Reign FC Announce Seattle Soccer Celebration at Waterfront Park

Seattle Sounders FC and Seattle Reign FC, in partnership with RAVE Foundation an...

15/05/2026

How Sound Designer Dan Brumm Built Blueys Audio World with Sennheiser and Neumann

Dan Brumm has served as sound designer on Bluey, the Australian children's t...

15/05/2026

Applications Close May 31 for Mark Brunner Professional Audio Scholarship

The Professional Audio Manufacturers Alliance (PAMA) and Shure Incorporated are accepting applications for the 6th annual Mark Brunner Professional Audio Schola...

15/05/2026

Netflix Expands NFL Coverage With Additional Games Starting in 2026

Netflix has announced an expanded NFL schedule for 2026 and beyond under a four-year partnership extension with the NFL through the 2029-30 season. Each season,...

15/05/2026

Ateme Supports TVRIs SRT-Based Live Sports Contribution and Distribution Workflow

Ateme is supporting TVRI (Televisi Republik Indonesia) with a contribution and d...

15/05/2026

Concacaf Launches New Website and Mobile App Powered by Deltatre

Concacaf has announced the launch of a new website and mobile app built on Deltatre's FORGE platform. Concacaf.com and the mobile app, available on iOS and ...

15/05/2026

Qatar Media Corporation Launches QBC Business Channel in 4K via Eutelsat

Eutelsat has announced the launch of QBC Business Economic Channel by Qatar Media Corporation, broadcasting in 4K/UHD via Eutelsat's 7/8 West video neighbo...

15/05/2026

Amazon to Serve as Exclusive Launch Home of MLS Original Series Cup Dreams on May 14

Major League Soccer has announced four original content series timed to the 2026...

15/05/2026

AIMS to Focus on IPMX Education at InfoComm 2026

The Alliance for IP Media Solutions (AIMS) has announced it will exhibit and present at InfoComm 2026, taking place June 13-19 at the Las Vegas Convention Cente...

15/05/2026

InfoComm 2026To Feature Sports, Broadcast, and Live Event Technologies

InfoComm 2026 will take place June 13-19 (exhibits June 17-19) at the Las Vegas Convention Center. The show will include sessions and exhibits covering broadcas...

15/05/2026

Tracy McGradys Ones Basketball League Signs First Streaming Agreement with Fubo Sports Network

Tracy McGrady's Ones Basketball League (OBL) and FuboTV Inc. have announced ...

15/05/2026

Disguise and Creative Technology Return for Eighth Year at Eurovision Song Contest 2026

Disguise has partnered with Creative Technology (CT) to deliver visual playback ...

15/05/2026

Sony Announces Alpha 7R VI Camera and FE 100-400mm F4.5 GM OSS Lens

Sony Electronics has announced two new products for professional imaging: the Alpha 7R VI full-frame mirrorless camera and the FE 100-400mm F4.5 GM OSS super-te...

15/05/2026

SVG GameDay, Ep. 15: New Jersey Devils Joe Kuchie - Growing the Game in the Garden State

In-venue and creative video staffers at the professional and collegiate level ha...

15/05/2026

Ratings Roundup: ESPN Secures Top Viewed Second Round Game 4 of Stanley Cup on Cable; NBA Draft Lottery Viewership Up 23%

Ratings Roundup is a rundown of recent rating news and is derived from press rel...

15/05/2026

The Future of Sports Analytics: Building Trust and Intelligence With SmerSports and Cisco

For sports organizations, the most valuable assets are often the most sensitive:...

15/05/2026

NFL Broadcast Schedule Roundup: Breaking Down CBS, ESPN, FOX, NBC, Netflix, and Prime Lineups

The NFL's broadcast partners released their 2026 regular season schedules ye...

15/05/2026

Netflix Steps Into the Cage for First MMA Production With Rousey-Carano Showdown at Intuit Dome

When MMA icons Ronda Rousey and Gina Carano meet inside the Hexagon at Intuit Do...

15/05/2026

Dustin Hoffman and Leo Woodall Bring the Noise in Daniel Roher's Tuner

Daniel Roher attends the Tuner Premiere during the 2026 Sundance Film Festival at Eccles Theatre on January 22, 2026 in Park City, Utah. (Photo by Neilson Bar...

15/05/2026

And The Winners of the 2026 Spotify Podcast Awards in Mexico Are

Last night, the Spotify Podcast Awards in Mexico returned to the country's capital. Now in its second year, the evening honors creators whose voices are hel...

15/05/2026

Music Expo (San Francisco) becomes MONO Music Conference

Rebranded show announced Ahead of their 2026 return, Music Expo have announced that they have now officially changed their name to the MONO Music Conference...

15/05/2026

Buzzing Bugs Audio Devices introduce the Bolster

Fuzz pedal joins UK companys line-up UK-based pedal makers Buzzing Bugs Audio Devices have recently unveiled their latest creation, the Bolster. Said to pay...

15/05/2026

Joint Statement: News Bargaining Incentive

Joint Statement: News Bargaining Incentive 28 April, 2026 Media releases The vibrancy of Australian democracy relies on the robust and open exchange of new...

15/05/2026

Call it Deltavision, Australia's through to the Grand Final of this year's Eurovision Song Contest!

Call it Deltavision, Australia's through to the Grand Final of this year'...

15/05/2026

Join Calrec at MPTS 2026

Join Calrec at MPTS 2026 | May 13-14 | Stand A40 | Olympia, London We're looking forward to meeting up with customers and partners at this year's Media ...

15/05/2026

CTV's Data Gap Holding Back Bigger Ad Budgets, New Gracenote Research Finds

86% of media planners would move more linear TV budget to CTV if they had show-level targeting and reporting - and 65% would also shift dollars from programmati...

15/05/2026

Scripps Completes Station Swaps with Gray Media

Share Copy link Facebook X Linkedin Bluesky Email...

15/05/2026

Clear-Com Takes Communications Further at InfoComm 2026

Clear-Com will showcase new communications solutions and major platform updates at InfoComm 2026 (Booth N7005), June 17-19, in the North and Central Halls of t...

15/05/2026

Rise AV Launches Second Year of UK Elevate Programme Foll...

Following an outstanding inaugural year in 2025, Rise AV is proud to announce the return of its flagship leadership initiative, Elevate. The programme continues...

15/05/2026

Berklee Announces Lineup for Inaugural AI Music Summit

Berklee Announces Lineup for Inaugural AI Music Summit The three-day event puts musicians at the center of the future of music creation, ethics, and the indus...

15/05/2026

Lightware Highlights Scalable USB-C and AV-over-IP Innova...

Lightware returns to InfoComm 2026 with a focused showcase of scalable USB-C connectivity, next-generation AV-over-IP solutions, and technologies that help over...

15/05/2026

IAB Releases Campaign Data Standards 1.0 for Public Comment

Share Copy link Facebook X Linkedin Bluesky Email...

15/05/2026

ARRI Expands Management Board

Share Copy link Facebook X Linkedin Bluesky Email...

15/05/2026

Gray Media Names Joanie Vasiliadis SVP of Transformation

Share Copy link Facebook X Linkedin Bluesky Email...

15/05/2026

Study: Data and Measurement Problems Reduce CTV Ad Budgets

Share Copy link Facebook X Linkedin Bluesky Email...

15/05/2026

Upfronts: WBD Expands Advanced Ad Capabilities and AI Ad Tech

Share Copy link Facebook X Linkedin Bluesky Email...

15/05/2026

VLAST Powers PLAVEs Asia Tour Encore with AJA Gear

Delivering a live, arena-scale production of a massively popular band is no small feat. Between expansive in-arena LED walls and a global live stream fed to onl...

15/05/2026

Sun Broadcast Futureproofs Dayalbaghs Multimedia Van with...

Connection is the heartbeat of any strong community, and with live streaming becoming more accessible in the modern era, it's much easier for faith-based or...

15/05/2026

Disguise and Creative Technology Power Eurovision for the...

Powered by GX 3 media servers, optimised IP-VFC workflows and on-site engineering expertise, the production delivers high-performance visuals for one of the wor...

15/05/2026

UKTV joins forces with BritBox and Sony Pictures Television for a co-commission of Chocolate Wars (w/t)

The six-part series is a co-commission with BritBox and Sony Pictures Television...

15/05/2026

A Mother, Two Daughters and One Big Scandal: Netflix's Crime-Comedy 'Maa Behen' Premieres June 4

Back to All News A Mother, Two Daughters and One Big Scandal: Netflixs Crime-Co...

15/05/2026

Why Trusted Measurement Matters More Than Ever in Retail Media

Against that backdrop, IAB UK has added retail media to its Gold Standard. Jan Pitt, Commercial Director at ABC, spoke with Liv McCullagh, Retail Media Lead at ...