Skip to main content

SysMedia CEO discusses YouTube captioning

While admirable in its intention, Google-owned video service YouTube’s automatic captions should not yet be seen by anyone as being a cheap substitute for human subtitlers, says SysMedia CEO Andrew Lambourne.

Google’s recent announcement of the introduction of automatic captions (auto-caps) for YouTube caused a predictable flurry in the world of captioning as people wondered just how good their automatic speech recognition technology would prove to be.

“In Google’s own words, ‘The captions will not always be perfect.’ In practice, they vary from quite impressive to truly awful, and subtitlers understand only too well the reasons why,” Lambourne says.

Automatic speech recognition (ASR) technology has advanced significantly in the last 10 years, and auto-caps should be applauded for aptly demonstrating that point. However, the new service also highlights the challenges inherent in unleashing automated speech processing technology on real-world problems.

ASR has now reached the point where someone speaking clearly and fairly consistently can be recognized to an accuracy of perhaps 90 percent or more, provided that there’s no background noise or other speakers interrupting, nor any particularly unusual vocabulary. In reality, very few media clips are like that, and so useful captioning of professional content will need the human touch for many years to come.

“As subtitlers and their audiences know only too well, errors in subtitles can be very confusing for people who cannot hear or understand the original soundtrack,” Lambourne says. ASR, along with automated translation technology, certainly has its place in the captioning workflow, and SysMedia specializes in products that blend the productivity-saving benefits of these technologies with the more adaptable skills of the human to achieve quality closed-captioning.

SysMedia is currently developing WinCAPS Quantum, a new subtitle production platform for the broadcast subtitling market, which combines the efficiency of the latest technologies, including automated speech transcription and machine translation, with fast and targeted manual editing.

“Automation technology is continually improving, and as it does so, SysMedia will continue to harness it to make subtitling more efficient. But I suspect that for a good while yet, human subtitlers will still be an essential part of the editorial process,” Lambourne says.