Verbit Launches Speaker Identification for Live ASR Broadcast Captions
New feature improves real-time caption clarity, enhancing accessibility and viewer experience for live news and sports broadcasts

AI voice transcription and captioning platform Verbit has added a new feature to its Captivate ASR solution—the ability to identify specific features in automated captioning—that the company is billing as an industry first.
The new feature dramatically improves the quality and clarity of automated captions during live broadcasts by identifying not just speaker changes but the speakers themselves. For the first time in automated broadcast captioning, this allows viewers to see specific speakers identified:
>> JONATHAN WILLIAMS: Let’s look at some of today’s top stories in the news.
>> NICOLE HAINES: Dangerous storms firing off in the southern plains and pushing east.
>> DAN THOMPSON: He pitched a strong game but looks like they’re calling on the bullpen.
Launching with Verbit’s media customers across news, weather and live sports, the speaker ID feature delivers a clearer understanding of fast, overlapping and multispeaker dialogue as well as a more accessible experience for millions who rely on captions., the company said.
“Live ASR caption viewers deserve the same clarity and context that human captioning has long provided,” said Verbit general manager Doug Karlovits. “Our new speaker identification solution leverages the most advanced and innovative speaker models—far surpassing traditional ASR outputs—to achieve the highest accuracy for speaker IDs.”
Get the TV Tech Newsletter
The professional video industry's #1 source for news, trends and product and tech information. Sign up below.
Verbit explained the workings of the new feature by noting that Verbit’s professional Global Prep Team captures voice profiles, or “voice signatures,” from designated speakers, such as anchors, reporters or sportscasters, before a program goes to air. These signatures are labeled, added to Verbit’s trained acoustic and language models and activated during live broadcasts to accurately and clearly tag speakers in real time.
“We work with customers to determine which speakers they want to identify,” said Karlovits. “And as with all our services, we offer a range of customization options for speaker IDs and can tailor formatting and styles to specific customer requests and preferences.”
The new feature also improves analytics capabilities by enabling broadcasters to track and analyze who said what, information that can be very helpful in terms of compliance, editorial decisions and future AI-powered workflows, the company said.
Verbit’s speaker identification is the latest addition to its Captivate ASR solution, which was recently named to Fast Company’s “Next Big Things in Tech” list, and is designed to meet the specific needs of customers by providing accurate, cost-effective captions and transcripts.
George Winslow is the senior content producer for TV Tech. He has written about the television, media and technology industries for nearly 30 years for such publications as Broadcasting & Cable, Multichannel News and TV Tech. Over the years, he has edited a number of magazines, including Multichannel News International and World Screen, and moderated panels at such major industry events as NAB and MIP TV. He has published two books and dozens of encyclopedia articles on such subjects as the media, New York City history and economics.