SMPTE2018: AI Makes Strides In Media Operations

LOS ANGELES—Artificial Intelligence for media has progressed remarkably in one year’s time –not simply in R&D labs, but also in the minds of creative and technical media professionals, at least if the SMPTE Annual Technical Conference and Exhibition is any measure.

Last year, the AI and machine learning in media was the subject of the day-long SMPTE Symposium—a forum devoted to emerging technologies, not the day-in-day-out realities of working in television and motion pictures. This year, however, AI was a major tech conference theme, a clear sign the technology is in daily use for media.


Jason Perr, media workflow specialist, at Workflow Intelligence Nexus, summed up the state of AI in media applications from the get-go during his presentation Oct. 23, the first in the AI conference track at this year.

Jason Perr

Jason Perr

“When will AI be here?” Perr asked, rhetorically. “The answer is today.” The time is right to “filter out the noise of marketing speak” and find the right AI tools for specific media workflows. “If you’re not already using it [AI], in a lot of respects, you could be falling behind.”

Focusing on media asset management, Perr explained AI offers better accessibility and visibility across large libraries of stored digital content. “It can be used as a simple process to be able to extract metadata, whereas in the past people have known this to be something that can be very painful…,” he said.

AI is seen as having great potential for MAM applications because it can learn from an existing library and use that information to analyze new content added to the system.

What is required to take advantage of AI for asset management is an open MAM system with “a robust, well-documented, automatically updated API,” which will give users the ability to draw on an extensive lineup of AI tools for use across their sizable content libraries, he said.

AI tools for MAM applications can be built into the system itself, or they can be added from outside sources, said Perr. The former includes tools like facial recognition, automated metadata tagging and being able to search across transcriptions.

However, these built-in AI tools are just the tip of the iceberg, he said. Other tools that can be added perform functions such as phonetic searching, keyword spotting, voice recognition, transcriptions, object recognition and audio fingerprinting to identify speakers.

Not only do these AI tools for MAM make content more easy to search, they unlock the value of older content by helping organizations remember what’s been stored that may have otherwise faded from memory.

Getting AI to perform for a specific organization comes down to finding the right integration and customization partner. Not only should that consultant know AI but also understand the media enterprise’s workflow “inside and out,” he said.

Perr also advised his audience to test AI tools being considered in their specific organization’s environment before buying so they get the results they desire.


Norman Hollyn, an editor and professor at the University of Southern California, focused his presentation on how AI tools can assist editors in editing suites.

Norman Hollyn

Norman Hollyn

One use is in the hyperlocalization of content, he said. Drawing on AI’s visual recognition capabilities it is possible to identify and replace objects in a scene for a specific audience. To illustrate his point, Hollyn showed how AI tools from a new company called RYFF could identify one of several wine glasses on a tray and perform a live replacement of it with a bottle of wine.

“This increases the ability to localize our content to replace labels on bottles for different sponsors or an Arabic version of a Coke bottle instead of English or Spanish,” he said.

Now that AI visual recognition is available at the high end, it won’t be long till the technology becomes available to those with much lower budgets and begin to be used on independent films and low-budget web series, he added.

Other AI tools are assisting editors today and will evolve to become more helpful in the future. Hollyn pointed to AI script analysis, which today is used for shot and scene identification as well as identifying what characters exist in a film. It’s becoming possible to task AI with finding sentiment, gender and many other scene attributes, he said.

“This is huge for me in the final side of things, and it is even more important when we talk about the localization and distribution [of content],” said Hollyn.

There is even an aspect of AI script analysis that can help producers predict the box office value of a movie based on elements in the story, he added.

Hollyn also showed an example of work being done by Stanford University and Adobe on AI in editing. Using idiom-based editing, users can combine different editing idioms to define different editing styles.

A video clip showed how it is possible to use a basic idiom, such as the presence in an edit of performers while they are speaking, to sort clips. The system can then generate an edit based on that idiom, as well as others if desired, across an entire scene. However, the tool, which is in beta today, lacks any recognition of emotion. “It doesn’t really understand performance, at all, at this point,” said Hollyn.

However, by adding an understanding of emotion, AI can one day be used to streamline the editing process by producing a rough cut that can then be tweaked and reworked as needed by a human editor. “That’s huge for me,” he said.


Lior Berezinski, director, presales engineer at Prime Focus, devoted his presentation to AI in broadcast operations.

Lior Berezinski

Lior Berezinski

Berezinski discussed the use of AI in a variety of media operations, such as promo and highlight production, localization operations, compliance mastering and VOD operation.

Focusing on metadata, Berezinski pointed out that today AI can be used for content discovery by relying on tools, such as content analysis, to identify objects, faces, brands and profanity. However, Prime Focus is adding its own technology, called “Wisdom,” on top of AI, he said.

“With Wisdom, you can do more meaningful things for the media and entertainment industry,” said Berezinski. For example, Wisdom can detect contextual thumbnails from a piece of content as well as discover clips–not just frames, but scenes, color bars, blacks, slates and credits.

From a workflow point of view, AI with Wisdom can discover clips for a story brief, build a story timeline and share exported clips.

To help localize content with appropriate, Wisdom-assisted subtitle production supports speech-to-text and language translation, live transcription and creation of time-code driven subtitling to accommodate different reading speeds.

For these as applications as well as compliance mastering and VOD operations, the goal is that one day the people involved with media production and distribution will be focused on quality control and approval, he said.

The 2018 SMPTE conference continues through Thursday, Oct. 25. For more information,

Phil Kurz

Phil Kurz is a contributing editor to TV Tech. He has written about TV and video technology for more than 30 years and served as editor of three leading industry magazines. He earned a Bachelor of Journalism and a Master’s Degree in Journalism from the University of Missouri-Columbia School of Journalism.