SMPTE 2017: Q&A—Michelle Munson, Yvonne Thomas

Shortly before the start of the SMPTE 2017 Technical Conference & Exhibition, TV Technology spoke with Michelle Munson, co-inventor of the Aspera FASP transport technology and CEO of Aspera until May 2017, and Yvonne Thomas, product manager of New Media for arvato Systems, about the SMPTE 2017 Symposium.

TV TECHNOLOGY: You co-chaired the SMPTE 2017 Symposium, and the topic is artificial intelligence and machine learning—relatively new concepts for the M&E industry. What are the top two or three areas where you expect to see AI and machine learning helping those in the industry? Why?

Michelle Munson

MICHELLE MUNSON: Currently, the most impactful area of innovation in AI and ML for the M&E industry is perhaps automatic video, image and audio indexing, which allows features of content to be recognized and tagged at scale, and in real time (live). This broad area has a host of applications that can dramatically help automate and scale the industry in the short/medium term:

- Automation of compliance and quality control in multi-territory distribution

- Automatic (live) editing and creation of screeners/previews and sub-clips

- “Tagging”/metadata enrichment of key elements like particular brands/faces/activities/scenes for the purposes of advertising placement or personalized recommendation; large-scale search for viewers; and content-archiving or repurposing

- Automatic scene change and keyframe detection for advanced compression techniques

- Automatic summarization of video content

- Speech-to-text translation for automating many heavy lifting tasks such as live captioning, live language translation

In benefit to the consumer, collaborative filtering techniques are widely used to providepersonalized content recommendation based on a user’s past preferences or the preferences of similar users. Deep learning neural networks are now beginning to expand these techniques to allow for dynamic content composition based on user sentiment to create dynamic, customized “choose your own story” experiences.

Additional major areas of applications are in enhancing content security and detecting piracy, e.g.:

- Anomaly detection in content security and access control systems allow for rapid discovery of potential deviant behavior in access to critical content systems based on identification of unusual events or usage patterns, and could flag breach attempts ahead of penetration.

- Deep learning techniques can be applied for highly accurate identification of low res or otherwise down-modified master content that has been pirated.

Advanced ML optimization techniques also have potential to dramatically optimize resource selection in content storage and distribution, such as compute, bandwidth, storage and internet routing to maximize user streaming and download experience and to reduce costs of content storage, exchange and distribution (see question 3).

Yvonne Thomas

YVONNE THOMAS: First of all AI is a general term, a summary for several kinds of technologies. Machine learning is part of AI, so are analytics, neural networks, cognitive computing, algorithms, deep learning, big data, linked data part of the family.

Metadata is very essential to the M&E industry and analytics services help to enrich metadata automatically and reliably, especially when it comes to huge amounts of data we humans are very inefficient in processing. However, it´s most essential that machine learning is part of those systems and evolves continously.

Second AI can be useful for predictions. Let me describe two examples here:

- A manufacturer (Thyssen Krupp) uses AI to predict when their elevators break and thus can repair them before they are out of service and avoid lengthy, costly and disappointing down-times. The same use case can be relevant to M&E technology, especially where we have 24/7.

- Another example seems pretty impressive. At the 2017 NAB Show, Banjo did a demonstration of their system in which they demonstrated all kinds of data souces (even police radio) and use this data for predictions. According to them they were the first ones in the world to report about the Paris attack. Especially for news, this is extremly essential to be the first.

TVT: Should SMPTE members be concerned they will be losing their jobs to AI and machine learning or happy that these technologies will make their work easier and them more productive?

MUNSON: I don’t believe the AI/ML era is significantly different in this respect than other big leaps in technological innovation (and that the risks are more about the ethical ramifications, see question 4). A few points:

- The automation and AI/ML can replace the “undifferentiated heavy lifting” (quoting our speaker from AWS Konstantin Williams)—in areas such as object and scene detection, facial analysis, facial recognition, face comparison, celebrity identification, image moderation.

- The sheer inventive possibility AI opens throughout the media creation and supply chain drives huge new revenue opportunities and consumer experiences.

- Success relies on new multidisciplinary expertise crossing devops, software, data science and media expertise generating new job opportunities of higher value.

- And, research has shown the most accurate models combine human analysis with machine-learned recommendations—meaning that what we are creating can largely improve the work of the human media professional rather than replace it.

THOMAS: No, I don´t think they must be afraid of AI. As said in question 1, we humans are very inefficient in processing huge amounts of data and AI can do this for us in a faster, more raliable and efficient way. It won´t take our jobs, but we will concentrate on what we are good at and we might have a shift in tasks. For example, we need new skill sets to train those systems (machine learning).

TVT: How do AI and machine learning play into more efficient resource use in media? I was thinking of tasks like closed captioning and subtitle creation, managing network traffic and storage for CDNs and also helping media companies deal with new requirements, such as delivering personalized content or channels to viewers based on their individual interests, preferences, habits, etc.? But maybe I am way off base here.

MUNSON: Many broadcasters’ media management and distribution systems still rely on brute force techniques to meet the demands of scale, which are no longer practical today—because they cost too much, fail to maintain uniform high quality of service in live and dynamic events or simply aren’t built for this era of direct-to-consumer, personalized media. Consequently, they are suffering in the face of extreme competition for OTT audiences dominated by the large Internet infrastructure companies. In content storage and distribution for example, the direct-to-consumer services today are relying largely on cloud storage and CDN architectures that use the same brute force edge storage and caching models introduced 10-20 years ago, which are often expensive and even at top dollar don't always provide deterministic, high qualities of service across all users or for live events. New machine-learning-based techniques can help make possible highly efficient storage and routing solutions that can reduce the costs dramatically, optimize high bandwidth distribution and adapt in real time.

THOMAS: It´s analytics services that are relevant for speech-to-text, face recognition, scene and concept recognition or network traffic management, etc. This metadata enrichment is extremely important to M&E to work as efficient as possible. In regard to recommendation engines I believe that what we currently use (no AI) is already pretty good, but is purely based on past data—because I watched movie X, I might be interested in movie Y. Analytics for recommendation engines will become more interesting to predict what‘s the next movie a customer will select.

TVT: What do you see as the top ethical concerns raised by the adoption of AI and machine learning in media?

MUNSON: Alan Turing, often credited as the father of modern computing, famously made the point that because it was not possible to measure the intelligence of a computing program/machine, saying that computers can be made “intelligent,” is nonsense. Another way of putting this is that much of today’s deep learning, “learns” in a way that imitates human behavior, but still heuristically, meaning that no one can explain exactly how these programs do what they do. In turn, many experts point out that the most advanced learning algorithms can neither be made fail safe or immune to manipulation by bad actors.

In media and entertainment the consequences of this are obviously not as grave as a self-driving car that “decides” to collide head on into another car, but are still concerning. For example, the very neural networks that drive high precision face and image recognition or live video editing could be manipulated by bad actors (or simply have software bugs) that cause the display of grotesque or otherwise offensive content taking advantage of viewers, advertisers or content owners. Additionally, the ability of today’s vast GPU/TPU compute power and scalable AI models could be used to manipulate news and create propaganda at scale that is undetectable or moves too fast for human detection; consider “fake news” at scale.

And finally, as an engineer I worry about the “commoditization” of AI tools such as ML programs that themselves create learning programs. These are already available to potentially unskilled users who may over-estimate their capacity or misuse their training accuracy, and could be a source of abuse—particularly when they are used to make decisions that can have clear bias or privacy consequences. Examples in media could be video surveillance and person matching that wrongly targets individuals based on profile, and security exploits that dwarf today’s worst exploits of IT negligence/ignorance.

THOMAS: I believe we must define rules and clear goals for what we would like to achieve with AI and where to use it. Think about the internet—the technology came into our lives pretty fast without any discussion about rules, and it has changed our lives drastically (positive and negative). As always technology can be misused, but let´s make something positive out of it.