LOS ANGELES—At first glance, swinging robots and ice cream sales may seem like odd topics for a media and entertainment technology symposium.
But that’s exactly what Monday morning’s speakers (Oct. 23) at this year’s kickoff technology symposium before the official start of the SMPTE 2017 Annual Technical Conference and Exhibition in Hollywood, Calif., had on their minds.
The topic of the SMPTE 2017 Symposium was machine learning (ML) and artificial intelligence (AI) in media. But to get where the gathering needed to go, the morning speakers—Richard Welsh, co-founder and CEO of Sundog Media Toolkit; Jeff Kember, technical director of Media in the CTO Office at Google Cloud; and Jay Yogeshwar, director of media and entertainment at Hitachi Vantara (previously Hitachi Data Systems)—set the stage by bringing attendees up to speed on an area of technology that likely is unfamiliar.
Welsh, who also serves as SMPTE Education Vice President, laid out the basics during his session “Understanding AI, Analytics and Machine Learning.”
“AI is clearly a bit of a buzzword,” he said. Calling it a marketing term, Welsh explained that the description is useful for the general public, but doesn’t really say what it is. More useful is dividing the domain into three areas: machine learning, which relies on algorithms to train a machine to do a task; deep learning, which is similar but regenerative, in other words, learns as it is does a task; and general AI, a free-learning artificial intelligence that may actually be conscious and is without constraints, he said, adding that the latter does not exist.
Driving ML are algorithms, which fall into two basic categories: classification, which aims to tell the difference between things, such as a horse and a dog; and regression, which is basically a decision tree, trees or an entire forest processing a data set, he explained.
Welsh illustrated the example with an ice cream vendor wishing to correlate sales to the ambient temperature. Other data points could be added to the model, such as time of day, location as well as economic and demographic information of customers.
“You might think this is really going to be complicated,” said Welsh. “But actually, once you have trained the network, it’s not that difficult to traverse the network.” The tools can be used to make quick decisions, he said. For example, an 80,000-node tree and 90 trees that is traversed 4,182 times will take about 0.02 seconds on a normal machine.
GETTING THE HANG OF SWINGING
Jay Yogeshwar With the AI and ML fundamentals laid out, Hitachi Vantara’s Yogeshwar discussed where their impact is being felt in media workflows. During his presentation “How Digital Transformation Is Changing Media & Entertainment With Machine Learning, Content & Data Intelligence and Heterogeneous Integrated Workflows,” he identified four areas: video compression, nonlinear editing and MAMs, OTT ad insertion and OTT e-commerce.
To illustrate how machine learning will advance media workflows, Yogeshwar showed a video of how Hitachi is using AI to enable a robot to learn how to swing using machine learning algorithms running on a laptop. The significance of the example is that the robot is using machine learning to develop its own hypothesis about how best to swing. Quickly the robot begins to get the hang of swinging, the video showed. Within a minute it has become much better. By 20 minutes into the experiment, the robot is swinging better than a human being, he said.
Running the same algorithms on a different robot suspended from a bar, the robot learns to swing and shortly swings “better than a gymnast,” said Yogeshwar. In video compression, similar machine learning will be able to generate its own hypotheses about how best to compress a clip. “I firmly believe that it is such a big optimization problem and it involves human perception that the next breakthrough is going to come from applying machine learning algorithms to compression,” he said.
The morning also featured a keynote from Google’s Kember who focused his talk on what problems ML can solve today. Google is taking a measured approach to machine learning, trying to identify what app needs there are today and problems that can be solved and where it could go in the future, he said. Kember charted advancements in machine learning, pointing to the performance of AlphaGo, Alphabet’s Google DeepMind to learn game play at a championship level, which subsequently was outdone by AlphaGo Zero, as well as in language translation from a phrase-based approach to a neural network method.
In terms of what is possible today for media, Kember said Google has APIs for images, motion, movies, text and audio. The company also has the ML algorithms to allow users to build their own apps and Tensor Processing Units to power services. In media, Google is at work finding ways to insert machine learning into the entire production cycle from camera raw to debayering, transcoding and dailies to VFX processes like rendering and simulation to final color correction, distribution and archiving.
Doing so will enable a decentralization of the editorial process, which will allow shooting to take place in one location with the directorial staff and all the others involved in the workflow located elsewhere, he said. “What we are doing is taking the entire production pipeline, be it for episodic television, feature film or feature animation, and being able to upload it and make it available globally,” he said.
Watch the full panel here.