SMPTE 2019: AI Powers Advancements in Video Compression

Jean-Louis Diascorn described how AI is improving compression at the 2019 SMPTE Tech Conference.
Author:
Publish date:

LOS ANGELES—In the 20-plus years since video compression became available, improvements in compression efficiency, CPU utilization and quality of experience have been many, steady and required a lot of effort.

However, a new approach to improving compression algorithms that leverages artificial intelligence has emerged and is already producing benefits that touch terrestrial broadcasters, OTT service providers, IPTV services and satellite and cable TV operators, said Harmonic Senior Product Manager, encoders, Jean-Louis Diascorn.

Jean-Louis Diascorn

Jean-Louis Diascorn

Speaking Oct. 21 on the first day of the 2019 Annual SMPTE Technical Conference & Exhibition at the Westin Bonaventure Hotel in Los Angeles, Diascorn discussed how Harmonic is exploiting AI to achieve these improvements.

During his “How AI Technology is Drastically Improving Video Compression for Broadcast and OTT Content Delivery,” Diascorn described a two-step process using AI to optimize the style used for compression as well as the optimization of encoding for frame rate and resolution considerations.

The first step, which can take hours or even days, is an offline learning process, he said. Many test panels are fed into an AI system at this stage to produce a prediction model, which is downloaded into a live encoding system.

“The second step is running the live system… [which] will use the prediction model to produce a better compression for viewing,” said Diascorn.

The Harmonic encoder expert laid out three predictive models: Dynamic Encoding Style, Dynamic Resolution Encoding and Dynamic Frame Rate Encoding.

Dynamic Encoding Style aims to reduce bitrate and maintain quality. Using the two-step process, test files are loaded into the AI system to develop these encoding styles. “Encoding styles are in fact encoding configurations,” he said. After a lengthy time running AI algorithms on the files, the system produces an encoding style prediction model that is downloaded into the live system. While running on the live system, video analysis provides information to the prediction model, which in turn modifies the encoding core, he explained.

A Tier One satellite provider has deployed Dynamic Encoding Style-optimized compression and has achieved about a 20% bitrate reduction, said Diascorn.

Dynamic Resolution Encoding relies on the same two-step process, but in this instance, AI leverages the relationship between movement and resolution in developing a predictive model.

In low movement video, the human eye sees detail; however, in high movement video, such as in sports, the eye is unable to recognize the same level of detail, he explained. “So, it can be useful to have the appropriate resolution,” said Diascorn. “This is what Dynamic Resolution Encoding does, selecting with AI the best possible resolution for [a] given video.”

Dynamic Resolution Encoding is primarily intended for use by OTT services, and the benefits achieved are primarily in CPU savings, he said. “For complex content, we found a 50% CPU savings, and for simple content we found 42% CPU savings,” said Diascorn.

Dynamic Frame Rate Encoding is “a little like the previous one, but the other way around,” explained Diascorn. In video segments with a lot of movement, it is desirable to have a high frame rate to avoid any jitter in the picture. Conversely, in video with little movement, “why spend all the bits?” he asked.

Once again, the two-step method was employed to develop Dynamic Frame Rate Encoding. This time the process was tuned to create a prediction model based on the best frame rate for the amount of movement in video.

Appropriate for satellite, IPTV, OTT, cable and terrestrial broadcasting, Dynamic Frame Rate Encoding saves on average about 30% of frames, which translates into about a 30% savings on CPU usage. “In terms of bitrate savings, we are about 10% for AVC and 5% of HEVC,” said Diascorn.

Diascorn noted that it is possible to combine the AI-optimized predictive encoding models, depending on the application.

“Dynamic Encoding Style and Dynamic Frame Rate Encoding both provide bitrate savings—significant [savings] for Dynamic Encoding Style and [a] better quality of experience is brought by the Dynamic Resolution Encoding,” he said.

“CPU savings is brought by DRE and DFE, and in terms of interoperability the Dynamic Encoding Style and the Dynamic Frame Rate are for all applications, whereas Dynamic Resolution Encoding is mainly for HD OTT.

The implementation of AI to enhance the performance of compression technology is in its early stages, he noted. “We have an exciting future ahead of us because we are just starting,” he said.