SMPTE 2014: Doing HDR With HEVC

HOLLYWOOD—Higher resolutions comprised the initial impetus for high-efficiency video coding, or HEVC. Since then, attention has turned to the concept of improving pixels rather than just using more of them. So Raul Diaz, CEO of Vangard Video, set out to explore the integration of HEVC with higher dynamic ranges of luminosity.

Luminance dynamic range is about 15 orders of magnitude, from zero photons to the sun. It takes about 40 bits to represent that, Diaz said. The human eye can see about 12 bits of data, though different studies provide different conclusions.

“Technology used in the past had a range of around 6 to 7 bits of data in terms of the lightness-darkness range,” he said. “SDR digital video is around 7 to 8 bits. Workflows developed around those figures, he said. Cameras today, however, can capture 14 to 15 bits. Even mid-range cameras can do 12-13 bits.

“Display technology also is catching up, in the neighborhood of 9 to 10 bits, and should be able to match human eye dynamic range with equipment we have today. So we need a workflow that can support this.”

HEVC actually has native support for HDR, he said. HEVC can process HDR with HDR extensions. However, existing gear can’t decode HEVC HDR extensions. Existing gear can’t interpret HDR display samples, though SMPTE’s ST 2084 and REC 2020 help for new displays.

“We can create a backward-compatible HDR approach,” he said.

Diaz described an approach in which SDR and HDR were separated into two layers, with HDR serving as the enhancement layer.

“Managing the enhancement layer has to be invisible to legacy gear,” he said. “The EL may be selectively transmitted to save bandwidth.”

Dual-layer HDR-compatible equipment combines layers into HDR data. The recombined HDR samples are sent gear that can display it, while equipment that cannot display it simply doesn’t see it.

Dolby Vision has a dual-layer HDR implementation, with HDR data pre-processing; base layer and EL separation, and metadata signaling, he said. EL adds 20 to 30 percent additional data load to the base layer, or BL.

Equipment that can see the EL will demux it and decode it independently. An EL decoder will need access to reconstructed the BL data. EL needs original 12-bit HDR input minus reconstructed BL data.

This presents the challenge of doubling the input data into the system, Diaz said. Two uncompressed video input streams will require two HEVC encoding tasks, and more bandwidth in the workflow.

HEVC provides a 30 to 50 percent improvement over H.264, but it requires about 10 to 40 times more computing, he said.

“The Dolby Vision implementation of dual-layer HDR adds 20 to 30 percent load to bitrate,” he said.

Diaz said three topologies were being tested: Balanced, limited and high processing. With the balanced approach, the HDR data comes in, is split, and each goes to its own processor. The BL is processed normally while EL waits and the two are later combined. Diaz said it was a good application for cloud environments connected via Ethernet.

In a limited-processing topology, shared memory space provides access to decompressed BL data at essentially zero cost. The uncompressed data can be sent between the processor and decoder. BL encoding is modified to provide access to reconstructed frames in this shared memory environment, saving on latency.

High processing adds a redundant partial BL encoder to the EL processing system. Complete, independent ACR streams are sent to two completely independent processing servers. EL does a redundant partial BL encode, generating reconstructed frames within system. The approach requires three HEVC encodes, but because they’re separate, a traditional SDR workflow can be maintained.

There’s a trade-off of processing for data movement for latency in each of these different environments, Diaz said, but quantitative results have not yet been revealed. Results expected later this year.