H.265 HEVC, The Next Step for MPEG

High Efficiency Video Coding (HEVC), also called H.265, is the next step for MPEG video coding. It currently exists in the form of a preliminary draft standard finished in February and is expected to be completed, approved, and published in 2013. To understand H.265, let's look at a little MPEG history, to see where it came from.

FUNDAMENTAL MPEG ELEMENTS

MPEG-2, Part 2, the digital video coding scheme we are so familiar with in U.S. digital television, is an internationally standardized technology that was developed jointly by the ISO/IEC JTC1/SC29/WG11, also known as the Moving Picture Experts Group (MPEG) and ITU-T SG16 Q.6, also known as the Video Coding Experts Group (VCEG). The MPEG-2 standard is jointly published by the two organizations under the two titles ISO/IEC 13818-2 and ITU-T H.262 .MPEG-2 was developed as a follow-on improvement to MPEG-1, which itself was based on H.261, a teleconferencing codec optimized for low bit rates. H.261 introduced the concept of dealing with macroblocks of pixels, in the case of H.261, a block of 16x16 pixels.

These MPEG standards are "tool kits," from which users may choose whichever tools they find useful. They are developed under two overarching concepts: First, the standard only defines the syntax of an encoded bitstream and describes the method of decoding that bitstream, while the implementation of the coding and decoding functionalities is left to the developer. Second, the decoder modules are deliberately simple and inexpensive to build, while the encoder modules can be complex and costly.

LG Elecronics unveiled the world’s largest Ultra-High Definition TV set at CES in January. The 84-inch screen supports 3840x2160 (4K) resolution. MPEG-2 forms the basis for succeeding MPEG video technologies, and it established the fundamental MPEG elements: prediction, transform, quantization, and entropy coding. Each video frame is coded into an MPEG frame, and a discrete cosine transform is performed, generating a transform coefficient from each pixel. The preponderance of energy contained in an image is concentrated in a subset of the image's transform coefficients, making it easier to compress, as the lower-energy coefficients may be ignored. Coding efficiency is maximized by defining three types of coded frames: An I-Frame, or intra-coded frame, is a real picture, coded only with reference to itself, and not to any other frame. A P-Frame, or predictive frame, is not a real picture, but is rather composed of motion-compensated difference information between itself and a preceding I-Frame or P-Frame. A B-Frame, or bidirectional frame, is also a difference frame, but while a P-Frame only looks backward to previous frames, a B-Frame looks both backward, and forward to future frames. Coded frames are arranged in Groups of Pictures, or GOPs, within which most interframe coding takes place. MPEG is coded into one of several profiles and levels, depending on the requirements of a particular application. For example, MPEG-2 Main Level is SD, while High Level is HD.

The Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG subsequently developed the next generation of MPEG coding, approved in 2003. This standard was also jointly issued, and is known as ISO/IEC 14496-10, MPEG-4, Part 10; and ITU-T Rec. H.264, Advanced Video Coding (AVC). MPEG-4's principal advantage over MPEG-2 is higher coding efficiency, permitting it to deliver video quality equivalent to MPEG-2 using a data rate as low as half that required by MPEG-2.

The current ISO/IEC collaboration is known as the Joint Collaborative Team on Video Coding (JCT-VC), a group of video coding experts from ITU-T Study Group 16 (VCEG) and ISO/IEC JTC 1/SC 29/WG 11 (MPEG). JCT-VC are now developing the next generation of MPEG coding, which, when completed, will be known as H.265, HEVC. This work began in early 2010, and is expected to result in the publication of a finished standard in 2013, 10 years after the publication of MPEG-4. A principal objective of H.265 is to improve coding efficiency, reducing bit rate requirements to half that required for comparable pictures in MPEG-4, at the expense of increased computational complexity. It is aimed at advanced displays and television systems that use progressive scanning and resolutions from 320x240 pixels up to UHDTV with 8K resolution (7680x4320 pixels), so it is aimed at covering the gamut from handheld mobile screens to theater-grade 8K displays, and the systems to produce and transport images for them, plus improved noise performance, dynamic range, and color fidelity.

IMPROVEMENTS OVER MPEG-2

Like MPEG-2 and -4, H.265 builds on its predecessors. Some improvements include larger block structures with flexible mechanisms of sub-partitioning. In the draft specification, the traditional MPEG macroblock structure is replaced with variable-block-sized coding units, (CUs) which define the sub-partitioning of a picture into rectangular regions. Each CU in turn contains one or more variable-block-sized prediction units (PUs) and transform units (TUs). Each TU is processed by performing a spatial block transform, and quantizing the resulting transform coefficients. Within the prediction loop, an adaptive loop filter (ALF) is applied, then the frame is copied into a reference decoded picture buffer, a process that improves both objective and subjective picture quality. As in H.264, context-adaptive entropy coding schemes are used. (The preceding information was adapted from ISO/IEC JFT1/SC29/WG11/doc. No. N11822, "Description of High Efficiency Video coding (HEVC)", by Jens-Ranier Ohm and Gary Sullivan, January 2011.)

So MPEG-2, MPEG-4 AVC, and the work-in-progress HEVC represent a new and improved video coding standard about every decade. While MPEG-4 was pretty much all about increased coding efficiency, HEVC promises increased efficiency plus adaptation to all the present and future video applications that may be currently envisioned, from mobile phones to UHDTV. What the decade beyond HEVC will produce is anybody's guess.

Randy Hoffner is a veteran of the big three TV networks. He can be reached through TV Technology.