MPEG-4 Advanced Video Coding Emerges

At long last, HDTV is arriving in Europe, with a single HDTV DBS service already operating, and launches in the U.K., Germany, and France announced for the 2005-06 period. As might be expected, there are some differences between U.S. HDTV and European HDTV.

Perhaps the most obvious difference is that the scanning formats in Europe use frame rates based on 50 Hz, rather than 60 Hz.

Interestingly, we learned that Europe has HDTV scanning format wars in progress that bear striking similarity to the ones we had here in the United States, pitting 720p/50 against 1080i/25. (Only the frame rates are changed to protect the innocent.)

A second difference is that the European HDTV services announced to date are pay services, not free-to-air television. Delivery media will include terrestrial broadcast, cable and satellite.

A third major difference is that as of today, all HDTV services in Europe are expected to use MPEG-4, Part 10 Advanced Video Coding, rather than the MPEG-2 coding used for both HDTV and SDTV in the United States, and for standard-definition DTV services in Europe.

CODING FLEXIBILITY

Using advanced coding is somewhat easier in Europe because each pay service requires its own set-top box, and the box can be made to be capable of decoding both MPEG-2 and MPEG-4.

The one satellite HDTV service operating today, Euro1080, currently uses MPEG-2 coding but will move to MPEG-4 in June 2005. In the United States, DirecTV has announced that it will ultimately broadcast all HDTV using MPEG-4 coding as well.

What is MPEG-4, and why is it better than MPEG-2? First, let's try to sort out the origins of this standard and its various aliases. The first MPEG compression technology toolkit, MPEG-1, was only applicable to progressively scanned, SD images.

The second generation, MPEG-2, added tools for interlaced pictures. A third generation, MPEG-3, was contemplated for HDTV images, but it was discovered in the early days of DTV development in the United States that the tools available in MPEG-2 were adequate to process HDTV images, and MPEG-3 was abandoned in favor of MPEG-4.

The MPEG-4 development process earlier produced MPEG-4, Part 2, which has for a time been used in such applications as cell phones and digital still cameras. When we see references to MPEG-4 in today's news, those references are to MPEG-4, Part 10, the most recently standardized MPEG technology toolkit.

In a joint effort by the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) and International Telecommunications Union (ITU), the MPEG-4 technology was developed by a group called the Joint Video Team, which is comprised of ISO/IEC JTC1/SC29/ WG11, also known as the Moving Picture Experts Group or MPEG, and ITU-T SG16 Q.6, also known as the Video Coding Experts Group or VCEG.

The name that the ISO/IEC group originally used for this new compression technology was Advanced Video Coding (AVC), while the original ITU-T name for it was H.26L.

The ITU document standardizing the technology is "ITU-T Recommendation H.264, Advanced Video Coding for Generic Audiovisual Services-Coding of Moving Video," which was approved by ITU-T in 2003. The ISO/IEC document is "International Standard 14496-10 (MPEG-4 Part 10) Advanced Video Coding." It is typically referred to as H.264 AVC or MPEG-4, Part 10.

INCREASED EFFICIENCY

The title of the ITU-T recommendation references "generic audiovisual services," and this codec syntax is intended to be used in a variety of video transmission and storage applications, ranging from cell phones to HDTV.

Its principal advantage over MPEG-2 is its higher coding efficiency, which permits it to deliver video quality equivalent to MPEG-2 using half the data rate. This naturally makes it attractive to video distributors, because it permits them to maximize the number of services that may be contained in a given amount of bandwidth.

Some extensions--known as "Fidelity Range Extensions"--to the original standard were recently completed. These extensions facilitate higher-fidelity video coding by supporting higher bit-depths, including 10-bit and 12-bit encoding, and higher color resolution using the sampling structures YUV 4:2:2 and YUV 4:4:4.

ISO14496-10/H.264, like previous MPEG standards, does not define a specific encoder and decoder. Instead, it defines the syntax of an encoded bitstream and describes the method of decoding that bitstream. The implementation is left to the developer.

The fundamental elements of the MPEG-4 coding process are the same as those of the earlier MPEG standards, which include prediction, transform, quantization and entropy coding. It has a number of new features that permit more efficient video coding than does MPEG-2. Let's look at some of the high points.

Multiple reference picture motion compensation uses previously encoded pictures more flexibly than does MPEG-2. In MPEG-2, a P-frame can use only a single previously coded frame to predict the motion compensation values for an incoming picture, while a B-frame can use only the immediately previous P- or I-frame and the immediately subsequent P- or I-frame.

H.264 permits the use of up to 32 previously coded pictures, and it supports more flexibility in the selection of motion compensation block sizes and shapes, down to the use of a luma compensation block as small as 4-by-4 pixels. H.264 also supports quarter-sample motion compensation vector accuracy, as opposed to MPEG-2's half-sample accuracy.

These refinements permit more precise segmentation of moving areas within the image, and more precise description of movement. Further, in H.264, the motion-compensated prediction signal may be weighted and offset by the encoder, facilitating significantly improved performance in fades (by now we all know that fades can be problematic for MPEG-2).

DE-BLOCKING FILTER

Also, as we well know by now, block-based coding can generate blocking artifacts in the decoded pictures. In H.264, a de-blocking filter is brought within the motion-compensated prediction loop, so that this filtering may be used to predict an expanded number of pictures.

Switching slices, which permit a decoder to jump between bitstreams in order to smoothly change bit-rates or do stunt modes without requiring all streams to send an I-frame at the switch point (making the decoder's job easier at switch points), have been incorporated.

H.264 also uses advanced entropy coding methods.

To summarize, the principal new features of MPEG-4, Part 10, include enhanced motion prediction capabilities; the use of a small block-size, exact-match transform; adaptive in-loop deblocking filter; and enhanced entropy coding methods.

Many more refinements are incorporated into MPEG-4, which is the latest progressive step in MPEG video compression coding development. There is work ongoing in the ATSC to incorporate MPEG-4 into its standards. The folks in Europe waited long enough to delve into HDTV that they can use MPEG-4 virtually from the outset.

Randy Hoffner