MPEG-2, MPEG-4 and JPEG 2000 encoding

MPEG and JPEG encoding of video content have been around for a long time now. It seems hard to believe, but the first product tests for MPEG-2 encoding began in the mid-1990s, with the standard adopted in 1996.

MPEG-2 is sometimes misunderstood in fundamental ways. It is both a compression system and a description of the transport stream that carries the content. MPEG-2 was developed to overcome some of the shortcomings of the MPEG-1 standard, which included limitations in audio to two channels, no support and poor results with interlaced signals, and support only for 4:2:0 encoding. Each of these limitations proved critical to enabling professional use and to a degree affected deployment of consumer applications as well. And MPEG-1 was developed to deliver narrowband content to fit in a T-1 link, or within the bandwidth of a CD-ROM. We moved beyond those requirements a long time ago, so a review is necessary. For a comparison of compression standards, see Table 1.

Review of MPEG

MPEG-1 and MPEG-2 make use of discrete cosine transform (DCT) to compress the content. By converting from the spatial to the frequency domain using DCT, and then normalizing and quantizing the coefficients that result, the considerable redundancy in video content can be squeezed to allow lower bit rates. Other techniques are also used, including run length encoding to further improve performance.

The most important point is that what is thrown away is hopefully content we don't miss. The analysis of the content is based on good science that is founded on the psychophysics of human vision. For instance, we don't see detail as well in the presence of motion, so we can eliminate that redundant information. Compression choices are made by hardware designers based on tests done with expert viewers in a controlled environment and increasingly with automated test equipment that replicates the results that expert testing returns. The standard defines the bit stream a decoder must be able to decode, but in fact doesn't talk about how to achieve the bit stream.

The MPEG-2 transport stream provides a method of transmission of single or multiple programs, each with multiple audios, closed captions and other data types. The ATSC standard is based strongly on MPEG-2 for compression and transmission, with extensions that customize it for the needs of terrestrial transmission.

It is important to remember what the system was designed to do “a priori.” Compression is expensive, so decoders can be cheap and easy to deploy. Early HD encoders for the ATSC system cost $250,000. Now consumer camcorders have MPEG-2 (or AVC) HD compression built into devices, costing 25 percent of the price 15 years ago.

High-quality encoding is still more expensive, but keep in mind that the results are short of astounding. We routinely send SD content in less than 2Mb/s and HD in less than 12Mb/s for end-user applications. Equally important, the transport stream developed in the 1990s is just as good today, serving MPEG-4 and JPEG 2000 applications as well. With suitable error correction, we now send MPEG content through noisy RF environments; IP networks; and over fiber, copper and WiFi, successfully. Many people entering the production industry are more comfortable with MPEG video than baseband signals, which seem almost arcane.

Though not a subject for quick review, it is worthwhile to note that none of this comes for free. The MPEG-LA (Licensing Authority), a patent pool, holds more than 600 patents from more than 20 companies and universities, many of which have expired. The negotiations over the patents took longer than the development of the technology. Today, MPEG-2/4 (H.264/MPEG-4 Part 10/AVC) covers transmission and storage of bit rates from well below 1Mb/s to over 400Mb/s.

Though using MPEG-2 can produce great results, the technology is much older than the standard adopted 15 years ago. The MPEG-4 standard, and particularly Part 10 (also known as AVC, Advanced Video Coding and H.264 for the ITU standard number) offers all that MPEG-2 does, plus many additional tools that can further improve the coding efficiency and reduce bit rates for equivalent quality.

MPEG-4 Part 10, in which we are primarily interested in broadcast systems, still uses DCT compression, but allows variable block sizes, multiple reference frames for motion prediction, and the ability to code single blocks as intra (I) or predictive (P) data. Other significant improvements developed after MPEG-2 were loaded into the standard, like increased precision in motion prediction (one-quarter picture element, or PEL), more effective entry coding using context-adaptive binary arithmetic coding (CABAC) and context-adaptive variable-length coding (CAVLC).

In an important way, AVC supports the modern needs of the industry for increased quality at higher bit rates. Support of better pictures; 4:2:0, 4:2:2 and 4:4:4 chroma subsampling; and increased sample precision ranging from 8 bits to 14 bits per sample were added, which has extended the application space to include production for a wide range of uses and archival storage. The results can be astounding, with decoded pictures indistinguishable from original material.

Of course, higher quality is also available at low bit rates, and much of the content we watch on the Internet is H.264 coded as well. Like MPEG-2, AVC has long latency, which is problematic for many live broadcast uses. Latency can be reduced substantially by using higher bit rates and I-frame only coding, but with higher transmission bandwidth.

JPEG 2000

At the high end, JPEG 2000 is quite a different beast. It does not use the same tools as any of the MPEG flavors, but instead uses wavelet compression. This offers interesting advantages, but not low bit rates. Wavelets can allow lower resolution outputs to be extracted from a higher bit rate stream without recoding. The need to have a separate system generate and store thumbnails may not be necessary. Similarly, an HD stream can be used to directly extract an SD output.

In general, errors are less visible because by the nature of the compression, they are spread across the image rather than being evident in small areas of the screen. JPEG 2000 offers high bit rates for high-quality applications, as much as 960Mb/s; image maps up to 4096 × 2304; and frame rates to 120Hz. It should be no surprise that it is the basis of the Digital Cinema standards adopted worldwide.

In addition, it can offer excellent quality at moderate bit rates for mezzanine compression use in broadcast facilities. A system using wavelet compression similar to JPEG 2000 was developed by the BBC and is known as Dirac. Other broadcasters have adopted JPEG 2000 in part due to extremely low coding/decoding latency.

MPEG-2 dominated the market for nearly a generation. It might not be reasonable to expect the next dominant compression format to last that long.

John Luff is a broadcast technology consultant.

Table 1. Compression comparison Compression Precision Color coding Latency GOP Motion Uses MPEG-2 DCT 8 bit 4:2:0-4:2:2 Low to long One to many Motion- compensated Broadcast / production Consumer/ professional AVC DCT 8-12 bit 4:2:0-4:4:4 Long One to many Motion- compensated Mobile ATSC/ production Consumer/ professional JPEG 2000 Wavelet (DWT) 8-14 bit 4:2:2 Low One N/A Transmission/ digital cinema/stills Professional Dirac Wavelet (DWT) 10 bit 4:2:2 Low One N/A Production Professional

Send questions and comments to:john.luff@penton.com