Video compression

Broadcasters have a range of choices when it comes to signal compression solutions. For advanced, professional-grade compression, MPEG-2, H.264 and JPEG 2000 are all viable options. Ultimately, network infrastructure, bandwidth requirements and budget all help to define the “right” choice for a broadcaster. MPEG-2 and H.264 are strong options for compression for next-generation multimedia applications.

Increasingly, a strong case can be made for JPEG 2000, whose advanced intra-frame based encoding provides a degree of flexibility and control not found in other compression schemes. Furthermore, the surge in the amount of video transport applications requiring both very low latency and very high visual quality make JPEG 2000 an optimum solution to meet the demands of a video landscape moving toward HD.

For all broadcasters — regardless of network infrastructure, compression solution and application specifics — the goal is to deliver maximum quality given bandwidth and cost limitations with the sole purpose of maximizing revenue. Keeping in mind that video transport is a complete value chain. Anything done along the chain affects overall processing with consequences of a misstep dire and resulting in devaluation downstream. Of course, how the chosen compression solution is engineered and managed is also critical to achieving the best performance, regardless of the compression scheme selected.

MPEG-2: A legacy codec

Video compression algorithms such as MPEG-2 and H.264 are Discrete Cosine Transform (DCT)-based codecs that use inter-frame prediction to reduce video data between a series of frames. This involves techniques such as difference coding, where one frame is compared with a reference frame and only pixels that have changed from the reference frame are coded. In this way, the number of pixel values that are coded and sent is reduced. When such an encoded sequence is displayed, the images appear as in the original video sequence.

MPEG-2 grew out of the need to broadcast formats at higher data rates — SD signals at bit rates from about 3Mb/s to 15 Mb/s and HD at 15Mb/s to 30Mb/s. With MPEG-2's motion prediction compression, each encoded frame in a sequence of images is classified as an Intra (I-frame), Predictive (P-frame) or Bidirectionally-predictive (B-frame) picture. By reducing spatial and temporal redundancy, MPEG-2 provides increased compression. However, the use of B-frames necessitates a reordering delay dependent on the number of consecutive B-frames used that can increase delay significantly.

MPEG-2 is still widely deployed and continues to be a viable compression choice due its lower cost of implementation and widely supported 4:2:2 color sampling. There are, however, many limitations imposed by the codec itself and the standards that govern its implementation. For example, MPEG-2 typically requires higher bandwidth than competing codecs to ensure reasonably high video quality. MPEG-2 coded material is also extremely sensitive to errors and any information loss due to its bit stream architecture and packet-based encapsulation scheme. If a single packet is lost or corrupted, there can be significant impact on the decoder, causing dropped frames or pronounced blocking artifacts. Finally, 3G-SDI compression is not supported as a use case as defined by the standards, and that will ultimately cause the demise of MPEG-2 as a viable choice in the professional broadcast space.

H.264: The next-generation codec

H.264 or MPEG-4 Part 10 (Advanced Video Coding), was developed as a successor to MPEG-2, providing gains in efficiency and a comprehensive toolset for delivering high flexibility. As a result, H.264 provides equivalent video quality at substantially lower bit rates compared to MPEG-2. An H.264 encoder can, without compromising image quality, reduce bandwidth requirements by as much as 50 percent in comparison with MPEG-2. It was also developed to use the same asymmetrical architecture. Computational complexity at the decoder was minimized, ensuring enough flexibility for a wide range of applications, including broadcasting, storage and wireless multimedia communications.

H.264's algorithm is similar to MPEG-2 and uses the same underlying principles, including block-based motion compensation and the discrete cosine transform. However, H.264 emphasizes efficiency and reliability. It performs spatial prediction for intra-frame coding and temporal motion estimation for inter-frame coding to improve compression efficiency. In intra-frame coding, each frame is encoded on its own, without using any information from its neighboring frames. In addition, H.264 makes use of preprocessing stages and relies on spatial prediction using neighboring pixels from previously encoded blocks to take advantage of inter-block spatial correlation.

Key features of the standard include compression and transmission efficiency, and a focus on widely used applications of video compression. Its flexibility and scalability is evidenced by the 17 profiles and 16 levels supported today, each targeting specific classes of popular video communication applications.

The limitations posed by H.264 are similar to those faced by MPEG-2 in its infancy. Ultimately, it's the capability of existing technology that's slowed H.264's penetration in the broadcast professional domain. Today, the most technologically advanced and standards-compliant H.264 codec is capable of producing compressed video streams of 80Mb/s limited to 8-bit resolution. Deploying a H.264 link can be costly, as much as four times higher than competing standards both in cost and in power consumption. The architectural asymmetry of the codec has also led to the assumption that high-quality decoders are low-cost devices. Users often end up surprised by the high cost of professional video decoders.

JPEG 2000: a growing compression choice

JPEG 2000's underlying structure is the key to the advantages it offers in today's market. JPEG 2000 is a wavelet-based image compression standard and coding system. Conceived as an image, not a video codec, its intra-frame based encoding scheme provides a range of benefits across the broadcasting spectrum — contribution, production, and primary and secondary distribution.

JPEG 2000 is perhaps best known for its superior visual quality in comparison with H.264 and MPEG-2. (See Figure 1 on page 16.) JPEG 2000 allows operation over a complete frame while other compression schemes require the image to be broken up into smaller blocks, causing quality to diminish unevenly and vary within the frame. This creates the visually annoying digital artifact known as blocking. With JPEG 2000, quality loss occurs evenly across the entire frame and is perceived as a softening of edges otherwise known as blurring. Such distortions are visually less disturbing than blocking as blurring occurs naturally with human perception. JPEG 2000 offers the unique ability to deliver pristine content for downstream processing. Gains in upstream quality allow for higher quality in downstream processing. JPEG 2000 remains relatively unaffected by multiple encode/decode cycles delivering high-quality video throughout the chain.

JPEG 2000's low latency — typically 1.5 frames or less for a complete encode decode cycle — is critical for interactive applications and results from a lack of dependency from one frame to the next. This low latency of 45ms for HD compression compares favorably with H.264 and MPEG-2, which extends into the one- to two-second range.

The high bit rates achieved by JPEG 2000 compression are also critically important. As a standard, JPEG 2000 allows for very high rates — much higher than H.264 when comparing available implementations. For high-quality transport, this is key because bandwidth may be limited to certain infrastructure types, but bandwidth limits may not necessarily be crucial. For example, HD video at 1.5Gb/s will not fit into Gigabit Ethernet or OC-12 (622Mb/s), but the entire pipe can be dedicated. As such, JPEG 2000 can be provisioned to consume all available bandwidth to achieve the highest possible quality through “light” compression to just fit into the pipe or, when appropriate, encoded in mathematically lossless mode for no loss in video information.

One of JPEG 2000's most significant advantages is its flexibility. It may be transported over a multitude of network infrastructures — Ethernet/IP, SONET/SDH/PDH and fiber. When packed into ASI, JPEG 2000 can be transported anywhere, anytime and over any distance. JPEG 2000 encodes each field or frame and the luma and chroma component independently. The capability of achieving mathematically lossless compression is compatible with lossy compression. Videos can be encoded with mathematically lossless compression but truncated to a lossy bit stream if insufficient bandwidth is available.

By design, JPEG 2000 has similar complexity in both encode and decode processes. As a symmetrical codec, it can be provisioned as an encoder or as a decoder with the same hardware, while asymmetrical codecs require vastly different hardware implementations, particularly at high bit rates. The relatively low complexity of JPEG 2000 also offers a cost advantage, offering significant capital and operational expenditure advantages and costing less per circuit while consuming less power and space.

The big picture

All codecs discussed here have a role in professional, contribution-quality video transport. H.264/MPEG-4 and MPEG-2 are still relevant in the realm of professional broadcast transport. They provide high-quality solutions for bandwidth-constrained environments, but they are not necessarily the right choice in all applications.

JPEG 2000 provides very high visual quality and low latency with multiple coding cycles. It has proved its worth in a complete video chain and for an environment trending toward IP transport as well as 3G, HD and 3-D technologies.

In addition to quality and network infrastructure, resources and cost must also be considered when evaluating and selecting a compression solution. Generally, MPEG-2 and H.264 compression are expensive, power-consuming, large and complex technologies.

The future of JPEG 2000 is bright as it requires less power, consumes less space and generally delivers greater scalability, flexibility and visual quality than other codecs. An increasing number of service providers and broadcasters are using JPEG 2000 implementations for large, globally significant events — particularly over IP networks. But the landscape is ever changing. An even newer compression solution, 1080p50/60, is now causing a stir in the broadcast community. Let's all stay tuned.

Dr. Chin Koh is director of product management at Nevion.