Compression technology

Over the past three years, there has been a high level of take up of HDTV over multiple platforms. MPEG-4 AVC has largely replaced MPEG-2 in the distribution of HDTV via satellite to the home. IPTV has also used HD as a differentiator in its market entry strategy. In addition, several local channels in the United States are now gathering and transmitting news programming in HD. These factors, including the increasing international adoption of HD, have led to an increased demand for HD content.

With current forecasted consumer demands pointing toward even more bandwidth-hungry HD applications and screen capabilities now predominantly achieving full resolution, the requirement for multiple HD formats will grow. Operators are seeking a solution that enables them to acquire high-definition content across their current networks without significant change in costs.

MPEG-4 AVC is increasingly being seen as enabling technology that can meet these goals. In previous applications, it has been applied at low bit rate implementations whereas in the future, it will penetrate contribution networks.

Technology

It was initially assumed that the major benefits of MPEG-4 AVC were to be found at the low bit rate end of the spectrum. However, recent research has shown that there are significant gains to be found at higher bit rates.

Figure 1 shows the major savings available below 10Mb/s. The region of least gain appears around 20Mb/s, and there are increasing gains above the 20Mb/s operating point.

The coding gain of MPEG-4 AVC can be attributed to a number of new coding tools that are not available in MPEG-2. These include:

nine 4 × 4 intra-prediction modes;
four 16 × 16 intra-prediction modes;
seven inter-prediction modes from 16 × 16 down to 4 × 4 block sizes;
a quarter pel motion compensation;
advanced bidirectional pictures;
motion compensation from outside of the picture;
multiple reference pictures;
integer transformation;
in-loop deblocking filter; and
context-based adaptive binary arithmetic coding (CABAC).

The importance of 4:2:2 video compression

Traditionally, contribution applications have demanded 4:2:2 MPEG-2 video compression. This prevents the need to repeatedly down-sample and up-sample the chroma channels. Mismatches in spatial positioning of the chroma filters and/or soft filter roll-offs can significantly degrade the chroma quality.

In addition, repeated concatenation of video encoding processes can impede compression performance through the later stages in the delivery chain of content reaching the consumer. Therefore, preserving optimum video quality is desirable from the very first stage of content acquisition.

Furthermore, 10-bit 4:2:2 is the most commonly used studio and production format. Therefore, from a compression efficiency prospective, it is important to know at what bit rate the picture quality of 10-bit 4:2:2 would be noticeably better than 8-bit 4:2:2.

Levels, profiles and operating points

ISO/IEC 14496-10, otherwise known as MPEG-4 AVC, defines a number of profiles applicable to video conferencing, broadcast and streaming applications:

The Baseline Profile is mainly intended for video conferencing and streaming to mobile devices. Its simplistic applications mean that it does not support bidirectional frames, interlace or CABAC entropy encoding.
The Main Profile is most widely used for HD and SD broadcasting applications. It allows bidirectionally predicted frames with two direct modes (spatial and temporal) and weighted predictions. Furthermore, it supports all interlace coding tools including picture-adaptive field/frame coding (PAFF) and macroblock-adaptive field/frame coding (MBAFF) as well as CABAC.
The coding tools of MPEG-4 profiles that go beyond Main Profile are summarized as Fidelity Range Extensions. In particular, the High Profile allows adaptive 8 × 8 integer transforms, intra 8 × 8 predictions modes and scaling lists. The High 10 Profile allows coding of 4:2:0 video signals with 10-bit accuracy, and the High 4:2:2 Profile allows coding of 4:2:2 video signals with 8- or 10-bit accuracy.

Here we are concentrating on high-definition and therefore the High 10 Profile and the High 4:2:2 Profile in particular. Assuming that there is no debate over whether 4:2:2 is an absolute requirement for contribution networks, then the focus will shift to the choice between 8- or 10-bit encoding.

Comparison between 8- and 10-bit encoding

An important question to answer when comparing 8-bit with 10-bit video signals is to decide which assessment method should be used for the comparison. Because we are dealing with relatively high picture qualities, objective methods, such as structural similarity and PSNR, are preferable to subjective assessment methods. Subjective assessment tends to become less accurate the higher the picture quality because distortions are less obvious.

In order to compare YCbCr PSNR results for 4:2:0 with those for 4:2:2, a method to combine individual PSNR figures for Y, Cb and Cr into a single number has to be defined. Because chrominance distortion is far less visible than luminance distortion, a weighted sum of 0.8Y + 0.1Cr + 0.1Cb is used. The same weighting factors have also been independently proposed, even though it is a relatively arbitrary definition that has not been verified by extensive subjective tests. Therefore, the comparison between 4:2:0 and 4:2:2 has to be interpreted with some caution. Nevertheless, such a definition allows us to investigate the trade-off between chroma resolution and picture distortion on different test sequences.

Figures 2 and 3 show example PSNR curves for MPEG-4 AVC compression formats at low and high bit rates, respectively. The three formats are 8-bit 4:2:0, 8-bit 4:2:2 and 10-bit 4:2:2. The 10-bit 4:2:0 format was not included in this analysis because it is assumed that 4:2:2 coding will be considered more important for contribution applications than 10-bit coding.

It can be seen that 4:2:2 coding provides picture quality improvements compared with 4:2:0 at bit rates above 4Mb/s whereas, on average, more than 20Mb/s are required before 10-bit 4:2:2 coding pays off. However, for pictures with low spatial complexity, 10-bit coding could improve picture quality at lower bit rates as well. Interestingly, there is no crossover point between 8-bit and 10-bit coding, i.e., 10-bit video codes as efficiently as 8-bit video at low bit rates. Similarly, the crossover point between 4:2:0 and 4:2:2 is at very low bit rates as has been observed in MPEG-2 systems.

MPEG-4 AVC artifacts

A key difference between MPEG-2 and MPEG-4 AVC is the type of artifact introduced in the compression process. The industry is familiar with MPEG-2 macroblocking effects, which are not present in MPEG-4 AVC because of the use of in-loop de-blocking filters. However, in MPEG-4 AVC, there is a tendency toward posterization or contouring on large plain areas of the same color.

This effect is masked in MPEG-2 by noise introduced by DCT inaccuracy and the mismatch control introduced to compensate for DCT inaccuracy. This has the effect of introducing dither into the picture, which appears as a small amount of noise on the plain areas. The dither in effect masks the boundaries between macroblocks to which human eyes are extremely sensitive.

Posterization is caused by the fact that there is only a luma level difference of one or two (in 8-bit terms) between neighboring blocks, but because the entire macroblock is so plain, the eye is very sensitive to this change. When this data is presented to the in-loop de-blocking filter, there is nothing the filter can do to hide this boundary. However, in the 10-bit case, the in-loop de-blocking filter has two extra bits of precision. Therefore, intermediate levels can be created and hence soften the macroblock boundary. This indicates that certainly for contribution applications, 10-bit encoding will provide considerable benefits in masking the contouring artifacts.

This effect is even more apparent in contribution networks, as the source material is unlikely to have significant levels of noise that could serve to mask the contouring effect.

These results are further confirmed in Figure 4, which shows considerable improvements found in the use of MPEG-4 AVC 8- and 10-bit encoding over MPEG-2 and summarized for a particular scene that has large areas of a single color.

1080p broadcasting and contribution networks

To date, no broadcaster has announced plans to broadcast in 1080p. There are considerable barriers to doing this, particularly in the management of the uncompressed video. Here it would require studio and networking infrastructure capable of operating in the bandwidth range of 3Gb/s per video channel. Camera and video editing equipment would also need to be suitably upgraded.

In terms of carrying 1080p compressed video across either direct-to-home (DTH) or contribution networks, there is likely to be a delay in this implementation until the prerequisite video decoding technology becomes widely available, economically viable and implemented in decoder devices. It is also most likely that it will permeate the DTH market before being used for contribution and distribution, as was the case for MPEG-4 AVC. There are a number of choices to be made prior to broadcasting in 1080p format, including whether to use MPEG-4 scalable video coding (SVC).

The SD case

There are already several thousand SD MPEG-4 AVC channels in operation worldwide commonly using bandwidths from 1.5Mb/s through 3Mb/s. Very few SD MPEG-4 AVC broadcast encoders are optimized to operate above 5Mb/s.

Figure 5 shows that above the 6Mb/s operating point, there are additional gains in bandwidth that can be achieved. Again, there appears to be a point of minimum efficiency for SD at around 6Mb/s.

The important point to take into account here is that this is not because of any particular encoder optimization. Rather, it is a result of the fact that the gain in PSNR is constant, and the gradient of the rate-distortion curve reduces toward higher bit rates. Therefore, constant PSNR gains translate to higher percentage bit rate savings at higher bit rates. It is for this reason that it can be seen that MPEG-4 AVC has significant advantages for contribution and distribution applications.

It has been proven in the past that MPEG-4 AVC has significant gains over MPEG-2 at lower bit rates, and this analysis shows the gains continue at higher rates as well. Figure 6 shows that comparative bandwidth gains continue to be available as the bandwidth increases, as indicated in previous charts. It also confirms that the maximum percentage gains are in the very low bandwidth area. The least are in the 6Mb/s area for SD video, and the maximum absolute bandwidth savings increase at the higher bandwidths. Here only standard definition is used to illustrate the principle, although it is similar for high definition.

Summary

From the above information, it can be seen that MPEG-4 AVC is a technology that can be applied to low-bandwidth applications such as DSNG today. In the future, it will provide worthwhile bandwidth savings at the higher bit rate contribution applications. Moving to 10-bit encoding for HD applications will also provide significant improvements to video quality at these higher bit rates by removing posterization artifacts. Smaller gains are forecast in the adoption of MPEG-4 AVC 4:2:2 AVC, although the gains in chroma quality may still make its use advantageous.

Carl Furgusson is vice president of compression product management at TANDBERG Television.