Encoding technology

The last decade has seen an estimated 40-fold increase in processing power in commercial, off-the-shelf silicon. Approaches to coding MPEG-2 once considered economically unviable are now within the realms of possibility. After years of marginal gains, the digital TV industry could see a revolution in MPEG-2 compression efficiency, providing a much-needed economic enabler for new service delivery.

MPEG-4 AVC (H.264) was brought to market as the successor to MPEG-2, and, given its ability to revolutionize satellite HDTV business models and get telcos into the TV business, the new codec has enjoyed center stage since 2005. The last three years delivered around 5 percent total efficiency improvements in MPEG-2 as encoder vendors worked on techniques such as better noise reduction and coding optimizations. With diminishing efficiency returns for MPEG-2 development, work in this codec has tailed off, even though MPEG-2-based SD services and viewers have continued to increase.

Given the number of MPEG-2 set-top boxes in the field, it has become apparent during this three-year hiatus that MPEG-4 AVC will continue to complement rather than replace MPEG-2. We can expect at least another decade of coexistence. As an alternative to increasing transmission OPEX costs, or costly attempts to recover valuable network bandwidth, digital TV pioneers with large MPEG-2 set-top box populations need another dramatic shift in MPEG-2 compression efficiency to expand their service offerings within their existing network capacity. Service expansion could mean adding more SD channels or squeezing SD bit rates to make room for HDTV, on-demand services or faster broadband. The digital TV industry would welcome a new generation of MPEG-2 if it meant offering new services within today’s transmission OPEX.

Contrast that with the next three years, which could deliver a 15-30 percent efficiency boost. This achievement relies on three main factors: a quantum leap in silicon processing power per dollar, making advanced techniques affordable within MPEG-2 SD business models; transferring lessons learned in MPEG-4 AVC into MPEG-2; and fresh approaches to MPEG-2 coding implementations.

Rate optimization

Rate optimization and rate-distortion optimization will be at the heart of next-generation MPEG-2. RO/RDO has its roots in the early 1990s, but the concept was developed further during the evolution of MPEG-4 AVC as an implementation solution to deal with the increased compression mode toolset complexity of MPEG-4 AVC within the cost-constrained processing power available. The technique has never been applied seriously to MPEG-2 because of its high processing demands.

In all real-time video encoding, encoders have a choice of tools they can use to compress the content, and each approach will deliver a different bit rate and visual result for any given macroblock. Today’s MPEG-2 encoders have limited processing resources available, so for real-time processing, the encoders have to make intelligent predictions about which tools will deliver the best compression result for any macroblock. These “mode decisions” are part of the “secret sauce” that makes one vendor’s implementation different from another.

In the future, it will be possible to encode each macroblock using every encoding tool combination in parallel. Encoders will then determine which method worked best in terms of bit rate cost (an estimative approach found in rate-optimization) or by measuring bit rate combined with total visual distortion from the source video signal (an exhaustive approach found in rate-distortion optimization). The encoder then selects the best of the parallel encodes and uses it for the macroblock. This process is repeated for every macroblock.

Exhaustive macroblock-level RDO allows the encoder to look deeper into the MPEG-2 compression algorithm than RO, so it knows exactly what the macroblock will look like after encoding without any degree of estimation error. In effect, the encoder pretends that it is going to take each of the parallel encode results as the final encode. The built-in decoder decodes the macroblock for every parallel encode, in each case providing a true picture of what the video will look like, including its visual qualities as well as bit rate.

Even using rate-distortion optimization, there has to be a trade-off between visual quality and bit rate for each macroblock. A good encoder will need to find the optimum balance and exercise repeated proper judgment. Simply throwing processing power at MPEG-2 will not deliver the maximum potential gains that digital TV providers need to consider encoder swap-outs.

Other enhancements that could now be used to develop a new generation of MPEG-2 compression include:

More powerful look-ahead functionality.Increasing processing power will support multiple encodes at the look-ahead stage, or the point at which incoming video is analyzed to optimize the final encode. Early look-ahead implementations only provided a bit rate demand calculation, but with more information available for the final encode, any new generation of MPEG-2 encoders will be able to make better predictions. This will result in improved rate control and will help determine exactly what quantization is needed to achieve optimum picture quality for each frame.
Two-stage motion-estimation. Improved look-ahead would provide motion estimation and after this first pass, the motion-estimation information could be refined in time for a second-stage motion-estimation during the final encode.
Preprocessing and forward analysis scene cut detection. Any new generation of MPEG-2 encoders should also deliver improved flash detection so they handle significant color changes, including fade detection such as fade to black.
Adaptive GOP structures and adaptive GOP length. Techniques commonly used on MPEG-4 AVC encoders, adaptive GOP structure and length gives the encoder more freedom to match the frame type to the content, helping to reduce bit rate requirements without sacrificing quality.

These various techniques, combined with rate-distortion optimization, provide the basis for a complete re-evaluation of MPEG-2 compression. Using simulation models, tests show that a 15 percent improvement in MPEG-2 compression efficiency is realistic. All of these techniques are compatible with the MPEG-2 specification, guaranteeing interoperability with legacy SD MPEG-2 set-top boxes.

Figure 1 shows PSNR results for broadcast content comparing today’s typical MPEG-2 performance to the potential performance using these new techniques described. At 8Mb/s, a scenario typical for public broadcast channels in Europe, a bit rate reduction of 15 percent can be seen for equivalent picture quality. At lower operating points such as 3.5Mb/s, a scenario typical for DTH statistically multiplexed channels, savings of 20 percent can be seen.

These tests show that the digital TV industry can now look forward confidently to a change in MPEG-2 performance. At the very least, platform operators can expect more SD channels in the same bandwidth without reducing picture quality on existing services.

Improved MPEG-2 encoding will provide the means to harvest bandwidth from SD services to enable “bandwidth-free” introduction of HDTV on satellite; HD, VOD and faster broadband over cable; or expanded SD (and possibly HD) on digital terrestrial.

Carl Furgusson is VP of product management at TANDBERG Television.