The eye of the beholder: Wavelet JPEG 2000 vs. DCT MPEG-2

Beauty, the expression goes, is in the eye of the beholder. Perception of quality is relative. Consumers have a set of references about quality. They are forgiving of poor image quality, user-generated content viewed on the Web. They are much less forgiving of poor image quality on dramas, sports or documentaries viewed on subscription SD or HD channels, or on DVD and Blu-ray.

That's why compression image quality is a key business consideration. How much compression should we use? Which compression technology is best? The answer, of course, depends on what's important to you. One current battleground is mastering. When the mastering process moves away from videotape and into the file domain, the choice of compression is hotly debated.

The importance of image quality

Like all businesses, mastering is about producing acceptable quality outputs for an acceptable cost to the business. In the mastering process, the input is source content, which may arrive on rolls of film, compressed videotape, or uncompressed or compressed files. The outputs are typically compressed files like Windows Media, QuickTime, 3GPP or IMX MPEG-2.

Between the inputs and the outputs lies the mastering process. This can include top and tail editing, closed captioning, audio track relaying and other changes to the content to create a master. This master is then used to create the output deliverables. In file-based mastering, it's common to have an intermediate mastering file format. The intermediate format creates a high-quality common storage format between the input formats and the output formats. Using a common storage format means that the mastering process can be independent of the multiple possible input formats.

It's important that the intermediate mastering format can deliver the right levels of quality at an economic data rate. Data costs money. It ties up storage, impacts processing times, clogs up networks and eats distribution bandwidth. So, in many parts of the broadcast industry, the temptation is to compress as much as possible. Digital compression delivers huge time and cost savings, and has allowed newer delivery formats like HDTV, IPTV and mobile to grow at phenomenal rates, but the tipping point is difficult to calculate with compression. Using too much compression starts to lose money, not save it. Too much compression in a high-quality broadcast channel can lead to viewer dissatisfaction with the risk of lost market share. Too much compression in mastering can lead to end client dissatisfaction and a loss of business.

So, which compression should you choose and how much should you use? In file-based mastering today, various flavors of discrete cosine transform (DCT)-based MPEG-2, a DCT-based compression scheme, are widely used as an intermediate mastering format.

DCT exploits the redundancy inherent in most pictures. It works well if it can predict the value of a pixel from its neighbors. A great deal of redundant information can be removed in that way. Note that DCT usually uses an 8×8 pixel block-based structure, and the calculations take place within each block.

Compare the original 1080i HD image in Figure 1 with the same compressed image using MPEG-2 at 20Mb/s, long-GOP encoding. The compressed image is still viewable but blocking starts to be visible in the inset.

MPEG-2 can work well in many applications and at higher bit rates produces a good result. However, like any other compression scheme, it can be pushed hard enough to show objectionable artifacts once the data rate falls sufficiently low and the multigeneration performance deteriorates rapidly. It is also typically used at 8 bits (not the 10 bits stored on many tape formats and transferred over SDI).

When you increase the compression with MPEG-2, blocks start to appear, and the multigeneration performance deteriorates rapidly. The problem is, blocks can become annoying. Consumers don't like the result, even if they can't articulate why. A recent independent study by Contentinople found that video image quality ranks highly in media companies' consideration of live streaming and download technology. If all else is equal, consumers prefer better quality images, especially if they have invested in an HD television, HD set-top box or Blu-ray player. During live events, sponsors and advertisers want their messages to be legible. Changing between channels or selecting different Web feeds takes no time. So, the tipping point occurs where increasing the amount of compression starts to lose you money. At that point, you lose audience share or advertising revenue.

Alternative compression schemes

There are alternatives to DCT-MPEG-2 that offer higher quality, such as H.264/AVC-Intra . However, in mastering, wavelet-based JPEG 2000 is gaining ground. Wavelet is much less widely used in broadcast and post than DCT, but there are notable exceptions, for example JPEG 2000 digital cinema delivery.

Unlike MPEG-2, wavelet compression schemes decompose images into multiple resolution representations by a process that's analogous to filtering and then downsampling. The default condition for HD is to have five levels of JPEG 2000 wavelet transform, each corresponding to different resolutions. The input image is divided up into areas called coding blocks. Then, image tiles are decomposed into high and low subbands using the discrete wavelet transform (DWT). The transform is performed spatially in the vertical and horizontal axes with a high-pass and low-pass filtering. (See Figure 2.)

In lossless compression, all the different resolution levels are included on the output JPEG 2000 files. In lossy compression, as the amount of compression increases, some of resolution levels are discarded. Rather than showing clearly defined DCT blocks, the image progressively softens. The differences are marked. (See Figure 3.)

While the JPEG 2000 is close to the original (some slight softening has occurred), the MPEG-2 picture is clearly showing blocking. If we zoom into the same pictures, the differences are striking.

So, why isn't everyone using JPEG 2000? One potential downside of JPEG 2000 is that it is computationally more complex than MPEG-2. However, it is less computationally complex than some of the more sophisticated and high-quality versions of DCT like AVC-Intra. If you are choosing between MPEG-2 and JPEG 2000 for mastering applications, JPEG 2000 allows big savings in data. As the amount of compression increases, at some point starting around a 20:1 compression ratio, JPEG 2000 will start to deliver much better subjective quality to the viewer, especially because blocking occurs much later than with MPEG-2.

Conclusion

JPEG 2000 is a clear winner over MPEG-2 for many mastering uses, but there are other high-quality compression alternatives that are not wavelet-based, which can also outperform MPEG-2. AVC-Intra is an example. The current debate between JPEG 2000 and AVC-Intra is complex.

Broadly speaking, both easily outperform MPEG-2, but each has its own characteristics. There are arguments about the subjective perception of which performs better depending on the compression levels used and the kind of subject material. For example, at medium bit rates, do you prefer less detail with no blocks (JPEG 2000), or more detail with some blocking (AVC-Intra) as bit rates drop further?

Whatever your viewpoint, as mezzanine intermediate formats, supporters of both JPEG 2000 and AVC-Intra have argued that there are quality benefits, especially if your final delivery format is also DCT-based. In mastering applications, there are clear benefits of using JPEG 2000 compared with DCT-based MPEG-2. If quality is your business driver, JPEG 2000 offers more bang for less bucks because of less bits and less blocks.

Bruce Devlin is chief technology officer and Mark Horton is product marketing manager for AmberFin.