While we all know that compression causes artifacts, just how much do you know about what causes them and how to reduce them? While the occurrence of artifacts would seem random, it is directly linked to specific parts of the compression process.
Transform coding is comprised of several cascaded processes. The two critical elements leading to artifacts are the discrete cosine transform (DCT) and quantization. (See Figure 1.)
The DCT takes the amplitude values of the 2-D, spatially ordered pixels (usually 64 in an MPEG block) and applies a mathematical algorithm that transforms the array into one that represents the frequency components of the block. The DCT can be considered an ordered set of weighted basis functions, or elementary functions, which, when taken in combination, can replicate any image. (See Figure 2.)
These DCT coefficients are quantized, i.e., converted to a lower precision value, and then transmitted in the compressed bit stream. This process lowers the number of bits needed to reproduce the image. It is this same process that causes many compression artifacts.
When the coarsness of the quantization is high enough, the normally invisible edges of the blocks will become noticeable. This is due to an inherent property of the DCT. Each coefficient affects the entire block when it is converted back into spatial pixels.
Thus, when quantization of a certain DCT coefficient creates an error that is dispersed over the entire block and a neighboring block with similar image components has a different error, then the neighboring edges of the blocks will have a level transition that was not present in the original image. Figure 3 shows a heavily compressed image to exaggerate the effects.
It should be noted that once an artifact is created, attempts to remove it by using a conventional filter at the decode side will blur the picture and result in a loss of sharpness. However, sophisticated deblocking filters can improve the image with less loss of detail. MPEG-4/AVC includes a signal-adaptive deblocking filter that operates within the encoding error loop and provides considerable improvement in blockiness reduction.
In addition to block-edge visibility, a coarsely quantized block can actually cause the basis functions to appear. A highly detailed part of the image will tend to mask errors in coding. This is a normal part of the psychovisual process. Areas with flat contrast show errors more readily than those with more detail.
However, when the quantization is coarse enough, the array of DCT coefficients will not be reproduced faithfully. Thus, there can be instances where a particular basis function is stronger than that from the image source. In such a case, you can actually see the basis functions. (See Figure 4.)
Bugs in the picture?
Another artifact that is caused by DCT is mosquito noise. Named for its appearance, this effect appears when a complex part of the block may appear normal, due to the busyness of the image, but another, less busy part of the block may have an increased visibility of the basis function. (See Figure 4.)
Because images are inherently complex, encoders cannot always choose the best quantization for every element of a picture. Thus, a block that contains different kinds of detail may include some elements that were reproduced well and some that were not, hence the mosquito effect.
Another consequence of heavy compression is contouring. This is where normally smooth graduations in luminance or chrominance levels take on a step-like appearance.
Finally, MPEG and similar codecs all use predictive coding, where some frames are predicted from less-frequent anchor frames. Any strong artifacts in the I-frames are usually propagated in time and could change on a GOP-by-GOP basis, causing a noticeable pumping in the video.
Audio compression's limits
When lossy coding is employed, audio coding results in artifacts. The MPEG, Dolby and other codecs rely on psychoacoustic masking. The production of artifacts is modeled at the encoder, and the amount of quantization is chosen to minimize the likelihood that those artifacts will be heard. This model relies on certain characteristics of the human auditory system, whereby quiet sounds are masked by louder sounds, especially when those sounds are within critical bands.
In order to analyze the audio in these critical bands, audio codecs use either the modified discrete cosine transform (MDCT) or sub-band filters. Quantization of the compressed audio samples will then give rise to various artifacts.
However, because of the complementary synthesis transform (or sub-band filter) used in the decoder, artifacts tend to remain within the critical bands. Both systems also must rely on sliding time windows in which to perform the various compression functions. These windows can be responsible for a certain amount of artifact production, as transient phenomena get smeared in time.
Among the various artifacts that have been ascribed to audio compression are ringing or pre-echo, warbling, metallic or underwater sounds, transient dropouts, and smearing or sizzling. Perhaps most notable are sounds with percussive attacks and decays, such as those produced by castanets. Two examples include when onset of the sound loses its sharp attack and the atonal sounds produced by cymbals, where the harmonics take on a swishy sound. Simple sounds approximating pure tones are also useful as critical listening material.
Quality is key
So how can you minimize these artifacts? There are two fundamental rules. First, use the best encoder you can afford. Second, use the highest practical bit rate. Some characteristics of the encoder may be adjustable, too.
Always base your analysis using both typical and critical video — the latter is available quite readily on test disks. Now that you know what to look (or listen) for, comparisons should be easier to make.
Aldo Cugnini is a consultant in the digital television industry.
Send questions and comments to: firstname.lastname@example.org