Successor to the Compact Disc?

The CD was revolutionary, of course, because it is a digital audio recording. Although digital audio recorders had previously been used professionally, the CD was the first digital medium available to the general public.

It has now been more than 25 years since the compact disc was introduced to the world. We might remember that there were—and still are—those who disparaged the sound of the CD, preferring the venerable vinyl disk, which has become a collector’s item. The CD was revolutionary, of course, because it is a digital audio recording. Although digital audio recorders had previously been used professionally, the CD was the first digital medium available to the general public.


The technical characteristics of the CD reflect the state of the technology at the time it was developed. The audio data on the CD is in the form of PCM, or pulse code modulation. In PCM, the amplitude of the analog audio waveform is sampled periodically, and the samples are quantized: Each sample is assigned one of a number of specified values or quantization steps. The number of quantization steps available depends on the bit depth, which is 16-bits in the case of the CD.

Each sample has to be represented by a specific quantization step, but any given sample’s value may not fall precisely upon a step; in which case the step closest to the sample’s value is assigned. The distance between the sample’s actual value and the nearest quantization step is called quantization error, and the aggregate of quantization errors is known as quantization noise.

It can be seen that quantization errors are really signal-related distortion components, not uncorrelated noise. The greater the number of quantization steps available, the closer any given sample is likely to be to a quantization step, so quantization noise is an inverse function of bit depth—the more bits available, the lower the quantization noise. The bit depth of CD audio is 16-bits, which means that there are 216 or 65,536 discrete quantization steps available.


PCM audio is subject to the Sample Theorem, which states that in order to perfectly reconstruct a sampled waveform, at least two samples must be taken per cycle. This means that no frequency higher than twice the sample frequency may be accurately represented in the reconstructed analog output. In the case of the CD, the sample frequency is 44.1 kHz. Half of 44.1 kHz is 22.05 kHz, and in order to avoid aliases, signals input to the sampler/quantizer must be limited to this frequency or lower.

If all the foregoing conditions are met, the reconstructed analog output waveform of such a system will be a perfect representation of the original analog waveform with some quantization noise added. In addition to the anti-alias filtering required before the analog signals are sampled and quantized, there must be reconstruction filters on the digital-to-analog conversion side, to remove the images, which are harmonics of the sampling frequency.

We note here that the above principles apply to any waveform that is sampled and digitized using PCM, including audio and video. A notable example of PCM-coded video is the ITU 601 component digital SD video with which we are well familiar. In many cases, including HD video, some form of data compression is used, as PCM coding generates an impractical quantity of data.


The developers of the original Compact Disc have now developed the Super Audio Compact Disc, which, instead of PCM, uses a coding technology called Direct Stream Digital or bitstream coding. DSD makes use of sigma-delta 1-bit conversion. Delta modulation has been with us for some time, having been developed in the 1950s. As its name implies, it takes account of the signal’s amplitude delta—the amplitude change between a sample and the sample that precedes it. In essence, the sigma-delta A/D converter is a simple device. It uses a negative feedback loop to accumulate the analog audio waveform over each sample period.

If the amplitude of the accumulated waveform over a given sample period is greater than that of the previous sample period, the converter outputs a “1.” If the amplitude of the accumulated waveform over a given sample period is less than that of the previous sample period, the converter outputs a “0.” This is a kind of pulse density modulation, as positive-going waveforms will produce many 1s in sequence, while negative-going waveforms will produce many 0s in sequence. The sigma-delta decoder performs the inverse operation.

It is apparent that if we have but a single bit to express the amplitude change between samples, as opposed to 16-bits for traditional CDs, many more samples must be taken in a given period of time in order to accumulate the requisite information density. In the case of SACD, the sample rate is 64 times the 44.1 kHz sample rate used by CDs, or about 2.822 million samples per second. This yields an analog frequency response up to 100 kHz, and a signal-to-noise ratio of about 120 dB.


There has been a general trend in recent years regarding consumer equipment of all quality levels away from the traditional resistor-ladder type of D/A converter to the sigma-delta type converter; not in the least because it can be made smaller and at lower cost. Many of these are 1-bit sigma-delta converters, which first interpolate and up-sample the 16-bit, 44.1 kHz CD data to 1-bit, 2.882 mHz data, then use a 1-bit sigma-delta DAC to convert the data back to analog audio.

The advantages cited for these converters, and for the entire DSD A/D-D/A system, are the elimination of the requirement for anti-alias and reconstruction filters, wide frequency response and high signal-to-noise ratio. Nothing is free and the penalty exacted by the 1-bit sigma-delta process is a high noise level; but the perceptible noise is reduced by aggressive noise shaping, in which the noise is pushed up above the audible spectrum. It must also be said that newer high-end sigma-delta DACs are increasingly of the multibit variety, frequently in the range of 3-bits.

SACDs of course use high-density recording techniques. Conventional CDs contain about 750 MB of data, while SACDs, like their high-density relatives DVDs, can contain about 4.7 GB of data. SACDs on the market today are typically hybrid discs. They contain a layer of conventional CD data, and also a layer of high-density data, so that they may be played on conventional CD players as well as SACD players. The SACD signals can contain up to six audio channels, as opposed to the 2-channel capability of CD signals.

There are those who ecstatically praise the sound of SACD audio as a tremendous advance over CD audio. For a different technical opinion, the reader is referred to Audio Engineering Society Convention paper 5395, “Why 1-Bit Sigma-Delta Conversion is Unsuitable for High-Quality Applications,” by Stanley Lipshitz and John Vanderkooy. It was presented in 2001 at the 110th AES Convention in Amsterdam.

It is well known that in the digital process, the addition of a small amount of uncorrelated noise, called dither, can reduce or eliminate correlated quantization distortion components from the signal, leaving a benign floor of uncorrelated noise. Lipshitz’s and Vanderkooy’s contention is that 1-bit sigma-delta converters are in principle not perfectible, because when they are properly dithered, they are operating in constant overload, while multibit sigma-delta converters are infinitely perfectible, as they do not have the 1-bit headroom constraint.