Demystifying AES3 Digital Audio

PART 1
We casually speak of digital audio, often referring to it as AES audio, perhaps AES3 (for the number of the AES standard), or by an older term AES/EBU. But what exactly is this AES3 signal?

Here’s the summary, with explanations following: AES3 is a serial digital bit stream containing audio that has been digitized using the technique of linear pulse code modulation. Each digital audio sample word is represented in linear two’s complement binary form. Two audio samples are time-division multiplexed to make up a frame, with each audio sample preceded and followed by specialty bits of data. The digital data in the frames (except the preamble bits) are represented in a bi-phase mark format.

Now for the specifics. These may be basic to some, a review for others or an introduction, but a good understanding will help in the design of properly functioning digital audio systems and aid in troubleshooting when things go wrong including performing digital audio measurements.

Let’s start with pulse code modulation (PCM). PCM is what you might think digitization is all about. With PCM, the amplitude of an analog audio signal is sampled at a regular rate and at regular intervals of time. For TV and video applications, the most common sampling rate is 48 kHz. Here samples are taken every 1/48,000 of a second, about every 20.8 µs.

Other common sampling rates for digital audio in general are 32 kHz, 44.1 kHz (used for compact discs), 96 kHz and 192 kHz.

Each sample is assigned a discrete binary number (digital word) that has a value that corresponds closest to the sampled analog value. The quantity of discrete digital numbers is determined by how many bits are used to describe the signal. This quantity of bits is also called the digital word length or size of the digital word. At this point, we’re not going to delve further into the inner workings of analog to digital converters, but rather save that for future discussions.

With greater number of bits, the more quantization levels (discrete amplitude levels) are defined. The step between each level is smaller, allowing better approximation to the actual sampled signal, and reducing the error between the actual level and the quantized level. This error manifests itself as noise, and with the higher bit-rate, the digital noise floor is lower, allowing greater dynamic range, and better digital representation of low-level signals.

MASTERING DIGITAL

Common quantization levels for digital audio are 16, 20, and 24 bits. Compact discs use 16 bits. Many ENG-type digital camcorders with digital audio also use 16 bits. Mastering digital VTRs tend to use 20 bits or even 24 bits for digital audio.

For mastering audio-only recorders, 24 bits is the norm, but even for portable digital audio recorders, 24-bit capability is becoming much more prevalent, with provisions for the user to choose a lower bit-rate if it suits the recording or storage requirements. Many portables even offer top sampling rates of 96 kHz or 192 kHz. As with anything digital, the higher the sampling rate and the greater number of bits means more data is produced that has to be processed or stored.

(click thumbnail)Table 1: Quantization levels, two’s complement range, and maximum theoretical signal to noise ratio for given digital audio word lengths.Table 1 shows the total amount of quantization levels for 16, 20 and 24 bits, the range for positive and negative values when expressed in two’s complement, and the maximum theoretical signal to noise ratio for each word length.

If you’re familiar with digital video, you’ll notice that digital audio requires many more bits to produce anything resembling a quality and accurate representation of the analog audio signal. Digital video, by comparison, generally requires only 8 bits or now more often, 10 bits.

When we say that AES3 uses linear PCM it means that the digital values correspond to the actual amplitudes of the analog signal, not for instance, the logarithmic values. Linear PCM is considered a lossless encoding scheme, in that it uses no bit-rate reduction techniques like MP3 or advanced audio coding. Linear PCM is commonly used to represent digital audio in computer WAV files.

In summary, pulse code modulation involves sampling the analog signal, quantizing the value of the sample, and creating a fixed bit length digital word for each digitized sample.

IT’S IN THE CODING

The digital word is made up of a series of ones and zeroes that together add up (in base-2 arithmetic) to the particular quantized value.

For example, the 4-bit digital number 0101 equals the decimal number 5; [(0x23) + (1x22) + (0x21) + (1x20)] or [(0x8) + (1x4) + (0x2) + (1x1)].

But with only ones and zeroes to work with, how can you represent negative numbers in digital form and in a form that makes it easy to perform binary arithmetic computations? That’s where two’s complement binary comes in.

Using this coding, half of the number of bits are used for positive numbers (with zero included in this group) and the other half for negative numbers. The negative numbers in digital form all start with 1, while the positive numbers start with zero. The range of integer numbers represented in two’s complement is: –2N–1 to (2N–1 – 1).

Conceptually, here’s the procedure to change from a positive to a negative binary number. For an example, let’s use the 4-bit number 0101 (decimal 5).

First invert the bits—in other words, change ones to zeros and zeros to ones. This now gives us 1010. Next add one to this number using binary arithmetic. The final result is 1011, which equals –5 in decimal. Notice that the first one (the left-most one or most significant bit) indicates that this is a negative number. Checking the computation of the decimal value: [–(1x23) + (0x22) + (1x21) + (1x20)] or [–(1x8) + (0x4) + (1x2) + (1x1)] = –5.

You could do this process again to get back the positive number. The only time the negative to positive conversion doesn’t work is for the most negative number. (If you go through the procedures, the most negative number remains itself.)

We’ve now covered pulse code modulation, linear PCM, and two’s complement binary arithmetic, in this brief presentation. We’ll consider more aspects of AES3 next time.