Multichannel audio Part 2

The last tutorial covered some of the history of multichannel sound and recording 5.1 audio, as well as embedded audio and MADI transport systems for multichannel audio. This tutorial will continue with AES50 and Dolby E. (See Figure 1.)


AES50 is another standard for transporting multichannel audio around a facility. AES50 uses IP packets and Cat 5 cabling, and each channel is assigned 1Mb/s to assure a very high level of quality. AES50 will handle up to 16 channels of 192Kb/s digital audio or more channels at lower bit rates and is meant as a point-to-point system even thought it uses Ethernet cable and IP packets. Trying to pass it through a network can cause a high degree of latency, so it has not found widespread acceptance or use.

Dolby E

Dolby E is a proprietary system developed by Dolby Labs to overcome many of the issues that engineers face when trying to use any of the aforementioned systems for broadcasting multichannel audio. Dolby E is an encode/decode system for transporting up to eight channels of digital audio over any two-channel digital link of 16-bit depth for 5.1 audio and 20 bits for the full eight channels. Besides the audio channels, it also carries metadata and time code.

Once a program is finished, its surround-sound 5.1 mix is encoded to Dolby E along with metadata about the audio tracks, including dialnorm, dynamic control and downmixing. With a complete broadcast audio chain, this metadata will be passed through to the audio processor and to the home DTV receiver. By encoding every program with Dolby E, each one would have its own audio parameters adjusted in the audio chain whenever it played on-air.

Once encoded, the Dolby E signal will pass through the complete distribution chain from videotape or server to satellite up/down link through the broadcast facility, including the master control switcher, and then be decoded back to its original 5.1 channels on three AES3 outputs, which are then connected to the on-air audio processor.

Dolby E is only meant as a finished product transport system and not for production use. Another advantage to Dolby E is that its data frames are synchronized with the video frame rate, thus making it much easier to keep the two synchronized. Still, there is a one-frame delay in both the encode and decode process, which has to be kept in mind when installing a system.

Many equipment manufacturers, including those of routers and master control switchers, have added Dolby E pass-through capabilities to their equipment to accommodate the signal.


Monitoring 5.1 audio can be problematic within a control room environment, where two speakers and an amplifier will not suffice. Special 5.1-audio monitoring consoles have been developed to address this issue. Being able to solo the front or rear (surround) as well as the center and LFE channels is very important to being able to isolate problems in the audio, as is the ability to perform a two-channel stereo downmix. (See Figure 2.) 5.1 headphones are also available and can come in handy in a noisy control room environment. As the number of audio channels increases, so should the level of automatic monitoring and alarms.

Watching level meters, or, better yet, a graphic display, of the surround-sound environment is the way most engineers monitor 5.1 audio. Several manufacturers make such instruments that display a roughly circular pattern showing that all channels are present with no phase problems; six bar graph meters usually accompany this display. Monitoring in this way is essential when working with 5.1 audio in a TV facility. Check out Broadcast Engineering’s latest on visually monitoring surround sound here.


Normally, all six channels are transmitted, received and played back on the viewer’s surround-sound audio system. But there many are cases where the viewer only has the two stereo speakers to listen to, so what happens then?

All ATSC receivers have the capability to downmix the 5.1 surround sound down to two channels. They do this with two techniques: L/R total and L/R only. For L/R total, the two channels are shown as left (Lt) and right (Rt), and each is derived in this way: Lt = L + -3dB Center + -3dB [Ls + Rs] and Rt = R + -3dB Center + -3dB [Ls + Rs]. The Ls/Rs are also phase-shifted 90 degrees. For L/R only, the two channels are shown as left (Lo) and right (Ro), and each is derived thusly: Lo = L + -3dB Center + [Ls may or may not be added with higher attenuation] and Ro = R + -3dB Center + [Rs may or may not be added with higher attenuation].

Continue to next page.


If most of your programming is in 5.1 surround sound, then switching to spots and other programming that is in stereo only may be jarring to viewers when switching between the two. In that case, upmixing would be necessary; this is where a stereo audio feed is converted to 5.1 surround. Several processors on the market will do this, and these can be added to the audio chain. In addition, some master control switchers can be programmed to switch in the 5.1 converter as needed for stereo-only sources. (See Figure 3.)

Some devices use phase shifting and/or reverb to make up those surround channels, but there are others that pull information out of the stereo audio and use it to fill in those rear channels. These processors use digital signal processing (DSP) to archive the surround effect.

Another way is by adding the 5.1 audio to all archived stereo files. Software can also convert stereo to a surround-sound mix, which enables greater control of the conversion, but converting an entire library is very time-consuming, unless it’s automated.

If, after conversion, the audio is then encoded as Dolby E, the six channels will only use up two audio channels in the server and in routing around the plant. Again, this is a great deal of work to perform on a station’s library, but may well be worth it in the long run.

As a side note, there is another six-channel to two-channel system available from DTS called Neural Surround DownMix and UpMix. This system, similar to Dolby E, encodes three AES (six channels) down to one AES feed (just two channels). But considering that it does not lock to a video source as the Dolby system does, it is more suited for radio applications.

Broadcasting 5.1

When broadcasting DTV, the audio takes up very little of the bit rate compared to the video, especially if broadcasting HD, but it does use some of the bits. In a normal two-channel broadcast of Dolby AC-3 audio, the data rate is about 176kb/s, or about 1 percent of the total bit rate. This is in contrast to a 5.1 audio broadcast, which can occupy about 425kb/s, or 2.2 percent, of the 19.4Mb/s bit rate. (See Figure 3.)

Compared with HD radio, that system gets 100kb/s-151kb/s for all channels broadcast. Some radio stations are actually broadcasting 5.1 audio with these much lower bit rates.

The AC-3 audio codec used within the ATSC specification adds several other parameters to the audio signal including dialnorm and dynamic range compression, both artistic and heavy. Although these parameters have been implemented gradually, an increasing number of stations are starting to implement them.

Even the new ATSC M/H standard for mobile DTV allows for 5.1 surround sound. Because many of today’s cars have multichannel sound systems, it only makes sense to incorporate surround sound into the mobile standard.


Multichannel sound is here to stay, and it more widely available than ever in the home, and soon in cars as well. Surround sound has also been available for some time on Internet streaming music sites, with support from both Windows Media and QuickTime. As multichannel sound becomes more ubiquitous, it’s important for engineers to understand and monitor it. Given the right tools and training, all stations can move up to 5.1 surround sound for their local productions, broadcasts and even Web streaming.