Skip to main content

Multichannel audio, part 1

At first broadcast television audio was mono, then in 1984, MTS (Multichannel Television Sound) was approved by the FCC. Stereo audio and SAP (Second Audio Program) became available and were slowly adopted by most television stations across the country. With the advent of digital broadcasting, television stations broadcast many primetime shows in surround sound or 5.1 audio.

The most common audio for non-network DTV is still two-channel stereo with some stations adding another stereo channel for a second language. Local programming is still primarily stereo as many stations only have production facilities for two channels. But this is changing as consumers expect more from TV audio than ever before.

History of multichannel audio

The progression of multichannel sound reproduction starts with stereo in 1881 in France and later in the U.S. in the 1930s. Stereo has been the mainstay in sound reproduction for the last 80 years. Quadraphonic sound was introduced in the 1970s, but it had a short life span. After that, stereo reigned supreme until surround sound was introduced to the home in 1995 with the introduction of Dolby Digital on Laserdiscs.

There have been several versions of multichannel consumer audio formats in the past including quadraphonic, but it was never intended to be the used as a soundtrack to video programs. Surround sound tries to live up to its name by supplying sound that surrounds the listener to make the entertainment experience more realistic, just as 3-D video tries to make the video more realistic.

Surround sound became popular when the first DVD players hit the market. DVD players connect to the audio amplifier by using the S/PDIF (Sony/Philips Digital Interconnect Format) interface. S/PDIF is the consumer version of AES digital audio and provides a means for interconnecting multiple channels of audio over a single coax or fiber-optic cable. Even HDMI, which is primarily intended to connect HD component video from the source to the monitor, can now carry multiple audio channels. See Fig. 1

Over the years, consumers have come to expect surround sound from their entertainment sources such as movies and now television shows. Toward this end, the ATSC standard incorporated multichannel sound into the DTV specification. Most ATSC MPEG-2 video encoders come with a two-channel audio encoder, but to add multichannel surround sound a separate audio encoder is all that is required.

Recently eight-channel audio has been introduced to the home with 7.1 audio, which adds two more channels and speakers. THX is the driving force behind 7.1 audio and has been limited to DVD movies released using its system.

Another much larger multichannel system comes from NHK in Japan. This system is composed of 24 audio channels in total and is intended to be used in conjunction with its Ultra High Definition TV system. NHK calls it a 22.2 sound system and it not only surrounds you horizontally but also vertically. There are three vertical layers of sound to give a much more realistic sound environment. Of course, this 24-channel system is intended for movie theaters and not the home, but it demonstrates how far multichannel sound can go.

Recording 5.1 audio

Recording actual 5.1 sound in the field can be quite challenging until you break it down into what is actually needed from location recording. In most programs and movies, 5.1 adds ambience to a scene: the door creaks behind you to the left, the light switch is flicked on in the front right, and so on. In this setup, only the actors' voices are recorded on the set, all others sounds are added during post production, the music, sound effects, and so on. When recording music, you either want to feel yourself in the middle of the band or out in the audience, and this can be accomplished with a six-microphone setup for live events or a multitrack recording that is mixed down to 5.1 audio.

Out in the field, many videographers just record two-channel stereo sound, sometimes recoding two more channels to catch the ambient sound. Back in post, they mix a 5.1 surround output for the final cut.

There are several 5.1 microphones that will supply you with six channels of sound, but most of today's camcorders come with two to four audio channels. To really record six channels of audio, you would need a professional multichannel audio recorder with SMPTE time code locked to your camcorder. As you can see, it becomes a challenge to record 5.1 audio in the field. See Fig. 1.

5.1 audio

Surround sound, or as it's more commonly called professionally, 5.1 audio, consists of six separate audio channels: left (L), right (R), center (C), left surround (LS), right surround (RS), and the low-frequency effects (LEF) channel. These six audio channels are used for DVD, cable, satellite, and broadcast DTV audio.

The basic way to transport these six audio signals requires three AES3 digital audio feeds (each AES-3 handles two channels each). When transported over AES3, they are grouped in this fashion: L/R, LS/RS, and C/LFE.

One of the problems with using three separate AES-3 cables to transport 5.1 audio is keeping them all synchronized together and with the video as well as routing all three around the plant. Because AES3's data frames are not synchronized with video, it's easy enough for one to slip out of sync, let alone three. Today, there are better options including MADI (Multichannel Audio Digital Interface), Dolby E, and embedded audio within SDI.

Embedded audio

SDI video can handle up to 16 channels of digital audio or eight AES3 audio feeds. Thus 5.1 audio can easily be accommodated by embedding it within the associated SDI video. For SD-SDI, the standard is SMPTE 272M, and for HD-SDI, it is SMPTE 299M. The audio data is transmitted during the horizontal blanking period, which is not used.

The advantage is that just the one signal needs to be routed, the SDI video with embedded audio. Plus the audio stays locked to the video as long as it¡¯s embedded. But you cannot pass it through a frame sync or other device that will change its timing as it will corrupt the audio. You have to de-embed the audio, pass it through the device, and then re-embed the audio, which is not easy.

Many satellite IRD (Integrated Receiver Decoder), video servers and other equipment now offer embedded audio as an option. This worked well as long as only two audio channels were being transmitted, but as more and more programs are using surround sound 5.1 audio, making sure you are receiving all the correct channels and that they are routed to the correct location within your facility may be a problem. See Fig. 2.


Multichannel audio digital interface (MADI) is capable of transporting up to 64 separate digital audio channels over a single 75§Ù coax cable or fiber-optic cable. MADI is basically a high-speed AES3 transport. Where AES3 will handle just two channels compared to MADI¡¯s 64 channels, its official designation is AES-10-2003. While MADI has been mostly found in recording studios to connect mixing consoles to multitrack recorders, it has now found use in television production studios and broadcast facilities.

With multichannel audio standard on most network programs along with supplemental audio channels, routing all of this around the plant became a problem. By using MADI, up to 64 channels can be routed by switching a single coax cable. Several of today¡¯s master control switchers offer 16-channel audio switching and mixing along with MADI interfaces for each video input.

For long runs, MADI requires a separate clock signal carried on a separate coax between the transmitter and receiver to reduce clock jitter. For shorter runs, most equipment is capable of deriving the clock from the data on the MADI line. You might see MADI referred to as carrying 56 channels of audio instead of 64; in the standard¡¯s first incarnation, it was only capable of 56 channels but was later updated to 64.

In part 2

In part 2, we'll examine more multichannel transmission systems as well as monitoring, downmixing, upmixing, and broadcasting multichannel sound.