Following Digital Audio To the Home

We will wrap up our look at how to monitor almost every type of audio signal in a DTV plant by briefly exploring how to deal with baseband audio and separate metadata.

We will wrap up our look at how to monitor almost every type of audio signal in a DTV plant by briefly exploring how to deal with baseband audio and separate metadata. Then it's on to the home where we finally see how viewers will be presented with our audio signal.


At least two major terrestrial broadcast networks are distributing PCM audio around their plants, and as such, the accompanying audio metadata is not embedded within the audio streams. Critical to the success or failure of audio metadata is that it be applied to the audio during monitoring. The result of not doing so could be the following: Program A, a loud music show, has a dialogue level of -18. At the top of the hour, the network transitions to program B, a quiet movie, with a dialogue level of -31. An operator who simply metered the PCM audio would see a 12 dB drop in audio and might try to compensate for it by bringing the main level up in master control. Three things are likely to happen: the movie will now be too loud for its dialogue level metadata setting of -31, the first commercial to hit will be unbelievably loud, and the little red phone will start ringing.

Although Dolby E avoids this problem by keeping the audio metadata in the same stream, there are some very good system reasons why certain broadcasters have chosen to distribute uncompressed audio and metadata-issues such as timing effects due to multiple passes through control rooms or edit suites, access to individual audio channels in real time, and master control switchers that cannot yet handle compressed audio, are just a few. Cost-effective monitoring and metering products that accept PCM audio and metadata should start to appear soon to support this approach.


Now that our Dolby Digital (AC-3) audio stream (or streams) has been combined with MPEG video and other types of data to form a transport stream that is then either 8-VSB- or COFDM- modulated for terrestrial transmission or QAM-modulated for cable distribution, it is amplified and sent on to the consumer. Once the channel is tuned, it is demodulated then demultiplexed back into the individual audio, video and data-streams -- sometimes referred to as PES or Primary Elementary Streams.

Looking at Fig. 1, you can see that the Dolby Digital (AC-3) bitstream goes to two places. It is decoded to baseband PCM audio and is also formatted as well as sent out of the receiver in its compressed form or its decoded PCM form via a so-called S/PDIF (Sony/Philips Digital InterFace, pronounced as "spee-diff" or "S-P diff") port, which we will explore a bit later. For now, we will concentrate on what happens to the decoded audio.

(click thumbnail)Fig. 1
Fig. 1 shows the internal two-channel Dolby Digital (AC-3) decoder providing a PCM output that is then converted to analog. This decoder provides a two-channel downmixed output from any Dolby Digital (AC-3) bitstream, regardless of how many channels were encoded; if it was originally 5.1 channels, a stereo or LtRt version is automatically created under the direction of the metadata settings. The left and right analog signals feed the standard RCA jacks found on most receivers or set-top boxes and are simultaneously combined to mono, attenuated -- 3 dB and feed the RF re-modulator for creating the "channel 3/4" output.

The S/PDIF connection is also known as IEC60958 and can either be an RCA jack specified at 75 ohms, 0.5 volt peak-to-peak, or an optical connector developed by Toshiba called TOSLINK. Originally developed to allow uncompressed PCM to pass out of a digital consumer device, this useful connection was expanded via IEC61937, which provides a path for PCM or compressed audio data. For digital broadcast, cable and satellite services, this enables the use of an external Dolby Digital (AC-3) decoder as in many cases set-top boxes and television receivers only have a two-channel internal decoder.

You can see a number of switches that may or may not exist, depending on the type of receiver and its sophistication. For example, one switch allows the S/PDIF output to alternate between PCM and Dolby Digital (AC-3). Some older boxes do not have an internal analog-to-digital converter and only provide a digital output on the digital channels. There are many variations and this is where a Dolby DM100 Bitstream Analyzer is a handy and inexpensive tool to have around and will show you exactly what the receiver is capable of doing.

One other area of interest is that many set-top boxes and receivers are beginning to include very advanced multimedia features. One example is the electronic program guide that actually produces sounds as you scroll through the menus. If you have the S/PDIF output connected to an external Dolby Digital (AC-3) decoder, how can you get these sounds out of the box? Surprisingly, some set-top boxes can actually perform an onboard Dolby Digital (AC-3) encode of any non-encoded audio to allow a continuous stream to be output regardless of the source. This is very intriguing as other advanced multimedia applications, which might use a format capable of multichannel audio, such as Windows Media 9, have the very nice side benefit of being able to supply this audio to the millions of existing external Dolby Digital (AC-3) decoders.


What happens to the S/PDIF output of the set-top box or receiver? First, this stream is output slightly ahead of the picture to compensate for the minimal latency of the external Dolby Digital (AC-3) decoder so there will be no timing difference between the analog outputs of a set-top box or receiver and the speaker outputs of an external decoder. Next, unlike many built-in multichannel decoders, the external decoders have useful features such as bass management to allow almost any size speaker or combination of speakers to be supported; if at all possible, a center channel speaker and a subwoofer are must-haves. If it is simply not possible, the external decoder can be set to create a "phantom center" by attenuating the center channel audio by -3 dB and feeding it to both the left and right speakers. Likewise, lack of a subwoofer can be partially compensated for by rerouting the low-frequency information to the left and right speakers. You probably cannot get away with four tiny speakers and no subwoofer, but a small upgrade of the left and right speakers or the addition of a subwoofer and a center channel will make dramatic improvements. Luckily today there are many inexpensive options that sound simply fantastic. In one room, I use a Klipsch ProMedia 5.1 system picked up at a local computer store that does the job remarkably well and at only a few hundred dollars it did not break the bank.

Don't forget that multichannel decoders also contain matrix decoding such as Pro Logic or Pro Logic II for handling two-channel sources. As Dolby Digital (AC-3) can pass mono to 5.1 channels of audio, two-channel sources would only be reproduced from the left and right speaker if it were not for a matrix decoder. A very nice benefit of the matrix decoding process is that the dialogue contained in a two-channel program will be reproduced from the center channel just like 5.1 channel sources, and this works with surround encoded, stereo or mono programs-even more reason to make sure you are monitoring as accurately as possible.


Recently I have received a number of questions regarding the ability of cable operators to modify audio or change metadata values, so I did some investigating. I found that services such as HBO, Showtime, Cinemax and others, which are pre-encoded before distribution to digital cable operators, cannot have their Dolby Digital (AC-3) streams modified, and this includes metadata. Apart from decoding and re-encoding the audio (which there is no good reason to do), once metadata is set correctly, it will stay that way. After speaking to cable audio and loudness metering guru Jeffrey Riedmiller, we both agreed that we knew of no case where this was happening or how it would even be possible given the technology and the networks prohibiting decoding and re-encoding. Services that are encoded by a cable operator are an exception, but the tools to make the settings correct are abundantly available.

If it were possible to make the terrestrial metadata path as short and contained as it is in cable, it would be afforded similar protection. The only case in which this takes place terrestrially is in the PBS model where a finished transport stream is sent to the local stations. If it is not decoded and re-encoded for local branding, the metadata cannot be changed and it passes directly on to the consumer.

The next Audio Notes column will be in January. We will begin a two- or three- part series on exactly how audio data compression such as Dolby Digital (AC-3), AAC, MP3, dts, WMA, PAC, etc. actually work. I will have plenty of input from the experts who have designed these systems and it promises to be both interesting and understandable. Until then, have a safe and happy holiday season.