Preparing Audio for Distribution

After our look back over the last 20 years of television audio last time, we return to the final stretch of our journey from post production to the consumer. Before jumping directly into monitoring as promised last time, we will ease into it by describing the local station and discussing some of the real-world issues they will have with getting our multichannel audio program over the air. This will reveal yet another set of signals, and will provide an excellent basis for trying to figure out how to monitor everything we have created.


As we have discussed, the various broadcast networks are distributing the audio portion of their programming to local stations in different ways. Two are using Dolby E, one is using high-rate Dolby Digital (AC-3) and one is sending two-channel audio via the MPEG audio system built into its distribution system (but plans to switch to Dolby E someday...). Regardless of how the audio gets to the local station, it will eventually be Dolby Digital (AC-3)-encoded for transmission to consumers. Although some of the encoders that I know of will accept analog audio, all will accept PCM via AES inputs. The output of a Dolby E decoder is PCM, the output of a professional Dolby Digital (AC-3) decoder is PCM, and the PCM output of the MPEG IRD is, of course, PCM. See the trend? Thankfully, the lowest common denominator is still PCM, so everything can be interconnected with relative ease. Further, the audio can be treated just like any other audio source. It is even possible for the audio to be converted back to analog, as long as it ends up as PCM to get into the Dolby Digital (AC-3) encoder. This would mean quite a bit of wiring, but it will certainly work.

The drawing in Audio Notes from the July 23, 2003 issue of TV Technology (p. 30) shows the audio signal flow for a generic local station. Only minor modifications to this drawing are required to adapt it to handle any of the different network flavors.


Every local station will have to choose exactly how it wants to handle all the audio channels plus the metadata. Should it be baseband or embedded? Should it be PCM or compressed? Should metadata be a separate router layer, embedded within the vertical ancillary space of the SDI signal, or handled by an audio compression system like Dolby E or Leitch Diamond? These are almost exactly the same set of decisions that the network operation centers had to make. You must make choices that make sense for your local operation. However, depending on how much of a change you need to make, it may be simpler than you think.

I have visited or corresponded with quite a few stations in many different markets and one thing is very clear. Overwhelmingly, DTV is confined to just a couple of racks of equipment at the station. The network feed is received, decoded and fed to a local/network switch, then on to the ATSC audio and video encoders, sometimes first passing through an audio processor to control loudness. The local feed is upconverted and fed to the same switch. A tally output from master control commands the local/network switch, and shazaam! The DTV station is on the air. This is oversimplifying it a little but is not too far from reality. It also means that the transition can happen in stages, and that the choices over embedded versus baseband, as well as compressed versus PCM audio can wait until the entire plant will transition. If you are only wiring between three or four racks of equipment, the choice to stay baseband PCM might be a wise one, both from a cost and a complexity point of view. Remember, you will likely be the one who has to train the staff on how to use it and troubleshoot it if there are ever problems.

What if you plan to pass your audio through master control? You will unfortunately still run into the same problem of what to do with the metadata, and how to handle the uncommon but possible need to crossfade between a two-channel and a 5.1 channel program. Cuts-only switchers can be made to work, but they limit production choices to "where-to" transition, not "how-to" transition.


Have you ever looked at all the settings in a Dolby Digital (AC-3) encoder? Scary, isn't it? There are quite a few settings, even if you are just sending stereo. I have to claim responsibility for at least helping to create some of the complexity when I was at Dolby, but much of it is very necessary. In order to repent, I have been spending quite a bit of time with the ATSC Systems Evaluation Working Group (sewage for short) where we are almost ready to give birth to a document that has been in progress for more than two years. This document covers many audio topics, but focuses on the "why" and "how" of all the encoder settings and even has some tables of recommended settings. Much of it will look familiar as we have discussed many of the topics in this column, but it gets into the complexities in much more detail than we are able to do here. Anyone who wishes to have a copy, please drop me an e-mail and I will make sure to forward it to you when completed.

(click thumbnail)Fig 1. Typical monitoring points are shown above, along with some suggested measuring equipment (in yellow). Note the consumer setup in the upper-right corner; arguably the most important of all, this can be purchased quite inexpensively and can create a showplace in the station lobby or other public area.
Now that I have taken the candy away from the baby, I will summarize probably the most important fact about Dolby Digital (AC-3) encoder setup -- the factory-default settings present in Dolby's DP-569 encoder are very close to being exactly right. These settings are also present in Appendix A of the DP-569 manual, available for download at Remember that two of the settings, dialogue level and audio coding mode, should be set to match the current program. This requires either metadata to be flowing through the entire chain, operator intervention to change the settings, or running them in default and making the programs match the settings.

Also, if loudness issues have not yet hit your radar screen, they should. As you are planning your new facility, you must consider the fact that while metadata is designed to allow the loudness of programs to be matched, there is a bit of overhead to make it work. As we have discussed, program loudness can be tagged via the dialnorm (dialogue level) parameter and then normalized in decoders, but it requires that the tags are accurately created and passed all the way through to the Dolby Digital (AC-3) encoder. What if the tags are missing or incorrect (accidentally or not)? Two things can help. First, the LM100 from Dolby will allow you to measure any program's loudness and will display the correct dialnorm value as well as the one that might already be there. If they do not match or one is missing, it can be corrected in the encoder. The other thing that can help is an audio processor, and it is now available to support 5.1 channel audio and metadata (see Audio Notes in the Sept. 18, 2002 of TV Technology). Loudness issues may not triple when going from stereo to 5.1 channels, but they will certainly not go away.


We now have the possibility of encountering the following physical interfaces that carry audio: SDI (SD or HD embedded audio), baseband AES (75 or 110 Ohm), RS-485 (metadata), SMPTE 310M, DVB-ASI, and of course analog. Within these physical interconnects we can have uncompressed PCM, compressed Dolby E or Dolby Digital (AC-3), all of which can be carrying mono, stereo, matrix surround encoded, or discrete 5.1 channel surround programs, and sometimes a combination of a few types. How do you monitor all of these signals? Fig. 1 shows a typical local station with critical monitoring points marked. Luckily a few expert companies have developed some very useful audio monitors. For in-rack monitoring, the first name that comes to my mind is Wohler. Will Wohler, Carl Dempsey and their team have developed a comprehensive line of products that can handle baseband, embedded, Dolby E and Dolby Digital (AC-3). In fact, one of their products can display (and reproduce through those amazing Wohler built-in speakers) embedded or baseband PCM, Dolby E, and even analog and supports up to eight channels. Videotek also has a monitoring product that uses a VGA monitor to display comprehensive level and phase information for up to eight channels of embedded PCM or Dolby E/Dolby Digital (AC-3) audio.

One thing that no one has yet tackled is how to monitor baseband audio and apply the metadata information that makes the audio reproduce properly. Remember that metadata is typically not applied until a Dolby Digital (AC-3) stream is received and decoded. However, if not applied during monitoring, program level differences that will eventually be corrected for by dialnorm will show up as level shifts on meters and through speakers. This "gotcha" has embarrassingly gotten many of us "audio experts" and will certainly bite an operator or two. We will have to wait until next time to reveal some potential solutions.

Next we will finish our discussion of monitoring and move on to the destination of our journey-the consumer. We will investigate the many ways our multichannel program may be reproduced, as well as offer tips to help answer common viewer questions. Until then, please keep the excellent questions and suggestions coming.