audio monitoring systems

The old maxim that tells us to watch what we ask for applies to the audio post community in spades these days. Complete mixes, including dialog, effects and music may have sounded hideous coming out of a single three-inch speaker, but engineers and producers knew exactly what the listener would get. Audio mixers wanted more though, and a stereo field wasn't enough. Those who hungered for full surround audio have seen their wishes come true. More homes have theater systems than ever before, and the numbers are growing. The ability to create full-range mixes that convey the creative ideas of content providers is now part of the audio post process on many, if not most, commercial assignments being executed today.

But this bonanza has created thorny new problems that audio post engineers have to contend with. Monitoring a mix that will sound good in all of the environments it will be played back in, including mono, stereo and 5.1, is perhaps the greatest challenge facing audio engineers. For starters, surround sound packages vary widely in price and quality. Spending $300 on a six-speaker system will allow a listener to hear sound effects pan across the entire field, but how will the frequencies that are reproduced compare with the performance of speakers that cost far more?

Speaker placement is another monitoring problem to be considered by engineers and content providers of material that is designed primarily for the home market, or that will eventually make its way into this environment. The rigorous standards applied to movie theaters desiring THX approval regarding speaker placement and a host of other issues are absent in the home. How does an audio post facility set up a surround sound mixing room that will let an engineer effectively monitor for the greatest number of homes that will play back the material? Should this kind of democratic concern even factor into the equation? Perhaps mixers should create their work to the highest standard, and let the home theater owner who places Aunt Bessie's ceramic vase in front of the left rear speaker suffer the consequences. The decisions that mix facilities are making today regarding how to monitor audio will have a critical effect on how well they fare in the competitive audio post market.

When broadcasters first began dealing with audio for picture issues back in the 1940s, downmixing - the term applies to the need to simultaneously monitor surround sound mixes in stereo and mono while working with all six speakers - was not an issue. When stereo programming was introduced, the need to reference these mixes in mono was apparent. Today, many viewers still listen to the audio portion of programming on a single speaker built into the set.

There has been a steady escalation in the sheer amount of audio material used in broadcast production. How does this increase affect monitoring? Digital audio workstations allow for an almost unlimited amount of tracks, many of them virtual spaces on a hard disk that can be ported out a smaller number of discrete, physical outputs during mixdown. Today, mixes that contain 100 or more channels of audio are not uncommon. As a result, phase cancellation becomes more of a problem, especially when stereo mixes, for example, need to be downmixed to mono. Careful monitoring is required to insure that two elements at the same frequency do not nullify and cancel each other out when a mono mix is executed. The combination of synthesizers, acoustic instruments and complex sound effects insures that phase problems will arise, and they must be carefully addressed while monitoring a mix. Question: You're producing the movie of the week for a network. In your audio mix, which you've decided to optimize for stereo (the primary audience), your audio engineer has created a stereo field that matches the film's climactic scene, a car chase through a crowded city, frame for frame. When the engineer references this scene in mono you find that you're losing several metal-against-metal effects that your sound designer has created. Do you live with this or rework the stereo mix so these effects are more audible in mono, even if it means compromising the stereo feed itself? Most producers are choosing to optimize their mixes for the preferred distribution method.

In fact, the lure of creating the best mixes is enticing producers - and the audio post professionals who service them - to monitor under the most ideal conditions. Many audio post houses are setting up two monitoring environments, one to replicate a theater and a second that serves as a high-end home theater model (See "Simple," p. 102).

Monitoring issues multiply as we move from stereo to 5.1 (7.1 may hit the home one day, but it is not a practical reality at this time). Now we have five single source point audio speakers, not just two (the sub woofer [.1] is omni-directional). The audio engineer must understand that the mix he or she is creating will live in properly tuned, high-end home theater environments, and in less-than-ideal home theater applications, and will also be played back in stereo and mono. Referencing for all of these different applications requires tools that complement the ears of the engineer without interfering with the creative process by adding cumbersome technology into the equation.

For starters, mixing desks, digital or analog, must be equipped to handle surround mixing. As recently as several years ago some console manufacturers were touting their boards as surround sound capable simply because you could plug six or more speakers into them. A true surround console must be able to pan all source material and effects across the multichannel speaker field.

Dolby Laboratories manufactures a piece of equipment that helps engineers keep track of the way mixes sound as they move from surround down to stereo and mono. To appreciate the functionality that the DP570 Multichannel Audio tool offers, it's critical to understand the concept behind Dolby Digital and the term metadata. Metadata is essentially "data about data," sent in a single Dolby Digital bitstream to help deliver audio to a variety of different consumer listening environments. One decoder will know, for example, that it's installed in a stereo-only environment and will execute the appropriate downmix from the feed - assuming that its owner has properly set up the device. Another decoder, used in a full-blown 5.1 home theater environment, will look at the same Dolby Digital stream and select the six-channel mix for playback.

What control do engineers and content providers have over the way decoders will execute downmixes? Quite a bit, actually. In the monitoring process, engineers can, for example, choose a preferred stereo downmix. This preferred mix could be a standard left/right stereo or a Prologic encoded LTRT stereo-compatible mix. Prologic, the forerunner to the true, discrete digital sound now enjoyed in many home theater environments, is a four-channel technology (LCRS) which derives the sub-woofer channel from the left/right mix, sending out only a mono signal to both surround speakers.

Another metadata parameter lets content providers tell decoders how they want the surround channels to be summed when it's necessary for them to do so. If, for example, a decoder needs to sum a 5.1 audio program into standard stereo, how loud should the center channel material appear in the two mix? Again, engineers and content providers, monitoring the metadata through a DP570 in the studio, will be able to encode their decisions into the final metadata stream to be acted upon by the home decoder. A similar parameter tells the decoder how to handle the downmix of the surround information.

The price of solid performance six-channel speaker systems, digital receivers and DVD players has dropped substantially in the last year or so. Although digital television sets remain highly priced, more and more homes are being equipped with gear that can handle multichannel digital audio. In the future the need to monitor audio for broadcast purposes in a way that insures that material designed for playback on surround systems plays back well on stereo and mono systems will diminish in importance. Until that day, audio post facilities will have to provide their clients with monitoring environments that ensure the greatest quality of downmixes possible.

C5 Editorial is a New York City-based audio post facility that specializes in audio for film. Skip Lievsay, a principal at C5, was working on a surround sound remix of the Coen Brothers' early film, "Blood Simple," when we spoke. Lievsay executed the original mix all the way back in 1982. At that time, the producers couldn't afford to pay for a Dolby license, and the film was released in mono. Things have changed for the Coen Brothers, who felt that the time had come to remix the film in surround. "Carter Burwell went back and re-recorded his original score, and we re-recorded all of the sound effects. Plus, the Coens originally wanted to use some songs in the film, but couldn't afford to pay rights fees on them, which they can do at this point.

"Targeting the market you're mixing for is crucial when it comes down to the environment you monitor audio in," says Lievsay. "What many of my colleagues are doing, and us as well, is to create two separate monitoring areas. We need to test our mixes in a full-sized theatrical environment, but that won't tell us exactly how things will sound in the home, so we also set up a smaller room that resembles a well-designed home theater environment."

If you're 16 years old you've probably never cringed over the volume of the audio tracks played down at the local multiplex. More mature moviegoers might wish for a more subtle attenuation of levels. Engineers must monitor audio at levels that accurately reflect the level at which it will be played back. Mixes played back in the home are generally listened to at softer levels than in the theaters, and this must be accounted for. "The balances change when you listen back at softer levels, and some of the lower level information can get lost. As a result, mixers have to keep a lid on the high level material and raise the lower level audio."

Although Lievsay has never used compression on a project intended for home theater release, dynamic compression is used at times to raise the level of an entire track. In fact, dynamic range compression (DRC) is a very important monitoring issue. Mike Babbitt, Field Services Engineer at Dolby Laboratories, says that DRC is a greatly misunderstood part of metadata.

In the recording industry, compression is used to boost the overall level of a track, adding punch to lower-end frequency material in particular. DRC sets out to achieve a different goal - to allow the consumer to tailor the audio in a way that maximizes it for a particular listening environment. For example, some listening rooms force the programming material to compete with a barking dog, the apartment next door or noises coming from a crowded street. Typically, the softer portions of the audio track are more difficult to hear than the louder ones under these kinds of conditions. Raising the level of the entire audio will, however, make the louder sounds uncomfortable to the ear.

A series of dynamic range profiles are imbedded within the Dolby metadata stream. When the consumer calls upon them, these profiles perform different functions on the audio, raising, for example, the level of the softer portions of the program and lowering the level of the louder portions.

Sophisticated home theater equipment places burdens on the audio post mixer, the content's producer and the consumer. All must be aware of the fact that the audio portion of a program will be heard in a variety of environments. Engineers have to take all of these different environments into consideration. The consumer who wants to enjoy audio properly must also learn the capabilities of the decoder, and set it up to take full advantage of the metadata stream provided by Dolby Laboratories.