Producing Audio With A Center Channel

In my last column, I wrote about my experiences with the sound of the typical broadcast audio center channel as perceived when using a dramatically improved critical listening environment. I discussed how important the center channel is, the problems it faces, and how much I learned about the center channel in broadcast audio as revealed by a really good center channel speaker (a Bang & Olufsen BeoLab 5). This month I’d like to spend a little more time discussing the center channel, concentrating on some of the production issues we have to deal with.

A VARIETY OF PATHS

When we generate and transmit a typical garden-variety mono voice signal, it gets to the end users through a variety of transmission paths, and it’s that variety that gets us. These paths include stereo, Dolby Pro Logic, discrete 5.1 plus a variety of receiver and set-top box efforts to downmix (or upmix) these variants. We have no control over which path will be chosen (they are simultaneous and in parallel), and we need to concern ourselves in production with making sure that our all-important mono voice track will sound at least OK in all paths.

I’m focusing on the mono voice track here because it is the most important signal in televisionland. It is also, technically, the simplest. We’d like it to be perceived as coming from the talking head on the screen.

In reality, the mono voice track may be sent (in stereo) to both left and right loudspeakers. The theory here is that it will be perceived as a phantom image between the two speakers, hopefully in the middle of the screen along with the talking head.

It may also be sent to just a center speaker (in 5.1) or to left, center and right (also in 5.1, when the mixer decides it would be good to limit the divergence or the percentage of a signal sent to the center speaker).

Finally, it may be “steered” to the center channel, in Dolby Pro Logic. What happens here (approximately) is that the processing algorithm extracts “sum” and “difference” components from the stereo signal. Our mono voice track is almost always a sum signal, because it is sent equally to left and right and is phase-locked. Whenever its magnitude reaches a certain threshold, it is routed to the center speaker. Clever, eh? It works pretty well, so long as there is plenty of discrete left and right, plus some difference signal, to mask the steering action.

You would think it would be pretty hard to mess this up, but in fact, serious problems emerge, particularly given the great range of competence to incompetence in the setup of end user systems and the dismal quality of audio knowledge out there in end user land.

(click thumbnail)Fig. 1: A Digidesign Pro Tools 5.0 panning tool, showing a signal assigned to center. Note the divergence and center percent settings. The blue outline on the screen denotes the actual limits within which panning happens.For example, I’ve received a significant number of letters complaining about how the announcer’s voice is sometimes or often faint or inaudible. This will occur, of course, when no center speaker is hooked up and the receiver is configured to use either Dolby Pro Logic or Dolby Digital (5.1). If the voice track is sent only to the center channel in such a case, it will simply be inaudible. If there is some divergence or center percent, the voice may be audible but weak in the mix.

Some TV channels suffer from just the opposite problem. They seem to be determined to never lose the voice, so they send it, in equal amounts, to both left and right and to center. This results in a 6 dB buildup (unless you don’t have a center speaker), causing the voice track to be unreasonably loud, just about masking any stereo material that might be present. Nasty!

Also, for some complex reasons we will not discuss here, phantom images sound different than discrete sources. The timbre of a sound from a single speaker is different from the timbre of the same sound perceived as a phantom image derived from two speakers. Sometimes that difference is quite significant and can be annoying.

Further, when three speakers are emitting the same sound from approximately the same region in space (oh, say, the front of the room), if we aren’t precisely on the median plane we may hear some moderately unpleasant comb-filtering artifacts that are a consequence of hearing multiple versions of the same sound from slightly different points (and times) in space. Yikes!

WHAT DOES THIS MEAN IN PRODUCTION?

So what do we do? How do we handle this in production? I’m going to lapse into ProTools-speak here, to make some concrete suggestions. First, assume a variety of tracks or stems going into a surround mix that will also be downmixed for stereo transmission as well, and also be encoded into Dolby Pro Logic. Assume 5.1 surround bus assignments (see Fig. 1 and Fig. 2).

Take a look at the divergence and “center percent” settings. Divergence refers to the percentage of signal sent to other channels than the assigned, and is available across the front, across the back, or between front and back. One hundred percent divergence means that the signal is sent everywhere equally. Zero percent divergence means the signal is sent only to the assigned channel.

Center percent refers to the amount of left/right summed signal that is sent to the center channel. One hundred percent means that unity gain from both left and right is sent to the center speaker, resulting in a 6 dB buildup.

(click thumbnail)Fig. 2: The relative output levels that occur with the center channel pan assignment in Fig. 1.So, these features permit a user-specified amount of crosstalk between channels. Said crosstalk tends to (a) enhance envelopment and (b) ease center-channel problems when there is no center speaker (or a poor quality one) in playback. On the downside, it constrains hard assignments to discrete speakers, making images a little less distinct.

The precise chosen value of divergence or center percent really depends on the specific material and context, of course. However, here are some useful default starting points for you to consider.

For divergence, across the front I suggest 90 percent (–20 dB crosstalk), while only 80 percent across the back (–14 dB) for a little better envelopment. To my ears, front to back should have no divergence, to strengthen the surround ambience and positions as much as possible.

For center percent, I find I keep settling on 40 percent (about 4.5 dB crosstalk), which to me yields the most benign compromise between the phantom and the discrete image, while protecting against a missing or deficient center channel.

As general rule, I would suggest similar panning constraints for all tracks and stems in the mix, except when there is a specific effect that needs to be hard panned to a discrete speaker.

THE BEST APPROACH

Careful, conservative crosstalk is probably the best approach here. The multiplicity of playback situations our work will encounter make it advisable to make sure that our production effort will work well across the general range of playbacks, including some commonly defective situations such as the missing center speaker. And toward that end, some judicious tweaking of the crosstalk and level relationships between the center and other channels can help the all-important center voice channel survive out there in listener/viewer land.

Next month, we’ll consider the surround channels.

Thanks for listening.