Why Not Just Set the Level Once and for All?

I got an e-mail a little while ago from Jimmy Chivers, a recently hired chief engineer for KVHP-TV in Lake Charles, La. Among other concerns, he mentioned that his boss was on his case because the commercials were coming out much louder than the program audio. Naturally, I suggested that he check out what the sponsors' ad agencies were actually giving the station, and to be aware that it is considered good sport by ad agencies to really compress the bejeezus out of their audio to get a leg up on the program audio and on their competition. This correspondence has led to an ongoing intermittent phone conversation covering a fair range of topics (right now, I'm "it").

Meanwhile, I recently hung out in Denmark with a guy named Uffe Kjems Hansen from TC Electronics. He showed me a gadget called the P2 Level Pilot, which might actually take care of Jimmy's levels dilemma.

This all got me to thinking about the "levels" problem. It really is a problem, and the obvious solution of using a compressor to regulate levels doesn't really solve it, for reasons I'll discuss in a minute. I run across it all the time as a couch potato watching and listening to my TV, and also while driving around in my car. So do you. And we all are annoyed by it.

To get an idea of the real magnitude of the problem, I conducted an unscientific study of the problem as it exists on my television. I have a "digital" cable feed, plus a VCR (reading an "analog" cable signal), plus a DVD player-all standard stuff. I used Channel 2 (a local PBS channel) as a reference, and set the TV audio at "60" (an arbitrary, medium loud setting with voice-approximately 70 dBA SPL at about 7 feet). I observed "all" the other channels, plus a comparison of the VCR feed and the DVD feed, estimating the difference in voice level as I compared them with Channel 2. I made a few perfunctory notes about the "type" of channel, the presence of "obvious" heavy limiting, etc. I made no distinction between commercials and program, although I think if we separated those things out, we would get an additional significant family of results.

The results weren't pretty. The estimated range of variance of the 110 channels I observed (I just used my "favorite" list, plus some local access and shopping channels) was a whopping 18 dB! Worse, the standard deviation between channels was approximately 5 dB, which is to say there is about one chance in three that when you change channels, the level will change by 5 dB or more! What this in turn means is there is about a one-in-three chance you will feel the need to change the level when you change the channel.

Why is this? Partly, it's pure lunacy and sloth. For instance, two channels both transmitting the ABC network feed were about 4 dB different! There is no valid reason for this. Similarly, the variance in level between the "digital" cable feed and the "analog VCR" feed (which is at unity gain for its throughput signal) is nonsensical. It can be argued, of course, that the "digital" feed is subject to metadata regulation such as dialnorm and the "analog" feed is not. However, it can be equally argued that this is exactly what metadata was intended to prevent, not cause!

Some tendencies stood out. Home shopping channels were louder (but not always). Movie channel bundles (e.g. HBO and Cinemax) tended to run soft, and the music channel bundle ran really soft (this does not include VH1, MTV or their operational cousins, which tended to be on the loud side) . Such practices afford a wider dynamic range and more headroom for the movies and background listening music, while the shopping channels enthusiastically and shrilly shill their wares, just like we'd expect.

CAN'T WE JUST COMPRESS AND FORGET?

The tried-and-true solution that broadcasters have used for years to maintain stable levels is fairly massive compression and/or limiting prior to transmission. But, two problems exist with such processing.

First, there is no standard for transmitted audio signal magnitude, so regardless of how carefully any given station maintains its levels, its nominal level may be (and probably is) different from other stations' nominal levels.

Second, compression is dependent on the nature of the program material over time to work, and a compressor varies its output as a function of the short-term state of the signal. It makes loud moments soft and soft moments loud, regardless of what this does to the "meaning" of the program material carried by the signal. You can easily hear this and it is often both annoying and inappropriate-definitely in the realm of "bad audio," even if we do it all the time.

A NEW GADGET

This is why the new TC Electronics product is so interesting to me. It functions with multiple time bases, as well as some multiband compression. First, it takes a look at the long-term level of the program material and slowly moves that level toward a user-defined target level (at, say, 1 dB every 2 seconds-you get to pick the rate). Meanwhile, it uses multiband compression and expansion to keep a handle on the short-term deviations from the target level. And finally, in order to head off long-term pumping during periods of silence or very low levels, it has a user-defined "window" of levels to regulate. When the signal goes above or below that window, it doesn't change the long-term level (and I think it skips the multiband compression/expansion as well), working on the assumption that such levels are a legitimate part of the program.

During the half hour we used the P2, we cobbled up a broad range of signals and signal feeds, including CDs, radio and television. Happily, the P2 never lost its composure, and the various signal levels remained reasonably stable across something like the 18 dB range I observed on my television today, while at the same time maintaining their own integrity in terms of dynamic range.

WHAT ABOUT STANDARDS?

A sophisticated processor such as the Level Pilot isn't really the way to think about this, however useful it may be. The real issue here has to do with our professional integrity regarding levels management in production and transmission. It might be useful to standardize levels, as well as to work a little harder toward a world where the viewer/li-stener experiences a consistency of audio levels roughly equivalent to the consistency that exists in terms of color and brightness between channels. Further, it would be really helpful for stations, from the local operations like Jimmy Chiver's KVHP to the big urban stations and cable providers, to ensure that the levels from their various sources match.

Dialnorm, from Dolby's metadata toolbox, is supposed to provide just such a level. And while I have some reservations about how they measure a "normal dialogue" level, they do in fact specify what that level should be (-31 dBFS, A-weighted, for dialogue).

Not a bad idea. Consistent audio. Good audio. For a change. We should try it. We might even like it!