Audio Levels In a Small Listening Room

I started this series last November by describing my adventures teaching about Critical Listening to the employees of iZotope, a high-tech software firm in Cambridge, Mass. As you may recall, I described the trouble they had hearing much of what I was trying to demonstrate, that is until I got them out of their conference room and into my studio, where, all of a sudden, it all became quickly and blessedly obvious to them.

In that article ("Learning What's Involved in Critical Listening," Nov. 3), I also confessed that I had forgotten the universal truth about the importance of room acoustics, having been happily situated in my studio for nearly 20 years, and that this sloth on my part had led me into Bad Habits and Pedagogical Perversions pertaining to my Teaching Strategies and Efforts regarding Critical Listening. (Critical Listening is, of course, listening to audio that we do to evaluate other people's audio work, for the purpose of making it sound better, for money).

One of the things I did not mention in that article is how the students described their surprise, even shock, at how loud my listening levels were, in comparison to their geeky, head-phone/computer monitor listening experiences. And that's what we are going to talk about today: Listening Levels, Dynamic Range and, of course, the Dreaded Noise Floor!

ABOUT MY LISTENING LEVELS

My listening levels in the studio are calibrated, sort of. What that means is that a known signal (Pink Noise @ –14 dBFS RMS) sent to any one of my speakers yields 85 dB SPL (slow detection) at the client's position when my monitor controller level is set at "85." Cool!

It is important to note that these levels are not A-weighted. An A-weighted measurement (which is fairly typical) would yield a measured level 6–9 dB less than my measurement, or approximately 76–79 dB SPL, varying as a function of the spectrum of the signal.

Unfortunately, the broadcast feed to my console is not calibrated and also varies widely as a function of channel selection (as I've described on numerous occasions). I've arbitrarily selected one major network channel (where a former audio student of mine is in charge of things) and set the console levels so that under normal primetime news broadcast conditions, the measured accumulated level (Leq) at the client position (all channels playing typical program material) is (or was, last time I calibrated) approximately 78 dB SPL when my monitor controller level is set at "85."

This means that:

(a) for almost all other broadcast channels, the level will be something other than 78 dB SPL, varying over a range of something like 15 dB;
(b) for surround playback of a DVD (a different, calibrated feed), the –14 dBFS level will yield approximately 91–92 dB SPL;
(c) for playback of a modern CD with hypercompressed levels, the playback level will be between 90 and 94 dB SPL, and I've made it a practice to drop the monitor controller level to "80."

Fig. 1 The Audio Window, with possible signal levels and noise floor (not to scale-proportions not correct)THE AUDIO WINDOW

Let's consider this in a broader perspective. Take a look at Fig, 1, "The Audio Window."

That yellowish blob in the middle of the graphic is approximately "all the stuff we can hear," placed within ranges of amplitude (dB SPL) and frequency (Hertz). Note that if we've "calibrated" our system so that one speaker, driven by a –14 dBFS signal, yields 85 dB SPL, then our "nominal level" in stereo will be about 90 dB SPL and the maximum level is probably a little less than 105 dB SPL (this will be slightly less than zero dBFS going to each of two speakers). Assuming that our speakers each require 1 Watt of power to generate 88 dB SPL at our listening position (typical), each speaker is going to require 16 Watts of power to play back 0 dBFS. This is actually pretty close (3–6 dB) to the upper limits for any small or medium-sized studio monitor.

So, in any reasonably normal listening room or control room situation, our "nominal" listening levels are pretty close to the upper limits of what the system can deliver. Those levels are also subjectively quite "loud." And that is probably why my iZotope students had such a reaction of "Wow, this is really loud!" They are very much used to listening at much lower levels while working at their desks and workstations, and having almost no critical understanding of either the "meaning" or the implications of softer or louder levels.

At the same time, look at the "nominal" noise floor at ca. 50 dB SPL. Because of it, we only have 55 dB of dynamic range, and a 41 dB range of signal-to-noise. For any critical production work, we need that dynamic range in order to hear low-level details. If we reduce our nominal level, it won't change our nominal noise floor, which is made up of mostly acoustical noises (that we can't get rid of) which are unrelated to the monitor level setting.

WHY BE CONCERNED?

When I normally listen to live broadcast video (news and sports) at my calibrated setting, I find the voice-overs in the center channel to be below my defined nominal level, ca. 70–75 dB SPL. Music, FX and related stuff (and commercials) often rise above this, but as a general rule, it's all pretty reasonable, and generally pleasant to listen to. However, it can actually get fairly loud on occasion (to a point where people in an adjacent room might complain). Further, those pesky commercials will often rise by 3–6 dB, while maintaining a similar peak level due to compression. If the channel is a loud one, this whole thing can become quite annoying. If we want to turn up the level to get into the spirit of a sporting event, for instance, commercials can become extremely annoying! This is, of course, why the CALM Act came into being. In such cases, I drop the level by 5 dB, just as I've learned to do for hypercompressed CD playback.

This makes for a vibrant and satisfying audio playback that integrates well with film and video, and is certainly equivalent to a very good commercial theatre. It is also generally realistic, even though dialogue levels are louder than they would be in real life (keep in mind that the on-screen images are also larger—the dialogue becomes noticeably unconvincing when reduced to an "authentic" 65–70 dB SPL).

But, there is not much headroom left and, in production, it all gets harsh and shrill when we go above these levels. Peak levels over 92 or 93 dB SPL actually begin to hurt, and sustained RMS or Leq levels above 85 dB SPL begin to be wearing. Further, the monitors will be running out of power (beginning to distort) unless they really have excess power-handling capacity, and the resulting distortion artifacts can also be quite annoying. Meanwhile, as I mentioned earlier, if you reduce the monitor level you begin to conceal low-level detail as well as emotional impact.

Therefore, you need to get a handle on the actual measured levels at which you are working. You really do need a sound level meter to check your levels until you've really got those levels in your ears and can estimate within a couple of dB the actual Sound Pressure Level by ear alone. (You can become totally awesome by also being able to do this with A-weighted measurements as well!) We cannot mix by pain thresholds alone!

Now that the CALM Act is law and the FCC has only a year to get organized on the enforcement of it, so this is a great time to get good at this. Getting comprehensive understanding and control of your production working audio levels, in your good room with your good loudspeakers, is a huge part of mastering your audio production craft, and learning to stay within the law.

Next column, we'll address the "Myth of Awfultones," wherein you think you've got to mix on bad speakers because that's what people listen on! Should be fun! Thanks for listening.

Dave Moulton is terrific at giving other people advice and dozing by the fire. Other than those things, you can complain to him about anything at his website, www.moultonlabs.com.