Perceived loudness, particularly of commercials, has been a problem and a complaint as long as commercial television has been with us.
I mentioned in my last column that it isn't just commercial television either, as my local PBS station specializes in blasting me off my couch with their promos and commercials, too.
Previous efforts to address commercial loudness dead-ended for a variety of reasons, but we are well into the digital television age, and we are told that we now have the tools to deal with this loudness problem. This may be true, but only time will tell whether the relevant parties at last have the requisite will to deal with the problem.
THE APPROVED RP
The ATSC made a major effort to address television audio loudness when it approved the "ATSC Recommended Practice: Techniques for Establishing and Maintaining Audio Loudness for Digital Television," on Nov. 4. A pdf of the guidelines can be accessed at www.atsc.org. The RP's approval occurred the same day a seminar addressing the loudness topic was held in Washington, D.C.
This document—the result of more than two years of work by the drafting committee—is based on work that goes back more than 10 years in ATSC committees, notably in the now-dormant Implementation Subcommittee and its subgroup, the Systems Engineering Working Group.
Mark Richer (L), president of the ATSC and Pat Waddell, technical marketing manager at Harmonic, both speakers at the Nov. 4 ATSC Seminar on Audio Loudness. Photo by James E. O'Neal The RP is an ambitious document, amounting to about 65 pages total, with about 13 pages comprising the RP itself. The table of contents is two-and-a-half pages long. The RP begins with a brief description of the AC-3 multichannel audio system, followed by a section on loudness measurement, a relatively new concept to the television engineering community, who have been accustomed to making instantaneous audio level measurements. This section begins with a description of the loudness measurement method and algorithm described in ITU-R BS.1770. This is the method that yields loudness levels in units of dB LKFS, which are dB relative to Full Scale using K-weighting, the measurement weighting algorithm described in BS.1770.
Loudness measurements are addressed in a variety of situations, including measurement during production or post production, measurement during live events, measurement of finished long-form content and short-form content, and file-based loudness measurement.
There is a section describing audio monitoring setups that addresses the acoustic characteristics of rooms and production spaces, and how to perform reference-level calibration. The test signals recommended for calibration are described. These include 440 Hz sine wave tone, which is a new test frequency to old television hands who have traditionally used 400 Hz tone. We know that 440 Hz is A above middle C, also known as Concert A, the traditional orchestral tuning note.
The RP notes that in addition to its musicality, it was selected because it has no harmonic relationship to any digital audio sample frequency, and it is in the flattest portion of the K-weighting curve. The other recommended test signal is band-limited pink noise.
A chapter is devoted to "Methods to Effectively Control Program-to-Interstitial Loudness," i.e., how to avoid blowing the viewer off the couch when a transition is made from program to commercial break. This chapter describes methods to use when employing either a fixed dialnorm system or an agile dialnorm system, both of which are described in the following chapter on audio metadata management.
In a fixed dialnorm system, a television station's AC-3 encoder is set to a fixed dialnorm number, which can range from 31 to 1. In the ATSC receiver, the AC-3 decoder will attenuate the incoming level by the difference between the received dialnorm number and –31 dBFS. If the incoming dialnorm number is 31 (the negative sign is not used), the incoming audio will not be attenuated at all. If the incoming dialnorm number is 24, for example, the incoming audio will be attenuated by 31–24=7 dB. (The actual equation used here is –31 dBFS +24 dialnorm=–7 dB volume adjustment, but the previous equation is easier to understand.) The higher the dialnorm number is, the louder the audio delivered to the viewer.
The RP also describes the concept of "Target Loudness." The loudness of a program or commercial segment is measured over some time period using the K-weighting algorithm, yielding a loudness figure in LKFS. The audio material is mixed or adjusted so that its loudness number's absolute value is equal to the fixed dialnorm number used by the ATSC encoder, that is to say, if the encoder's dialnorm value is set to 24, Target Loudness is –24 dBFS. In a fixed dialnorm system set at 24, the loudness of all program and commercial elements is measured and conformed to –24 dBFS. If this is done, the embedded metadata in all such material received in the home forces the "dialnorm volume control" to attenuate all incoming audio by 7 dB, and the loudness perceived by the viewer remains relatively stable.
In an agile dialnorm system, all program and commercial material, each segment of which may have a different loudness number, also has accurate dialnorm metadata embedded in it. The AC-3 encoder used in this scheme is capable of receiving the embedded dialnorm number of incoming material, and dynamically adjusting the encoder to transmit that same dialnorm number to the receiver, which dynamically adjusts the attenuation of the incoming audio material.
This is analogous to an automatic hand continuously turning the receiver's volume control up and down, as appropriate to compensate for loudness fluctuations, and ensuring relatively stable loudness for the viewer. For such a system to work, all program and commercial elements must have accurate dialnorm metadata embedded, so that the encoder can dynamically adjust the transmitted dialnorm number appropriately. Both the fixed and agile dialnorm approaches accomplish the objective of delivering stable loudness to the viewer.
Following the chapter on metadata management is a chapter on Dynamic Range Management. This chapter addresses the Dynamic Range Control (DRC) feature of the AC-3 system, in which dynamic range control (dynamic range compression) words may be generated and transmitted; and either applied or not applied in the receiver, according to the wishes of the viewer. If they are applied, the program audio dynamic range is compressed accordingly. If they are not applied, no compression is perceived by the listener.
Following the RP is a number of annexes, including one describing program loudness and true peak audio measurements; room acoustics and loudspeaker placement; room correction; reference monitor setup for television; loudness ranges; AC-3 DRC details; AC-3 metadata parameters; and two quick reference guides.
If the new ATSC Recommended Practice on Loudness is voluntarily adopted by broadcasters and their program and commercial suppliers, or if it is mandated by law, it should improve the DTV loudness situation, providing its recommendations are understood and properly implemented.
Randy Hoffner, a veteran of the big three TV networks, is a senior consulting engineer with AZCAR. He can be reached through TV Technology.