As you all probably know, we now have a "loudness law" on the books: the CALM Act. The Commercial Advertising Loudness Mitigation Act is intended to reduce the annoyance (and, more importantly, the resulting complaints to Congress), that viewers experience when viewing television and being subjected to the annoying loudness changes that sometimes occur when a given program goes to a commercial break and/or when a new channel is selected.
Fortunately, the law isn't at all bad. In its final form, it restrains itself to "incorporating by reference and making mandatory... the 'Recommended Practice: Techniques for Establishing and Maintaining Audio Loudness for Digital Television' (A/85), and any successor thereto..." This Recommended Practice, known as ATSC A/85, was generated by industry practitioners entirely independent of the government, actually prior to the law's creation.
So we've been ordered by the government to obey our own internally developed recommended practices for the management of loudness in television broadcast and distribution. Could be worse, as Mark Antony is alleged to have said when he first headed for Cleopatra's tent.
INTRODUCING –24 LKFS
At the heart of ATSC A/85 is a "new" amplitude reference level: –24 LKFS. I'd like to use this column to introduce you to it. It's going to be important for us.
LKFS is derived from dBFS, which is decibels referenced in the digital realm to the amplitude level of the most significant bit of the signal, hence Full Scale. LKFS is in fact an amplitude level. However, it is intended to stand in for the subjective quality we call "loudness." In order to make that at least reasonably plausible, LKFS modifies simple amplitude level (as expressed in dB) to more closely resemble what our ears hear under normal television viewing conditions.
There are four aspects to that derivation:
First, we boost the high frequencies (approximately 4 dB above 2 kHz) of all five 5.1 channels (we ignore the LFE channel in all of this);
Second, we roll off the low frequencies of those channels (below 200 Hz, from –2 dB at 100 Hz to –13 dB at 20 Hz);
Third, we calculate the "power" RMS average for the entire audio file of each channel segment of interest (this is known as an "Leq" measurement);
Finally, we sum the powers of those five channels (after boosting the surrounds by 1.5 dB each).
The resulting number, converted to and expressed in dB, is our digital LKFS level. It focuses on high frequencies, discounts low frequencies, and tends to hover closer to the higher levels than the lower ones over time, thanks to the power averaging.
24 dB below Full Scale (zero LKFS) is the new mandated nominal, or "target," level for audio broadcast (unless you can measure and use Dialnorm). Hence: –24 LKFS.
Unfortunately, this requires an appropriate meter or equivalent algorithm, as noted in A/85: "Although the mixer balances a mix using his or her ears, an objective loudness measurement helps to maintain consistent loudness within and between programs."
The familiar VU and PPM meters measure neither the loudness nor the true peak levels of the signal. The characteristics of many of the common "electronic" meters available are unknown, and contribute to the inconsistent and confusing situation found in practice today.
"This [Recommended Practice] provides guidance that, if followed, will result in consistency in loudness and avoidance of signal clipping. The specified measurement techniques are based on the loudness and true peak measurements defined by ITU-R Recommendation BS.1770."
So, there is no reasonable way to estimate LKFS from VU, peak or PPM meters without a lot of practice at audio level estimation. You are going to need new metering, if you haven't already got it. Look for meters that use the "ITU-R BS.1770-1 algorithm" or something equivalent. Dolby, TC Electronics have such products, among others.
Where it gets a little fuzzier is with the so-called anchor element and the target loudness. The anchor element is defined in the recommended practice as "the perceptual loudness reference point or element around which other elements are balanced in producing the final mix of the content . . ." Basically, this element refers to dialog whenever it is present. The fuzziness grows out of the fact that much of the time there is more than just dialog happening, and a "simple" measurement of LKFS is going to yield readings that are greater than the LKFS of just the dialog or anchor element.
TC Electronics’ TM9/TM7 TouchMonitor supports the ITU-R BS.17701 algorithm for loudness.
In their LM100 meter, for instance, Dolby uses some DSP to detect and extract only the level of dialog. This process (called Dialog Intelligence) has been a cornerstone of the dialnorm process from the beginning of metadata usage in set-top boxes—they've called it dialog "Leq(A)." But when you're working in real time and/or don't have something like an LM100 meter, you are going to have to estimate the level of the anchor element.
Meanwhile, the target loudness is, simply, –24 LKFS for the anchor element. Given such a target, I would expect to see levels of up to –16 LKFS (maybe even higher) for the entire mixed program. It's unclear, unless you have the meter and the post-production time to do all the measurements and corrections. More importantly, there is potential for error to creep in at any point in the broadcast. We need to ensure that the original signal is at an appropriate LKFS level (approximately –16 to –20 LKFS for a –24 LKFS anchor element level). We then need to maintain that level at approximately unity gain throughout signal distribution and transmission.
Finally, at the set-top box, we need the approximately correct level and dialnorm metadata, so that viewer can set the level to their satisfaction and then leave it alone, having the audio continue to sound good! This dispels their urge to complain.
MANAGING TRUE LOUDNESS
Finally, keep in mind that, in fact, loudness is subjective. It is a primary carrier of emotional intensity and nuance. My own take, after a career in mixing and mastering music, is that first we need to get the levels under control and then we need to manage that sensation of loudness, through our careful handling of all the audio elements in our mix over time. Our goal is to achieve the best and most convincing emotional rendering of our audio content. This means we need to manage the louds and the softs, getting our mix to breathe, to have feeling and to sound authentic. This goes way beyond the notion of LKFS, which must be viewed as a starting point for excellent production, not as the ultimate outcome.
Now it's the law! You gotta keep the loudness consistent and non-irritating. Get used to –24 LKFS for spoken voice over time (yielding approximately 65 to 75 dB SPL for end-users). If you get so you can do that well, your GM is not going to go to jail for excessive loudness and your station owner is not gonna be massively fined for same. Overall, that's got to be a good thing.
Next month, we'll review dialnorm for those of you who have either forgotten or never understood it in the first place. In the meantime, thanks for listening.