Normalizing dialog

Is there a broadcast engineer who hasn’t heard the clichéd accusation that TV stations turn up commercials? Ever heard of a broadcast TV station where that actually happened? Me neither. Pre-automation master control audio operators were viewers who got annoyed like everyone else. If something sounded too loud, most good master control audio operators ignored the VU meters and didn't reach for the monitor speaker volume knob. Good operators potted down the source gain at the console, and people still complained.

Typically, downstream from the from the Master Control audio console were various audio processing devices designed to prevent over-modulation at the input of the STL and transmitter. Those devices generally fought with changes in levels and dynamic range. It was a problem that kept getting worse.

Audio-follow MC switchers, broadcast automation, crazier producers and more powerful audio compression techniques made the loud commercial problem so loud the United States Congress felt the need to take action. Can digital television technology fill in for the missing broadcast engineer with a critical ear on the demod and a hand on the gain knob? The FCC and the industry have set out to do exactly that for a number of good reasons.

Insert CALM cliché here

The CALM Act became law last year, and the resulting FCC regulations go in effect on Dec. 13, 2012. Its primary purpose is to reduce the sound level variations between commercials and programs. Its secondary purpose, which the FCC encourages and I completely agree with, is to eliminate AGCs, compressors, limiters and other downstream audio processing gear that reduces or alters dynamic range as it was originally intended. It also gives producers a standardized definition of loudness to use as an absolute reference. It does not, however, eliminate the need for golden ears.

How the CALM Act affects your facility depends on where it fits in the program creation and distribution chain. The beauty of the CALM Act, and similar laws on other countries, is that it recognizes world-wide groups of TV audio experts and gives them an opportunity to scientifically define repeatable loudness measurements and to incorporate it into industry standards and the law. In its simplest terms, the CALM Act is about making viewers happy. How it’s accomplished is where it gets interesting.

Loud commercials aren’t unique to North America. Most of the world thinks TV commercials are too loud. Brazil passed a loudness law in the 1990s. In the

U.S., the CALM Act itself simply states the problem and directs the FCC implement rules to fix it. The FCC only regulates broadcasters with call letters, and multichannel video programming distributors (MVPDs) such as large cable, satellite and IPTV. It does not regulate cable or broadcast networks. The FCC developed a brilliant strategy to enforce the rules on those that it can regulate by providing an incentive to networks, producers and providers to certify content CALM Act compliance.

Larger cable and satellite MVPDs don’t have the time, equipment or people to examine and certify all content on all channels before it is distributed to homes. In less than a month, each must ensure its own inserted commercials are in compliance and push for compliance of all programs and commercials within that it carries. Neither the law nor the FCC Report and Order require certification. The market will.

Not that loudness

If you look up loudness in your trusty, dusty Audiocyclopedia, you’ll find something that has absolutely nothing to do with 21st century loudness. So forget what you thought you knew about loudness. The only way to calculate digital loudness with scientific accuracy and repeatability is to monitor and record the digital levels of each packet containing audio, and apply an algorithm to determine loudness. While some might argue with the algorithm formulas, measurements made with the same algorithm will at the least be scientifically consistent.

The ITU-R recognized the international need and began work on a comprehensive, scientific, statistically repeatable definition of loudness. In October 2007, the ITU-R introduced BS.1770-1, which was the original digital algorithm designed to measure loudness and true peak. In May 2011, ITU-R adopted BS.1770-2, which added level gating. Level gating is an important component of loudness in long-form content, but more about the technical details of gating later. A couple of months ago, the ITU-R adopted BS.1770-3, which mostly clarified true peak. Defining loudness is truly a work in progress. Fortunately, as definitions are upgraded and updated, most digital test gear can be upgraded with the latest definitions.

The basic unit of measurement of loudness is “Loudness, K-weighted, relative to Full Scale” (LKFS), and it is similar to db Full Scale (dBFS). Zero dB plus headroom equals zero dBFS, which roughly equals zero LKFS, in the most general of terms. In fact, dBFS is derived from the most significant bit of the audio signal. LKFS is dBFS after it has been altered to model the response (not necessarily the frequency response) of the human ear at typical TV listing levels. This is the point at which the measurement slides from objective to slightly subjective. To measure LKFS levels, frequencies above 2kHz on all 5.1 channels are pre-emphasized by 4dB, while the low end is deemphasized starting at 200Hz by -2dB to -13dB at 20Hz. The RMS power average for the complete audio file is then calculated, the power in dB of each channel is added, and an LKFS number is established. LKFS is interchangeable with LUFS.

LKFS is the basic unit of measurement of nearly all loudness measurements, and it is critical to ITU-R BS.1770 and subsequent itineration and crucial to the establishment of DIALNORM. DIALNORM stands for dialog normalization, and it’s the reference for all loudness measurements. It is a metadata number that controls gain within the Dolby AC-3 system used in ATSC transmission. It is a number from 1-31, with 31 being unity gain. DIALNORM, if properly set, will result in assets being played out at -31 LKFS level, regardless of their loudness. In general, people are migrating to a DIALNORM of 24 to ease confusion downstream.

The DIALNORM value represents the measured dialog level of the signal. It is the standard by which all audio is supposed to be mixed. Typically, dialog might not be the loudest sound, so its average level is measured in LKFS, and its value is set by the operator. Typical TV DIALNORM levels run at -24 LKFS, and that seems to be the unofficial common value most producers are migrating to.

The next "Transition to Digital" DTV tutorial will explain ATSC A/85 Recommended Practices, which was revised to become the implementation document envisioned by the CALM Act.

The author wishes to thank Andrew Sachs with Volicon and Steve Smith with Broadcast Technology Consultants for their assistance and information they provided.