Skip to main content

Speaking Volumes

Sixty souls vested in digital audio met at the University of Southern California’s Aug. 1 Loudness Summit to discuss the annoying bursts and muting of sound imposed on listeners as their DTVs transition between programs, channels and commercials.

The audio discrepancies are being investigated by the Advanced Television Systems Committee, which set the parameters for audio in its A/52 and A/53 Digital Television standards. The standards body represents the broadcast, broadcast equipment manufacturers, cable, satellite, motion picture, computer, consumer electronics, and semiconductor industries.

Summit attendees represented production sound mixers, post-production supervisors, network people, set-top box manufacturers and a television set manufacturer. A representative from the American Association of Advertising Agencies was invited but unable to attend.


The ATSC’s dialnorm (dialogue normalization) clause got the most attention. The parameter requires that broadcasters transmit a code that tells a receiver to yank the level of, for example, a news reader, to the dialogue level in a movie.

Dialnorm is based on the premise that face-to-face speech in a normal movie scene (where there is no whispering or shouting) is at the same level, whether the movie is “Driving Miss Daisy” or “The Terminator.”

“In theory, when you go from channel to channel, you get the dialogue at a constant level—all normalized to 31dB below full scale,” said summit host Tomlinson Holman, professor of sound for USC’s School of Cinema-Television and principal investigator, Integrated Media Systems.

Holman is best known for inventing the Lucasfilm THX sound system (THX stands for Tomlinson Holman eXperiment) and his key contributions to TV’s 5.1 surround sound format. He was also a prime contributor to the audio subcommittee that added the dialnorm clause to the ATSC’s A/52 and A/53 standards in 1992. He and the other presenters were united in their belief that the industry must assess (and possibly amend) the current specs and abide by them.

“The number one priority is to get consensus on dialnorm guidelines for measurement and practice,” said Jim Starzynski, principal audio architect for NBC Universal Advanced Engineering, who also heads the ATSC S/6-3 working group for loudness issues. “Our group is looking at ways programs and commercials transition with regard to dialnorm, measurement and subjective response. A best practices a-pproach may be taken—we certainly intend to do our part to engage in an industry-wide dialogue in appropriate venues, [such as] SMPTE, HPA and AES.”

Starzynski’s presentation focused on making DTV professionals aware of the ATSC dialnorm guidelines and his company’s methods of compliance. In late 2006, NBC Universal codified techniques for its 10 owned-and-operated stations and the NBC DTV network.

“For the network, a brief on carriage of metadata from the program provider to the home was highlighted for tape deliverables and remote production,” said Starzynski. “A very simple and effective method of averaging local content upconverts could be used by any DTV station to set up its audio encoder to ATSC levels.”

Implementation was completed in April 2007. In addition, the North American Broadcasters Association adopted a recommended practice based on this procedure.


ATSC Standard A/53 also refers to a method of measuring the level of spoken dialogue based on the Leq(A) (Loudness Equivalent A-Weighted) standard originated by the American National Standards Institute.

“[But] not all program material has dialogue—Leq(A) does not produce a particularly accurate level in those [dialogue absent] circumstances,” said Graham Jones, director of communications engineering for NAB.

In May, the S/6-3 working group asked ATSC members to consider an amendment to A/53 that would change the method of measuring dialnorm from Leq(A)’s specs for sound level meters to the BS.1770 method recommended in 2006 by the United Nation’s International Telecommuni-cations Union, (see “Loudness Metering: New News About an Old Topic,” Technology Corner, p. 48)

“It is important to have a measurement technique that has been vetted, tested and agreed [upon] on an international basis [as content flows around the world],” said Dr. Craig Todd, senior vice president and chief technology officer for Dolby Laboratories in San Francisco. “While Leq(A) was quite a usable technique, it was not designed for this purpose, and could be criticized as an inappropriate method.”

Todd noted that the transition would be relatively seamless, with Dolby providing software upgrades of its LM100 encoders and decoders to its customers. Moreover, its DP600 Program Optimizer already includes the BS.1770 feature, as well as the benefits of file-based audio.

“The DP600 can look at a file—it’s on a server—and analyze the audio in the file,” said NAB’s Graham Jones. “The LM100 has to take a stream of audio, so it works as the audio is actually streaming.”

Holman also cited a discrepancy of “about 10dB” between local and national news channels (the locals were louder) that could be traced to another shortcoming in ATSC’s dialnorm specs: not properly accounting for today’s compression realities.

“Network and local broadcasters were setting the dialnorm correctly,” he said. “[But] when you take five channels and you try to crunch them down to two channels, you automatically have a problem: five channels can play louder than two channels. In the standard it tells you to turn down the five channels when you put them out as two channels to play them back on a two-channel set up. That’s actually a mistake.”

By his assessment, the attendees were determined to fix the problem, though how remains to be determined.

“There may be a change in that part of the ATSC standards,” said Holman. “There may be a work around that the manufacturers can do—for the pros immediately, for the consumers, down the road.”