DTV Audio:Viewer's Choice, Finally

Ever notice how we plow right into new video technologies with gusto? Digital, servers, streaming, graphics, effects, HD, SD, 1080i, 720p, 480i or p, 4 x 3, 16 x 9-piece of cake. Nothing is too complex or without merit.

Ever notice how we plow right into new video technologies with gusto? Digital, servers, streaming, graphics, effects, HD, SD, 1080i, 720p, 480i or p, 4 x 3, 16 x 9-piece of cake. Nothing is too complex or without merit.

But audio-now that's another matter. Headroom, +8 dBm, or is that 0 dBu (it depends), AES/EBU, twisted-pair or coax, stereo, surround, AC-3, Dolby Digital, Dolby E. Nothing is seemingly more complex than audio.

Nothing is more important to the television program than audio. Today most DTV stations either don't process the audio at all or take it from the regular NTSC processor. Neither one is right for DTV. We had just mastered monophonic, FM modulation limits, loud commercials, and started in on stereo and SAP when digital came along with its complexities. But AC-3 audio, DYNRNG, and DIALNORM scares the tar out of broadcasters, and confusion reigns. Dolby has tried for years to explain these terms to broadcasters. The movie industry has used them for years.

Actually, DYNRNG and DIALNORM are something broadcasters and the audience have been wanting for a long time :1) wider dynamic range (DYNRNG) for those who can appreciate it, and 2) moderating (NORMalizing the DIALog) the loudness of different program segments.

The AC-3 audio system developed for DTV provides: 1) wide dynamic range for home theater users (no compression), 2) less dynamic range for standard stereo (light compression), and 3) reduced dynamic range (heavy compression) for late night viewing. How about the receivers? Well, some have mode selectors and some don't. But simple DTV receivers with analog audio outputs have modes 2 and 3 built in.

With DYNRNG, the broadcaster does not apply compression because dynamic range is the viewer's choice to make. DYNRNG runs continuously in the background on a frame-by-frame basis, sending signals to the receiver to tell the dynamic range circuitry what to do. The action is a combination of compression and expansion. So, DYNRNG allows viewers that want it to be blasted with sound effects or compressed to keep the neighbors from complaining, yet not miss any of the dialog.

Now DIALNORM. Dialog normalization has been a vexatious problem since the audio operator disappeared from the master control room and was replaced by an audio processor. DIALNORM is set by the broadcaster according to the headroom needed by music and sound effects for a specific program. By establishing a constant loudness for all dialog, no matter what the program content, music, and sound effects may be set at any level desired by the producer. Since there are limits to the headroom of any system, digital or not, the solution is to send a signal to the receiver to tell it, for example, that the dialog of this program will be lower to allow for much louder sound effects. In the receiver, the level of the dialog will be brought back to normal volume levels that then allows the effects to be very loud or compressed according to the viewer's setting of DYNRNG. But the dialog loudness stays the same from program to program.

What happens if the broadcaster forgets to set the DIALNORM? Well, it's not serious, but the dynamic range may not be optimized for that specific program. The combination of dialog normalization and dynamic range settings provides the station and viewer with simple controls to maximize the effect of program content but with defaults that will maintain reasonable audio performance for those not interested in the controls.

Broadcasters control the dialog normalization parameter by their choice of the compression profile, whether it be music, talk, movies, or something else. For each profile, a signal is sent to the receiver to tell it how much compression and/or expansion to use for each of the three reception modes described above. The amount of compression (and expansion for low-level sounds) was predetermined by Dolby engineers after years of subjective testing and experience.

Bottom line: The broadcaster selects the program profile with DIALNORM and the viewer (or equipment configuration) selects the dynamic range (DYNRNG) to match the viewer's environment. The result is TV audio, as we want it-finally.

Dolby describes these two important audio system controls in papers at its website (see "For MoreS" above). In particular, read All About Audio Metadata. Stay tuned.