Figure 1. The typical levels in analog voltage measurement. Click here to see an enlarged diagram.
Maintaining the accuracy of the original acoustic energy is critical to the recording and transmitting of audio signals. As this energy is transformed electrically and sampled digitally, the goal is to maintain the integrity and exactness of the audio in all of the forms it may take as it moves within a modern TV infrastructure.
Passive or active, analog or digital, today's audio devices all have a finite capability to handle the maximum amplitude of the audio signal. If this capacity is exceeded, the resulting audio — due to a failure of the electronics, current limitations of the circuit or exhaustion of available bits — will not match the original audio and will be distorted.
Establishing and maintaining the proper relationship of the signal amplitude to the minimum and maximum capabilities of the equipment and storage medium in every step of the signal chain is essential for quality audio. Maintaining accurate levels during A-to-D conversions and always tracking perceived loudness (and how it is affected by level changes) must be done to safeguard signal accuracy. Identifying the critical steps in the signal chain and employing proper signal measurement practices to mix and distribute program audio is key. Problems can arise when analog signals are used in a digital plant if the differences in level characteristic and scaling are misunderstood and enough care isn't taken to the process of handling both types of audio signals.
To master the challenges of today's hybrid analog-digital world, let's review the terms and explore the practices that are fundamental to maintaining a modern TV facility's audio.
The reference level establishes a nominal amount of signal at an understood calibration point that is well above the noise floor of the signal and well below the overload point. This is the comfort zone. This level and the amplitude just below is where the majority of the audio signal should be mixed, transmitted or recorded.
The area above the reference level but below the maximum amplitude or overload level is referred to as headroom and is needed to maintain undistorted recording or transmission of the fast peaks of the audio material.
The entire area from the noise floor to the recording or transmission overload level encompasses the dynamic range of the medium or electronics. Dynamic range is measured as a ratio in dB, and its span varies from system to system.
Contemporary analog voltage measurement levels are in dBu. dBu is used when an audio voltage is connected to another circuit that does not reduce the voltage of the original signal. These bridging circuits are the opposite of older matching circuits that required a signal to be loaded down by the destination devices for proper level. Matching circuitry was given up for the most part in the '70s and '80s, giving way to the simplicity of bridging circuits that have low source impedances around 60Ω or less and high load impedances that are typically 10,000Ω or more. Maintaining a ratio 10 times or higher of load impedance to source impedance bridges and does not load down the circuit.
dBu represents the level compared with 0.775 volt RMS with an unloaded, open circuit source. (The “u” in dBu stands for unloaded or unterminate. It is a voltage that is not related to power by an impedance.)
Analog operating levels and the the VU meter
dBm represents the power level compared with 1mW. This is a level compared to 0.775 volt RMS across a 600Ω load impedance.
A typical noise floor measurement for a studio analog mixing console with a single channel routed to program out is about -82dBu or 86dB below the +4dBu operating level. We frequently use 0VU on the meter scale to actually mean +4dBu. (See Figure 1.)
John Woram's “Recording Studio Handbook” explains this as “A carry-over from the earlier days when meter construction was difficult and additional resistance was necessary for accuracy. This resistance loaded down the meter by 4dBm and therefore 0VU was actually +4dBm. Rather than redefine the zero reference level, it was considered expedient to leave 0dBm at 1mw across 600Ω, with the understanding that 4dBm above this value would correspond to a zero meter reading.”
Many modern VU meters are adjustable but may read 0VU with a signal of +4dBu applied.
Broadcasters still use this convention today for analog circuits, including mixing consoles, distribution amps and many signals that interface with telephone company circuitry, even as carriers move from analog to digital. Analog inputs connecting to digital multiplexing devices frequently operate with a +4dBu reference level.
Headroom's at the top
This audio signal, combined with video, feeds an encoder with A-D conversion, establishing an audio-video digital bitstream for carriage to the destination. This type of signal path is frequently used for studio-to-transmitter links as land-based fiber continues to be chosen over microwave for getting the studio's signal to the transmitter site.
Figure 2. Analog operating levels and their relationship to the VU meter and digital signal levels. Click here to see an enlarged diagram.
Headroom, as mentioned earlier, is defined as the amount of signal handling capacity above the reference level but below the specified level that causes overload distortion of the equipment or medium. Frequently, 20dB of available headroom is desirable for all equipment or storage media to handle the fast peaks an operator could not be expected to control. This provides a safety margin before the onset of distortion. Therefore, the maximum output level of a piece of broadcast gear using this example is +24dBu. (+4 = operating level plus an additional 20dB). Frequently, broadcast analog circuits have a greater margin and exceed this example by several dB.
Carrying over to digital
It's easy to understand a correlation of the analog signal level to the digital signal level by going number-by-number or level-by-level. See Figure 2 for a typical audio plant with a -20dBFS operating level calibrated to +4dBu analog.
Decibel full-scale (dBFS) is a unit of measure for the amplitude of digital audio signals. It is critical to understand that though digital and analog signals have similarities, their characteristics differ significantly. 0dBFS occurs when all the binary digits (bits) making up the digital signal are on, or read as 1s and not 0s in computer talk.
All of the bits available to make up the signal have been used at this finite point and no additional headroom exists. Trying to increase the level simply doesn't work and causes immediate distortion.
The digital scale reads in negative numbers with louder, higher amplitude signals moving from a negative number closer to zero. This may be both confusing and helpful. However, if you think about the concept of a signal's headroom and apply a digital level reading to it, you can instantly determine the amount of headroom that's available for your measured digital signal. In fact, you are creating it. An example would be a digital level signal of -20dBFS. It has 20dB of headroom because digital overload occurs at 0dBFS, a signal 20db louder than -20dBFS and so on for -18dBFS, -16dBFS, -12dBFS, etc.
Figure 3: Top to bottom:
1. Original 1kHz source tone at 20dBFS/+4dBu reference level.
2. Same tone running through analog equipment attempting a gain increase 3dBu over the clipping point. Notice minor flattening of the waveform.
3. Same tone running through analog equipment attempting a gain increase of 6dBu over the clipping point. Notice more severe flattening of the waveform
4. Same tone running through digital equipment attempting a gain increase of 3dBu over the clipping point. Notice more severe flattening of the waveform than the analog signal at this level.
5. Same tone running through digital equipment attempting a gain increase of 6dBu over the clipping point. Notice more severe flattening of the waveform.
Please note: These waveforms are intended for illustration purposes only.. Click here to see an enlarged diagram.
Another major difference that should be noted is at the point of digital signal overload at the 0dBFS level. Should an analog signal reach its maximum amplitude capacity (as illustrated in Figure 3), distortion will creep into the signal, most likely at a gradual and perhaps indistinguishable rate, until the distortion is audible after initial overload is reached and then exceeded. This process is different and more forgiving to the ear than reaching digital overload at 0dBFS.
When an unprocessed digital signal clips at this level, distortion is immediately apparent. Based on the 1kHz tone used in Figure 3, we can see an immediate distortion of the waveform that produced severe ringing of the audible signal. As we raised the digital signal in our tests even higher, it began to oscillate erratically as the circuit continued to fail. This may have been due to the specific characteristics of our digital workstation trying to operate at this level.
Digital signal processing circuitry may include analog saturation modeling, which is a form of processing that simulates the familiar analog clipping sound and is intended as a safeguard against ruinous digital clipping.
Although it varies from facility to facility, it's helpful to use digital scales when mixing and recording digital signals. This practice introduces and constantly reminds the engineer of a different digital — not analog — measurement standard and the different digital signal characteristic that accompanies it.
Just as critical as the digital scale is the proper meter that can quickly react to peaks and alert the operator to where the signal is in relation to 0dBFS. (See Figure 4.)
U.S. broadcasters may typically use a digital reference level of -20dBFS as a plant standard and direct carryover from an analog reference level of +4dBu. This is a continuation of providing 20dB of headroom for the circuit with the reference 20dB below the system maximum as stated in SMPTE Recommended Practice -155-1997.
However, this may vary from facility to facility and especially so from country to country. European EBU Technical Recommendation R68-2000 states that an alignment level of -18dBFS is to be used for reference, a level 18dB below the maximum possible coding level.
Figure 4: An audio signal as seen on a Tektronix 764 Digital Audio Monitor. Note the signal applied is peaking well into the headroom, and channel 3 is peaking at 0dBFS. Click here to see an enlarged diagram.
Using reference tones facilitates the accurate transition from one standard to another. This practice enables setup of standard house levels and headroom in a plant — no matter what the origin of the signal or where the facility receiving the signal may be.
Most important is calibration all the way through the signal chain and the need to adhere to the identified reference, headroom and overload points and their indicators. Remember, the brick wall happens at 0dBFS no matter what, and distortion isn't pretty.
Understanding dialog normalization (dialnorm) and how it works is critical for completing an explanation of digital audio levels in today's world of DTV and DVD.
Enter DTV, DVD and Dolby Digital
Dialnorm is one of 29 metadata parameters that are part of the Dolby Digital bit stream required for all ATSC DTV streams and worldwide DVD audio. Dolby AC-3 is the term identifying the codec for this technology.
Metadata is steering information that is encoded with the audio when it's bit-reduced by Dolby Digital. Metadata is essential data that controls a home decoder to do many things, such as turn on the proper number of channels (e.g., 5.1 or two-channel), let the consumer pick a dynamic range from choices provided by the show's producer (e.g., wide, normal or narrow) and, for dialnorm, automatically adjust the perceived loudness from show to show, commercial to show and so on.
Using a device, such as Dolby LM-100 loudness meter, show audio is measured and dialnorm readings are created using the LEQ-A audio measurement practice that's part of Dolby's Dialog Intelligence. This provides a long-term A-weighted summation in real-time, yielding a figure for the perceived loudness of the audio within minutes. This practice (in conjunction with the proper and complete distribution of the metadata) has the capability to maintain the same perceived loudness for all show, commercial and promotional audio during all parts of a station's DTV broadcast day. Dialnorm is intended to eliminate the NTSC issue of inconsistent loudness during broadcasts. How it works: The audio engineer labels the program or commercial audio with a measured numeric value between -1 and -31 as indicated by an LM-100. This figure is entered as part of the overall metadata stream with a Dolby 570 authoring tool and muxed directly into the VANC of the HD-SDI stream or into the Dolby E metadata bit stream. Dolby E is used to pass multichannel audio and metadata through broadcast gear with limited audio channel capacity.
DTV stations can pass this compatible metadata to home decoders using their Dolby 569 AC-3 encoders, set up to receive the signal on their external metadata input. The audience's set-top boxes or home theater receivers extract the dialnorm metadata from the Dolby Digital bit stream transmitted from the station.
This information is used to dynamically adjust the perceived loudness for all audio to the same -31 level no matter the actual audio amplitude a show or commercial may have. With dialnorm, no limiting or compression is used and the audio maintains all of its dynamic range.
Moving from acoustic energy to electricity, analog to digital, digital to dialnorm, the modern mixing engineer has an arsenal of tools and established practices. Proper set-up, use of headroom and measurement of perceived loudness all help to ensure the accuracy and consistency of the audio from its origin to delivery. The broadcaster's awareness and use of these practices will ensure that today's digital savvy home theater audiences receive the extraordinary sonic experience they now demand.
Jim Starzynski is principal engineer in advanced technology for NBC-Universal.