Monitoring loudness levels

How loud is your television? As loud as you want it or louder? The whole issue of loudness is fast becoming a hot topic amongst broadcasters.

Viewers at home are watching a program with the sound turned up to a comfortable level so as not miss anything. Then along comes a commercial break, and suddenly the viewers are deafened because the sound is so loud. How can this be? How did that happen?

Over recent years, there has been a lot of talk about the differences in the audible level of broadcast sound, especially with broadcasters increasingly turning their attention towards 5.1 surround sound. Loudness levels are not just an issue between various broadcasters. Often the issue extends to programs on the same channel, indicating that loudness levels are not being monitored or being given the attention they deserve. Trailers for forthcoming programs can be even louder — and, therefore, more annoying — than the commercial breaks.

The fact that this happens is something we can all agree on. Broadcasters need to come up with a loudness level with which most people are comfortable, and to have some way of measuring it.

This presents a challenge because the preferred listen level actually depends on several hard-to-measure parameters. The age and sex of listeners, the time of day they are listening and the types of job they do are just a few of them. Others include the impact the audio is supposed to be making in relation to the picture. In some cases, it needs to be loud to create atmosphere. The goal is to find a good balance that leaves viewers free to enjoy the program without having to take control of the sound. When we toggle through the channels of our TV sets, we soon notice that the audio levels of the various channels vary. Some attempts have been made to solve this. One approach is via automated level control; there is no human interpretation at all.

In the U.S., where the HDTV standard includes Dolby 5.1, metadata incorporated into the datastream includes a value for the Dialogue Normalization (DIALNORM) parameter, which should give a trouble-free volume level setting in viewers’ homes. The theory is that set-top boxes in the home should interpret the DIALNORM value and automatically adjust the volume to a preferred setting for the type of material being broadcast. In practise, viewers often forget to set the DIALNORM parameters. Another drawback is that in order to get a DIALNORM value to put into the datastream, the program has to be played all the way through. This makes it difficult to include DIALNORM values in live programming.

Peak-reading meters

Another approach involves using level meters that are designed to give good correlation between the reading on a scale and what is actually being heard. Attempts are being made to create such meters, but at present, most people are relying on standard peak program meter (PPM) technology for information about loudness levels, which sadly doesn’t provide all the information they need.

Traditional PPM level meters were designed when everyone worked in analog. They were a tool to drive the AM and FM transmitters as far as possible without overmodulation. PPM meters don’t offer any measure of how loud a listener will perceive a specific piece of program. All they offer are numbers on a scale, which must be interpreted by an operator in order to establish a measurement of its loudness.

When an engineer uses his ears and combines this information with the reading from the PPM meters, it is entirely possible to create good control of loudness. Many national broadcasters such as the BBC, Danish Radio and YLE used to educate their sound engineers in the fine art of balancing audio using the methods at their disposal, namely PPM meters and their own ears.

But today, it’s not just sound engineers who are responsible for audio post-production. For example, journalists often do their own audio editing. Because they are not sound engineers, they don’t understand why volume levels change and intelligibility suffer if they switch speakers or incorporate telephone interviews. We need to make it sufficiently simple so operators can mix programs without having to think too much about the technical issues. They shouldn’t have to learn mixing skills like recording speech with a higher meter reading than a piece of recorded music.

The reason special skills are used is because the PPM meter will not give an impression of how loud it sounds. The PPM must, therefore, be interpreted by the audio engineer and evaluated together with the output of the speakers in order to judge the loudness. There is only one standard way of measuring perceived loudness, and this is ISO standard 532-1975.

Figure 1. Shown here is a simple “loudness” meter. Illustrations from Audio Metering by Eddy Bøgh Brixen.

Measuring loudness

ISO 532-1975 is built around the work of Zwicker, a physicist who was conducting his research into sound during the late 1960s and 1970s. His basic idea was to use the corresponding value between 1/3 octave bands and printed curves in order to give a value to perceived loudness. The trouble is that only equipment developed for acoustical measurement has this true loudness scale implemented. Most meter manufacturers, including some who claim to offer a ‘loudness’ meter, use instead some kind of frequency weighting curve. (See Figure 1.) This measurement is based on theory and praxis from straightforward noise measurement. In the best cases, the weighting curves follow the standardized A, B, C or CCIR468 parameters. (See Figure 2.)

But in many cases, the curve is actually invented by the manufacturer. This gives rise to serious concern because there is no standardization between manufacturers, so one meter might be very different from another even though both claim to do the same thing. Apart from that, the whole idea of using a frequency weighting filter in front of the meter could well prove to be wrong, according to the latest research.

Figure 2. The weighting curves follow the standardized A, B, C or CCIR 468 parameters. Click here to see an enlarged diagram.

There is another problem with PPM metering: There are different standards. For instance, in the UK, IEC type IIa is the standard, while in Germany, it’s the DIN scale. In the U.S., the VU is often used. The mismatch or just simply the misunderstanding of the full-scale relation in the digital domain can easily offset the sound levels — for example, +18dB, +20dB, +24dB. With global program interchange, there is a need for either a standard scale or, at the very least, the ability to mix in any of the various standards.

Loudness standards

A better solution would be to implement a standardized loudness scale, one that is agreed in all territories and accepted worldwide. However, as things stand, there isn’t a norm for broadcasters, although work is being carried out by the ITU, whose SRG-3 group is trying to establish a common standard. At a recent SRG-3 meeting, many suggestions based on loudness tests carried out on approximately 100 people were evaluated. Many of the suggestions were based on the frequency weighting theory, but further investigation seemed to indicate that, when looking at the energy content of the signals rather than pure filtering, there was a higher correlation between the meter readout and perceived loudness levels.

Figure 3. When looking at the phon curves of our preferred listen level at home (here stated to be 70dB to 90dB SPL), we see that the curves for the three lines 70, 80 and 90 are almost identical. Click here to see an enlarged diagram.

When the RMS value of a complex signal was measured, it seemed that the best results were achieved in praxis. One can use well-accepted phon curves to argue this point of view. When looking at the phon curves of our preferred listen level at home (here stated to be 70dB to 90dB SPL), we see that the curves for the three lines 70, 80 and 90 are almost identical. (See Figure 3.)

This means that within this area, we can judge the ears frequency dependency to be linear. This, of course, is based on the assumption that during the recording session, the engineer used his ears to compensate for the natural un-linearity of the human ear! If we look at higher dynamic ranges and compare the 40 phon curve with the 70 phon curve, the linear principle doesn’t work. But as the program is normally mixed within the 15dB to 20dB dynamic range, it is easy to conclude that we will have no need for a frequency weighting curve in front of our measuring device.

It is easy to place a filter in front of a meter and just as easy to take an RMS measurement, but the outcome of the two approaches is very different. This is why it is important for engineers to question loudness meter manufacturers about their test methods and praxis, because the concept on which the meter is built might not prove to be the best in practise.

The subject of loudness has been under discussion for a long time, and many attempts have been made to design a meter that gives a reading that correlates to our hearing. Many broadcast stations have had their own laboratories trying to make such a meter, but as yet no one has succeeded. It is indeed no easy task.

But at least the work of the ITU SRG-3 group is going a long way towards setting a practical standard, and that’s something we all should welcome. With luck and cooperation, this group should be ready to recommend a new standard by the autumn of 2005. Let us hope that the industry accepts its recommendations; otherwise, we will just have to keep reaching for the volume control.

René Moerch is technical director for DK-Technologies