Over the decades, audio professionals in the production and broadcast industries have attempted to control audio so their audiences hear programs that are both intelligible and easy on the ear.
Initially, audio professionals did this by simply listening to the audio content. In theory, it made perfect sense because audio is produced to be perceived by the human ear. But in practice, there were reasons why simply listening to what was running on the audio track did not achieve the desired results. It quickly became apparent that the industry needed a means of measuring the audio.
Why an industry standard is necessary
Audio is an electrical signal with a limited bandwidth, so it is possible to measure the voltage. Measurement instruments use logarithmic scaling and can have different characteristics and ballistics of level meter display. A standard audio level meter is not precise when it comes to following the peaks of the audio voltage, but that is not an issue because many peak transients don't transport much acoustical energy. In other words, they are mostly inaudible.
One might think that by combining measurement with monitoring (listening) it is possible to control the audio content. But as engineers have discovered, two pieces of audio with the same audio level can actually sound very different. What's more, the loudness impression may also sound different. There are many reasons why this is the case, not least the impact that spectral composition, equalization and dynamic range compression can have on the audio content. When these factors are applied, two pieces of audio can end up with a different sound and loudness impression.
These issues have been the subject of debate among experts for many years. The broadcast industry has recognized the need to agree and adopt a common rule to measure loudness, and both the ATSC in the United States and the EBU in Europe are currently working on projects that will set a proper norm to help all broadcasters follow industry standards.
The International Telecommunication Union's Radiocommunication sector (ITU-R) published in July 2006 a new recommended algorithm for estimating program loudness levels for use in digital broadcasting (ITU-R Rec. BS.1770 — Algorithms to Measure Audio Programme Loudness and True-Peak Audio Level). Either Leq(A) or the new ITU-R Rec. BS.1770 measurement method will, on average, yield the same results.
Hurdles audio engineers must overcome
Currently, there are two main ways to distinguish loudness: sensory and perceptual. Sensory loudness is directly related to the neural activity of the inner human ear. It is possible to model this and build a sensory loudness meter. In contrast, perceptual loudness is related to how interested the listener is in the sound. Obviously, this is not something one can model because it is a learned response that varies according to the personal involvement of the listener. As people are different in how they use their senses, there are enormous variations in the information needed to form a perceived reality. This becomes interesting for people working with sound and moving pictures because if the source can be seen, then the mind will prefer the visual evidence to the sound.
Nonetheless, the standard published by the ITU describes a proven method of measuring the audio and getting a result related to loudness. As the production and broadcast industries start to embrace loudness control, they will face some important challenges. They must get used to something they have no experience with. The industry knows how to deal with level control because it has been doing it for years, but as already stated, two different pieces of audio with the same level characteristic can be different when it comes to their loudness impression.
The industry should learn to control audio material by using loudness measurement techniques instead of level measurements because when it comes to complying with the ITU 1770 standard, what the level meter is showing is no longer important. Engineers need to understand that loudness and level are two different things. If the audio material is aligned to equivalent loudness, the level might vary wildly, and that's going to confuse those who have been trained to look for proper leveling.
The way we control audio is going to be very different in the future. The audio engineer still needs to trust his or her ears, but the optical reference instrument of the future will be the loudness meter. Audio monitoring conditions will have to be standardized and aligned in the same way; otherwise, it will be difficult to produce equivalent loudness from different locations and venues.
Another problem related to the practical use of ITU-based loudness control is the lack of references. There are standards for audio levels. These may vary from region to region, but at least people know which alignment level they need to control. The ATSC has given the first recommendation for loudness control, while in Europe, the P/LOUD group hosted by the EBU, is discussing the recommendation. If the references are published, it should then be easy to follow them. There is no evidence that loudness perception is different for various regions of the world; therefore, it should be possible to evaluate and set an international reference. However, given the politics involved, one does wonder if this will ever be the case!
The ideal solution
In the end, the industry may need to adopt the right technical solution. There are already options available, the most basic of which is to run an automated fader driven by the control signal out of the loudness detection.
But well-balanced loudness control isn't that easy. Ideally, the industry needs a system that gives short-term dynamic control as well as average level control (compliant to ITU 1770) in one box. With this type of loudness control system, the circuit should give continuous control, regardless of the source and without touching the sound of the audio material. There should be no breathing, pumping or spectral changes — just well controlled loudness. Loudness changes from different feeds and differences between program parts; the process should take care of them automatically and give listeners the results they want to hear.
This type of system also needs to restore dynamic structure and offer inaudible gain control. Transients and peaks should be precisely controlled, even if they aren't represented with the loudness detection.
The process should be easy to set up and adaptive to the program material. And all this together should be performed with a short latency time even if the human ear needs about 200ms to perceive loudness. In order to avoid dangerous time base differences between audio and picture, a minimum latency is required.
We are moving toward a situation where broadcast ingest, playout and distribution facilities will face mandatory requirements to implement loudness-based audio control solutions. However, achieving consistent loudness output would be a lot easier if international references were given and if production facilities began adopting loudness control.
Peter Poers is managing director for Jünger Audio.