Audio formats

Since the early days of cinema and broadcast, the business of audio capture, production and post production has always been the recipient of less attention, manpower and smaller budgets compared with video, which has also consumed the lion's share of funding for infrastructure upgrades over the decades. While television moved into color at the turn of the 1960s and 1970s in developed countries, regular transmissions did not upgrade from mono to stereo audio before the late 1980s and 1990s. And stereo is as far as many broadcasters have come to this day for reasons this article will explore.

Competing formats

Forward-thinking broadcasters in Europe, the Far East and the Americas made the leap into some form of HD broadcasting in the early years of the 21st century. But just as with the previous major transition from monochrome to color, audio lagged behind. For several years, there was uncertainty among broadcasters over which multichannel audio format (i.e. with more than two channels) to adopt to accompany HD transmissions. This was in contrast to the world of film, where 5.1 surround had emerged as a widespread standard for cinema and home theater use by the 1990s.

Essentially, 5.1 audio comprises five full-bandwidth channels for speakers positioned front right, center, front left, rear left and rear right, with a bandwidth-limited “dot one” low-frequency effects (LFE) channel to handle the bass frequencies usually only found in special effects or explosions. However, 5.1 is far from the only surround-sound format. There are many, and each uses a different number and spatial arrangement of speakers, with its relative pros and cons. Among others, audio standards are defined for 7.1, 9.1, 10.2 and experimental 22.2 formats that include height information. Figure 1 shows a diagrammatic representation of 5.1 speaker positions, with an indication of the positioning of the additional side speakers in a 7.1 setup. (Note: Precise placement of the LFE channel is not so important, as the human hearing system has difficulty localizing low frequencies. It is shown positioned to one side here for the sake of completeness.) It took until the late 2000s for 5.1 to achieve acceptance among broadcasters, encouraged by the reasonably widespread use of the format in home theater systems in the developed world.

The ongoing move to 5.1

In the UK, the BBC's early 21st century experiments with surround sound, including Dolby-matrixed OB test transmissions from the annual Wimbledon tennis tournament, used the 4.0 format (front left and right, rear left and right), with no front center channel or bandwidth-limited LFE channel. For some years, there was a debate within the BBC as to whether material traditionally mixed in the center of a stereo signal — such as voice-overs, presenter dialog or narration — should be mixed equally in the front left and front right channels of surround material (the so-called “phantom center” approach) or routed to its own discrete center channel, as in cinema sound. Transmissions in the 4.0 format used the phantom center approach and omitted the LFE channel altogether. Following the permanent launch of its dedicated HD channel in 2007, the BBC settled on the use of 5.1 as its multichannel audio standard for HD transmissions, but as of early 2011, stereo and 5.1 soundtracks are both still deemed acceptable by BBC HD program delivery guidelines, and the audio accompanying much HD content is still stereo.

In the United States, the situation is even more fragmented. The major broadcasters, including ABC, NBC, CBS, ESPN and Fox, now offer 5.1 surround sound on some (by no means all) of their HD programming. However, the surround audio does not always reach viewers; not all local television providers have the infrastructure to transmit 5.1, even where it is broadcast.

The situation is more advanced and consistent in Japan, where NHK undertook its first HD transmission with 5.1 audio in 2001; regular transmissions began in 2006. In Europe, when satellite broadcaster BSkyB began planning its HD channels in 2003, it specified infrastructure that would be capable of producing and outputting surround sound from the beginning. Although Sky has not offered 5.1 audio on every program since beginning HD transmissions in 2006, a wide variety of its total HD (and latterly 3-D) output has offered 5.1, initially solely on Sky Sports coverage, but now on drama and light entertainment transmissions as well.

Infrastructure upgrades and product development

To move from stereo production to a true surround-sound workflow, where audio is captured and mixed in 5.1 as part of the original production of the program material, rather than in post production, completely new equipment and infrastructure are required. BSkyB's move to 5.1-capable audio systems from 2004 to 2006 necessitated the purchase of new surround microphones; mixing consoles with much improved channel counts and DSP capabilities; multispeaker arrays for monitoring; enhanced multichannel editing and post-production tools; and faster, more powerful networking and interconnection infrastructure. That is to say nothing of the completely new HD OB vehicles Sky had to commission and the new servers it needed to archive audio in the new formats.

One of the side effects of such a far-reaching transition has been to drive product development in Europe in all of these fields, as much of the broadcast technology now used by BSkyB did not exist when it began planning its HD channels. Audio companies have developed new processing platforms to meet the requirements of 5.1 audio productions.

Compromises

Unsurprisingly, the outlay required for this kind of far-reaching upgrade has hindered the uptake of surround sound for broadcast across the globe. When you consider the breathtaking cost of upgrading just the video side of a production center to HD, it's no wonder few broadcasters have chosen to undertake a grassroots retooling of their entire audio production chain at the same time, especially in the current economic climate. Compromises and part-upgrades have been widespread, with many smaller broadcasters choosing instead to make new studio facilities and OB vehicles HD-ready by putting in HD-capable cabling and networking infrastructure, but retaining existing key SD hardware for the present. The theory is that the main items of hardware required for 5.1 operation (such as audio consoles and loudspeakers) can then be plumbed into the HD-ready connections during a future financial year, when equipment budgets are in better shape.

Another common approach is to eschew the expensive option of a true surround-sound workflow and instead continue to produce everything in stereo. The completed two-channel mix is then processed at the post-production stage to create a multichannel surround-sound version for broadcast. This approach, known as upmixing, has the obvious financial benefit that no new infrastructure is required except the upmix processor and surround-capable playout hardware. However, the viability of the practice depends on how good the upmixer is, and some are much more realistic than others.

Upmixing

Software-based stereo to 5.1 upmix processors have been available for over a decade, but many of the early attempts created center channel information by combining the stereo left and right channels at a lower level while placing a processed version of the left and right channels into the left and right surround speakers. Artificial reverberation was often added, or they were treated with phase shifts to create slightly different information for the surround channels. Such approaches work reasonably well on some types of program audio — for example, crowd noise at sports fixtures, where reverb on the surround channels doesn't seem inappropriate even if it wasn't present in the original signal — but reveal their weaknesses when presented with dry, close-miked material, for example unplugged-style music coverage. With such material, the added processing in the surround channels can seem at odds with the original acoustic.

What's more, older upmix processors often create multichannel surround that isn't stereo-compatible. Since multichannel audio is often automatically downmixed to stereo or even mono en route to the listener (particularly in the developing world, or to create stereo mixes for SD transmission), this is not desirable and is another argument for implementing a true 5.1 audio workflow, where such incompatibilities can be more easily avoided.

Fortunately, the past two to three years have seen the release of a number of more advanced broadcast-quality hardware upmix processors. These units produce downmix-compatible 5.1 surround from two-channel stereo and achieve it without adding gratuitous processing to the original audio, although the quality of their output is still program-dependant. Even those broadcasters that capture, mix and output in 5.1 are finding a use for upmixing technology as it will be several years before even these companies can remove all stereo-only infrastructure from their workflows. While stereo-only OB vehicles, effects libraries and edit suites remain in service, upmixing will have its place.

A twin-track future?

From a quality perspective, it would be nice to think that the future will see broadcasters gradually upgrade so that the majority of HD and 3-D content is accompanied by audio captured and mixed in surround from the outset, as in the film industry.

However, given the high costs of implementing surround workflows in full, and the lower priority that audio infrastructure upgrades have historically been accorded in television circles, surround for broadcast will likely continue to be produced according to a twin-track approach for the foreseeable future. This means full surround workflows for the better-resourced companies and upmixing technology overlaid on existing stereo workflows to a greater or lesser extent by smaller broadcasters.

Matt Bell is a freelance technical author in the fields of pro audio and audio for broadcast.