Surround sound

The recent news that Dolby Laboratories is in the process of buying high-tech compression developer Coding Technologies emphasizes the crucial role 5.1 surround sound will play in future broadcasting systems. Broadcast technologists had spoken somewhat glibly of 5.1 being the natural companion for HDTV, but its role will be far more than just providing big, wrap-around sound to go with the sharper pictures.

Audio codecs

Coding Technologies had been working with Dolby's rival, DTS, to develop surround systems that deliver good quality at low bandwidths. The two companies first announced the collaboration at IBC2005, when the focus seemed to be more on broadband and other IPTV applications. A year later, Coding and DTS had expanded the MPEG-4 aacPlus technology they had been developing to make it suitable for HDTV services.

Demonstrations have been given of the aacPlus system running an audio transport stream of 128Kb/s for 720p HD pictures. As broadcasters demand lower bit rates, any savings that can be made in freeing up space could result in enough room for an additional television channel, a major consideration in the increasingly competitive play-out provision sector.

Last year, Dolby introduced its own low bit-rate system, Dolby Digital Plus, described as an extension of the established Dolby Digital (AC-3) surround format. It operates at about 200Kb/s, with the ability to scale up as necessary. This system has now been included by the DVB Project as an option in the latest version of the ETSI TS 102 005 specification for IPTV, which has featured high-efficiency AAC (in other words aacPlus) since the inception of the transmission standard.

Finalization of the Dolby/Coding deal still depends on approval from the governmental and business regulatory bodies in the United States and Sweden, but there is no reason to suppose that will not be forthcoming. There is no indication at this time what will happen to Coding Technologies' proposed certification scheme, intended to guarantee interoperability along the entire broadcast chain, from encoders and head-end equipment to integrated TVs and domestic set-top boxes.

Of even more importance, and the subject of intense speculation, is where all this leaves Coding Technologies' recent partner, DTS. The aacPlus surround format that the two companies jointly developed has been selected for three major broadcast services, two of which are in Europe. The first to adopt the system was Belgian HD pioneer Euro1080, which earlier this year announced it would be standardizing on MPEG-4 for both video and audio.

During the middle of last year, the consortium behind Norway's MPEG-4 DVB-T platform made a similar announcement, selecting aacPlus/DTS transcoding for the sound component of HD services. In Brazil, a new terrestrial digital TV platform was launched in December 2007, using ISDB-T for the video and aacPlus on the audio. The aacPlus codec will be used for both SD and HD services on transmission as well as production.

In the consumer market, Coding Technologies and DTS gave the first demonstrations of aacPlus on domestic set-top boxes at the 2007 CES show in the United States last January, with a basic S/PDIF link between the codec and a DTS digital surround equipped receiver. During the launch, the two manufacturers stated that aacPlus MPEG-4 5.1 surround could be decoded by virtually all set-top boxes.

Surround-sound broadcasts in Europe

Dolby Digital is already established in broadcasting across Europe, having been used for SD movie and drama channels since the mid to late 1990s. The format and its derivatives are now featured on new HD services, with early adopters including BSkyB in the UK, Premiere in Germany and RTL across Europe. The feeling is that the market is now considering the next move, but the latest trend is for international broadcasters to implement 5.1. Broadcasters that have done this include Discovery; the American digital satellite network Voom, which is expanding into the Benelux countries and Eastern Europe; and National Geographic, which has services in Italy, France, the UK (through BSkyB) and Scandinavia, through Canal±.

National Geographic has most recently launched in Poland, a country that is witnessing a proliferation of TV channels, with Discovery also active there and MGM introducing an HD channel. New SD services are also being launched with 5.1 audio, showing that surround sound is a general consideration now and not just something to go along with HD.

Broadcasters have been selecting which 5.1 format to use by fitting the technology to needs and national characteristics, such as terrain and regional size. And the chance of a universal surround-sound format is now even slimmer because the EBU and the European Information, Communications and Consumer Electronics Technology Industry Associations (EICTA) concluded that a single European HDTV audio standard will not be possible.

The EBU stance

There will be a straight choice between the Dolby family and the AAC/DTS group, with recommendations as to which is most suitable for each form of broadcast delivery. The EBU's evaluation involved 10 of its laboratories, about 30 codecs, more than 100 human assessors and more than 1000 encoding sequences. Among the criteria considered are technical performance (quality vs. bit rate), intellectual property rights (IPR), equipment cost, availability of devices and interoperability with existing equipment and metadata.

The EBU does not intend to recommend what audio system should be used with HDTV. It hopes that, based on the technical evidence available, it will be able to advise members on the available and preferred options for different platforms (satellite, cable, terrestrial and IPTV). The EBU stresses that its members are autonomous in their technical choices, and the organization does not intend to interfere with internal decisions.

The overall trend in TV broadcasting is toward surround sound. Movie channels now routinely transmit 5.1 soundtracks, while most high-end dramas are mixed in 5.1, if only to have it available at a later date when surround sound is the norm. Live events are also now more likely to feature more than two audio channels. This year's Eurovision Song Contest was mixed in 5.1.

The most enthusiastic adopter for the technology has been sports broadcasting. The 2006 Football World Cup in Germany offered a choice of mixes, including 5.1, but few broadcasters took the surround option. The story was different for the 2007 Rugby Union World Cup, held in France. Dolby Digital 5.1 was available for all games and was mixed in the audio areas of the OB trucks at the stadiums, rather than in suites at the international broadcast centre, as happened during the Football World Cup.

Surround sound is the primary use for multiple audio channels, but this overlooks the now increasing need to accommodate audio description and alternative language tracks. Linear Acoustic, which has been working with DTS on encoding multiple channels to be carried over two-channel infrastructures, is seeing its StreamStacker encoder being used in a number of countries around the world to accommodate different languages. These include the United States and China, where there is a requirement to broadcast the different dialects found in each region of such a massive country. There is similar potential in Europe, particularly around the Benelux countries, where more than one language is spoken.

OBs in surround

Surround-sound production and post production is well established for drama, documentary and most other forms of programming, but the real development and innovation is happening in the live OB sector, with sport being the most obvious example. The number of OB trucks in Europe has grown considerably in the past five years, with new vehicles being either fully HD or HD/SD. The majority of these trucks includes audio areas equipped with 5.1-capable digital mixing consoles, supported by the necessary encoding, decoding and processing gear.

Modern digital audio consoles are as much routers and processing engines as they are mixing tools. With so much to be done in terms of encoding, decoding and switching, engineers and operators are now demanding a solution that can handle the bulk of the patching and routing of inputs and outputs within its internal system. This includes being able to integrate with as many encoders, decoders and SDI de-embed units as possible.

Broadcasters working on live OBs have developed their own methods of processing for surround sound, and this operational difference extends to how the signals are captured in the first place. Some, such as BSkyB, favor the SoundField technique, while others, notably the BBC, lean toward the MMS method.

There may still be differences in opinion and ways of working, and the Dolby/Coding Technologies situation will have an effect on the business, but that all proves the growing importance of surround sound. With the viewing public becoming ever aware of what 5.1 can do, the technology is on course to achieve what was hitherto thought impossible for audio — to be considered as important as the pictures.

Kevin Hilton writes about emerging technologies for digital audio.

Capturing surround sound

Many techniques exist for capturing surround sound, each with its own exponent. For coincident placing, two implementations are used: MMS and SoundField, using the ambisonics principles developed by recording engineer and mathematician Michael Gerzon.

Mid-mid-side, designated as double M/S, MMS or MSM, evolved from the mid-side (M/S) method for recording stereo. In M/S, a main microphone is aimed at the primary axis of a sound source and captures the mid signal, while a second mike positioned laterally records the directional side information. MMS adds a second mid microphone to record the rear left and right sounds. Both mids provide their respective left and right material in matrixed form. (See Figure 1.)

SoundField's method works with the three dimensions found in natural sound by recording front/back information (X) for depth, left/right (Y) for width and up/down (Z) for height, with absolute sound pressure (W) as the central reference point. The microphones consist of the capsule unit, the equivalent of the conventional microphone body and a controller. A digital control unit, the DSF1, was developed for the 2006 Football World Cup, and in 2007, SoundField introduced the SPS200 mic in which all processing, including transforming basic A-format information relating to positions of sounds into the more workable B-format, as well as surround sound decoding, is carried out in software on a laptop.