Television audio has progressed in a matter of 20 years from monophonic analog transmissions using FM modulation to digital surround sound carried in the MPEG transport stream. In addition, new features such as audio description and control metadata are all improving the service for listeners.
Worldwide broadcast technology is evolving in both the state and private sectors as broadcasters migrate from analog to digital transmissions. Across Europe, companies are launching HD terrestrial services as part of the switchover. This year, the FIFA World Cup is being used as a starting point from which broadcasters will roll out new HDTV 5.1 services.
DVB SD services have generally carried stereo audio using MPEG-1 Layer II (ISO/IEC 11172-3) and MPEG-2 Layer II (ISO/IEC 13818-3) encoding. Some broadcasters have used surround sound as part of their premium services for sporting events and movies.
As broadcasters move from SD to HD services, part of the more immersive viewing experience is to provide surround sound. The existing DVD market, plus growing penetration of HDTV, Blu-ray and 5.1 home theater systems, adds to viewers' expectations for enhanced audio services with broadcast TV.
New platforms can also deliver multichannel audio, including online services to complement digital TV channels. Also, the online aggregators are engaging with game console makers as well as set-top box (STB) and TV receiver manufacturers to deliver services into the home. In Europe, surround services are available via more than 300 channels, and that number is growing.
Efficient bandwidth use is always an issue, whatever the platform, because it enables service providers to offer higher quality or more channels. There are always limits to compression efficiency, and the limits are not entirely the development of new algorithms, but the cost of implementing the algorithms in silicon. As semiconductor technology advances, what can be achieved in a chip increases yearly.
Compression schemes like MPEG-2, Layers II and III, were all that was possible in the 1990s. Ten years later, much more is possible at the low prices demanded for the components used in STBs and integrated receivers.
This evolution from mono to multichannel must be carefully man-aged. The public expects a reasonable life from their receiver and will resist rapid obsolescence through constant standards changes to trans-missions; thus, cable operators have a large investment in their STBs.
Broadcasters also need to amortize audio infrastructure, so, again, expect a standard to run for some years before change. They also have to be mindful of changes to production techniques and workflow in the migration from stereo to multichannel audio.
This means that change to delivery standards must be carefully considered and well researched. In the past, early adopters have been left with less-than-optimal systems and out of step with international standards. It is in the interest of both viewers and CE manufacturers to agree to standards across a wider territory to increase market size and lower costs. Organizations such as the DVB Forum, the European Information and Communication Technology Association (EICTA) and Consumer Electronics Association (CEA) have created a framework for the delivery of multichannel audio that constrains delivery standards. Many national governments have derived standards to support the local markets from these frameworks that take account of any multilingual requirements within their territory. (See Table 1 on page 8.)
Dolby Digital Plus (E-AC-3) and HE-AAC (an amendment to MPEG-4 Part 3, ISO/IEC 14496-3) are the two audio codecs that have been chosen as next-generation broadcast audio codecs to provide facilities with high-quality audio, control, consistency, compression efficiency and surround sound (5.1). These codecs both use spectral extensions to improve compression efficiency and to make improvements to coding algorithms now possible with increased processing power.
Dolby Digital Plus
Dolby Digital Plus or Enhanced AC-3 (E-AC-3) adds extensions to the AC-3 compression algorithm used by the ATSC HDTV system in the United States, among other applications. New coding tools include spectral extension, enhanced channel coupling, transient prenoise processing and improved filter bank and quantization. The result of the enhancements is that some applications can use lower data rates (than AC-3), although the improvements also deliver support for higher data rates up to 6.144Mb/s. A typical 5.1 surround-sound bit rate can be about 256kb/s.
Spectral extension replaces higher-frequency coefficients with lower-frequency spectral segments translated up in frequency. Much high-frequency content is the harmonics of lower-frequency sound.
New features include simultaneous support for more channels beyond 5.1, 7.1 (as used in Blu-ray) and 9.1 (for height perception) — allowing for at least 13.1 for future applications. The standard also supports multiple program streams through substreams, all multiplexed into a single bit stream. This could be used for audio description, commentaries and the like.
The high-efficiency advanced audio codec (HE-AAC) is a development of the MPEG-4 AAC audio codec. Specifically, it adds spectral band replication (SBR) and, in the Version 2 profile, parametric stereo (PS).
If a codec like MP3 is to be used at low data rates, to avoid audible quantizing noise, the bandwidth is limited to about 7kHz. Much like the spectral extension of E-AC-3, HE-AAC uses SBR to reconstruct higher frequencies from the lower frequency range. The reconstruction is aided by the SBR signal, multiplexed with the lower-frequency data; thus, the aim, to limit the audio bandwidth of the codec while maintaining the full output audio bandwidth, can be realized.
Parametric stereo relies on the strong correlation between left and right signals. A composite mono signal is derived, and separate parameters to describe the stereo signals are derived (at a fraction of the rate of composite mono data), and used by the decoder to reconstruct the original stereo signal. The stereo signal is described as a panorama parameter (L/R differences) and an ambience parameter. The Dolby implementation of HE-AAC v2, which adds Dolby metadata and full 5.1 surround sound, is called Dolby Pulse.
One key difference among current audio codecs is the inclusion of metadata in the bit stream. This allows broadcasters more control over delivery to a wide range of receivers — from a mono portable receiver to a home theater.
A hot topic is loudness management, and the inclusion of metadata goes a long way in helping to adhere to quickly developing regulations regarding item-to-item loudness. The primary Dolby audio metadata parameters are dynamic range control (DRC), downmix and dialog normalization, plus channel configuration. Dialog level, or the dialnorm parameter, represents the long-term average level of the dialog within a presentation. The receiver uses this value to normalize the average audio output to a preset level.
DRC allows broadcasters to deliver a wider dynamic range to viewers who may be listening through a home theater system, rather than compressing the program to suit the poorest listening environment — as is the case with conventional audio. Listeners can then select no dynamic range compression, or light or heavy compression to suit their listening environment. The sound mixer can choose from profiles for movies, music and speech.
Downmix allows a multichannel production to be reproduced in the home through fewer speakers, stereo or mono. The downmix metadata parameters enable the sound mixer to select how the stereo downmix is constructed.
Production to delivery
Maintaining the metadata through-out the production processes enables content providers to deliver to viewers a closer representation of their artistic intentions, rather than the heavily compressed version. It delivers to listeners a more consistent experience, no matter what type of platform they may be using to receive the program — over-the-air TV, IPTV, PC via the Internet or a mobile device.
It also allows better control over levels and dynamic range and enables the producer to create surround sound that can be enjoyed equally in stereo or mono.
To ensure that new transmissions are compatible with legacy home theater systems, the latest HD receiver standards describe transcoding from the latest transmission codecs (Dolby Digital Plus and HE-AAC) to Dolby Digital 5.1, stereo or downmix to PCM stereo and pass-through of DTS encoding. This ensures compatibility with existing consumer equipment. (See Figure 1 on page 11.)
TV audio is evolving to meet the needs of broadcasters, who want control over the quality of the delivered product, freedom to trade data rates and audio quality that meets the bandwidth limitations of different delivery platforms. They also want the ability to deliver surround, stereo and mono in a flexible manner, as well as simple ways to deliver additional audio channels for ancillary services like audio description.
Listeners have differing needs, from high-quality surround sound with a wide dynamic range for the home theater to clear, compressed dynamic range for the small, portable receiver. The installed base of analog and MPEG Layer II receivers limit instant migration, but a managed change to the new codecs (Dolby Digital Plus and HE-AAC) gives broadcasters and other service providers a wide choice of delivery channel configurations and bit rates to provide their listeners with a compelling audio experience to accompany HD pictures.
Table 1. Audio formats for European HD terrestrial broadcasting (DTT)
|HD digital terrestrial specification (authority) ||Dolby Digital Plus mandated ||HE-AAC with transcoder mandated |
|EBU (Tech 3333) ||✓ ||✓ |
|EICTA (Known as Digital Europe) ||✓ ||✓ |
|France HD DTT (French HD Forum) ||✓ ||✓ |
|Spain HD DTT (Spanish HD Forum) ||✓ ||✓ |
|UK HD DTT (Digital TV Group) ||✓ ||✓ |
(Transcoder not required)
|Ireland HD DTT (RTE) ||✓ ||✓ |
|Italy HD DTT (HD Forum Italia) ||✓ ||✓ |
|Poland HD DTT (KIGEiT ) ||✓ || |
|Slovenia HD DTT (APEK) ||✓ ||✓ |
|Norway, Finland, Denmark, Sweden, Iceland HD DTT (Nordig specification) ||✓ ||✓ |
|RiksTV (Norwegian terrestrial broadcast) || ||✓ |
James Castelton is broadcast marketing manager for Dolby Laboratories.