What Is Dolby E?

With the brief introduction to Dolby E last time, I figured it would be a good idea to leave an entire column to focus on the technology plus some of the difficulties in making it work in a typical plant, if such a thing as a typical plant exists. Before we dive in, however, I wanted to mention an excellent resource introduced to me by Bruce Jacobs of Twin Cities Public Television in Minneapolis. It is called the ETV Cookbook and can be found on line at: http://etvcookbook.org. It is clear, concise, has nice drawings and while geared toward PBS affiliates, is broad enough to be useful for any DTV station or network. I am very impressed and highly recommend stopping by. On to Dolby E...


There is little sense in having an emission coding system such as Dolby Digital (AC-3) that can carry 5.1 channels of audio if you can't get 5.1 channels to the encoder. In 1996, all commonly used VTRs had only four audio channels; servers could in theory do more but were not generally configured that way, and digital plants rarely had more than two AES pairs available for routing. Once the Dolby Digital (AC-3) system was in place as part of the ATSC standard, Craig Todd, Louis Fielder, Kent Terry and others at Dolby began to investigate ways to efficiently distribute the multiple channels of audio-they foresaw issues that were still a few years away from becoming a really big problem.

So what is Dolby E? Contrary to some rumors, it is not high-rate Dolby Digital (AC-3). This approach was considered, but there were too many benefits to be had for starting over with a different set of goals. What I mean by that is the goal of the Dolby Digital (AC-3) system is to deliver up to 5.1 channels of audio to consumers using the fewest bits possible while still preserving excellent audio quality. As we will see, this is not the goal of Dolby E. To meet this goal, the Dolby Digital (AC-3) encoder is rather complex and takes about 187 milliseconds from the time it receives audio until the time it produces a Dolby Digital (AC-3) output. This is analogous to the video encoding process-high-quality but low bit-rate means the encoder is going to need processing time. Although this encoding latency is small (far less than video encoding latency) and is taken into account in the multiplexer, this amount of delay is difficult to deal with in production and distribution.

(click thumbnail)Fig. 1
Also, while the audio quality of Dolby Digital (AC-3) is very good, it would not be appropriate to use it for multiple encode/decode cycles. This might enable coding artifacts to become audible. I say might because the artifacts are unpredictable-sometimes you might hear them with certain material, sometimes you might not. Again, high-rate Dolby Digital (AC-3) minimizes the chance of this occurring, but it could.

Another drawback of using Dolby Digital (AC-3) for distribution is that although its data is packetized into frames, these frames do not regularly line up with video frames (see Fig. 1).

You might be thinking, "PCM audio is packetized into AES frames that do not line up exactly with video frames either so what is the problem?" Good point, but Dolby Digital (AC-3) frames carry bit-rate reduced (i.e. compressed) audio, not baseband audio. Although a video edit would cause little problem for baseband audio, cutting a mid-Dolby Digital (AC-3) frame will cause major problems. After decoding, the results will be audio mutes if you are lucky, clicks and pops if you are not. Dolby Digital (AC-3) is simply not intended to be used this way.

This did not stop some early adopters, however, and at least one major DBS provider used Dolby Digital (AC-3) recorded on one of the AES pairs of a Digital Betacam recorder to carry the 5.1 channel audio of movies. Did it work? Absolutely, and even when the Digital Betacam tapes were not long enough to hold an entire film and the Dolby Digital (AC-3) stream had to be switched mid-movie, it hardly ever caused a glitch. They were lucky! It can be done, but it is not easy and is not recommended. Clearly, it was time for a new system designed specifically for the task.


Some of the original goals for this new system were that it had to be video frame-bound so that it could be easily edited or switched, had to be able to handle multiple encode/decode cycles (at least 10) while causing no audible degradation, had to carry eight channels of audio and metadata, had to fit into a standard size AES pair of channels and had to do its encoding and decoding in less than a video frame.

Dolby E satisfies this tall order. The system will accept up to eight channels of baseband PCM audio and metadata and fit them onto a single 20-bit, 48kHz AES pair (i.e., 1.92 Mbps), or it will fit six channels plus metadata into a 16-bit, 48kHz AES pair (i.e., 1.536 Mbps). After decode, PCM audio and metadata are output to feed a Dolby Digital (AC-3) encoder.

(click thumbnail)Fig. 2
In Fig. 2 you can see how Dolby E frames match video frames. Although only NTSC and PAL rates are shown, the system will also work with 23.976 and 24 fps material.

Notice the small gap between the Dolby E frames. This is called the Guard Band and is a measurable gap in the bitstream. Switching here allows a seamless splice of the audio signals. In fact, upon decode, the audio is actually crossfaded at the switch point, which is a remarkable feat for compressed audio.

As the goal for the Dolby E bit-rate reduction algorithm was to enable multiple encode/decode cycles or concatenations, the audio quality is maintained for a minimum of 10 generations. This does not mean that at 11 generations the audio falls apart, but rather that absence of any artifacts is no longer guaranteed. I was one of the listening test participants at Dolby during the development of the "E" algorithm. I spent two rather unpleasant afternoons listening to all kinds of audio samples both pre- and post-ten generation Dolby E. I consider myself a critical listener, but those were some of the hardest listening tests I have ever participated in. This was not like comparing apples and oranges, more like comparing two perfect apples-one was ever so slightly different than the other. In a word: Maddening!


Dolby E, like Dolby Digital (AC-3), is carried on a standard AES pair. It can be recorded, routed and switched just like standard PCM audio in an AES pair (see Fig. 3).

(click thumbnail)Fig. 3
However, there are some strict requirements. The path for the AES pair must be bit-for-bit accurate. This means that there can be no level changes, sample rate converters, channel swaps or error concealment in the path. Remember that although the Dolby E data is in an AES pair, it is not audio until it is eventually decoded. Any processing that causes changes in the data will destroy the information. These "Gotchas" are hidden everywhere, especially sample rate converters, so be prepared to really evaluate your facility. An invaluable resource is the Dolby E partner program, run by Jeff Nelson at Dolby. You can find a bunch of very useful information at www.dolby.com/tech/dolbyE_prtnr_prgrm.html. Manufacturers and individual products that have been tested to pass Dolby E are listed here.

When baseband audio is not possible, Dolby E has become the de facto standard. Since it began shipping in September 1999, more than 1,000 encoders and decoders have been sold, and the system is the source for virtually all 5.1 channel Dolby Digital (AC-3) broadcasts here and abroad.

One last point. The question I was probably asked most often was: "Why is it called Dolby E?" Simple: "E" comes after "D." So, as Steve Lyman likes to say "Dolby E is for Distribution and Dolby D (i.e. AC-3) is for Emission."

We ran short on space (or I was long-winded) this time, so we will wrap up Dolby E, describe some of its competitors and get to the SMPTE standards next time. These standards are essential for carrying compressed audio of any kind because if it is not done in a standard manner, interoperability becomes an impossibility. In the meantime, if you have access to them, I suggest taking a quick look at the series of SMPTE standards from 337M to 339M. Please keep the excellent questions and suggestions coming.