Alternatives to Dolby E

Last time we discussed the details of how Dolby E works. We examined how it has some pretty strict usage requirements that you need to keep in mind while designing and installing systems that will use Dolby E. While its use as a distribution format has become widespread very quickly, it is not the only solution for moving multichannel audio around a plant. This time we will investigate some other techniques as well as give a brief overview of the SMPTE standards that help to organize all this data.

BASEBAND

At least one major network has decided to take a baseband approach: All video, audio, and metadata (audio metadata, closed captioning, PSIP, etc...) will be combined into a single SMPTE 292M (HDSDI) serial HD signal which will be routed throughout the network plant. Having all of the data in one serial stream will help minimize lip sync and other timing problems, plus the obvious savings in cable and routing management. This network will be accepting content in a variety of formats, including formats with Dolby E encoded audio, but their preference will be baseband audio, video, and metadata.

Take a look at Figure 1. This topology can work quite well. There are video recorders, such as the Panasonic AJ-HD3700 that can accept this HDSDI signal, record it, and play it back. Embedding and de-embedding equipment for HDSDI is becoming very common, as are routing and monitoring products.


(click thumbnail)Fig 1. Baseband routing of multichannel audio and metadata. Note that the area in the white dashed box can be omitted if embedded audio/video/metadata is not desired for use in the local plant. Also, relevant SMPTE specifications are noted where applicable and can be consulted for further information.Due to RF constraints, they will use Dolby E to compress the audio for distribution over satellite to their affiliates. However, once the signal is received by the affiliate, it will be immediately decoded back to baseband PCM and metadata. The affiliate can then adopt the network approach and re-embed all of this back into a SMPTE 292M signal, route, switch, and/or store it, then finally send it to the emission encoder for transmission to the consumer. A major advantage for affiliates is that by the network choosing this approach, they can implement some, all, or none of it; it depends solely on what makes sense for them. As the signals can be brought back to baseband very easily, it can stay that way and be handled outside of the embedded domain.

A few potential problems come to mind. The old issue of master control switchers and audio metadata still remains (see TV Technology, April 23, 2003 issue), but is no worse than with any other approach. There is a new issue in regards to monitoring the audio of an embedded baseband system. As we all know, audio metadata is only applied to the audio inside the Dolby Digital (AC-3) decoder. How do you monitor it when the audio is baseband PCM and the metadata is RS-485? You could use a Dolby DP570 plus an in-rack monitor from Wohler, but this gets rather expensive. You could use a Dolby Digital (AC-3) encoder and a different in-rack monitor from Wohler, but this gets even more expensive and the audio will also now be at least 187 milliseconds delayed due to the Dolby Digital (AC-3) encoding latency. I have great faith that manufacturers will rise to the occasion and produce products to fill this void. And finally, yes, it is true that all of your proverbial eggs are in one basket, but who really wants to carry more baskets than they need to?

OTHERS IN THEMEZZANINE LEVEL

In addition to Dolby E and baseband, another approach to distributing discrete multichannel audio and metadata was co-developed by Leitch and Audio Processing Technology (APT). The system is called Diamond Audio and uses an enhanced version of the APT-X ADPCM (Adaptive Delta Pulse Code Modulation) compression system. Incidentally, the original APT-X system is used worldwide for distributing audio in radio, especially in Europe, and is the same system used by DTS (Digital Theater Systems) in its digital film system.

Enhanced APT-X provides for a 4:1 reduction in audio data rate, allowing for eight audio channels to fit into a single AES pair, just like Dolby E. In fact, many of the features of this system are quite similar to Dolby E. Like the Dolby E system, the Diamond Audio format is capable of replacing a single channel of audio in the compressed domain. This is useful for multi -- language versions of programs where a complete re -- mix is not possible. The alternate dialogue track can simply replace the existing dialogue track without losing a compression cycle. Upon decode, the program will have the new dialogue already in place.

One major difference is that while the output of the Diamond Audio system can be video synchronous, it is not frame-bound like Dolby E. This does not mean that the signal cannot be switched or edited, however. Due to the nature of ADPCM compression, switching and editing are possible without the system needing to be video frame-bound. This eases its use with different video frame rates and frequencies, as audio frame size is not tied directly to video frame size.

A possible advantage of the Diamond Audio system is that it has a very low latency. The encode takes only 4 milliseconds and the decode takes another 4 milliseconds. When you consider the cost of delaying a high-definition signal, the less needed the better. The only minor issue is that 4 milliseconds does add up, so after an encode and a decode cycle, you have 8 milliseconds of delay. There are a few products that will allow for correction of a 4 or 8 millisecond delay, but I am not sure everyone would do the right thing and correct for it. I am not pointing out a system flaw but giving a warning to system designers that just like with Dolby E, the delays must be compensated for. More information can be found by searching for diamond audio at www.leitch.com.

SMPTE SPECS

SMPTE standards 337M, 338M, and 339M were created to help define how and where data can be mapped into an AES digital audio interface. 337M is the foundation, 338M defines the data types, and 339M defines the formatting for the data. This series of standards is intended to help manufacturers design equipment that can handle not only baseband PCM audio, but also compressed formats such as Dolby E, Dolby Digital (AC-3), APT-X, and others and do so in a standard and interoperable way. For example, if you were to specify the use of a distribution system that created a transport stream that could be decoded only by that manufacturer's decoder, it would be far less useful than a system that allowed any standard decoder to do the job. MPEG helps the video side, but only loosely defines how audio (other than MPEG audio) fits into the system. SMPTE 337M-339M takes care of the audio issues. My strong advice when shopping for distribution equipment is to make sure that the system you choose supports these standards-and don't fall for the "we will add it later" line because as I have seen, this is not always possible and to my knowledge has never been successfully done!

I suppose the moral of the past two columns is that there is more than one way to "skin the distribution cat." It is really up to the individual networks and affiliates to choose which option or combination of options works best for them. No one knows better than those that are in the thick of it! Next time we will investigate a fascinating new technology called "consumer-side dynamic range processing." This technology may truly allow consumers to handle dynamic range-either by ignoring it and letting it be controlled for them, or by actively deciding what is appropriate for them. Sound familiar? You might be surprised by some of the details.