Tim Carroll / 09.22.2004
Of Dolby, DVRs and Distant Signals
Let's start with some good news before we jump into the difficulties and complexities of television audio. I am happy to report that in August, Dolby Laboratories was informed that the company had won a technical Emmy award for its LM100 Broadcast Loudness Meter.
This is a definite sign that the rest of the world is noticing that there is a real problem here, and that the first step toward solving it is being able to measure it.
So congratulations to everyone at Dolby, specifically Jeffrey Riedmiller (the well-known father of the LM100), Matthew Robinson, Marvin Pribadi, Charlie Robinson, Mark Vinton, Steve Lyman and of course Ken Gundry (who wrote the best set of articles explaining noise reduction I have ever read and re-read many times).
Also, congrats to the folks at Dolby who bring the products to life: from the PAS Engineering group to the gang in manufacturing down in Brisbane (the one slightly down from San Francisco; that is, not the Brisbane way, way down in Australia).
STORE AND FORWARD
Believe it or not, TiVo (and TiVo-like devices) and video-on-demand (VOD) are very closely related. In fact, I would go so far as to say that TiVo is the consumer version of the massive server systems used for VOD. Both have the same goal of delivering content when a consumer wishes to receive it, and the only major difference is the ability of VOD to serve more than one viewer. Still, they are very close.
These services are also a close cousin to the PBS and Fox transport/distribution model, in that all the encoding is done before the signal is distributed and stored. This model is particularly useful and cost-effective for VOD, because the whole transport stream is stored, then forwarded (on-demand); the MPEG video and Dolby Digital (AC-3) audio encoding only needs to be performed once. This is a gigantic cost-savings.
Think about a newly released film entering the VOD lineup on a typical Saturday night on a typical cable system. If the feature is popular, many, many different streams will be requested and delivered. This functions similarly to how people can visit a Web site at different times but see the same content.
If the entire transport stream were not available, each separate delivery would require an MPEG and Dolby Digital (AC-3) encoder. Not only would this make the system cost-prohibitive for the cable companies, it would very likely take so much rackspace, power and HVAC for the equipment that it just would not be physically possible.
Transport stream technology to the rescue. Yes, there are certain complexities to this approach. As we have discussed, because the transport stream is carrying compressed audio and video signals, you cannot easily do "baseband" things to it like crossfades and voice-overs.
However, with VOD, there is little need for these types of operations. So, simply providing the stream then as gracefully as possible and splicing it with other programming allows for many simultaneous versions of the same program to be delivered, all with the capability for 5.1 channel audio.
I remember the first TiVo boxes that became available when DirecTV began carrying programming with Dolby Digital (AC-3) 5.1 channel audio. They would simply record the stereo downmixed audio and the video; certainly acceptable, but not the best capture method for a program.
The problem was the lack of access to the nondecoded bitstreams, which required the external TiVo-type recorders to recompress the incoming audio and video streams (bad on many levels). The solution appeared with the integrated receiver and recorder, as now the transport stream could be recorded and decoded upon playback. This simplified things immensely because no real-time encoder was necessary for the satellite channels, and because the whole transport stream was stored, Dolby Digital (AC-3) could come along for the ride.
One additional point becomes apparent. These services are further proof that encoding content into a transport stream as early as possible is a good idea and one that brings with it great efficiencies. While this could introduce additional complexities, I am confident that with all the benefits of this model, these will be worked through quickly.
Did you happen to catch the Olympics broadcast on NBC? In my opinion, this was one fantastic job. High-definition video and multichannel audio is hard enough to do locally, but to successfully pull off a production of this magnitude that far away is a testament to the many visionary professionals involved.
Yes, there were some minor glitches, but that happens regularly with NTSC, a format that has been around for decades, so let's be a bit more patient with a format approved only in 1995. It was also nice to see NBC finally go from very little HD content to HD plus 5.1 channel audio for days at a time. This hopefully bodes well for the new primetime season, but we will have to wait and see. As I keep telling Jim Starzynski, I just cannot wait to hear "Saturday Night Live" in 5.1! If nothing else, the whole Olympics production is a good response to the shrinking group of naysayers who just cannot bring themselves to accept that HD and 5.1 channel audio go hand-in-hand and are what consumers are demanding.
ATSC AUDIO DOCUMENT
We started with a happy topic, so why not end with one? The ATSC has signed off on the document officially called "IS/318 Multichannel Audio Program Delivery and Metadata Considerations (Pre-Emission)." Composed by the Audio Section of the Systems Evaluation Working Group (SEWG), this document began almost three years ago and covers multichannel audio and metadata issues from production through storage, routing and distribution, right up to the input terminals of the emission encoder (i.e., the Dolby Digital AC-3 encoder). I have mentioned this document a few times in past articles, and it is finally public.
At the time this article was written, the location of where to obtain a copy was not yet known, but please drop me a note and I will happily forward it to you. I will report the exact address for downloading next month. On the ATSC topic, it seems that activity is winding down somewhat, which is expected, but a little disappointing. I suppose it is time to get down to making all this stuff actually perform in the real world. There are one or two more audio-related documents in process with the SEWG dealing specifically with loudness and lip-sync that may yet see the light of day. I'll keep you posted.
Next time, we will look at how audio is being handled in the new fall season. There are lots of new infrastructures with all the requisite capabilities, and programs being produced with proper audio and hopefully metadata, but how does it sound in the real world? I will also share some feedback from DTV stations located between the two coasts to see if there is a difference in other markets.
We will also investigate the interesting and somewhat controversial audio technology called upmixing (i.e., two-channel to 5.1 channel synthesis) and explore the upsides and the caveats.
Until then, please keep the great comments coming, and any suggestions for specific areas that you think should be covered (or re-covered) would be more than welcomed. As always, thanks for your time!