Skip to main content

5.1 Takes More Than Serendipity

The Beijing Olympics provided a lavish laboratory for creating 5.1 surround sound. The Games ultimately demonstrated that uniform 5.1 audio could be achieved across multiple venues and transmitted around the world.

“There were a few bumps upfront, but we got it working, said Tim Carroll, president of Linear Acoustic in Lancaster, Pa. He was in Beijing for most of August, when the Games were held. “The ATSC system works as advertised. It proved that the tools are there.”

The architecture and workflow of audio for an event like the Olympics is massive. More than 2,500 microphones were employed throughout 12 venues, some with several simultaneous activities. The uptake was mixed on flypacked Calrec and DiGiCo consoles at the venues, transmitted in HD-SDI to the International Broadcast Center in Beijing for more processing, then sent via satellite in an MPEG stream to NBC’s New York operations center at 30 Rockefeller Center, which in turn distributed it as downmixed stereo or 5.1 to affiliate stations that delivered it to home receivers or cable headends.


Each link in that workflow chain introduces the potential for error, but what made the Olympics a 5.1 phenomenon begins with the technical design of digital television itself.

With analog signals, audio and video are essentially transmitted as one. They are split apart in the digital format, creating an inherent complication for lip sync--matching the audio to the video at the receiver. When audio is late, a show resembles an old Japanese Godzilla movie translated in English. When audio is early, the result is slightly mind-bending, chasing viewers away even more than the Godzilla effect. Either problem can plague digital stereo or 5.1 equally.

Bill Chase is a producer at HBO who worked on that network’s first live event done in 5.1 surround sound—a Britney Spears concert 10 years ago. “That was one of the issues with the encryption gear,” he said. “When it meets up at the other end, it’s got to be in sync. We had to come up with encryption equipment that could handle 5.1.”


Lip sync is an issue at the station level as well. Morris Pollock is chief engineer at WSFA-TV, a Raycom-owned NBC affiliate in Montgomery, Ala.

“Since there is no standard to keep the two together through a signal path we, like every other station, have our struggles with lip-sync problems,” he said. “There are better measurement products now available to measure audio delay and help determine exactly where the problems are, and Raycom is evaluating several of these. It’s something we have to keep watching for.”

At least one DTV expert believe the lip-sync problem lies within the MPEG transport stream specification, or rather what’s not in the specification.

“There’s no requirement for the decoder to pay attention to what’s in the systems layer,” he said. “People have assumed the stream will stay in sync and never jump.”

Splitting the audio into six channels for 5.1 doesn’t necessarily complicate lip-sync, but both raise questions about what’s happening with home receivers. Once the audio reaches a home, there’s no way to know what’s decoding it, Chase said.

“Even if John Doe has a home theater, who knows if it’s set up right?” he said. “It’s difficult to know what type of equipment they’re using.”

Another issue for doing live 5.1 is quality control, Chase said.

“Are you getting quality control right, so it’s leaving us OK, and who’s listening on that end to make sure it’s working,” he said. “The biggest danger... is you don’t want to bury your center channel. That’s always a big problem, because you don’t want to phase, or have your side channels overpower. You could lose dialogue.”


The center-channel issue appeared to come up during the opening ceremonies of the Olympics, when the commentators’ voices seemed to get lost in the mix. Carroll said it was more a matter of dynamic range, where the ads came in much louder than the program. Variations in loudness between programs and commercials can be extreme. Producers tend to mix for dramatic effect; advertisers, to get attention. So when a whispered death scene bumps into a caped crusader hawking used cars, there’s generally a few decibels difference.

Linear Acoustic president, Tim Carroll, soaking wet in the Bird’s Nest after heavy rains hit during preparation for the Olympics.

The Federal Communications Commission requires broadcasters to even out audio peaks and valleys, a concept known as “dialogue normalization,” or dialnorm. The FCC does not set a specific value for dialnorm, but has directed broadcasters to figure it out. They are trying.

“You can do it one of two ways,” Carroll said. “You can author metadata that matches the programming, or do what NBC does with house metadata.” NBC has created an operations center that can handle metadata. That way, instead of constantly adjusting content loudness at every step of the workflow, the network relies on decoders to do it.

“At the Olympics,” Carroll said, “We didn’t have to worry about metadata, but just stuck to mixing audio.”

Other networks ride the audio. Loudness meters help, but silent periods need to be gated out of the mix so the volume isn’t ratcheted down too far. Cable carriage adds another fly in the audio ointment for TV stations.

Rick Grinstead of KVUE-TV, Belo’s ABC affiliate in Austin, Texas, describes how cable distribution affects his station’s audio: “Frankly we are having issues with some cable boxes that we are pretty sure is caused by dialnorm,” he said. “A lot of cable boxes have dynamic range limiters built in them to help with levelizing the audio levels between channels. The boxes usually have the ability to change from wide, normal, and narrow, with narrow being the most compressed. We have discovered that the cable boxes in our market are defaulted to narrow. We are running an audio levelizer, basically a limiter/compressor, on our DTV channel, which appears to be fighting the levelizer in the cable boxes, resulting in the audio level constantly fluctuating on our HD cable channel. It’s been a real pain to troubleshoot this problem.”

POST 2/17/09

The cable problem could potentially worsen come Feb. 17, 2009, when analog broadcast signals end and the sole feed is digital.

“They’ll take the DTV signal and downconvert it, and the audio will be a downmixed output, Carroll said. “That signal that stations haven’t been watching as closely as the cash cow signal... that DTV signal will be it. Any station not taking the same care of their DTV audio signal as their NTSC is going to get hit.”

Many stations are passing through network 5.1 and upconverting their own content, something Bob Orban warns against. Orban is the founder of an eponymously named audio equipment maker in Tempe, Ariz. Orban chose not to make an upmixer based on an Audio Engineering Society white paper entitled, “Perceptual Evaluation of Algorithms for Blind Upmixes.”

“For listeners in the center stereo ‘sweet spot,’ none of the upmixing schemes was rated significantly better than the original two-channel stereo input, and most of the schemes were rated worse,” Orban said.

“Upmixes were only preferred if listeners we located away from the sweet spot. Because the program material was music, the results may not apply to typical television programming,” he said. “Nevertheless, we have heard some extremely weird and unpleasant-sounding upmixes on the air. Certainly, dialog emerging from what sounds like 16-foot wide heads wrapped halfway around the living room should concern any broadcaster who does not want to give viewers an excuse to hit the ‘mute’ button or to fast-forward the program material.”


Plenty of issues remain to be ironed out in DTV, but at least the Olympics proved that transmitting synched-up, dialnormed 5.1 surround sound can be done. “From the mixing console, all the way back to New York and to the consumers... was calibrated,” Carroll said. “The loudness meter in basketball correlated with the meters in Beijing, at 30 Rock, and at the affiliate stations. “We had a return-path from WNBCTV in New York, so we saw exactly what consumers were seeing. It was mostly belt-and-suspenders, but it proved, with proper calibration and a common measurement method—where it’s putting out a not a meter reading but an actual number—we could make it consistent.”