I'm Mad as Hell About Bad A/V Sync

Something like a year ago I went off in this column about out-of-sync audio and video. It seemed to me that the price of going digital had been making many on-camera speaking appearances look like badly dubbed foreign films. I thought that diatribe had sufficiently relieved the pressure in my spleen, even if it hadn't done any good.

That was until I went to this year's Consumer Electronics Show.

I attended a press preview for one of the world's largest consumer electronics companies, which also happens to be one of the largest broadcast equipment makers. There were so many press people at the massive booth that many of us were seated in separate areas, out of eyesight of the podium and presenters. But that's okay, because we had a big screen video feed and loudspeakers.

As I watched this live presentation, it was obvious that the audio was about 10 frames ahead of the video. (I started my career shooting and editing film, and I know 10 frames when I see it.)

This wasn't a complicated video chain running through the routing switcher a number of times. They weren't doing complicated effects that require the video to be delayed and delayed. This was a simple presentation. And it was simply wrong.

What made me maddest was that the sponsors of the event didn't care. At least they didn't care enough. This problem was fixable, and they didn't have it fixed.


I've timed my eruption for this issue for a purpose. NAB is just around the corner, and the answers to A/V sync are there.

Let me review the problem. When you watch the person next to you talk, both the sound and the movement of their mouth making the sound happen at the same time. When they make the sound of the letter "p," their lips purse at the same time the "puh" sound is made. When they make the sound of the letter "t," their tongue leaves their pallet at the same time the "tuh" sound is made.

This doesn't just happen with speech. When we watch someone chop a log, at the same time the axe hits the wood, we hear the impact.

Watching and listening, we're trained from birth to expect what we see and what we hear to be in sync. To the extent our real world experience ever lets those two get out of sync, the audio lags the video because sound travels more slowly through air than does light. Move that wood chopper a quarter mile away, and you'll see the axe hit the log a fraction of a second before you hear the sound.

This would explain some research I've read that found a video or film viewer perceives lip sync to be correct when audio is exactly in sync with video, or where the audio slightly (a frame or two) lags video.

However, the nature of the sync problems TV is having is just the opposite: The video lags the audio. That's because of digital processing.

There's a long and a short reason for this; you'll be happy to know that I'm only able to recite the short one: The process of digitally encoding and decoding video is more complicated than for audio, and so it takes longer. Running through a number of encoding and decoding processes where it always takes video longer to make the trip than audio, and the problem gets worse.


I remember watching a major local news story on TV a year or so ago, where the cable news channel was taking the feed directly from the local TV station that was also running it. The two operations are located in the same building. Lip sync on the TV station was correct; on the cable channel it was badly out of sync.

I'm surprised that news directors aren't screaming about this problem, because the end effect of bad lip sync is that it gets in the way of communication. If viewers are watching a badly out of sync field report and thinking in the back of their mind "hey, this isn't right," that's taking away from their understanding the report itself.

Several years ago, Tektronix offered its AVDC100 audio to video delay corrector, where a watermark, keyed to the audio but that was visibly imperceptible, was placed in the video. If the audio and video were in sync at the time of the watermarking, similar devices placed downstream in the plant would delay the audio to bring the two back in sync.

I've read through product listings for NAB2003 and found offerings from Leitch, Television Equipment Associates/BAL Broadcast and Vistek that talk about solving the A/V sync problem.

I'm the first to admit I'm not technologically smart enough to tell you how to put this together, and I'm not building a facility right now. But some of you are or will be, and the problem won't get solved unless you insist that it is.

Declare war on lip sync problems and get them fixed.