What Do We Really Mean by Hi-def?

A couple of months ago, I delivered a public rant about the present-day deficiencies of HD television at a professional audio conference (the Parsons Audio Expo) here in Boston. I ranted about the same things I've been writing about in TV Technology for the last six months or so. As I prepared for that rant, it occurred to me that part of the problem we face has to do with the gap between what we production types mean by “high definition” and what our beloved audience expects.

It occurred to me to try, for the purpose of the talk, to come up with a general definition or standard that would work for both parties, so that if we adhered to that standard in our production and transmission efforts, our audiences would have their expectations reasonably fulfilled.

I've already taken you through the sad litany of deficiencies that we experience with the system. I've also noted that we're all contributors to the problem. Us program producers, TV broadcast networks, cable and satellite service providers and viewers all share in enabling the frustrating failure to make HD television viable today.

The gap between the HD television we deliver and the HD television that our viewers perceive lies in the difference in the way we and they think about it. Us production types consider HD to be an objective production definition of a certain level of picture resolution (1,080 lines) and audio (5.1 channels of AAC-treated digital audio).

At the same time, our audience's subjective definition of HD, based in large part on how we've promoted it, is that it is a seamless, engaging, exciting and you-are-there sensory experience equivalent to what they can experience in good movie theaters.

The trick to bridging that gap, I think, is to come up with an overarching objective standard that reasonably encompasses our current objective standards while also giving our viewers what they subjectively desire and expect. After some thought, I came up with a specific minimum resolution standard that I think fills the bill pretty nicely, if we apply it to all aspects of our production. That standard turns out to be surpassingly simple, even modest: all magnitude aspects of production shall be accurate to a resolution of at least 60 dB, or (1,000:1).

Decibels are simply another name for ratios. A 1 dB increase in the magnitude of any normal quantity is simply an increase of 12 percent. An increase of 60 dB is a one-thousandfold increase of magnitude (I know I'm skipping some important background stuff about dB here—humor me, OK?).

A high-definition picture with 1,080 lines slightly exceeds that 60 dB specification. An audio channel with a frequency response of 20 Hz to 20 kHz. (1,000:1) and a signal-to-noise ratio of more than 60 dB also meets that specification.

So, we can say that our current visual and sonic HD specifications, when we meet them, satisfy my 60 dB standard. The problem is, we don't meet them very consistently, and that's where the audience's expectation is thwarted. So let's consider the implication of a 60 dB or 1,000:1 or 99.9 percent resolution for the magnitude of time. What would it mean?

ALL THE TIME

It would mean that we would have to meet our picture and audio standards 99.9 percent of the time, or for approximately 59 minutes and 56 seconds out of each hour. We could have a total time error of no more than 3.6 seconds (one-thousandth of an hour). At present, we're miles away from that level of consistency over time, especially in broadcasting.

Consider the problem of lip-sync error. To meet our 3.6 second-per-hour error budget, we could have only 10 seconds worth of 10-frame (ca. 330 ms.) lip-sync error, if everything else was perfect. We could be off by one frame for 100 seconds, or about a minute-and-a-half for every hour.

Right now, my guess is that the average lip-sync error, full time, is greater than one frame (which we regard as unperceivable—and that's the point, after all) or approximately 3.3 percent, 33 times the 60 dB standard of .1 percent.

So, just the current lip-sync error would use up all of our time-error budget and then some, just for a trivial audio problem. That means we would have to have zero time tolerance for picture tiling or other signal resolution artifacts/errors.

And that, campers, is what high resolution really means: very few errors, very little of the time.

Unfortunately, the transmission chain and the various cable/satellite service providers add to the problem. DirecTV, for instance, takes about five seconds to change channels (with particularly unpleasant audio artifacts during the change), which definitely isn't HD. It uses up all of our time-error budget just to select a new signal.

Also, there are similar, ongoing audible time errors and corrections during program transitions on several network feeds. These just make it worse. As I said, we are a long way away from being able to sustain HD for any reasonable length of time, certainly not a length of time that could qualify as HD time.

As I described earlier, we have led our audiences to expect an extremely life-like and realistic sensory experience. I called it seamless, engaging, exciting—you-are-there. I think it's the seamless aspect that gets us here. Every time there is a perceivable error, it distracts us from that experience and reminds our audience that this is only an illusion, definitely not the real thing. It is not enough to have high resolution some of the time—we need to sustain it for the length of the production.

Hence my suggested 60 dB standard. No error shall be greater than .1 percent or for more than .1 percent of the time. If and when we can deliver this to our viewers (as we mostly do with the DVDs that we rent or sell to them), they will almost certainly find the experience to be seamless and exciting. They also may very well feel drawn in to the production to a point where, well, they are there. Ah, truth in advertising. It's a beautiful thing!

Thanks for listening.