Do the Right HD Thing: Drop the Numerology

You might not have noticed that, for technologists, we’re awfully superstitious. For instance, there ain’t any D-4 videotape format on account of the number four being associated with death in parts of Asia that buy lots of TV technology.

So SMPTE, the international, accredited, due process, engineering standards organization, skipped D-4 to accommodate superstitions. I ain’t making this up. I kind of wish I was.

Maybe you’ve seen Fujinon’s 88x8.8 lens. Now then, part of the reason they went to those numbers from 87x9.3 is just progress, like going from 22x7.8 to 23x7.6. But another reason is that, unlike four, eight is a lucky number, and 88 is extra lucky. It ain’t by accident the Olympic Games in Beijing are starting at 08:08:08 on 08/08/08.

LUCKY NUMBERS

OK, so, if a big enough customer wants you to dress up in a clown suit and sing falsetto, I guess that’s what you do. That explains the absence of D-4 and the presence of 88x8.8, but what’s with the worship of numbers like 16x9, 1080, 1920, (SMPTE take note) MPEG-4, and, especially 1280?

Allow me a moment or two on that last number for openers. It’s the number of active samples per line in 720p HDTV, which is the flavor favored by ABC, ESPN, Fox, and maybe the European Broadcasting Union, better known as the EBU (which sounds like a large, exotic bird to me).

Now then, there ain’t anything wrong with 720p as an HD format. It’s progressive, which I don’t think anyone in their right minds still denies is a good thing. And some EBU experiments show that at low enough bit-rate-reduction data rates 720p looks better than even the similarly progressive 1080p.

Those experiments were performed using cameras with 1920x1080 imagers, which meant the 720p signals were oversampled. Perchance you have heard of oversampling. RCA’s CCD-1 might have been about the last standard-definition video camera not to use oversampling (and it was the first broadcast chip camera).

If you ambled through a camera booth at any convention for the last 20 years or so, you probably saw signs saying that their imagers had some 520,000 pixels—maybe more. If a U.S. standard definition production aperture has a maximum of 486 active lines, you can do the math and come up with around 1,100 pixels per line.

That’s for standard definition. So along comes HDTV, and a 1080-line camera has 1920x1080 chips. OK, so maybe it’s hard to do more. But, if you can do 1920x1080, why would a manufacturer put 1280x720 chips into a 720p camera?

Yes, the format is 1280x720, but the format for NTSC is 440x483, and even that digital production aperture is just 720x486. But standard-definition cameras have imagers with 1,100 active photosites per line on account of oversampling makes pictures sharper. So we go to HD and we drop oversampling? Yeesh!

For the details of why oversampling makes sharper pictures, I refer you to the mathematical sinc function, also known as (sin x)/x, which contributes to the shape of the good old modulation transfer function curve. Sigh. Remember the good old days when every broadcast lens came with MTF curves? But I digress. The important thing is that a bunch of folks say visual sharpness is proportional to the square of the area under an MTF curve, so its shape does matter.

TWICE AS EFFICIENT?

So, a foolish consistency being the hobgoblin and all that, having just pointed out that more imager pixels are important, I’m now going to argue that fewer in the format ain’t terrible. It’s that same MTF curve. More pixels in the imager makes more area under the curve, which makes sharper pictures. But lopping off the tail of the curve, where the finest detail is, doesn’t lose you much area or sharpness.

That’s what Sony did in HDCAM, chopping luminance from 1920 to 1440 and chrominance down to 480. Then Panasonic did about the same in DVCPRO HD: 1280 and 640. You do lose some area, so the difference between HDCAM and HDCAM SR is perceptible, but it ain’t a lot.

Anyhow, that’s just resolution. I hear tell that MPEG-4 (which is used in Asia without anyone dropping dead from the curse of the numeral) is “twice as efficient” as MPEG-2.

Well, now, I’ll grant you that four is twice as big as two, but I have a really hard time translating that into efficiency. At 500 Mbps, am I going to see a difference in picture quality between color bars encoded as MPEG-2 versus some in MPEG-4? Methinks not. And, at the other end of the scale, I kind of doubt that an HDTV basketball game will be watchable at 10 kbps regardless of whether it was encoded as MPEG-2 or MPEG-4. So, just how do you prove that “twice as efficient” claim?

If you say I should measure something like PSNR, I paraphrase Hamlet and say, “Numbers, numbers, numbers.” We’re drowning in numbers.

WHAT REALLY COUNTS

Noise figures can’t be compared directly. Is it HD or SD? What is the spectral content of the noise? What do the pictures look like?

How about minimum illumination? I really like that on a camera’s spec sheet. They offer just a number in lux, like 0.001 lx. Now then, what lens was on the camera? What color was the light? What were its focal length and iris settings? What was the signal-to-noise ratio of the output? And just what did that minimum illumination do? Did it raise the video signal one quantum above black, or did it achieve full-scale video?

Then you’ve got your holy 16x9 aspect ratio, the most perfect imaginable. Too bad the guy who introduced it to HD acquisition gave a little noticed presentation 10 years later explaining how he’d goofed.

Remember: You ain’t in the telephone book business. Numbers don’t count; pictures and sounds do.