HD conversion products: The big picture on small pixels

The picture is not improved by upconverting a 525 signal to HDTV for DTV transmission.

For many years consultants, equipment designers, production professionals and system planners have mused about how to effectively manage the transition from our low-resolution 525/625 television systems. In conversion matters, the image format issues seem to garner the most interest. People would rather discuss conversion of line rates and pixel counts than more esoteric issues like colorimetry. My guess is that much of the arcane but critical information one must consider in conversion is not well understood by most professionals in the industry.

There a number of technical parameters that are routinely modified in conversions between standard-definition and high-definition video formats. They include issues of technical format (analog or digital), captured image format (frame rate, interlace or progressive scanning, active picture sampling, aspect ratio), and ancillary information like embedded audio. Any geometric conversions also must take into account the effect in the special and temporal domain on the artistic intent of the original image maker.

Conversion between pixel sampling maps has become rather routine. For clarity, those include conversions from SD to HD, the inverse of that conversion, and conversions between the various HD formats. The sampling and filtering science involved first showed up in professional products over two decades ago, with standards conversion for 525 and 625 applications and digital video effects. In the HD domain the conversion might be between 1080p and 720p pictures. That would entail a filter and re-sampling operation on a picture that originally was 1920×1080 samples to create one that is 1280×720 samples. If static, the conversion is not particularly difficult. However, if one considers frame rate, the murk increases.

Most professionals recognize the issues. The most familiar case is 525-to-625 conversion. The dropped frames create a judder not unlike that seen when using film and video together. But consider the judder when 1080i is used in an NTSC world. That requires a frame rate of 30/1.001, or 29.97fps (59.94 fields per second). If we were to operate with cameras at 60Hz, we would have to drop a little less than two frames per minute, or about one field every 15 seconds. The other way to deal with it is to continuously interpolate the display frame, with the consequence that considerable memory and complexity is added to the process. Conversion done with motion interpolation has the effect of dramatically softening the picture, which might be better than missing frames in some cases, such as live sports.

In general it is true that special interpolations, i.e. changing one sampling grid to another, are more successful when converting higher pixel counts into lower ones. It seems intuitive that decimating a 1920×1080 picture to one that is 720×480 will produce a much better result than the inverse conversion. This speaks directly to the practice of many broadcasters today of upconverting their 525 signal to HDTV for DTV transmission. The simplest way to say it is that the picture is not improved by doing the conversion. The line structure will disappear and not all of the detail will be preserved. If a high-end converter is used the net effect may be quite pleasing, but use extreme caution with such conversions. You may find that the picture displayed in the home is worse due to the necessity of applying bits to create an HD signal, when the same number of bits would produce outstanding native 525.

Transmitting interlace allows fewer lines total to be transmitted, while preserving the impression that the resolution of the signal has not reduced. The “integration” the human visual system adds makes this trick possible. When the current standard was designed, interlace allowed a significant increase in picture quality, given the state of the technical art. But few would argue that, if technology allows, a progressive signal will produce a superior result. Given sufficient vertical resolution the difference is hard to spot. Compare for instance 1080i and 1080p at the same frame rate. For a more interesting comparison look at the same scene with 720p and 1080i. It is hard for many observers to see any difference in vertical resolution, and indeed mathematically there may be little difference. The most obvious comparison is 480i (525 NTSC) and 480p, where the progressive signal is always picked by professional and non-professional observers as superior. It contains twice the vertical information, and thus is truly superior.

The thing most obvious to many people about HDTV is the aspect ratio or the display. It is hard to find a happy medium when converting back and forth between 4:3 (like NTSC and PAL), and all widescreen 16:9 formats. 16:9 (or 1.778:1) provides 33 percent more width than 4:3 (or 1.33:1). It is arguably a better match to the human binocular visual system, and certainly closer to the aspect ratio of motion pictures. The most complicated and intractable issue in conversion is aspect ratio. Though it can be measured with numbers, the decisions are artistic rather than technical. A properly designed upconverter can leave side panels beside the 4:3 frame or truncate content top and bottom to fill the frame. Downconversion can leave “letterbox” bars top and bottom, or truncate the sides of the HD image. Variations include nonlinear methods of stretching the picture to fill the horizontal frame in the 16:9 window or panning the HD image across the 4:3 frame. All of those choices are “non-technical.”

The form of the pixel sampling format is essentially fixed. All studio devices, SD and HD, use 4:2:2 sampling for interconnection (SMPTE 259M and SMPTE 292M), but outside of the studio formats a plethora of sampling and coding formats exist in both the DV and MPEG domains. Systems using 4:2:2, 4:2:0 and 4:1:1 all work well, but be aware that concatenating pictures compressed using varying schemes will leave complicated footprints in the picture which may well not be compatible with other image transformations. For instance, a DVE expansion of an image that has been through multiple processes may well show filter artifacts that resemble aliasing.

Colorimetry is not often considered outside the design lab. With HDTV the standards organizations had an opportunity to pick color primaries and coding that optimizes colorimetry. However, when converting an image from NTSC to HDTV it is important to apply the most appropriate colorimetry transformation so that the color of the original image as chosen by the artist is properly replicated in the new colorimetry of HDTV.

Conversion from analog to digital (or the inverse) requires careful consideration as well. Pick a converter that has good filtering so that the signal will conform to the standards. Not paying careful attention to a manufacturer's spec sheets could lead to lower performance. It they do not offer a statement that they conform to an accepted and published standard ask them for full data. Inexpensive converters usually produce a visually pleasing picture, but may well leave unwanted energy in the signal that will cause other problems, or filter too heavily to prevent aliasing, leading to reduced performance.

Finally consider other data embedded in the signal you are converting. What happens to closed caption data, VITS and embedded audio may well be as important as the active picture in some applications. Data tends to be stripped off and at best reinserted after processing, and at worst not reinserted at all. For instance, upconverters in general do not pass closed captioning to the output, either embedded or discretely.

John Luff is vice president of business development for AZCAR.