Format conversion

Author Michael Robin explains the complex process of converting 4:3 images to 16:9 and back. The trade offer that accompanies the conversion is also discussed.
Publish date:
Social count:

Motion pictures that were made between the late 1920s, the advent of sound on pictures, and the early 1950s, the advent of widescreen pictures, share two characteristics:

  • Horizontal vs. vertical picture dimension ratio (aspect ratio). 1.33:1 (4:3)
  • Number of acquired pictures per second (picture frequency). The chosen picture frequency is 24 images per second. This satisfies the eye requirements with respect to recreating the illusion of movement. To satisfy a related eye requirement, critical flicker, each stationary picture of the sequence is projected twice, resulting in a “refresh rate” of 48 cycles per second.

Table 1. Some contemporary film formats. Click here to see an enlarged diagram.

When the basics of practical television were developed in the 1930s, the chosen aspect ratio was 1.33:1 (4:3) to match the contemporary film aspect ratio. On both sides of the Atlantic, the need was felt to relate the picture repetition frequency (refresh rate) to the power line frequency. For historical reasons, this was 60Hz in North America and 50Hz in Europe. Transmission bandwidth constraints dictated the use of interlaced scanning, and the result was 30 frames per second (60 interlaced fields per second) in North America and 25 frames per second (50 interlaced fields per second) in Europe. With the advent of color television, North America felt the need to alter the frame repetition frequency to 29.97 frames per second (59.94 interlaced fields per second) to reduce the visibility of the color subcarrier. PAL does not use this method.

Transferring film to video

With PAL or SECAM, featuring 25 frames per second (50 interlaced fields per second), transferring film to video is usually achieved by running the film at a slightly increased speed (25/24 = 1.04166…). This results in shortening the duration of the projected movie, which is relatively acceptable, and raising the reproduced sound pitch, which is mildly annoying.

NTSC video required a different approach. It is evident that it would be totally unacceptable to run film at 30 (or 29.97)frames per second. The solution adopted is based on the fact that 30 (television frames per second) and 24 (film frames per second) have a common denominator — six. Essentially, four frames projected at a speed of 24 frames per second take the same amount of time (4/24 = 1/6 sec) as five television scanning frames at 30 frames per second (5/30 = 1/6 sec). Thus, if the image is scanned completely five times while four film frames are passing through the projector, then the two systems maintain synchronism.

This relationship is maintained if one film frame is scanned with two television fields (2/60 sec), the next film frame with three fields (3/60 sec) and so on. This method is called the 2/3 pull-down. While this solution was adopted before the advent of NTSC color with its modified scanning rates (29.97 television frames per second), it works equally well with the slightly reduced frame rate.

The methods described above worked well until the early 1950s. By then, there were about 15 million television receivers in use in North America. This created an apathy among the potential moviegoers who preferred to stay home and watch television. The movie industry reacted by enhancing the movie-watching experience visually by using various widescreen formats and color, and aurally by multichannel sound. This resulted in a variety of aspect ratios requiring the widening of the screen. Table 1 shows some of the formats. Transmitting a widescreen picture required either shrinking it horizontally and vertically to fit the screen, resulting in black bars at the top and the bottom (letterbox), or cropping it horizontally.

ATSC implications

The ATSC/DTV standard specifies two picture aspect ratios, 4:3 and 16:9. The 16:9 aspect ratio is a compromise. The 4:3 aspect ratio formats have a luminance sampling grid of, respectively, 720×480 and 640x480. The 16:9 aspect ratio formats have a luminance sampling grid of, respectively, 1920×1080, 1280×720 and 720×480. This results in a large number of possible formats that may be encountered. ATSC also specifies a colorimetry standard (ITU-R-BT709) different from that used by SDTV formats.

Figure 1. The three methods used for downconversion are the horizontal cropping method, letterbox method and anamorphic distortion method. Click here to see an enlarged diagram.

A period of transition, supposed to last until the end of 2006, is expected to allow the gradual implementation of 16:9 aspect ratio DTV transmissions. During this period, TV stations are expected to simulcast 4:3 aspect ratio NTSC analog signals on the currently allocated channels and 16:9 aspect ratio DTV transmissions on separate, newly allocated terrestrial transmission channels. The simulcasting is planned to stop in the year 2006, when all analog NTSC transmissions are expected to end and the related transmission channels assigned to other uses. In the transition period, a great deal of format conversions will occur.


Downconversion applies to methods that reduce the full 16:9 aspect ratio HDTV sampling grid (1920×1080 or 1280×720) to a 4:3 aspect ratio SDTV 720×480 sampling grid. The process is best referred to as resizing and changes the original size of the image, i.e. the number of Y,CB,CR pixels used to represent the image to the selected target size. Figure 1 depicts the typical methods used.

  • The horizontal cropping method. In the edge crop mode, a central window is extracted, which fits into a 4:3 raster. In the “pan-and-scan” mode, the operator moves the central window in the horizontal direction to follow the main action. This is the most often used approach in North America. By necessity, some details of the picture will be dropped, so there will be a definite loss of picture information. On the other hand, the screen will be completely filled.
  • The letterbox method. The 16:9 aspect ratio picture is reduced vertically and horizontally to fit inside a 4:3 aspect ratio window. The process generates black bars at the top and the bottom of the picture. The thickness of the black bar depends on the aspect ratio of the film. Letterboxing reduces the vertical resolution because the black bars reduce the number of active scanning lines. This method is generally used in France and Germany but is shunned in the UK.
  • The anamorphic distortion method. The 16:9 aspect ratio picture is squeezed horizontally to fit inside a 4:3 aspect ratio raster. This method results in anamorphically distorted shapes. In North America, this method is used in the beginning and end of the movie to allow showing all the credits, many of which would be masked by the pan-and-scan process.

Figure 2. The typical methods used for upconversion are the side panel mode, the tilt scan mode and the anamorphic distortion mode. Click here to see an enlarged diagram.

NTSC stations will use downcon-verters to feed the NTSC transmitter with 16:9 originated signals. Also, given the large number of analog 4:3-format NTSC receivers in use, it is unrealistic to expect that all these receivers will be discarded in 2006. In all likelihood, a large number of set-top converters/decoders will be used to convert the 16:9 DTV transmissions to feed the 4:3 NTSC analog receivers. These set-top decoders will use some type of format downconversion.


Upconversion applies to methods that increase the 4:3 aspect ratio SDTV 720×480 sampling grid to a full 16:9 aspect ratio sampling grid (1920×1080 or 1280×720). The process is best referred to as resizing, and it changes the original size of the image, i.e. the number of Y,CB,CR pixels used to represent the image to the selected target size. Figure 2 depicts the typical methods used.

  • The side panel mode. The original 4:3 aspect ratio picture is inserted in a 16:9 window, which results in black side panels.
  • The tilt scan mode. The 4:3 picture is stretched in the horizontal and vertical direction to fill a 16:9 aspect ratio screen, resulting in a 25 percent loss of vertical resolution. The “viewing window” can be preset, or a “tilt and scan” approach can be used. Here the operator moves the window in the vertical direction to follow the action.
  • The anamorphic distortion mode. Figure 2 shows the manner in which a 4:3 aspect ratio picture is stretched horizontally to fill a 16:9 aspect ratio screen, which results in anamorphic distortion.

None of the three methods offers an ideal solution. Experiments indicate that a 5 percent anamorphic distortion is undetectable and that a 7 percent anamorphic distortion is not objectionable. The current trend is to combine the three methods to obtain a picture that is subjectively pleasant to the viewer. In the long run, upcon-version will be used to convert 50 years of SDTV legacy programming to DTV formats.

Michael Robin, a fellow of the SMPTE and former engineer with the Canadian Broadcasting Corp.'s engineering headquarters, is an independent broadcast consultant located in Montreal, Canada. He is co-author of Digital Television Fundamentals, published by McGraw-Hill, and recently translated into Chinese and Japanese.

Send questions and comments

Home | Back to the top | Write us