HD formats and perception

Many formats are presented as high definition. Yet in the ATSC standard, there are only two raster sizes: 1920 x 1080 and 1280 x 720. Both are in a 16/9 aspect ratio. Scanning is either progressive or interlaced. What effect do frame rates (60, 30, 24, 59.94, 29.97 or 23.98) have on image quality? In color space, does bit depth and sampling affect the perceived image quality?

Here are some of the choices used to construct an HD video format:

  • Raster: 1920, 1440, 1280, 960 horizontal; 1080, 720 vertical
  • Scanning: interlaced or progressive
  • Color space: RGB or Y, R-Y, B-Y
  • Color bit depth: 12, 10, 8
  • Sampling: 4:4:4, 4:2:2, 4:2:0, 4:1:1

Is there anything to be gained perceptually with the use of 1080 progressive scanning in the viewing experience? Conversely, are 1440 horizontal pixels perceptually inferior to 1920? Or for that matter better than 1280? Is 3:2 pulldown visually acceptable for all types of content? Does the subconscious brain influence a subjective best impression?

Regardless of the numbers, formats are all about perception. How the Human Visual System (HVS) and brain process stimuli and what a person thinks about images determines satisfaction with a viewing experience, regardless of format.

Formats and the media lifecycle

Content undergoes various transformations in format through the media lifecycle. For example:

  • Creation: The camera interpolates from the image sensor to 1080i/p or 720p. Color sampling 4:4:4 RGB is converted to 4:2:2 Y, R-Y, B-Y.
  • Assembly: Content will be ingested and compressed to a high bit rate MPEG-2, AVC, VC-1, DV, HDV or other format. All are perceptual coders, discard information and are limited in encode/decode generations before artifacts become noticeable.
  • Distribution: An ATSC compliant MPEG transport stream at 19.39Mb/s will contain HD content that has now been compressed up to 50:1 and color sampling reduced to 4:2:0.
  • Consumption: The consumer display will process the ATSC elementary stream to conform to its native resolution. Who knows how 1366 horizontal pixels are produced from 1920 or 1280 and what this does to image quality?

In every case the original baseband HD video essence has been sliced, diced, transformed, transcoded, decoded and reconstructed. Again, the key question to ask is if the quality degradation is beyond our perceptual capabilities.

Perception

Extensive psychovisual subjective testing has been done to determine what is acceptable in picture quality. When is a just noticeable difference perceptible and annoying?

There are two fundamental factors in analyzing visual responses to images. Visual acuity is a measure of the resolving power of the eye (about 1 minute of arc). Color difference sensitivity determines how our eyes respond to stimulation by various wavelengths of light. For example, NTSC color implementation employed a form of compression based on eye sensitivity by limiting the bandwidth of chrominance signals while using maximum bandwidth for luminance. This fundamental physiological characteristic has been carried over into subsequent DTV systems.

Color perception A top level description of brain processing:

Color Perception fundamentals:

Two 2004 papers published in the Journal of Vision by McDermott, J. & Adelson, E.H.of the Dept. of Brain and Cognitive Science MIT get into deeper issues:

Visual Acuity:

Video image quality assessment methodology:

Intense study has gone into how the brain interprets visual data (see sidebar). A better understanding of how the brain perceives spatial and temporal aspects of visual stimuli could lead to new design methodologies; to better performing perceptual codecs; and lower data rates for a given video quality.

The eye is limited in its capability to resolve detail. In television this ability (visual acuity) can be analyzed based on viewing distance and aperture that determines the smallest picture element (PEL) separation that can be perceived.

It would be a dull world if monochromatic black and white vision were all that the eye and brain were capable of percieving. Yet color perception is not a straightforward process. There are still many mysteries to be unraveled.

Quality matters — Objective testing

Universities, corporate R & D and standards bodies have been active standardizing video quality definitions, measurement and testing methods.

The MSU Perceptual Video Quality tool is a tool for subjective video quality evaluation that can be used to conduct psychovisual testing. The Web site is also an excellent place to gain an understanding of perceptual testing of compression techniques.

Video quality measurement programs are underway at the Institute for Telecommunication Sciences, a branch of the NTIA (a Department of Commerce agency). An overview can be found at http://www.its.bldrdoc.gov/tpr/2004/its_t/video.pdf.

Equipment manufactures such as K-WILL and Tektronix have incorporated some of these methods in their video test equipment.

Format conversion techniques

With any type of conversion implementation it is important to consider the quality of the resultant image. Conversion chains can introduce artifacts, both obvious and subtle, that will not appear until farther down the production chain or in the worse case scenario at the point of consumption. This will not make for happy viewers. Watching football players blur together at the line of scrimmage does not make for good PR for HD.

An NAB2004 conference paper titled "Digital Video Format Conversion" and an informative brochure from Futureware describes fundamental issues and methods of format conversion

Format conversion equipment

Format conversion equipment has come along way since the design and commissioning of the Grand Alliance scan converter. As part of ATTC testing, conversion between 1080i and 720p was to be analyzed. Since the capability did not exist, the GA members collaborated and built a real-time system.

Conversion equipment currently available includes the Miranda Imaging Series XVP-811 XVP-811 ; the Helios software based conversion platform from Snell & Wilcox; and Evertz's 7710UC-KF HD up-, down-, cross converter.

By the numbers?

When analyzing the numerical parameters of HD formats it is important to consider the geometry of the viewing environment. Will it be big screen, passive at distances greater than 3ft? Might it be for PC consumption at less than 3ft, or mini-screens on cell phones and video iPods? For all formats, perception is directly dependant on viewing distance, screen size and raster format.

An optimum viewing distance calculator illustrates this point for home theatre systems. For a 16 x 9, 42in horizontal display, the optimum viewing distance is about 6ft (6.5 for SMPTE, 5.4 for THX). The maximum viewing distance based on visual acuity is 6.3ft — and the eye can no longer resolve 1920 x 1080 interlace detail. For a 17in diagonal, 2.2ft is optimal. A mini 2.5in diagonal HD screen (if one should ever exist)is ideally viewed at 0.3ft for comparison NTSC is 0.9ft.

What’s detail worth?

Broadcasting is a business, and although it may be challenging to produce the highest quality images using advanced technology, the expense incurred may not generate a sufficient return on investment. In fact, viewers may not notice a difference and if they do, they may not care!

Musical instruments are identified by their unique spectral characteristics. Although the ear can only hear up to 20KHZ, higher frequency components nevertheless influence our perception of a sound. This is why even the best CD recordings still do not sound real. Is there a parallel in the visual domain? Does our brain somehow sense imperceptible visual detail?

Viewing location directly determines the appropriate display. Cell phone football viewing is acceptable on a train, but given the option, a large screen would be preferred. Each offers unique challenges and business opportunity for broadcasters. Welcome to the multiformat, multiple consumption platform of DTV.

If this is the case, maybe 1080i, 720 or 1080 60p is not detail enough. Theatrical productions are using 2K, 4K and even 8K for cinematic presentation. Maybe with the emergence of free downloadable content, the only viable business model for broadcasters will be to supply something a viewer cannot get at home, a cinematic, immersive experience.

A session at IBC last year featured a glimpse into the future. 3-D presentations of clips from the Star Wars movie and other movies were stunning. The viewing of the recently fully restored Wizard of Oz after having seen it on the tube all my life was an amazing sensual experience. Maybe the future of HD is taking us to NHK’s Super Hi-Vision ... after which all other formats will pale in comparison.