One-Chip Sensors -- The Bayer Pattern

Traditionally, most professional color television cameras have been three-sensor devices that produce red, green and blue signals that are electrical analogs of the light that enters the lens and falls on the sensors. Today, these signals are typically sampled and digitized before being output. In the tube camera days, this took the form of three pickup tubes--one each for red, green and blue light. In the current solid-state camera sensor era, three solid-state sensors--one each for R, G and B--are used Over the years, there have been some notable exceptions, but this has been the rule. Film also uses an RGB approach, with separate red-, green- and blue-sensitive emulsion layers.

In an RGB camera, light enters the lens, is focused, then passed through an array of filters and prisms that separate the red, green and blue components and steer each to its respective sensor. It bears noting here that the R, G and B filters used are spectrally wide. For example, the color-passing characteristics of the red filter overlap those of the green filter on its upper end and the blue filter on its lower end. The filters overlap to prevent spectral gaps, which would result in some colors not being registered at all.

(click thumbnail)
The sensor itself, typically a charge-coupled device or CCD in today's video camera is colorblind, sensing only the intensity of the light falling on it, not the color. The raw light intensity information generated by the sensors becomes the red, green and blue electrical signals respectively. The signals that come off the sensors occupy a linear color space--they are linear representations of the instantaneous amplitude of the lights that generated them. This space is often called "linear light" in the video business. Before being transmitted or recorded, the linear signals are gamma-corrected, or converted into a logarithmic color space, by being subjected to a particular set of calculations that compensate for the way the human visual system perceives light.

There are some potential drawbacks to the three-sensor approach to video or still photographic capture. Some of these include the fringing effects that can be caused by prisms, and the engineering problem of where within the camera to locate the bulky prism and filter apparatus.

Some video cameras and all digital still cameras use a single-sensor array rather than three sensors. Single-chip video cameras have traditionally been largely lower-cost consumer devices, but some professional video cameras, notably some video cameras designed to replace motion-film cameras, now use single-chip sensors. And all digital still cameras, including high-end ones, use single sensors. None of these single-chip cameras, save for a few using sensors made by one company, uses a straightforward RGB approach.

HOW IT WORKS

As always, there are some exceptions, but the majority of single-chip video cameras, and almost all digital still cameras, use so-called Bayer pattern sensors. These sensors, particularly in some still cameras, may be CMOS rather than CCD devices, but the principles are the same. The Bayer pattern approach, patented in 1976, was invented at Eastman Kodak by Bryce E. Bayer.

When a single sensor is used, light falls on a sensor array in which each element or pixel is covered by a colored filter. One obvious way to arrange the filters would be in the pattern: R, G, B, R, G, B, etc., progressing across each horizontal row of pixels. Other approaches include arranging colors in diagonal rather than horizontal and vertical patterns. These and other approaches have been tried, but the most popular arrangement for video cameras, and virtually the only arrangement used in still cameras, is the Bayer Pattern. Bayer used some characteristics of the human visual system as the basis for his arrangement.

In the Bayer Pattern sensor array, colored filters are arranged in repetitions of a two-row pattern. Across the top row of pixels, for example, the colors might be red, green, red, green, etc.; in the row below, the arrangement is green, blue, green, blue, etc. These two rows are consecutively repeated for the remainder of the array. Various manufacturers use various filter sequences. The top row is typically red and green, but Kodak sensors, for example, begin with green, while those from Sony begin with red.

The first thing that might be noticed is that there are twice as many green pixels as there are either red or blue pixels. The human eye is most sensitive to light in the green portion of the spectrum, and the eye obtains most of its "sharpness" information from green light. Thus, the green channel may be thought of as a luminance channel.

The next thing that is apparent is that none of the color channels is completely sampled. Only half the image is sampled by the green channel, and the red and blue channels each sample only one-quarter of the image. In order to form a complete image, the missing portions must be derived by interpolation.

DEMOSAICIZING

Because the individual red, green and blue images somewhat resemble mosaics, the process is often referred to by such tortured non-words as demosaicizing, demosaicing or demosaicking.

The demosaicizing, or Bayer pattern decoding process, can be done either by hardware or software. Hardware is faster and more portable when video images are involved. Since speed is of the essence, hardware is used. There is, naturally, a tradeoff between processing speed and image quality. Digital still cameras also have built-in hardware demosaicizers, but here, decoding speed is less of a priority.

There is another approach to demosaicizing and processing digital still images. In addition to processing each image by applying Bayer-pattern decoding, white balance, etc., and outputting the image, digital SLRs and some of the higher-end digicams can output raw, linear pixel data and associated metadata. The raw data is simply the sets of grayscale values for the red, green and blue components of the image, along with metadata, in a proprietary format specific to the camera manufacturer.

To view an image, a raw file must be processed with a software program that knows the file's syntax and format. In addition to demosaicizing, the raw converter converts the data from a linear space to a gamma-color space and applies all the parameters, such as color temperature, color balance, exposure and aperture values, etc.

In addition to having a highly sophisticated demosaicizer, this approach makes it possible to adjust the image's parameters in the linear space. Doing so causes less damage than would result from applying the same corrections in gamma space, where calculation errors would be greater.

So when we see high-quality still pictures or video images, it is possible that we are not in fact seeing the whole image, but rather a partial image with the missing portions interpolated. But thanks to Mr. Bayer, we would never know it by looking at it.

Randy Hoffner