Understanding digital video

Digital video is a method of representing the infinite variations of video signal levels inside a specified limit (e.g. 0mV to 700mV) as a limited number of binary numbers. Figure 1 shows a simplified block diagram of a typical "black-box" digital device or system. The input is a conventional analog video signal. The signal is bandlimited by a low-pass (antialiasing) filter and fed to an A/D converter where the analog signal is converted into digital form.

The digital signal is fed to a medium. The medium may be a signal distribution or transmission link, a videotape or a digital processor such as a frame synchronizer or a digital video effects (DVE) unit. The digital signal has to be formatted by a channel encoder to meet the requirements of the medium. Depending on the medium the channel encoder can be a parallel-to-serial converter or a complex MPEG compressor. Any medium adds noise, linear distortions and amplitude losses to the digital waveform. These distortions should not affect the analog "message" to the point where the zeroes and ones cannot be recognized by the channel decoder. If it does, you experience the well-known cliff effect.

After passing through the medium the digital signal is returned to its original analog format by passing consecutively through a channel decoder, a D/A converter and a reconstruction filter, which removes the higher frequency spectral components.

Quality limits The A/D conversion involves two major steps which determine the resultant video signal quality:

The sampling of the video signal. Sampling is the digital equivalent of amplitude modulation. The digital carrier (repetitive sampling pulses) is amplitude modulated by the video signal, resulting in a wide modulation spectrum with carriers at f subscript s (the sampling frequency) and its multiples (2f subscript s, 3f subscript s,..,nf subscript s). The process is similar to analog amplitude modulation except for the wider modulation spectrum, a result of the rectangular sampling pulse shape. The video sampling frequency has to meet several conditions:

- It must be at least twice the maximum analog video frequency (the Nyquist rule).

- It has to be high enough to allow for the design of practically realizable and cost-effective low-pass (antialiasing) filters with minimum ripple and group delay.

- The sampling frequency must be a multiple of a basic video frequency like f subscript h (horizontal scanning frequency) or f subscript sc (color subcarrier frequency). The sampling rate of video signals has evolved throughout the years. Today, analog composite video signals are sampled at a multiple of the f subscript sc. Analog component video signals are sampled at a multiple of the horizontal f subscript h.

The sampling frequency and the characteristics of the antialiasing filters determine the maximum video frequency that the device or system can handle. SDTV and HDTV digital video specifications narrowly define the frequency and group-delay characteristics of the low-pass filters.

The quantizing of the sample values. The sampling process results in a progression of samples repeated at regular intervals whose amplitude reflect the instantaneous baseband signal amplitudes at the sampling instant. Ideally the sampling pulse should have a very short duration. In reality it has a duration of T=1/f subscript s resulting in a sample-and-hold process that ensures that each sample amplitude is held in memory until the next sample arrives. This results in a high frequency loss, which is compensated for in the reconstruction filter.

In the quantizing process, binary values are assigned based on the sampled pulse amplitude values. The word length of the sampled value depends on the number of bits per sample that the system provides. As a consequence the amplitude levels of a continuously varying analog signal are converted to a finite number of discrete digital levels Q according to the expression: Q=2 superscript n, where n is the number of bits per sample. The resulting digital signal is therefore an approximation of the original signal.

Early video equipment used eight bits per sample, while today's high-end studio equipment uses 10 bits per sample. This limits the accuracy of the digital representation of the original analog signal since only a limited number of values (2 superscript 8=256 or 2 superscript 10=1024) are recognized by the system.

In addition to limiting the number of signal amplitude values that the system recognizes, the various digital standards limit the excursion of the video signal, or the "quantizing range," to less than the total number of possible values. This is due to the need to maintain an acceptable headroom for varying analog signal levels as well as for special timing reference signals (TRS) unambiguously defined as distinct from the samples carrying video information. The limited number of binary digital values representing the video information results in "quantizing errors" (Q subscript e), i.e. the erroneous representation of video levels. With eight or more bits per sample the quantizing errors are often visible as wideband noise. With less than eight bits per sample a system may exhibit "contouring effects" - the inability to correctly represent gently sloping wide area picture brightness or chrominance values. All sampled amplitudes falling within specific bounds are assigned a single value that is one of the Q levels, i.e. n, n+1, n+2 etc. The quantized error may thus contain an error not exceeding 1/2Q. Figure 2 shows the effect of an insufficient number of bits per sample on the reconstructed analog signal. The approximate signal-to-noise ratio (SNR) of a digital system is given by the formula SNR (dB) = 6n + 6, where n is the number of bits per sample. The formula takes into consideration the headroom as well as f subscript s vs. f subscript max considerations. Consequently a 10-bit system will have an SNR of approximately 66dB.

The digital video formats Two different digital video formats have evolved:

- The composite digital video concept. This format constitutes a stepping stone from the analog composite (NTSC or PAL) world to the all-digital teleproduction center. Two 4f subscript sc sampling frequencies have resulted: 14.3 MHz for NTSC and 17.7 MHz for PAL. In North America there was an initial interest in 4f subscript sc composite digital videotape recorders. This had to do with the need to replace aging analog composite two-inch and one-inch VTRs with digital VTRs featuring analog composite input/output ports. A number of manufacturers developed such products identified as D2 and D3 VTRs. A wide range of compatible 4f subscript sc digital video studio-quality equipment appeared on the market. In Europe there was limited interest in 4f subscript sc VTRs because they cannot handle SECAM.

-The component digital video concept. Component digital video is based on the use of one luminance (Y) and two color-difference signals (B-Y and R-Y) signals. The document describing the concept is known as ITU-R BT.601 (formerly known as CCIR 601). It defines the manner in which three component analog video signals are sampled, quantized and time-division multiplexed into a single bearer for distribution, processing or recording. The development of competitively priced component digital video equipment as well as the trends towards in-plant, land-line, on-air or satellite component digital video signal distribution at full or reduced bit rates was a major factor behind the whole DTV revolution.

The sampling frequencies of the three component signals are common to both SDTV formats (525/60 and 625/50) and each is a multiple of f subscript h as well as 3.375MHz resulting in the well-known 4:4:4, 4:2:2 and 4:1:1 sampling strategies. Figure 3 shows a simplified block diagram of a 4:2:2 component digital black box. The 4:2:2 sampled and quantized component signals are time-division multiplexed resulting in a bit-parallel data rate of 27M Words/sec. The bit-serial data rate of the 4:2:2 digital signal is 270Mb/s. Figure 4 shows the sampling grids of the three component digital formats. Figure 5 shows the quantizing range of the component video signals.

Among the advantages of using the component digital approach are:

- Wider chrominance bandwidth

- No NTSC/PAL artifacts

- Improved SNR

Pros and cons of digital video The advantages of digital video can be summed up as follows:

- Single-pass analog-type impairments are non-cumulative if the signal stays digital. However a concatenation of digital black boxes using analog interfaces can lead to cumulative analog signal degradations and should be avoided.

- Reduced sensitivity to noise and interference.

- Digital equipment can perform efficiently and economically tasks that are difficult or impossible to perform using analog technology.

- Digital signals are amenable to the application of techniques for the retention of essential information such as compression.

The disadvantages of digital video can be summed up as follows:

- Analog-type distortions as well unique digital distortions related to sampling and quantizing may result in a variety of visible impairments.

- Digital video may require wide bandwidth for recording, distribution and transmission. Sophisticated bit rate reduction and compression schemes may be needed to achieve manageable bandwidths.

- Unlike analog signals, the digital signals do not degrade gracefully and are subjected to a cliff effect.

The digital video technology is sweeping the broadcasting industry. Increased picture processing, transmission and recording capabilities are available at competitive costs. With the accelerating implementation of DTV, video signals will be kept in the digital form up to the viewer's home in order to provide a higher quality product with more user-friendly features.