Pixel grids, bit rate and compression ratio

Audio, video and data distribution can be reduced to one issue — channel capacity. The function of compression is to reduce digitized audio and video to data rates that can be supported by a channel; hence, the less data there is to compress, the easier it is for a compression engine to encode content at an appropriate bit rate for the target transmission technology.

Smaller pixel grids and lower audio sampling rates require lower compression ratios. A lower compression ratio means that less of the original content is discarded during compression, resulting in a reconstruction of the content with the highest quality audio or video that can be delivered over a limited bandwidth channel.

A single 20-bit PCM audio channel sampled at 48kHz has a data rate that is less than 1Mb/s, while SD-SDI is at 270Mb/s and HD-SDI is at 1.5Gb/s. Because video requires a significantly larger amount of data than audio, this tutorial will focus particularly on video.

A numbers game

Video formats vary largely and must be suited for the target delivery channel and reception device. Table 1 is a comparison of total pixels for common pixel grid dimensions. With respect to the three-screen scenario, HD and SD pixel grids are appropriate for broadcast DTV. VGA, CIF and QVGA are suited for broadband Internet delivery, while QVGA and QCIF can be used for mobile video services.

Resolution

Horizontal Pixels

Vertical Lines

Total Pixels

HD

1920

1080

2073600

HD

1280

720

921600

SD

720

480

345600

VGA

640

480

307200

CIF

352

258

90816

QVGA

320

240

76800

QCIF

176

144

25344

Table 1: Picture element comparison for common display grid dimensions.

Frame rate is another factor that influences the amount of compression necessary to fit a given pixel grid into a distribution channel. A 60Hz refresh rate, for example, requires twice as much data throughput as 30Hz and doubles the amount of compression necessary.

Color depth is an additional factor that influences bit rate. MPEG limits luminance and chrominance data to 1B; but in professional applications and when video is delivered over HDMI, luma and chroma data can be words that are 10- or 12-bits long. A 12-bit word has 50 percent more data than an 8-bit word over a given time and increases the amount of compression required.

Channel capacity

Channel capacities for DTV, Internet and cellular service transmission channels vary as well. DTV is defined by the ATSC MPEG-2 transport stream data rate constraint at 19.39Mb/s. For Internet broadband delivery, data rates are 768Kb/s as used by DSL and 1.5 Mb/s, commonly referred to as T-1. For delivery to 3G cell phones, there’s a 2Mb/s maximum data rate indoors and 384Kb/s outdoors.

The following examples will use YUV color space (luminance and color difference signals R-Y, B-Y) and 8-bit color word depth as specified in the MPEG standard.

Table 2 presents the various display formats and their associated data requirements. It is important to note the number of bits per frame in order to understand how widely data rate varies. The difference in the number of pixels from SD to HD display resolutions is an increase of three to six times, and, as the table illustrates, frame refresh rate (along with scan method) can impact bit rate.

Format

Pixel Grid

Pixels

YUV B/frame

bits/frame

15Hz Mb/s

30Hz Mb/s

60Hz Mb/s

HD

1920x1080

2073600

4147200

33177600

500

995(1080i)

1991(1080p)

HD

1280x720

921600

1843200

14745600

221

443

885(720p)

SD

720x480

345600

691200

5529600

84

166(480i)

332(480p)

VGA

640x480

307200

614400

4915200

74

150

300

CIF

352x258

90816

181632

1453056

22

44

87

QVGA

320x240

76800

153600

1228800

18

37

74

QCIF

176x144

25344

50688

405504

6

12

24

Table 2: Bit rates for various pixel grids and refresh rates.

Compression ratios and channel capacity

The amount of compression applied to raw content is expressed as a ratio. The most familiar compression ratio is the 50:1 (or greater) figure that roughly describes how MPEG-2 fits 1Gb/s (approximately) 1080i HD into a 20Mb/s MPEG-2 transport stream.

Because there is no explicit blanking interval in the compressed domain, only the active pixels need be considered. In the following analysis, however, keep in mind that the numbers are approximate and there will be some additional data, such as packet headers, check sums and error correction and concealment information, that can have a significant impact on payload data capacity.

For example, consider an MPEG-2 transport stream packet appended with 20B of FEC. Of the 208B packet, only 184 are payload. If this packet were transferred over a 10Mb/s channel, it would be 1664b of packet containing 1504b of data. Conceptually speaking, the payload data rate would be reduced by nearly 10 percent to 9.04Mb/s.

Applying the 50:1 compression ratio to SD 720x480 @ 30Hz (480i) at 166Mb/s, the result is a bit rate of 3.3Mb/s. It would take a hefty broadband Internet connection and an appropriately sized buffer to enable SD video streaming to a PC. Even VGA resolution compressed 50:1 requires close to 3Mb/s. To get VGA resolution into a 768Kb/s DSL pipe, the video compression ratio must exceed 192:1. By reducing the frame refresh rate to 15Hz, a 50:1 compression ratio would be adequate for DSL.

Continuing to downsize the pixel grid to QVGA, the data rate is reduced proportionally. For this quarter computer display image, the 50:1 compressed data rate is 747Kb/s. This is barely within the capability of a DSL connection, and leaves little room for header and check sum data.

The QVGA option, however, opens many possibilities as to how to use the available display real estate. Program associated information can be presented with the video, and, more significantly, commercial announcements supporting Web-based transactions could generate revenue for the broadcasters and advertisers. All the necessary technology to support this over the Web is in place, but not by DTV service providers. The implementation of T-commerce over DTV is still in the future.

For mobile video, if the data channel supports 384Kb/s, then application of the 50:1 compression ratio to a QCIF display @ 30Hz will produce a 243Kb/s data stream. For a handheld device, depending on the quality of the display, a 15Hz refresh rate may be adequate, and the data rate would be halved.

Looking at these figures, it’s obvious that a combination of advanced video codecs and increased channel capacity will continue to be developed. AVC and VC-1 have doubled compression ratios for equal video quality, broadband connections tout 4-, 6- and 8Mb/s downstream data rates and telcos are implementing ADSL2+ with 24Mb/s and greater capabilities.

Production workflows

A brute-force approach to production to support multichannel distribution uses independent workflows for each distribution scenario. This is not the most efficient method. Elimination of redundant tasks and automation of repetitive processes increases production efficiency, reduces time to air, enables management of multiple versions of content and formats and maximizes the use of infrastructure and personnel resources.

Management of multiple versions of content in various formats could be handled by an application accessible to all production personnel. A scheduling and tracking system can be a great aid in implementing parallel production. In this electronic workflow, production personnel are assigned tasks, progress is monitored and content is checked and approved with the aid of a single asset management application over an Intranet GUI. Multiple versions of the same content are catalogued, and metadata is shared. The use of AAF and MXF enables search functionality and a content history to be maintained. Efficient parallel production can be accomplished by centralizing these production tasks.

Folder drop automated conversion workflows can be leveraged to assist in content conversion, thereby removing redundant tasks. For example, an editor must produce a segment for DTV, Web and mobile distribution — a three-screen scenario. The high bit rate, ingested file is dragged and dropped to a folder designated for such a scenario. The automation compression process then kicks in and produces appropriately formatted (i.e. pixel grid, refresh rate, compression codec and bit rates) content for each distribution channel.

Maintaining quality through conversion

Quality after conversion is always an issue. Because all compression is lossy and based on perception, how much information can be discarded and still enable reconstruction of an acceptable, artifact free, moving video image at a reception device?

If the production process begins with native 1080x1920 video, and the capabilities of the compression engine are a 50:1 ratio, some form of image processing (pixel decimation) must often be performed. The 1920x1080 format can be reduced to VGA (640x480) by 3:1 horizontal pixel decimation. In the vertical dimension, 2.25 lines must be reduced to one line to produce 480 lines. Both processes will result in a reduction of high frequency information and a loss of image details.

Upconversion of display resolutions presents a difficult challenge. No technique can recreate lost details. Intelligent use of creative techniques, windows and split screens can help maintain perceived upconverted image quality or eliminate the conversion completely. When converting between HD and SD, however, pixel squareness negates the ability to do one-to-one pixel mapping between display grids.

Concatenation of compression conversions can lead to unacceptable quality. Each codec may remove information that is important to maintaining quality through the production and distribution chain. The best workflow is a single compression workflow where one process converts uncompressed source material to the desired codec and channel-specific bit rate.

In the future

The quest for higher channel capacity and more powerful (higher ratio) compression engines is an area that is being addressed by researchers, standards organizations and equipment manufacturers today.

One day, the Internet will probably expand from the Abilene backbone and become a consumer commodity. Networking at 10Gb/s will be commonplace in the digital home. Until then, limited channel capacity and audio/video compression will remain a digital broadcasting fact-of-life.