MPEG-2 basic training

One of the advantages of digital video is that it can be compressed and transported in an MPEG stream across an IP network. MPEG compressed digital video requires new sets of test tools and troubleshooting skills using bit stream monitoring and testing to accurately identify problems, or preferably, recognize and identify potential problems before they occur.

The MPEG-2 standard is defined by ISO/IEC 13818 as "the generic coding of moving pictures and associated audio information." It combines lossy video compression and lossy audio data compression to fulfill bandwidth requirements. The foundation of all MPEG compression systems is asymmetric because the encoder is more sophisticated than the decoder.
MPEG encoders are always algorithmic. Some are also adaptive, using a feedback path. MPEG decoders are not adaptive and perform a fixed function. This works well for applications like broadcasting, where the number of expensive complex encoders is few and the number of simple inexpensive decoders is huge.

The MPEG standards provide little information about encoder process and operation. Rather, it specifically defines how a decoder interprets metadata in a bit stream. MPEG metadata tells the decoder what rate video was encoded at, and it defines the audio coding, channels and other vital stream information.

A decoder that successfully deciphers MPEG streams is called compliant. The genius of MPEG is that it allows different encoder designs to evolve simultaneously. Generic low-cost and proprietary high-performance encoders and encoding schemes all work because they are all designed to talk to compliant decoders.

Before SDI
Asychronous Serial Interface (ASI) is a serial interface signal where a start bit is sent before each byte, and a stop signal is sent after each byte. This type of start-stop communication without the use of synchronized fixed time intervals was patented in 1916 and the key technology making teletype machines possible. Today, an ASI signal is often the final product of MPEG video compression, ready for transmission to a transmitter, microwave or fiber. Unlike uncompressed SDI, an ASI signal can carry one or multiple compressed SD, HD or audio streams. ASI transmission speeds are variable and depend on the user's requirements.

There are two transmission formats used by the ASI interface, a 188-byte format and a 204-byte format. The 188-byte format is the more common. If Reed-Solomon error correction data is included, the packet can grow an extra 16 bytes to 204 bytes total.

What’s the purpose of a general-purpose oscilloscope (GPO) in troubleshooting MPEG? Not much. Specialized technology demands specialized test gear. MPEG steams are complicated, and MPEG-2 streams are more so. Examining MPEG-2 streams is reminiscent of measuring the front porch or counting the number of sync pulse serrations to manually validate an analog video sync pulse. Well, kind of, anyway.

Making MPEG-2
An MPEG-2 stream can be either an elementary stream (ES), a packetized elementary stream (PES) or a transport stream (TS). The ES and PES are files.

Starting with analog video and audio content, individual ESs are created by applying MPEG-2 compression algorithms to the source content in the MPEG-2 encoder. This process is typically called ingest. The encoder creates an individual compressed ES for each audio and video stream. An optimally functioning encoder will look transparent when decoded in a set-top box and displayed on a professional video monitor for technical inspection.

A good ES depends on several factors, such as the quality of the original source material, and the care used in monitoring and controlling audio and video variables upon ingest. The better the baseband signal, the better the quality of the digital file. Also influencing ES quality is the encoded stream bit rate, and how well the encoder applies its MPEG-2 compression algorithms within the allowable bit rate.

MPEG-2 has two main compression components: intraframe spatial compression and interframe motion compression. Encoders use various techniques, some proprietary, to maintain the maximum allowed bit rate while at the same time allocating bits to both compression components. This balancing act can sometimes be unsuccessful. It is a tradeoff between allocating bits for detail in a single frame and bits to represent the changes (motion) from frame to frame. Which is more important?

Researchers are currently investigating what constitutes a good picture. Presently, there is no direct correlation between the data in the ES and subjective picture quality. For now, the only way of checking encoding quality is with the human eye, after decoding.

The packetized elementary stream
Individual ESs are essentially endless because the length of an ES is as long as the program itself. Each ES is broken into variable-length packets to create a PES, which contains a header and payload bytes.

The PES header is data about the encoding process the MPEG decoder needs to successfully decompress the ES. Each individual ES results in an individual PES. At this point, audio and video information still reside in separate PESs. The PES is primarily a logical construct and is not really intended to be used for interchange, transport and interoperability. The PES also serves as a common conversion point between TSs and PSs (covered below).

Transport streams
Both the TS and PS are formed by packetizing PES files. During the formation of the TS, additional packets containing tables needed to demultiplex the TS are inserted. These tables are collectively called PSI and will be addressed in detail in a moment. Null packets, containing a dummy payload, may also be inserted to fill the intervals between information-bearing packets. Some packets contain timing information for their associated program, called the program clock reference (PCR). The PCR is inserted into one of the optional header fields of the TS packet. Recovery of the PCR allows the decoder to synchronize its clock to the rate of the original encoder clock.

TS packets are fixed in length at 188 bytes with a minimum 4-byte header and a maximum 184-byte payload. The structure of the TS header is shown in Figure 1. The key fields in the minimum 4-byte header are the sync byte and the packet ID (PID). The sync byte's function is indicated by its name. It is a long digital word used for delineating the beginning of a TS packet.

The PID is a unique address identifier. Every video and audio stream, as well as each PSI table, needs to have a unique PID. The PID value is provisioned in the MPEG multiplexing equipment. Certain PID values are reserved. Important reserved PID values are indicated in the table below. Other reserved PID values are specified by organizations such as the Digital Video Broadcasting Group (DVB) and the Advanced Television Systems Committee (ATSC) for electronic program guides and other table.

In order to reconstruct a program from all its video, audio and table components, it is necessary to ensure that the PID assignment is done correctly and that there is consistency between PSI table contents and the associated video and audio streams. This is one of the main testing issues in MPEG and will be the focus of the next “Transition to Digital” newsletter tutorial.

Note: The author would like to thank Les Zoltan at DVEO Pro Broadcast Division for his help in the preparation of this tutorial.

Recommended reading