In spite of the obvious appeal of IP networks for many video transport applications, some broadcasters are not completely comfortable with the technology. Many of these concerns derive from experiences with the public Internet, which admittedly is not a perfect channel for mission-critical, time-sensitive material. However, when private networks are employed, IP technology can be used to deliver signal quality and reliability that significantly exceeds that available with traditional time-division multiplex (TDM) technologies employed in standard SONET/SDH networks.
In any real network, errors are going to happen. Data bits can be corrupted, IP packets lost and fibers cut, in addition to many other faults. In this environment, it is important to design a video transport system that tolerates these faults and recovers from them.
Bit errors can be introduced from a number of sources. Optical components will occasionally misfire and change values. Electronic devices will also experience errors. These errors are typically transient and often last for only a few bits. These types of errors are typically corrected using forward error correction (FEC) with specially designed redundancy codes.
Lost packets can occur from longer sequences of bit errors, congestion and other errors. Some network equipment deliberately discards packets when faced with network congestion. This situation can often be avoided for video streams by increasing the priority of the packets. Lost packets are also caused by random bit errors that corrupt the header or other portions of a packet. Packets with invalid headers (as indicated by a bad checksum value) are normally discarded by networking equipment. To correct these types of errors, FEC codes can be used, or packet retransmission schemes can be employed.
Network interruption happens when more than one packet in a row is lost. Because these outages last from a few milliseconds to many days, the only totally effective way to correct for these failures is to build a redundant network.
Interestingly, IP technology is uniquely suited for handling all of these different error types. Reed-Solomon and row-column FEC handle minor bit errors and occasional lost packets. In transmission systems with even higher error rates, packet doubling and automatic packet re-sending replace lost data. And finally, hitless protection switching, which uses packets to compensate for network delay differences and to precisely select an appropriate switchover point, corrects even long-term network interruptions.
Reed-Solomon error detection
Reed-Solomon coding is a popular method for detecting and correcting errors, particularly burst errors that occur in data transmission. In Reed-Solomon systems, the data stream is broken into discrete blocks, typically a number of bytes (n) that is less than 256. Within the data block, some of the bytes are designated as information bytes (k), and the remaining bytes (n-k) are calculated values that form the entire data block into a polynomial. These (n-k) values are calculated so as to add redundancy to the information bytes (k) to form a total data block (n). It can be shown that as long as no more than [(n-k)/2] of the bytes are corrupted in the transmission (either in the original information bytes or in the calculated bytes) then the original data block can be recovered perfectly.
As an illustration, consider a 240-byte overall data block (n=240) with a 224-byte information payload (k=224). Using standard Reed-Solomon calculations, 16 bytes of error correction data would be added to each payload (n-k=240-226=16) to form a data block prior to transmission. At the receiving end of the circuit, calculations would be performed that could recover the original block as long as no more than 8 bytes were corrupted along the way [(n-k)/2=16/2=8]. This gives a robust block transportation scheme, where all errors that occurred in a transmitted block could be repaired, so long as no error exceeded 8 bytes in length and no more than 8 bytes were corrupted in any 240-byte block. This corresponds to an error rate of 3.33 percent, which would be exceedingly high for any modern communication link, such as SONET or IP.
This same Reed-Solomon scheme could be creatively applied to correct for complete lost packets instead of simple bit or byte errors. In this scheme, each 240-byte block would be spread across 40 consecutive IP packets, wherein 6 bytes from each block would be included in each packet, as shown in Figure 1. Other blocks would be added to this group of packets so that each of the 40 IP packets is a reasonable size for transmission. Using a 1500-byte total packet size (including IP headers, etc.), it works out that 240 of these data blocks could be spread across the 40 IP packets. Because 6 bytes from each block are in each packet, the total payload size of each IP packet is 1440 bytes (6 × 240), leaving plenty of room for the required IP headers. These IP packets could then be transmitted across a network even in the presence of errors that would cause packets to become lost.
Forward error correction
Row-column FEC operates by arranging groups of packets in rows and columns and then adding a FEC packet to each row and each column. For example, a 10-row, five-column FEC scheme would add 10-row FEC packets and five-column FEC packets to every 50 data packets. In the new SMPTE 2022 standard, these FEC packets are calculated by using the exclusive or (XOR) function. Interestingly, if any one packet in a row is lost, then its value can be calculated by performing an XOR of all the other packets in the row, including the FEC packet. This also applies to columns.
With this extra data, it becomes possible to correct burst errors of up to five packets in length and even replace packets that are completely missing. However, there are limits to what can be corrected.
Of course, the added row/column FEC data is fairly costly in terms of bandwidth, adding 15 extra packets for every 50 transmitted, for 30 percent overhead. End-to-end delay of the system is also increased, because the receiver needs to buffer the block of 65 incoming packets in order to check the FECs and correct any errors in the data block. (See Figure 1.)
Packet doubling is a simple error correction scheme that transmits each packet twice from the signal source. These packets do not necessarily have to be transmitted one immediately after the other. The duplicate packets can be sent after a specified amount of delay that depends on the implementation. At the receiver, each incoming packet is examined, and any duplicate packets are discarded. In this way, the signal can get through even in a fairly high loss environment.
While this method does seem wasteful, with an overhead of 100 percent, it has a big advantage over other methods in terms of delay for low bit-rate signals, such as live, compressed audio feeds.
Automatic packet resending and TCP
Automatic packet resending uses a higher level protocol to retransmit packets that were lost during transmission. This is extremely beneficial for data transport and is required when no errors can be tolerated, as in the case of a banking transaction or for a downloaded software program, where even a single bit error causes major problems.
Packet resending is part of the Transmission Control Protocol (TCP), which is a standard, highly reliable method to transfer data files across a network that doesn't have service assurance guarantees. Unfortunately, TCP leaves much to be desired for transferring video streams, because the throughput of TCP goes down as the round-trip packet delay increases and as the packet loss rate increases. This is caused by two mechanisms used by TCP.
The first mechanism ensures that TCP can handle transmission errors, particularly lost packets. TCP counts and keeps track of each byte of data that flows across a connection, using a field in the header of each packet called the sequence number. As shown in Figure 2, the receiver sends an acknowledgement that indicates the sequence number of the next byte that it is ready to receive. If a packet is lost or arrives out of order, the sequence number in the next packet to arrive will not match the count that the receiver has made of all the previously received bytes. When this happens, the receiver sends an acknowledgement to the sender that indicates the last correct byte received, which obligates the sender to retransmit the missing data.
The second mechanism allows TCP to control the flow of data across a connection. This feature operates by way of a mechanism where the receiver tells the sender how big a buffer it is using, and the sender must not send more data than will fit into the buffer. Whenever the sender determines that the receiver's buffer is full, it will delay transmission of new data until the receiver acknowledges that it has processed the data that has already been sent. Whenever the receiver quickly acknowledges the receipt of new data, the sender can gradually increase the flow of data.