Video IP resilience

Nearly every bit error creates a noticeable disturbance in picture and sound. These disturbances typically span several video frames, and can sometimes last several seconds. In IP networks, the key to minimizing these disturbances is implementing effective schemes for IP resilience.

The basis of IP networks

IP networks were developed based on two main transmission protocols: Transmission Control Protocol (TCP) and User Datagram Protocol (UDP). TCP can adapt to the available channel throughput and ensure the delivery of every bit of information by retransmitting packets that did not arrive at the destination. However, TCP is limited to point-to-point transmission and is unsuited for streaming applications where timing properties, such as jitter and latency, must be tightly confined.

UDP better serves point-to-multipoint applications while enabling more predictable delivery timing and, therefore, is much more suited to the delivery of video over IP on infrastructure networks. Unfortunately, over the years, IP networks developed a large dependency on the resiliency of TCP, and then video arrived.

In some equipment implementations, UDP's failure to support retransmission automatically labeled any UDP traffic as low priority. Furthermore, the conventional approach in deploying IP switches and routers was to use small queues and deal with temporary congestion by dropping packets. Thus, during early video-over-IP trials and deployments, much of the work involved changing the approach from “drop packets upon any slight problem” to “pass the video packets at all costs.”

To handle video, IP switches typically require a much larger buffer capacity, as well as QoS implementation. Although video packets may no longer be dropped because of switching decisions, some are still lost through simple bit errors occurring in physical links between devices.

Traditionally, optical links were designed to have bit error rates of 10e-12, which was considered acceptable for time-division multiplexing (TDM) systems where a single bit error remains just one bit error. On Ethernet links, however, a single bit error causes the loss of a complete Ethernet frame. This may be acceptable when the TCP layer corrects the problem by retransmitting the information, but for video-over-IP implementation, this error rate yields a loss of seven MPEG transport stream (TS) packets, meaning a packet loss rate of 10e-8. A video-over-IP infrastructure may have multiple links between the source and destination devices, lowering packet drop rates of even a well-designed system to 10e-7.

Typical bit error rates on DSL links are several orders of magnitude worse, and packet loss rates are worse than on optical links. While in general IPTV is implemented using multicast, the DSL link connecting subscribers to the system is a point-to-point link that enables implementation of unique resiliency techniques.

Forward error correction

Resilience issues have not gone unnoticed by the international community. Efforts to standardize forward error correction (FEC) for transmission of MPEG over IP recently ended with SMPTE's adoption of the Pro-MPEG COP #3 spec as SMPTE 2022-2007. This standard enables re-creation of lost MPEG packets while tuning the trade-off among rate overhead, latency and level of protection.

Because implementation of SMPTE 2022 FEC is rather expensive in terms of hardware resources, it is first emerging in applications requiring high quality and reliability in video-over-IP links, such as in point-to-point broadcast contribution. While nothing prevents implementation of FEC on the massive video infrastructure of telco and cable companies, FEC is limited in that it only addresses random and burst errors, not equipment failure or configuration problems.

Stream redundancy

In the one-way streaming world, recovery is usually based on redundant streaming and reception.

Figure 1 on page 110 illustrates redundant transmission, or “hot-hot” transmission, in which the same video channel is encapsulated over IP several times. In cable, for example, each headend may stream each video channel twice, using two parallel routes to protect against network failure. To protect against complete headend failure, two headends will stream in parallel, sending two identical streams per video channel to the edge, as indicated in Figure 2 on page 110.

At the edge, in the case of redundant transmission, video processing devices must be aware, performing automatic detection and transitions among video-over-IP streams. The most common implementation is called socket redundancy, through which the edge device detects the lost connection and immediately moves to a redundant stream. Although this mechanism supports automatic recovery from upstream network problems, it still creates visual and audible artifacts of one to a few seconds.

The packet switching solution enables a seamless transition from the failed video (network) socket to a viable socket. Today, most edge devices receive and buffer only the active socket. Once a socket fails, the device tunes to the backup, requests it (in case of multicast), buffers it and plays it. For a seamless transition, the transmitter device must encapsulate identical video packets and stream them on two separate sockets. Thus, a given video TS packet and its continuity number are transmitted twice, with both versions arriving at the edge device at about the same time.

The edge device continuously stores the two socket inputs. When it detects a failure, it identifies the TS packet location in the buffer, which is ahead in time of the failure, and continues playout seamlessly. Note whether the system can allow the use of Real-time Transport Protocol (RTP). The addition of an RTP sequence number enables increased robustness (jitter, latency and burst losses).

Figure 3 illustrates a packet switching scenario in which encoder “A” streams two copies of the same service on two separate sockets. Those sockets are sent over IP to the receiver, which stores them at their respective FIFO buffers. Due to the nature of the IP network and other parameters, the two sockets contain the same video TS packets shifted by time-variable delays in the IP network. The receiver tracks the primary socket for failures and seamlessly switches to the second socket when it detects errors. This method provides a seamless user experience in the one-way streaming environment, even in the case of network failure.

Video retransmission

When the network is inherently flawed in terms of relatively high bit error ratio (BER), as in DSL, operators can deploy retransmission protocols to ensure resilience. Retransmission is performed with protocols over TCP/IP and works well for end user devices (such as STBs) that have a large enough buffer and are less sensitive to latency, as in linear services and live TV. Retransmission allows for end-to-end peace of mind. The transmitter maintains a per-STB TCP or TCP-like connection and retransmits lost packets per the STB's requests. The STB buffer size must accommodate the round-trip delay — from detection of missing packets to delivery of the retransmitted packet. Such implementation exists today in IPTV as standalone servers or as part of edge routers.

Bringing it all together

Generally, FEC is the best approach for long-reach point-to-point links (broadcast contribution) that require low latency and can withstand some additional equipment cost. In point-to-point applications, where a relatively high packet drop rate can occur and higher latency be tolerated, a retransmission approach is preferred.

Point-to-multipoint video infrastructure networks require the resiliency afforded by equipment and link redundancies. Many of these networks already support port-level redundancy, and a select few implement socket-level redundancy. In the near future, packet-based redundancy will be implemented in such networks, finally overcoming the remaining problem of random bit errors.

Adi Bonen is chief technology officer and Gal Garniek is associate vice president, marketing, for Scopus Video Networks.

Challenges in video-over-IP resilience

  • Unprotected, one-way traffic has become the standardMost video-over-IP traffic erroneously assumes that the network and devices are lossless. Once packet losses do occur, recovery is difficult, as few video-over-IP systems feature built-in recovery mechanisms.
  • The burden of recovery rests on the network perimeterNetwork components are not built to identify video-over-IP packet drop. Switches and routers are passive devices in that they multiplex/demultiplex IP packets, but they can and do drop packets. Even when a video flow is defined as the router's highest priority, a packet dropped due to temporary congestion may be recorded but not actively recovered.
  • Large packet bursts drive packet loss and jitter artifactsDue to the nature of IP streaming, which combines large packets and high rates, large queues form in an arbitrary fashion, resulting in potential buffer overflow in IP switches unfit for video-over-IP streaming. Overflow of the switch's internal FIFO can cause packet loss.
  • The nature of encapsulation results in visual/audio artifacts in the case of packet dropIn all video-over-IP encapsulation schemes, the IP packet size enables encapsulation of multiple MPEG packets, so the dropping of a single IP packet necessarily creates an artifact.