Video over IP

Delivering high-quality video content is one of the latest and most demanding challenges faced by the Internet Protocol (IP). The combination of time-sensitive delivery, low loss transport and wide ranging bandwidth demand requirements make video over IP seem like an intractable problem for a connectionless, best-effort protocol.

IP and its networks have evolved to support more than basic data transfers. The first foray into supporting time-sensitive communications was with the deployment of voice over IP (VoIP) services, and this was very successful. At the same time, the rapid evolution of underlying technologies provided large amounts of bandwidth in all areas of IP infrastructures, from 10Gb/s Ethernet links in the core to multiple megabits per second in the broadband access layer. These steps created the conditions for the consolidation of communications needs, the so-called triple-play (data, voice and video), over the IP networks. IP is now pushing the service paradigm into a fourth dimension: mobility. Quad-play opens the door to new means of delivering video content as well.

Packaging video with voice and data services provides benefits beyond the cost reduction of using a common infrastructure. New features and capabilities can be implemented through the interaction between these three services. Ads customized for the user or region can be inserted in a subscriber's preferred video content. Incoming voice call information can be delivered on-screen, while the video transport capabilities of the environment can be leveraged for video telephony.

IP has already been a catalyst for new ways in which video content is delivered. Over-the-top video would not exist without the Internet and IP. While it is not entertainment-grade, over-the-top video has rapidly grown in popularity. In the end, however, regardless of service models and delivery methods, video represents a new and demanding playing field with specific requirements. Video over IP brings both opportunities and a wide range of technical and business challenges for the IP networks and for service providers. The next-generation networks (NGN), either enterprise or service provider, must be designed to support video services over IP alongside data and voice.

General concepts and service requirements

There are several service types and service models for delivering video content over IP. It can be delivered on-demand (VOD), unicasted to the subscriber, or it can be a broadcast program multicasted to a group of subscribers. It can be sent by service providers over their own infrastructure and in full control of the networking resources, or it can be delivered over the Internet, crossing multiple administrative domains.

In all these service models, users demand a high quality of experience (QoE), which is a multifaceted metric of the service quality. The QoE is related to the quality of the image, which depends on encoding, delay variations and packet loss during transport. It is also connected to the impact of failures and the speed of changing channels. Thus, it is important to understand some of the common factors influencing the QoE.

Regardless of the service model, a common challenge for video content delivery is the large amount of information being delivered, which translates in significant bandwidth needs. To make this challenge more manageable, several encoding mechanisms have been developed, primarily within the ISO/IEC Moving Picture Experts Group (MPEG) and ITU-T Video Coding Experts Group (VCEG), to reduce the bandwidth requirements and to deal with data loss. Different encoding mechanisms are best suited for each service model. (See Table 1.)

The most popular encoding mechanisms today are MPEG-2 for managed video services and MPEG-4 Part 10/H.264 for over-the-top and business services. Even with the help of encoding, video services will consume significant network resources, up to almost 20Mb/s for an HDTV, MPEG-2 encoded channel.

While high compression reduces the bandwidth use, it increases the stringency of packet loss requirements. Dropping a single IP packet carrying an MPEG I-frame can lead to pixilation, macro-blocking or loss of picture frame, thus significantly degrading the viewing experience. For this reason, video service providers demand a maximum loss of one packet in a million from their networks.

The encoded video content can be transported over IP directly via User Datagram Protocol (UDP) or via Real-time Transport Protocol (RTP). Single programs can be transported in a single-program transport stream (SPTS) or, for efficiency reasons, multiple programs can be multiplexed into a multiple-program transport stream (MPTS). For a good viewing experience, these streams should experience delay variations of no more than 150ms, above which packets are considered dropped. The control plane protocols leveraged in the delivery of the video content, either unicast- or multicast-based, should also experience minimal delays, and the channel setup time should be less than 500ms.

Quantitatively, the goal for a good viewing experience is to have subscribers experience less than one artifact over a two-hour long movie. To achieve this goal, it is important to understand all the factors affecting QoE as well as their probability of occurrence in real networks. For example, some claim that an important requirement in designing networks for video services is to have a recovery time of less than 50ms on link failure. While this is indeed a factor affecting QoE, experience shows that in real networks, it only causes 7 percent of the service outages. Designing IP networks for video services requires a comprehensive analysis of the impact and weight of all factors affecting QoE.

IP network architectures for delivering video service

From an IP network design perspective, the most interesting case is that of video services offered by a service provider over its own infrastructure. In this situation, all the elements participating in the service delivery can be designed and deployed in a way that optimally supports video content distribution. By contrast, in the case of over-the-top service, the means available to offer the best service across multiple administrative domains are limited. For this reason, we will focus on the end-to-end network design considerations for the infrastructure of a service provider offering video to its subscribers.

As highlighted earlier, out of all the service types in triple-play, video is the most demanding on a network's resources. For this reason, new networks supporting video content delivery are designed based on the requirements of this service. This typically ensures readiness for the additional voice and data services. The future service provider network must support flexible, interactive, content-rich video services across a wide range of access technologies with a superior quality of experience. Moreover, in the context of quad-play, service providers must be able to extend the service to mobile devices and support roaming.

The key components of the video over IP architecture include:

  • Super headend (SHE)Live video feeds and real-time encoding of video broadcasts typically originate from one or two SHEs, where asset distribution systems are also located for on-demand services along with back-end systems.
  • Video hub office (VHO)Operators typically maintain a few dozen regional VHOs, usually in metropolitan areas, serving 100,000 to 500,000 homes. VHOs often contain real-time encoders for local television stations, as well as the network routers that connect the distribution network to the network core. They also typically house most content servers used for VOD services.
  • Central office and video switching office (VSO)Central offices and VSOs house aggregation routers that combine traffic to and from subscriber homes.

A schematic representation of this generic architecture is shown in Figure 1 on page 62. This architecture can be implemented in both a centralized and a distributed fashion to address scalability concerns.

The implementation of this architecture requires, among other things:

  • thorough bandwidth budgeting;
  • high capacity, high forwarding performance routers handling both unicast and multicast traffic;
  • fast converging IP unicast routing design for VOD;
  • fast converging source-specific multicast-based (SSM-based) design for broadcast video service with special design considerations for channel zapping;
  • QoS design based on rigorous differentiated service (DiffServ) or integrated service (IntServ);
  • redundancy and channel setup time improvements, typically achieved by sending the video content over two different paths to a point as close as possible to the subscriber; and
  • a set of tools and systems that facilitate the process of operating and monitoring the services.

The details of the implementation reflect the characteristics of the access layer technologies, the topology type (ring vs. multipoint) and the bandwidth availability at various points in the network. Nevertheless, in the end, service challenges still occur from things such as the bandwidth bottlenecks in the access layer or scalability limitations. Additional tweaks can be made based on the traffic profiles. IP networks can be tailored to the asymmetry between the upstream and downstream traffic of triple-play users and have more resources allocated to the downstream traffic.

In the case of the bandwidth-demanding video services, it is easy to oversubscribe the access layer. A subscriber browsing through multiple broadcast HDTV channels can quickly draw sufficient multicast streams into the access layer. This will saturate the bandwidth available to a group of aggregated subscribers. For this reason, call admission control (CAC) is an important tool in the IP infrastructures shared by multiple subscribers and multiplexing several other services besides video.

Finally, the expansion of the IP-based environment into quad-play services creates the challenge of offering video content to mobile devices. Regardless of how well-tuned the network is to support the service, new operational circumstances simply require new concepts and mechanisms for content delivery. How will a mobile device that is receiving a broadcasted program while it is briefly going through a tunnel affect the viewing experience? In these new situations, the IP network might enlist the help of new delivery concepts, such as those where a small amount of redundant information can hide the packet dropped while in a tunnel without signal.


Video delivery over IP presents multiple business and service opportunities. Its deployment, however, is not a trivial matter. Of the three services in triple-play, video demands the most from the networks, so it greatly affects the design of the NGNs. In a well-designed IP network and with the help of appropriate service management tools, several service providers successfully deployed and operate large-scale video over IP services offering HDTV content. At an ever improving viewing experience, over-the-top content providers are reaching the growing Internet population not only as viewers but also as content producers. And all these services are running over IP.

Ciprian Popoviciu, PhD, CCIE, is a technical leader within the Networked Solutions Integration Test Engineering group at Cisco Systems. He is also a senior member of the IEEE.

Table 1. Video encoding mechanisms Encoding Bandwidth Quality Use MPEG-1 0.5Mb/s-1.5Mb/s VCR Business services MPEG-2 3Mb/s-20Mb/s Studio Broadcast/HDTV services H.261/H.263 64K-2Mb/s Video VOD services MPEG-4(p10)/H.264 <64K-4Mb/s Video Internet and business services