The many facets of IPTV

Over the past few months, there has been increased interest in the use of IP networks to deliver broadcast-quality TV. This article will deal with the provision of service to IP-based set-top boxes within a closed network owned by an operator.

We are not considering streaming video over the public Internet or delivering TV to PCs, although both of these are important topics. We will look at the logical architecture of the applications, as well as the network architecture required to support that. We will concentrate on the sort of architectures typically found in current deployments, but also will consider likely developments.

Why use IP?

Why use 30-year-old technology designed for slow, unreliable, unicast communications to transport video? The answer is that IP is a good example of success breeding success; it has become totally pervasive.

The second motivation is that IP can offer more functionality than traditional TV. The best example of this at the moment is VOD. However, the open, extensible nature of IP holds the promise of many more services in the future. We are already seeing the emergence of peer-to-peer AV file swapping and integrated home entertainment systems. IPTV infrastructure also can be leveraged to provide video telephony and remote monitoring, as well as new uses of video within applications that haven't been thought of yet.

The final factor that is emerging is cost savings. For example, headend manufacturers are replacing serial digital interfaces (SDI) with IP-based gigabit Ethernet because volume production makes the technology less expensive.

Examples of current live deployments are FastWeb, which has built a fibre-to-the-home infrastructure in Italy, as well as Kingston Interactive TV (KIT) from Kingston Communications in the UK, which is providing TV over DSL. British Telecom has announced a hybrid solution in which the STBs will use IP delivery for VOD and digital terrestrial TV for linear channels. Microsoft recently announced eight partnerships worldwide to develop IPTV service provider operations using its proprietary Windows Media Video platform and DRM system.

Application architecture

Figure 1. Application architecture of an IPTV system. Click here to see an enlarged diagram.

Figure 1 shows the application architecture for a typical IPTV system. (The network layer is not shown in this diagram.) Starting at the top, turn-around channels are received from satellite and terrestrial transmissions in the usual way. The MPEG transport streams are then fed into IP streamers, which send them onto the IP network as Single Programme Transport Streams (SPTS) in IP multicast groups. In principle, it may be necessary to transcode from MPEG into one of the new low bit-rate standards. (This is discussed later.)

Most IP STBs use an HTML browser as the user interface, so the screens, pop-up and on-screen displays (OSDs) that the viewer sees are generated by an HTTP server at the headend, and this is often called the portal. This native HTML capability of the STB and an always-on IP network allow the STB to access walled garden sites offering interactive services.

The portal includes EPG-type functionality and so requires a source of schedule data. This can be taken from an external source or from the schedule data in the incoming streams. The portal also can control STB behavior on a per unit basis; it can turn services and channels on and off. Note that it is not secure, as the portal does not provide any encryption of services.

How do IP networks for video work?

Some deployments will have a conditional-access system, although the terms “content security” and DRM also are used in this context. VOD servers also may be present. They send video to the STB on unicast connections and receive “VCR control” commands from the STB using Real Time Streaming Protocol (RTSP). In addition, the billing system needs to interface with the portal, the content security system and the VOD servers, as all of these systems manage billable services.

Operational support systems (OSS) may also be present in an IP-based system. OSS is really a telecoms concept and refers to the management of services and devices, including provisioning, network management, inventory and workflow. Many current IPTV deployments run without interfacing to the OSS system, but as IPTV becomes more tightly integrated into a telco infrastructure, we would expect closer integration with OSS systems.

We will assume that the reader has some familiarity with the TCP/IP protocol suite and will concentrate on the particular way that IP is used to deliver video. TCP/IP network technology can be thought of as a protocol stack, with each layer doing a particular job. (See Table 1 on page 12.) Each job needs to be achieved before the next one can be completed:

Link and physical layers: These are responsible for getting the raw data, 1's and 0's, over a physical link, e.g. an Ethernet cable, fibre or wireless link.
Routing layer: The next layer is the internetworking layer, also known as the IP layer. This layer is responsible for getting data to their destinations over several “hops.” It does this by being aware of the topology of the wider network. The IP address is the unique number of a computer (or device) used to identify it on the network, so packets of data can be sent to it.
Transport layer: The transport layer is responsible for reliable, in-sequence delivery of packets.

Although many traditional Internet applications use the TCP protocol for the transport layer, it is not inherently suitable for video delivery and certainly not for multicast. It handles lost packets by re-transmission. This causes latency, which implies buffering at the receiver, and is not tolerable for real-time applications. It doesn't scale well for a multicast environment because the video source would have to maintain two-way sessions with all the receivers and re-transmit different packets to each of them.

All the current IPTV deployments use the User Datagram Protocol (UDP), which is the “lightweight” alternative to TCP. Applications that use UDP must manage lost and out-of-sequence packets themselves.

Unicast vs. multicast

Early on in the development of real-time applications, it became apparent that most applications (voice, video, etc.) would require similar capabilities over and above those provided by UDP, such as time stamping and sequence numbering. These have been standardized into the Real Time Protocol (RTP).

At present, most IPTV deployments rely on networks that have been scaled to ensure that there is always adequate bandwidth, so few packets are lost under normal operating conditions. Therefore, they often don't use RTP because there is no need to check sequence numbers. Neither do they use any form of forward error correction, although we could envisage that this and RTP would be needed in a less controlled environment.

Most applications on IP networks are currently unicast; a data connection is set up between two hosts and only two hosts. This means that IP networks are well-suited to supporting VOD applications because each session is only directed to one host.

For VOD, the video data is sent to the host (e.g. an STB) over UDP, and the “VCR control” data is sent to the server using RTSP. However, operators typically want to offer traditional broadcast services as well “linear” channels. Except for very small trial systems, this cannot be done using unicast methods, as the duplication of data would overwhelm the network. To effectively deliver traditional TV channels, the IP multicast method is used. This has been part of the IP protocol for many years, but has not been widely used for applications until video delivery became a requirement.

IP multicast allows a host to send packets to a “virtual” address that is not directed at a particular host. If an application on a host is interested in that data, it can listen in by requesting the network software to find that data. The multicast address is used as a logical identifier for the content.

Access network technologies

In principle, all the multicast groups (as they are called) are available across the entire network, and any host can consume data from one or more of them at will. In practice, this would not be an efficient use of bandwidth. In the case of a DSL connection, it would vastly exceed the available bandwidth on the link. To manage multicast groups efficiently, hosts must tell their nearest router that they want to listen to a certain group. This is done by using the Internet Group Membership Protocol (IGMP). If necessary, the nearest router will then signal further back along the chain to find the required multicast group. Many switches also can “sniff” IGMP to switch multicast packets to those ports that require it.

Switching and/or routing multicast groups takes a significant time (100ms to seconds). Making this happen fast enough for channel-hopping viewers is a major challenge in the design of IP-TV networks, even those accustomed to the delay on a MPEG-2 based digital TV network. There are other factors that also can slow this process down, such as the time for the video decoder to buffer data, and if conditional access (CA) is being used, the time for the first control word to be generated.

There are three common topologies to access broadband access networks: DSL, cable modem or fibre/Ethernet to the home.

DSL is well-suited to deliver TV services, particularly VOD. It's sometimes easy to forget in this Internet-dominated age that DSL was originally designed to deliver VOD, so this should be no surprise! DSL works well for VOD services as all the bandwidth on the wire is dedicated to one subscriber, although this bandwidth is quite limited and can usually only support one MPEG-2 SD service (normally around 1Mb/s to 2Mb/s).

Cable modems, on the other hand, were not designed to carry TV services, but rather as modems for PC access to the Internet. As the world moves to delivering TV over IP, their architecture will need to change. Manufacturers already are starting to think about how this might happen. The problem with cable modems in their current architecture, e.g. Data Over Cable Service Interface Specification (DOCSIS), is that the downstream capacity of 27Mb/s is shared between all the subscribers on the cable segment, which is usually at least 500, and only 10 or so MPEG-2 video steams can be supported on one DOCSIS downstream.

The obvious evolution of the standard is to allow many downstream channels on one segment and support for cable modem-based hosts to be switched between downstream channels. This capability would need to be integrated with the IP multicast protocols, supporting IGMP requirements from consumer devices. In the meantime, most cable VOD deployments use IP over the cable return channel for delivery of RTSP messages to the server. The video is sent from the server to the modulators over an IP network, and then it is sent down the cable network (typically HFC) using the local video standard (for example, DVB-C or OpenCable).

Figure 2. Fibre to the home network topology. Click here to see an enlarged diagram.

Fibre or cable to the home provides a dedicated link to each subscriber, so it is also well-suited for IPTV delivery. The links typically have more bandwidth than DSL, ranging from 10 to hundreds of Megabits per second. This obviously makes it easier to support multiple STBs in a house.

Figure 2 shows a general architecture of a fibre to the home network. A DSL network would have the access network rings replaced with DSL access modules (DSLAMs) at the telco central offices. A three-tier network architecture is shown here. The access network runs at the neighborhood level. The distribution network transports the traffic within a city, and the core network connects cities. Many current deployments are quite small, and this architecture may be collapsed, or at least have elements co-located. VOD is typically injected at the distribution layer to save bandwidth on the core links, but the IP streaming is usually injected from a single headend. This means than most, if not all, of the multicast groups are present on all the core links all of the time; there is usually at least one viewer for any TV channel somewhere in a city!

Video coding standards

When links or devices fail in an IP network, the remaining devices will search for a new path. If it is a switch, it will use the spanning tree algorithm to find a new link to the target device. If it is a router, it will use a routing algorithm — typically open shortest path first (OSPF) — to discover a new route. It is possible for networks to “converge” on a new path in seconds, with good design.

The “next-generation codec debate” (MPEG-4 AVC vs. Windows Media Video and the proposed draft SMPTE VC-1 standard) is outside the scope of this article. What is worth exploring is the choice that operators have to make between mature and established MPEG-2 or the new low bit-rate techniques (MPEG-4 and Windows Media Video). This can be a trade-off between increased headend complexity and network bandwidth. Up to now, most IPTV deployments have used MPEG-2, although this will be changing over the course of 2005.

For turn-around channels that are acquired off-air from digital satellite or terrestrial transmissions, transcoding from MPEG-2 is needed if a different distribution format is used over the operator's network. At the moment, most operators are content to leave turn-arounds in MPEG-2 format as the extra network bandwidth is cheaper. However, this could change as the next-generation codec technologies mature.

Table 1. The TCP/IP protocol stack as used for video delivery. Click here to see an enlarged diagram.

At present, the new standards present a cost premium at the headend and, more significantly, in the STB. Currently, DVB has approved new guidelines for MPEG-4 AVC video (and high-efficiency AAC audio) codecs, but debate continues over whether the proposed licensing schemes are commercially attractive, technically mature and sufficiently risk-free.

Hardware-based decoders are not expected in volume until spring 2005 for AVC and fall 2005 for WMV, so all the STBs that have an MPEG-4 or WMV capability are still using software decoders, which add about E50 to the cost of each STB. This is a significant cost when factored by the size of the deployment. Our prediction is that MPEG-2 will be used on IP networks until there are hardware decoders for the new standards or operators hit a “hard” bandwidth limit. A good example of a hard limit would be the need to carry more than one TV channel over DSL due to competitive pressure.

Quality of service

By quality of service (QoS), we mean reserving resources on the network so that a video flow can travel from the source to the destination without suffering significant degradation.

As mentioned above, all the current deployments use networks that are scaled so that there is no contention for bandwidth and so that no QoS mechanism is required, although some networks use mechanisms to prioritize video packets over other traffic.

In the future, one could image a scenario in which local operators might provide access to content owners' servers over third-party networks. This could require an explicit QoS mechanism and probably a mechanism to bill someone for the bandwidth. The provision of end-to-end QoS over complex networks is a huge subject, which we won't tackle here, other than to say that the Internet Engineering Task Force (IEFT) has a standard known as the differentiated services architecture (DSA), which is intended to address this issue.

Security

There are two aspects to security in this context. The first is content security, which is slightly more than a new name for CA. All the traditional CA vendors have or claim to have solutions for delivering content securely over IP networks.

There is also a new breed of vendors that sell digital rights management (DRM) tools. Although DRM does a similar job to a CA system, it is designed with many future applications in mind that do not exist yet, for operators starting out in IPTV. DRM technology is often drawn from the IT world, and the solutions are intended to be cross-platform — to work on PCs, PDAs and mobile phones as well.

But the main difference between CA, which we know, and DRM, which we are not sure about yet, is that CA protects the pipe through which the program arrives. From there, once decoded, it is not protected. DRM systems protect programs, in chunks or right down to individual MPEG packets, according to rules defined by the operator (in accordance with rights agreements with content providers). The file can be set to no copy or copy once, for example.

Many content security solutions in the IPTV space do not use smart cards and rely on the fact that the headend security servers can maintain a “dialogue” with the security software in the STB. Some deployments do not actually encrypt content on the basis that video only gets fed down links that the operator allows and, therefore, is not freely available. We predict that content owners will not allow this method to continue for long as they are becoming increasingly concerned about the potential napsterization of IPTV delivered video, which could be stored on hard disk, encoded in a PVR file, and then “detached” from the protected environment, transcoded and uploaded to file sharing services, and then downloaded for free.

The second aspect to security is network security. Whenever you connect an IP network to anyone (even within an organization), it will get attacked, and connecting an IP network to millions of subscribers' homes is a sure way to attract attacks. There is plenty of literature on network security. We can't cover the topic in any depth here, but two aspects of TV over IP networks are worth mentioning in this context.

First, customers expect high availability from a TV service, and service providers must take the effect of likely attacks into account when planning the availability. Second, it is important to remember that the open nature of IP means that subscribers can and will try to spoof the network into supplying service to other devices that attempt to emulate STBs, something that is quite difficult to do on a traditional CATV network. The servers supplying the service must be designed to check the integrity of requests coming in from clients to mitigate this.

Conclusion

The existing deployments show that IPTV is a technically viable technology, and renewed interest in the technology promises to make 2005 an interesting year. The challenges for the IPTV industry remain twofold:

To prove that service providers can scale up to large deployments while maintaining the quality that customers have come to expect from satellite DTH services.
To integrate IPTV systems with a telecom-type infrastructure to provide an integrated “triple-play” offering effectively and efficiently.

Of course, the rest of the TV industry has not been standing still, and IPTV will have to meet the challenge of providing HD and interactive services if it is to really become successful.

David Short is a technical architect working on the design of new DTV systems. He also is a member ofBroadcastProjects.com, an alliance of independent consultants. For more information, visitwww.broadcastprojects.com.

Recommended reading