Delivering Web video isn’t as easy as 1-2-3. Content must be unique and high quality.
There are very few markets in entertainment and media that can boast double-digit growth for 2008, but it looks certain that IPTV will be one of them. According to a report in September by market researcher Gartner, by the end of 2008 there will be 19.6 million global IPTV subscriptions, a 64 percent increase from 2007. This represents about 1.1 percent of households, and by the end of 2012, Gartner forecasts that worldwide household penetration of IPTV will be 2.8 percent, reaping total revenues of €14.7 billion. Further, many large cable system operators are making no secret that they will be transitioning their content delivery to IP over the next few years.
In addition to the growth of DSL-based IPTV, the use of IP technology to deliver video to the PC or other devices is growing rapidly. There are two principal drivers for the growth in Web video.
The first is the explosion in use of video portals such as YouTube and Joost as well as social networks like Facebook and MySpace, which provide access to video/audio content outside of the mainstream channels offered by conventional TV. This is complemented by the huge success of the BBC iPlayer in the UK and globally by services such as Apple TV.
The other key driver is cultural. Consumers now expect to watch whatever they want, on the most convenient device, whenever and wherever they want. This has been nurtured by the availability of free on-demand, accounting for more than 70 percent of VOD content.
Telco or cable-based IPTV services can further enable this basic consumer need, and IP-based content is an excellent fit, as it's naturally good for offerings such as on-demand services that require robust two-way interactivity. Furthermore, IP-based content can be transmitted to, and experienced on, a variety of IP-compatible devices, either on a fixed or mobile basis.
Despite the bullishness of Gartner's figures, key issues must be addressed before such growth is attained. Telco IPTV offerings need to become more innovative so providers can offer differentiation not only against the traditional competition from cable and satellite service providers, but also from other IPTV operators.
Most importantly, the content that new IPTV services provide must be generated and delivered to high-quality standards. If Web video or telco IPTV services cannot guarantee content delivery of the highest visual and aural quality, another IPTV operator, satellite or cable company certainly will. The quality issues are even more critical when moving from small to large screens. What looks OK on the Web may look terrible on a 42in plasma screen.
The fundamental quality issues span both video and audio. Media compression must be artifact-free, and A/V synchronization has to ensure correct lip sync. The video encoding must be high quality, as does any transcoding, rescaling or reformatting of content for redistribution. Jerkiness is not acceptable, nor are compression artifacts. The IPTV operator must focus on broadcast-quality delivery to that large-screen display (and associated home theater audio system) and not simply streaming to computer screens. (See Figure 1.) In understanding the requirements of IPTV, first it is necessary to comprehend how to size network bandwidth.
IP infrastructures were originally designed for the delivery of computer data that could be retransmitted (TCP) as well as for voice traffic. In data applications, and to a lesser extent voice, a dropped packet or two is not perceived as a problem. However, when an IP network delivers pictures and synchronized sound, if there is a sudden packet loss or packet reordering, there will be a noticeable service interruption at the set-top box (STB). Operators need to constantly be aware that trying to add the transport of real-time A/V content onto a network already loaded with a lot of computer data or voice data may result in a less than optimal viewing experience for their IPTV customers. Traffic grooming becomes an important aspect of maintaining high quality.
Most IPTV systems today use AVC video with either Dolby Digital or HE AAC stereo (and multichannel) audio. The A/V streams are transported using either MPEG-2 TS over UDP, or increasingly MPEG-2 TS over RTP along with FEC (standardized in SMPTE 2022, which was based on the Pro-MPEG COP3 FEC). While FEC adds some overhead, it permits perfect delivery of the A/V in situations where there was packet reordering or burst noise.
In addition to their own network considerations, IPTV operators (as relatively new entrants into the content delivery arena) may not have a close relationship with their content providers, and the incoming signals may not always be of the highest quality. By their intrinsic nature, IPTV systems are usually at the end of a potentially long chain of compressed feeds, and operators have to ensure that incoming video and audio quality are not compromised. For example, incoming A/V synchronization may not be good or may vary over time. The IPTV operator who foresees this issue might install A/V retiming equipment between incoming signal reception gear and compression gear to permit proper A/V sync delivery to its customers.
A/V synchronization, more commonly known as lip sync, is a nontrivial issue that has affected the entire broadcast production and delivery system since digital equipment arrived in the 1970s. Both equipment manufacturers and users were initially slow to realize it was a problem, which means that it is still an issue today, made more noticeable by the increased video quality of HDTV. Many production systems do not provide for keeping or restoring A/V sync, which causes the problem of crossing system boundaries. Further insight reveals that it impacts multiple standards development organizations and is codec agnostic. This has slowed systemic response to the issue, although progress is being made.
Program production requirements that need to be considered include whether or not the video production switchers change video delay depending on the effects selected by operators. Newer switchers maintain a uniform delay, but there are many older switchers still in use that do not. For live events, wireless camera systems often have much more video delay than wired cameras. Media servers almost uniformly separate the video and audio for storage and recombine on playback. Few of them provide a reliable method of ensuring correct synchronization at that point. Even though IPTV operators may not have any control over these factors, they need to be aware of them and be prepared to possibly correct for them in their own facility. (See Figure 2.)
Also, IPTV operators need to ensure that incoming signals maintain their A/V timing. They should understand also that there is neither vertical blanking information nor a vertical ancillary channel in MPEG systems, and the MPEG signal is no longer in the time domain. Some program providers routinely transmit A/V timing test signals at off hours, but many do not.
Measuring A/V offsets
In-service measurements are difficult to carry out, and further standards are not in place to help. For example, in relationship to the image raster, it makes a big difference if the video is part of the measurement. Because a frame takes 40ms for 25Hz systems and 33ms for 30Hz systems, two measurement systems, one measuring at the top of the raster and another in the middle of the raster, could be by definition 20ms or 16.5ms apart. Also, without suitable standards for expression of these metrics, users cannot reliably compare device specifications, which can have an effect on ultimate system design quality.
Existing standards, like ITU-R BT.1359, are now considered loose: -185ms to 90ms. But they were developed for SD images on a CRT display. (See Figure 3.)
Beyond measurement issues, there are basic unintended consequences related to the MPEG Committee's approach of only standardizing the bit stream and not requiring a consistent use of the clocks provided by that bit stream. Because of this lack of specification, practical implementations have synchronized audio and video in diverse ways. Buffer handling varies widely, and overflows and underflows caused by transmission errors are treated differently by decoders. Further, the video and audio decoders are often in physically different chips and may rarely communicate. This means that clock samples may only be shared at startup or at channel change.
It's not just about technology issues; system management is also vital. New entrants to the IPTV market need to realize why their competitors have significant management and operations staff. Some IPTV operators don't employ any operations staff other than a call center to handle complaints. Others believe that system monitoring only consists of having a large plasma screen and an STB in an office and just watching what is transmitted, rather than installing the necessary test equipment such as real-time MPEG and IP analysis and monitoring equipment. The leading IPTV players have addressed these needs both in terms of management and technology.
Another area that exemplifies the difference in operator insight is encoding. Smart operators realize that they simply get what they pay for, and those who have bought the cheapest encoders will then wonder why the end picture looks so bad. Cheap, single-pass encoders will not satisfy a broadcast audience; two-pass encoding is a must. Capable, adaptive noise reduction filters are essential to delivery of high video quality, as operators do not typically have time to dynamically tune for content. All signals carry noise, even HD, and the bottom line is that noise in the picture will take bits away from clean picture coding and detract from the overall quality of the picture.
IP systems need to be able to serve different-sized pictures such as full-screen images as well as thumbnails for the program guide, so operators also need to look at transcoding, rescaling and reformatting of their content for each application. Transcoding is considered by some as a low-cost requantization device where video quality may be secondary, and as such it has developed a poor industry reputation.
Furthermore, classic compressed-domain transcoding between MPEG-2 and AVC has proven disappointing in terms of picture quality and bit rate savings. This is attributable to the differences between MPEG-2 and AVC compression technologies. Currently, the best method of achieving high quality is to decode to baseband video, resize or rescale in baseband and compress the result with the new encoder.
The state-of-the-art integrated decode and re-encode solutions will deliver the desired bit rate savings at high picture quality. Integrated solutions may also permit reuse of picture coding metrics and do not require additional devices occupying rack space, meaning that installations use up less space and ultimately less power.
And that in the end is what it is all about. Some firms have looked at the robust growth of IPTV and calculated that all they need in order to take a slice of revenue is to build a headend, turn it on, walk away and collect money. This could not be more untrue. Delivering a successful deployment involves investing appropriately in system design as well as in network design. It also requires the necessary management and operational staff to make sure that the viewers receive a high-quality service capable of competing against the satellite and cable companies.
Patrick Waddell is manager, standards and regulatory, for Harmonic.