IPTV's key building blocks

Telcos are competing with the cable companies to deliver video content to consumers. However, most telco infrastructures do not have the required bandwidth

Telcos are competing with the cable companies to deliver video content to consumers. However, most telco infrastructures do not have the required bandwidth for high-quality video distribution like the cable companies. So, instead of using the digital modulation technique, such as QAM and MPEG-2 video encoding, to distribute content, telcos are employing IP networks and using new encoding schemes such as MPEG-4 Part 10 (also called H.264-AVC).

This creates a huge opportunity for equipment manufacturers to supply telcos with this new type of encoder and decoder. This article investigates the technology that is fueling this new IPTV environment. The first portion will look at the video encoding method, and the second portion will focus on the video-over-IP (VoIP) network design that is being used for IPTV.


Telcos are on the offensive to gain a big piece of the video market share from cable TV providers. Cable multiple service operators (MSOs) have made huge progress in delivering a triple play of voice, video and data services to consumers for quite a few years. Now telcos are responding in a big way to provide the same triple play by offering not only voice and data, but also high-quality digital TV video via IPTV. IPTV is an emerging technology that allows consumers to watch high-quality digital TV over the Internet via an IPTV set-top box or a PC. Traditional cable companies use an RF signal to carry the digital video by means of QAM.

Technology advancements have made it possible for telcos to bring the same quality of video via the Internet. The key building blocks on the transmission side are advanced video encoding and VoIP. Advanced video encoding is the most critical building block. The availability of HD content along with SD has created a challenge for telcos, because telcos still rely on the bandwidth-limited twisted copper pair of wires and usually do not have the luxury of cable's broadband capability.

A typical HD channel requires 20Mb/s, and an SD channel requires 4Mb/s. Therefore, a bandwidth-efficient video transport mechanism is needed. The H.264 format of MPEG4 Part 10 and Microsoft's VC-1 encoder can offer a 2.5X to 3X more bandwidth-efficient improvement over MPEG-2 encoding. Most broadcasters are adopting the H.264 standard rather than the VC-1 standard. The other building block of IPTV transmission is VoIP, which maps or bridges the encoded video data onto the Internal network for delivery.

H.264 encoder

H.264 is also known as MPEG-4 ISO/IEC14496-10 or MPEG-4/AVC. This standard was co-developed by a JVT group composed by MPEG-ISO/IEC members and VCEG-ITU-T members. Three profiles (main, baseline and high) have been defined, each with several levels. The main profile is required for broadcast video quality, while the simple profile is typically used for mobile and video conferencing applications. The H.264 encoder system block diagram includes two dataflow paths, a forward path, and a reconstruction or feedback path. (See Figure 1)

H.264 encoding is much more complex than MPEG-2 encoding. For the motion estimation/compensation section, H.264 employs blocks of different sizes and shapes, multiple reference frame selection, and multiple bi-directional mode selection.

For the transform section, H.264 uses an integer-based transform that roughly approximates the discrete cosine transform (DCT) used in previous MPEG standards, but does not have the mismatch problem in the inverse transform. Entropy coding can be performed using either a combination of a single universal variable-length codes (UVLC) table with context adaptive variable-length codes (CAVLC) for the transform coefficients or using context-based adaptive binary arithmetic coding (CABAC).

The H.264 design is complex, computing-hungry and requires parallel processing. If a general-purpose processor is used, it will be limited by its internal architecture. (If it has eight internal multipliers, it can perform eight multiplications per cycle.) A programmable logic device (PLD) is flexible and highly scalable: If an algorithm needs 100 multiplications per cycle, then the PLD can be programmed to perform the required task.


VoIP is the transmission of encoded video transport stream (TS) data over IP-based networks. It bridges between one or more encoded video streams and IP packets carried over 100Mb/s or 1Gb/s Ethernet. The VoIP accepts TS data and encapsulates it for transmission over Ethernet. Various standards define VoIP: real-time transport protocol (RTP), RTP payload format for MPEG video, UDP/IP, Pro-MPEG code of Practice #3 and DVB-IPI.

The TS input to the VoIP is either a DVB-ASI or uncompressed SDI video data that will be mapped onto the Ethernet protocol layer. Figure 2 shows a VoIP reference design block diagram that receives a DVB-ASI TS and then converting the TS to IP. The design includes the following main blocks: TS input logic, frame buffer, queue system, Ethernet-receive DMA, encapsulator, transmit channel information, receive channel information, timestamp, media access control (MAC) interface and host processor interface.

IPTV system summary

In summary, in order to provide quality VoIP, the latest H.264 video encoding technology is used to conserve bandwidth for delivery. Figure 3 shows the overall IPTV transmission system block diagram. The video content can be either SD or HD, uncompressed video, or previous MPEG-2 TS.

All these formats will be converted to H.264 video format before transmitting. All the key pieces can be implemented efficiently using PLDs for system upgradeability and flexibility.

Further information

Specification details of QAM can be obtained from the International Telecommunication Union (ITU) J.83 Recommendation: www.itu.int/rec/recommendation.asp?type=items&lang=E&parent=T-REC-J.83-199704-I.

Tam Do is the senior technical marketing manager at Altera Broadcast and Consumer BU.