Video over IP

Pay attention to this technology. It will likely be a key part of your future video transport infrastructure.
Publish date:

Video over Internet Protocol (IP) affects just about all video-related industries — broadcast, post, news and live sports. The promise of inexpensive transmission of video, audio and metadata over packetized networks — almost always employing IP — has driven an explosion in popularity of this transmission medium.

Engineers need to be familiar with this technology because it is likely to become a key part of broadcasters’ video transport infrastructure in the near future.

Defining video over IP

In this context when I use the term video over IP, I mean video that is compressed and transmitted over a packetized network. The video is most often either for contribution or distribution.

Contribution video originates at a remote location, such as at a football stadium. The video is then transported back to a central broadcast network facility for manipulation, branding and packaging.

Distribution video is sent from a central broadcast network facility to other broadcast facilities or cable systems for ultimate transmission to the end user. (Throughout this tutorial, video over IP includes video, audio and data such as subtitles, in addition to metadata.)

Video over IP has been a hot topic lately, and broadcasters are not the only ones who have noticed. Several large carriers, including AT&T and Verizon, have deployed video over IP technology for the transmission of signals into the home. The environment in which this technology operates, and therefore, the user requirements for these deployments, are quite different from the broadcaster.

In this tutorial, I am not talking about delivery of video content to the home over packetized networks. While this is an interesting topic, and while there are some challenges that have been overcome in creative ways in this area, nothing I say in this tutorial should be construed to apply in the domain of content distribution to the home.

Steps in transmission

You may be familiar with analog video transport over TV-1 lines. TV-1 is an old AT&T term for the transmission of analog NTSC, 525-line video and two audio channels over a terrestrial network. The original system consisted of microwave links and balanced, two-conductor coaxial cable. It used modulators, diplexers, clampers and line amplifiers to maintain signal quality during transmission. These early systems took in analog video and audio, modulated them onto a carrier and transmitted them either over microwave frequencies or down a coax cable.

The steps required for transmission of video over IP are quite different. (See Figure 1.) Typically, serial digital video with embedded audio is presented to a compression device. If the user requires it, forward error correction (FEC) is applied to the compressed data. The data is then encapsulated into packets, and addressing information is added. Then the data is transmitted over the packetized network. The data is received at the destination, addresses on the packets are analyzed, and the data is directed to the appropriate device for de-encapsulation. Once the data has been de-encapsulated, error correction may be applied, and the video and audio data is then decompressed and converted back to a serial digital interface (SDI) stream. Let’s examine each step in some detail.

Figure 1. Steps involved in video over IP transmission


The compression device removes redundant information, thereby reducing the amount of information that has to be transmitted. Today, the most commonly employed compression methods are MPEG-2 and JPEG2000. The effect of specific compression settings in transport applications is critical. In MPEG-2, for example, it is likely that one would want to configure video transport compression parameters so that compression efficiency is high. In other words, it’s a good thing if you can get the same quality with a lower number of transmitted bits.

One way to achieve this is to use a large group of pictures (GOP). The GOP setting defines the number of P- and I-frames sent between B-frames. B-frames take up much more bandwidth than the other types of frames, so sending B-frames only occasionally seems like a good idea — and it is as long as you do not encounter an error in transmission. But using a very large GOP size means that if you lose a B-frame during transmission, you must wait until you receive another B-frame, plus the time it takes the decoder to resynchronize before a usable video signal is restored. This can result in an outage of a second or more in some configurations. That said, if the transmission error occurs during an I-frame or P-frame, it is highly unlikely that anyone would notice the error.

So the impact of an error may be extreme or inconsequential, depending on where the error occurs in the compressed bit stream. This has caused more than a little frustration for transport equipment designers. The choice of compression parameters, such as GOP size, is a balancing act between reduction of transmitted bandwidth and the impact of errors on the usable video.

Forward error correction

Next month’s article will focus on real-world challenges, and FEC will be part of that discussion, so I will not spend much time discussing FEC here. Suffice it to say that, if you have a transmission network that is prone to errors, one way to deal with the problem is to send error correction information along with the video and audio data. This information can be used at the receiving end to recreate lost data.


Encapsulation is the process of putting the compressed video and audio data into packets so that it can be sent over a packetized network. (See Figure 2.) Packetized networks are strongly based on the layered approach to networking and the fact that the functions of each layer of encapsulation are separated from the layers above and below it. Encapsulating the data in the various layers provides fl exibility by allowing the user to transmit this video over a variety of networks in a standardized way. For a complete discussion of encapsulation, please refer to the Broadcast Engineering June 2007 Computers & Networks article.


Transmission in the video over IP world means transferring the IP packets from one location to another using IP. Internet Protocol is a self-routing protocol containing both the source and destination address in each packet. Each packet is transported across the network individually. In IP, there is no concept that these packets are somehow related — this association is made in the higher level protocols, or in this case, the application .

It is the network’s job to transport packets from one place to another. Generally, the network has no notion of a nailed-up path — a path from one point to another that is fixed so that all packets flow the same way through the network. Due to network congestion and many other factors, packets may arrive out of order. They may also be lost, of course, and in some cases, packets may even be duplicated in the network. If you are using a dedicated network that has been specifically designed for carrying video and audio, the network has been crafted to eliminate many of these issues by using QoS parameters.


It is the receiving equipment’s job to detect packets that are addressed to it and to begin the de-encapsulation process. Each stack in the protocol unwraps the payload, performs specific tasks on that payload if required and then passes the data up to the next layer.

Error correction

Error correction corrects data that has been lost during transmission. Extra bits sent along with the payload allow error correction algorithms to recreate missing data. Which error correction algorithm you choose, or whether you decide to use error correction at all, depends largely on the loss characteristics of the transmission network. You should work closely with your equipment vendors and your transport providers to select the appropriate error correction technology for your network.


Decompression reverses the process of compression described earlier, usually recreating the SDI video stream that was presented to the encoder at the beginning of the process.

Looking forward

In the first installment of this two part tutorial, I introduced the concept of video over IP — defining it and discussing the steps in the process of moving video and audio between facilities. In next month’s tutorial, I will discuss some of the challenges of transmitting video over IP and how the industry is addressing these challenges.

Brad Gilmer is president of Gilmer & Associates, executive director of the Advanced Media Workflow Association and executive director of the Video Services Forum.