Skip to main content

Calculating IP Video Signal Bandwidths for the Studio

Figuring out the amount of bandwidth a video signal requires on an IP network isn’t terribly hard, but it does require some familiarity with the underlying technologies and packet formats. Getting the correct answer is important for tasks such as estimating the number of videos that can be carried over a given network connection or for calculating the costs of a long-haul connection to carry signals between two facilities.

In this column, we’ll look at how much bandwidth will be consumed for a 1080p59.94 video signal when it is transported over two popular formats for uncompressed IP video transport that are available today.

The first format is SMPTE ST 2022-6, which was originally designed for moving uncompressed signals over a long-haul network, including all of the embedded audio and any other signals contained in the HANC and VANC spaces. This format is still popular today because it is very easy to take an SDI signal source (SD, HD or 3G), convert it into ST 2022-6 for transport over an IP connection, and get back exactly the same SDI signal at the destination without changing a single bit. It has been widely implemented by a number of equipment suppliers, and interoperability has been proven at several industry events, including VidTrans, which is hosted by the VSF (Video Services Forum).

The other format is newer, but it allows transportation of each type of media essence (video, audio, etc.) as a separate IP packet stream. This approach eliminates the need to embed and de-embed audio and other signals into SDI streams for transport and, as shown in the following calculations, reduces the amount of IP network bandwidth needed for video transport. The VSF TR-03 recommendation is based on RFC 4175, which takes groups of pixels and directly maps them into RTP packets. This recommendation is expected to evolve soon into SMPTE ST 2110-20, which will use a similar packet format.


The first step in calculating signal bandwidth is to figure out how much media essence can be transported in each packet. For ST 2022-6, this step is easy—each packet carries a fixed payload of 1376 bytes. For VSF TR-03, several alternatives are possible, so to simplify calculations, each video line consisting of 1,920 pixels will be divided into four equal parts of 480 pixels each. With 4:2:2 sampling, each pair of pixels requires four 10-bit samples (two luma and two chroma), which equates to 40 bits or 5 bytes. Thus, 480 pixels will occupy (480/2)*5=1200 bytes.


As shown in the calculations in Fig. 1, at each layer as packets move through the protocol stack, new headers are added. For ST 2022-6, a High Bitrate Media Header is added. For TR-03 (and soon ST 2110-20) an 8-byte payload header is used. Then, the 12-byte RTP header and 8-byte UDP header are applied, followed by an IPv4 header of 20 bytes. At the Ethernet layer, the standard Ethernet header of 14 bytes is often extended by a 4-byte VLAN label, and the required 4-byte Frame Check Sequence is appended to the packet, for a total of 22 bytes of overhead. When transmitted over a standard path, each Ethernet frame is preceded by an 8-byte preamble and followed by an inter-frame gap equal in duration to 12 bytes, for a total overhead equal to 20 bytes in duration.


Once the size of each packet is known, the other factor needed to calculate a signal’s bandwidth is the number of packets per second. This needs to be calculated using the original signal rates. 

For ST 2022-6, since the entire 1080p59.94 payload is transported, this calculation must be based on the full video frame. With 2,200 samples per line, 1,125 lines per frame, and 20 bits per sample (in 4:2:2 10-bit sampling), the total number of bytes per frame is 6,187,500. With 1376 bytes per packet, this translates to 4,497 packets per video frame. At 59.94 frames per second, the total packet rate is 269,550 packets per second.

For TR-03/ST 2110-20, only the active video area is transported. Since each packet carries one-quarter of a video line, one full video frame will require 4x1080 = 4320 packets. At 59.94 frames per second, the stream will consume 258,941 packets per second.

To get the total bit rate, all that remains is to multiply the packet rate by the size of the packet in bits. As the bottom row of Fig. 1 shows, the bandwidth of a 2022-6 signal is about 200 Mbps higher than the nominal 2.97 Gbps required by a 1080p video, whereas the TR-03/ST 2110-20 is almost 300 Mbps less than the raw SDI.


The only other high-bandwidth signals that are commonly found in a modern production facility are audio signals. In ST 2022-6, the audio signals are carried inside the SDI payload, so there is no extra bandwidth required for audio (provided the number of audio channels is less than what the SDI can carry).

In TR-03/SMPTE ST 2110, more bandwidth will need to be allocated for audio, although, with a 48 KHz, 24-bit stereo signal occupying less than 3 Mbps, audio streams are generally not a major burden on a gigabit-class network.


The actual amount of bandwidth allocated (i.e. the CIR or Committed Information Rate) in any network connection that carries an IP video signal needs to be greater than the raw bit rate calculated in this article. In particular, due in part to the bursty nature of video (blocks of pixels with gaps where the VANC would be), additional bandwidth should be provisioned through each network hop above and beyond the amounts calculated in this article. Since the recommended amount of added bandwidth is currently being studied by the SMPTE committee, this will have to be the subject of a future column.

Wes Simpson is active in standards development and technology training. Please visittelecompro.tvfor more information.