Skip to main content

Taming Ethernet for Video Frame Accuracy


Using Ethernet to move AV files is old news. What is news is using it to transport and switch real-time video streams in a broadcast, venue or post facility. Especially cutting edge is switching Ethernet streams frame accurately, as with an SDI router. True, today there are no standardized ways to do this with off-the-shelf IP (or L2 MAC level) switches. This is about to change.

Some vendors will show frame-accurate stream switching at the NAB Show and each solution will likely use a different non-standardized approach. So cross-product interoperability will be a dream in early 2014. This is expected at this budding stage of the streaming media ecosystem. Vendors feel a need to “get something out” and offer v1.0 products, even if interoperability is lacking. On the bright side, most of the major players have contributed proposals in this space to the “Joint Task Force on Networked Media,” which compiled a report that summarizes the proposals. You can access this at

So which technique will be the “chosen one” to switch AV streams frame accurately as with SDI? This is a very interesting problem, since one goal is to use commodity IP/ MAC switches without customization to meet video’s needs. The problem is a bit of an elephant, with different approaches depending how one looks at it. Remember that lossless, point-to-multipoint, extremely low latency (


In general, there are three methods to stitch or splice (A -> B) two real-time packetized video streams (A, B). These methods are broadly classified as:

Fig. 1: Examples of Source Timed, Switch Timed and Destination Timed control methods
Source Timed control method
• Frame-accurate effect but not using SDI switching method.

2.SwitchedTimed control method
• As with SDI switching, frame accurate stream switch (during VBI).

3.DestinationTimed control method
• Frame accurate effect but not using SDI switching method.

Incidentally, SDI switching systems use centralized management of the resources and guarantees frame-accurate response either from the human touch or under other device command. The same is expected when using packetized switches.

Fig. 1 will help you understand the three methods. First, nodes A, B and C are time and sync aware. Second, splice/route commands are sent to nodes or the switch, depending on the method. The small diagram in the upper right shows a frame-accurate splice of video sources A to B as received at C. The splice is directed by the switch control module. All three methods produce the same A to B splice as viewed at C. The video links are marked “Video/IP” but this could also be a Video/MAC (layer 2) mapping.

Let’s consider the Switch Timed control method first, since it is most familiar and similar to SDI routing. One tactic uses the switch API to tell the switch when to update the internal routing table and hence initiate an A/B splice. For example, the controller, using the API, would update the forwarding table of the switch to take effect at some future time Tswitch. This time would fall along video line 7 of the VBI, the same location where an SDI router may switch streams. Another tactic uses a switch-internal programmable FPGA to A/B splice during the VBI. Arista Networks and Thomas Edwards of Fos Network TV have publically demonstrated this style of switching.

Next, let’s consider the Source Timed approach. The basic idea is for the sources to control the state of their exit streams (A, B) so that the destination (C) receives the effect of an A/B splice. Here is one way to do this using multicast IP Groups:

• Assume that two IP multicast groups exist; G1 and G2;

• Destination C subscribes to G1 and G2 simultaneously;

• Source A is outputting its video payload on group G1. Source B is not outputting any video on group G2;

• At time Tswitch source A stops sending any video on group G1 and source B initiates sending its video payload packets on group G2;

• Destination C receives source A (on group G1), then at Tswitch receives source B (on group G2). So the effect is for A then B to arrive in series with no video frame overlap or discontinuity.

There are other ways to accomplish a Source Timed approach. Bottom line: The effect of a frame-accurate A/B switch occurs at C with no proprietary central switch design.

Finally, the Destination Timed approach relies on the destination to do the splicing. This may seem odd at first. Fig 1 outlines the sequence of actions for C to splice A/B. Assume C is receiving stream A only. Then, using out-of-band control, source B is also routed to C. At time Tswitch the destination splices A to B. Then, using out-of-band control, source A is not routed to C. This is simple with one huge disadvantage: There is a 2x bit rate increase at C and through the central switch when A and B are both received at C. The A/B overlap time may be as large as a 10’s of milliseconds. So, the network may need to be overprovisioned by a factor of two for the worst-case condition of all destinations doing stream splicing.

This method has merit when splicing streams inside a public cloud. Ethernet bandwidth peaks are not a big concern and “pulling streams” from video sources is easy even across non-video-friendly networks. The only strictly timed portion is the destination splice time, Tswitch. There are many ways for a destination to pull video from sources, and using the HTTP “get” command is just one.

This is just the tip of the iceberg. The intent is to outline the basics. Time will tell which method(s) is chosen for interoperable media systems.

Al Kovalick is the founder of Media Systems consulting in Silicon Valley. He is the author of “Video Systems in an IT Environment (2nd ed).” He is a frequent speaker at industry events and a SMPTE Fellow. For a complete bio and contact information,