New Dimensions in Media Streaming

Al Kovalick

Audio/video streaming is the heart and soul of the media facility—think Serial Digital Interface, AES3 for audio, HDMI and other transports that move audio and video from point to point in a timed, continuous fashion.

As previously mentioned in this column, IT methods and especially Ethernet are destined to replace/augment SDI over time. There is much industry activity to create the best practices and standards that will enable interoperable AV connectivity using Ethernet/IP. The new methods will be cloud-friendly in ways that SDI could never match.

We’ve all gone shopping to replace some worn-out item. One choice is to find a replacement that is nearly a clone of the original. Or, enticed by the new, we look for an item with the latest bells and whistles. We are now standing at the door of opportunity for creating an “Ethernet/IP AV transport” ecosystem. Should it be plain vanilla or a souped-up version?

The new networked transport version has many novel features lacking in SDI (and AES3 audio). See Fig. 1; is this the perfect set of characteristics? Should they be different? Let’s not debate it here. From my involvement in industry forums (especially the Joint Task Force on Networked Media) this set covers the main enhancements that many want to see. There is no room here to describe each new element in detail. So the approach is to review them now and expand in future columns.

At 12 o’clock in the figure, see “New stacks.” This relates to the layer 2 (MAC) and layer 3 (IP) data layering over Ethernet transport. One requirement is to layer SDI payload data over RTP/UDP/IP. This has been done using SMPTE standard St2022-6.

Fig 1. A future view of professional networked streaming
Here, each payload unit is time-stamped and filled with 1,376 bytes of image data. St2022-6 was implemented in several NAB Show product demos as a replacement for SDI. That said, the industry may need to tweak the specs slightly so it fits into a bigger, flow-harmonious framework. The flows may include video, audio, intercom, control, metadata, AV proxies and others, so, the total IP payload will be more than St2022-6 supports today. Also, the IEEE work on AVB is an alternate way to view the stack and other important QoS aspects.

Now see “Compressed essence” at 2 o’clock. SDI is uncompressed today. When a facility needs to stream UHD/4K images, applying a light compression will ease transport specs. For example, 4:2:2, 10 bit UHD requires a future 12 Gbps SDI link or a 40G Ethernet link. By applying 6:1 visually loss-less, low-latency compression the flow may leverage 3G SDI transport or 10G Ethernet.

Several NAB Show vendors showed compressed video over IP. Note that Ethernet is bidirectional whereas SDI is unidirectional. Also, more than one video payload can be packed into Ethernet up to rate limits. So, it seems reasonable that the new data formats should permit compressed streams, but not demand it. Selecting interoperable compression format(s) is a future effort.

At 3 o’clock is “Interop/bridging.” This relates to creating a future networked ecosystem friendly to the SDI world. This requires carrying SDI payload as one option using Ethernet/IP. When spanning domains, dropping data elements or transcoding AV formats looks like a “bridge over troubled waters.” So, efforts will made to align data types for smooth conversions between SDI and Ethernet/IP flows.

From 3:30 to about 7 o’clock in the figure are four related aspects (Range to New Physical layers). This is a list of networked features not available with SDI. Range-wise, IP flows can be device-to-device in the same room or across the world. Of course, long distance transport will seize tolls due to more loss, more latency and reduced rates. These negatives can be managed and still achieve acceptable performance. In 2014, 100 Gbps Ethernet is a reality and can transport 4:4:4, 12-bit, UHD2/8K flows at 96 Gbps (assumes Ethernet jumbo frames).

Finally, there is a wide range of standardized Ethernet physical link types to choose from including Cat 6a copper with RJ45 connectors for 10 G (to about 100m, same as SDI over coax), multimode fiber (to about 300 m) and single mode fiber (to about 80 Km).

At 8 o’clock is “Push/pull” support. What does this mean? SDI streams are pushed to a receiver from a sender. The receiver needs to drink the continuous data stream; there is no stopping it. On the other hand HTTP (uses TCP for reliability) is a pulled transport. So, imagine a client pulling video one frame at a time from a server.

The average performance looks like a streamed flow even if the flow is delivered in uneven chunks; receiver-buffering smoothes out any irregularities. Note that the latency per node could be a full frame of video. Pull has a pro/con list just as push does. It is likely that pull will find a home in the new world and especially in the public cloud. Pulled point-to-multipoint is problematic for a number of reasons not discussed here.

Last is the 9 o’clock item “Stream splicing.” There is no getting around it; it’s easy to frame-accurately splice two (A, B) video streams in an SDI router compared to using Ethernet/IP switching. That said, there are numerous ways to do this and several NAB Show vendors demonstrated frame-accurate video/IP stream splicing. Problem solved. However, our industry still needs to select a preferred method for interop.

Bottom line, the new streaming world is upon us. Stay tuned in for exciting advances over the next year and be sure to ask for demos at the 2015 NAB Show.

Al Kovalick is the founder of Media Systems consulting in Silicon Valley. He is the author of “Video Systems in an IT Environment (2nd ed).” He is a frequent speaker at industry events and a SMPTE Fellow. For a complete bio and contact information,

Al Kovalick