Video over Ethernet

Last month, we explored the merging of IP with SDI. This month, we will continue to explore this topic by looking at recent work on video over Ethernet.

This column has dealt extensively with the transport of real-time A/V over long-haul IP networks. By now, the SMPTE ST 2022 family of standards has been deployed in tens of thousands of units throughout the world. This family of standards works by encapsulating MPEG-2 streams in RTP and User Datagram Protocol (UDP) packets. The UDP is then inserted into IP packets and Ethernet frames. From there, carriers typically map the Ethernet onto long-haul SONET. Users can employ an optional FEC mechanism to correct errors introduced during transport.

Comfort level

This protocol stack offers a number of advantages in long-haul transport. The first is that all of the protocols are well-known and understood. RTP provides the required level of synchronization, UDP allows for efficient transport of large blocks of data, and both IP and Ethernet are ubiquitous. SONET provides ultra-reliable transport and precise timing over long distances. The optional FEC mechanism provides an interoperable, standardized solution for the sorts of errors frequently encountered when sending video over carrier networks. For all their strengths, however, these standards may not be appropriate for all A/V transport scenarios.

Recently, I attended a SMPTE users group meeting. A number of people spoke about the standards they would like to see developed over the next three to five years. It was interesting that almost every speaker mentioned video over IP or video over Ethernet. They may have been speaking about live, long-haul streaming of A/V over IP networks, but they were also definitely talking about A/V transport within their facilities. Frankly, the ST 2022 standards were not intended for this application. There are several reasons why 2022 is not ideal for local transport of v/a content.

When talking about moving video (and audio) inside a facility, there are several critical user requirements:

  • Delay must be minimal.
  • Delay must be deterministic.
  • Synchronization must be maintained between different streams.
  • Performance must be predictable.

Unfortunately, some of the underlying technology used in ST 2022 makes it difficult to meet these requirements, particularly in the area of delay. The inclusion of the optional XOR FEC mechanism in ST 2022 means, inherently, these systems will have delay on the order of several hundred milliseconds. The exact delay depends upon the video format and a number of configuration choices made by the manufacturer. Also, ST 2022 does nothing to ensure synchronization or network performance. The standard relies on mechanisms in MPEG-2 for synchronization of audio and video streams, and it allows the implementer to employ a number of available QoS schemes to lock down network performance. It is fine to require MPEG-2 and QoS infrastructures in a long-haul environment. But, is it really practical to employ this inside a facility with hundreds if not thousands of devices? Definitely not. Besides, even if using MPEG-2 did resolve the synchronization and QoS issues, the delay still must be dealt with. Clearly, the user requirements dictate a different solution.

As the professional media industry looks for a solution for deploying video over IP inside our facilities, it makes sense to see what other industries are doing. The Institute of Electronic and Electrical Engineers (IEEE) has an effort called Audio Video Bridging (AVB). IEEE is working to provide A/V transport over local networks with low latency, accurate synchronization of streams and guaranteed performance. While it remains to be seen if this effort will meet user requirements, it is a substantial body of work being done by the people who invented Ethernet. Therefore, it would behoove us to pay attention.

The other side

There are a few key things to be aware of regarding AVB. It is trying to solve a specific problem — the provision of hundreds of time-synchronized video and audio streams across a LAN at latencies below 2ms. Designers want to be able to reserve bandwidth to ensure that once a stream is started, it does not fail even as other devices come online. Also, they want to be able to support a large number of independent clocks on the network and allow a wide variety of simultaneously-used sync sources. To that end, AVB specifications allow hundreds of audio channels to be delivered with a latency below 2ms over seven hops on a 100MB/s Ethernet network, all synchronized within nanoseconds. This is an impressive design, and all indications are it will be achieved.

Something else to consider regarding AVB is the effort involves video over Ethernet, not over IP. Specifically, this is about AVB on IEEE 802.3 unshielded twisted pair networks. (Although, support for IEEE 802.11 wireless networks are anticipated in the future.) AVB concentrates on providing high-performance transport at Layer 2, not Layer 3. If you're a little unsure what this means, remember that AVB provides video over Ethernet. Ethernet uses MAC addresses and is limited to a LAN. To access the Internet, or to operate across a large-campus environment, Layer 3 is needed so IP addresses will allow for creating separate networks. While it is true that Ethernet bridges allow you to direct packets across separate physical wiring groups, AVB, in current form, will not traverse Layer 3 network switches.

Given that AVB provides guaranteed high performance, the fact that it works at Layer 2 isn't surprising. Parameters like jitter and wander can be controlled more tightly there. Also, it's difficult to ensure the performance specified by AVB across physical distances usually covered by a WAN.

AVB will require AVB-aware devices — bridges and end points — in order to operate correctly. This means that if you want to use AVB, you will need to carefully qualify devices on your network to ensure they meet AVB standards. The introduction of a non-AVB bridge or a non-AVB end point to an AVB-compliant cloud means that AVB performance cannot be guaranteed. (See Figure 1.) However, Figure 1 shows that professional-quality performance is possible within an AVB cloud as part of a larger LAN, which raises interesting possibilities. Is it possible to create various AVB islands, for example, within a facility's audio production area? Is it possible to establish a guaranteed AVB path between sports production and audio? It appears so.

There is no question that crossbar routers, SDI and coax will be in media facilities for a long time to come. But, there is also no question that companies will add Ethernet and IP infrastructures that coexist with conventional video over coax. In fact, every facility already has both infrastructures in place. The question is whether AVB provides a pathway for migration of some A/V applications to the Ethernet domain. Given that one driver behind AVB is enhancing A/V capabilities of consumer home networks, if AVB takes off, costs for core networking components are bound to drop, providing an interesting new technology to media facility designers.

AVB standards

The following is a list of standards developed by the IEEE for Audio Video Bridging:

  • IEEE 802.1AS —Timing and Synchronization: This standard specifies the protocol and procedures used to ensure that synchronization requirements are met for time-sensitive applications, such as audio and video, across Bridged and Virtual Bridged LANs.
  • IEEE 802.1Qav —Forwarding and Queuing Enhancements for Time-Sensitive Streams: This standard allows bridges to provide guarantees for time-sensitive, loss-sensitive, real-time, audio/video data transmission.
  • IEEE 802.1Qat —Stream Reservation Protocol: This standard specifies protocols, procedures and managed objects, usable by existing, higher-layer mechanisms, that allow network resources to be reserved for specific traffic streams traversing a bridged LAN.
  • IEEE 802.1BA —Audio Video Bridging (AVB) Systems: This standard defines profiles that select features, options, configurations, defaults, protocols and procedures of bridges, stations and LANs that are necessary to build networks that are capable of transporting time-sensitive audio and/or video data streams.

Brad Gilmer is president of Gilmer & Associates, executive director of the Video Services Forum and executive director of the Advanced Media Workflow Association.

Send questions and comments to: brad.gilmer@penton.com