Clearing the Air on Video Over IP

Differing terms rely on same core technologies, but different delivery systems
Publish date:

Video over Internet Protocol, not to be confused with VoIP (voice over IP), is becoming a commodity terminology found in a continuing-to-evolve set of applications for delivering multimedia content. Yet, there remains more than a bit of confusion among the terms “video over IP,” “Internet TV” and “IPTV” (Internet Protocol Television). IPTV and Internet TV both rely on the same core set of technologies, but their delivery methods vary. Internet TV relies on the public Internet to deliver a “best effort” video experience to the end user. IPTV relies on private networks which are managed, operated, and whose networking infrastructure is owned by a particular service provider.

Generally, video over IP is an open system whereby content is delivered on demand through a player application embedded in a browser which pulls selected content over a public network. Emanating from streaming media and Webcasting, video over IP can be found in a variety of network-oriented, streaming media architectures. For example, the underlying technology in AT&T’s U-Verse (as IPTV), RealNetworks’ RealSystem, Microsoft’s Windows Media Technologies, most cable systems’ video on demand service, and other real time and non-real time video delivery formats, use some of the principles which will be summarized in this article—each with their own distinctions.


Image placeholder title

On the transmit side, encoded video and audio as an MPEG-2 transport stream is encapsulated for transmission over an IP-based network employing User Datagram Protocol (UDP) and Real-time Transport Protocol (RTP) principles. The return path receives the IP-packets, demultiplexes the transport stream and decodes to baseband video on the destination side. Basically, video over IP workflow begins with encoding, either for live delivery or for later release. Video content, stored as files, is migrated to either a streaming server or another delivery server’s platform. But here is where the distinctions often end. IPTV systems, which are essentially closed systems, may encode live video “on the fly” directly into the network rather than store pre-encoded files on servers for subsequent playout at a later time. Video on Demand (VOD) content is pre-encoded with metadata (e.g., CableLabs’ VOD and SVOD specifications) to enable VOD features to operate over the network under control of user devices, such as set top boxes.

Servers manage the stream delivery rates so as to produce a real time playback through a network to user play-out devices embedded in software (e.g., for a PC-platform) or in hardware (e.g., for a decoding set top box). Formats and protocols distinguish between Web-based delivery and video over IP, with the encoding and file structures for streaming media differing from those used in conventional HTML Web pages. Streaming media allows for features previously unavailable using conventional HTTP for delivery of Web pages, such as real-time flow control, stream switching and interactive navigation, to occur.

Video over IP can be incorporated into both streaming media and Webcasting. Streaming is usually referred to in a continuous linear context, that is, the content is delivered to the user from the source in near real time. Webcasting is more like television—it can be pre-recorded or live and may incorporate both streaming and file downloads. Both variations can be scheduled or delivered on-demand, and either form can incorporate conditional access (CA) or encryption, like those found in subscription television.


Streaming media is sometimes called “IP Broadcasting” because it is comprised of files that are delivered over the Web via the Internet using Internet protocols (IP) to personal computers. The premise of the term IP is that it is bidirectional, usually a point-to-point unicast stream, which allows for interactivity with “start-stop-resume-replay” features. When a stream is scheduled for a delivery, multicasting may be used requiring only a single stream to be delivered to a properly enabled network that is delivered to several sets of users simultaneously. In this multicast mode, stunt features or interactivity such as starting or resuming, is curtailed.

As a communications protocol, IP (layer 3) is an unreliable delivery system. It must deal with variable network latencies and out of order or lost packet transmissions. Higher layer applications or protocols, such as the transport layer—where Transport Control Protocol (TCP) happens—can correct the IP-layer issues. Combine the two and you have TCP/IP, which collectively manages, reorders and acknowledges the receipt of packets or issues retransmission orders until all the proper packets are received.

Streaming, particularly multicast streaming media, must be able to ignore data errors—since there is no means to request the retransmission of packets once they leave the source. Thus, streaming uses a different protocol called User Datagram Protocol (UDP). Given that UDP has no flow control or error correction capabilities, a higher layer application must handle these kinds of functions. UDP is associated with applications such as streaming media, voice over IP, routing protocols and network management.

The nature of streaming required that a Real Time Protocol (RTP) be developed so that consistency could be maintained when delivering streaming video to players through an IP-network. Multimedia sessions may utilize one of a set of protocols developed by the industry, and found under RFCs such as Resource Reservation Protocol (RSVP as RFC 2205, 2208 and 2209), Real Time Control Protocol (RTCP as RFC 1889), Real Time Streaming Protocol (RTSP as RFC 2326), or Real Time Protocol (RTP also in RFC 1889). Essentially, these real time protocols add extra data that is not found in TCP which allows players to reconstruct and throttle the video stream to the appropriate frame rate and data rate combination.

For high performance, consumer or professional video over IP networks to work properly, stringent network controls are required. This is precisely why fee based IPTV video is delivered over private or specially constrained networks under the full control of the service provider. The resultant streams, usually in the hundreds to tens of thousands, are managed and delivered to dedicated set-top boxes designed for the provider’s own applications even though many of the mechanisms utilized may be quite similar to the feature sets found on the public network models.

Private networks allow the service providers to amplify the Quality of Experience (QoE) as compared to PC-based public network delivery of video over IP content. Video over IP application-specific systems will fine tune elements such as real-time flow control and intelligent stream switching (splicing). These systems then incorporate sophisticated electronic program guides (EPG) for previews, picture-in-picture clips, and search functions that are interconnected with video-on-demand and pay-per-view.

In summary, we are reminded that Internet TV is not IPTV, even though both may utilize some of the video over IP technologies for presentation or delivery. IPTV implies a closed system carried on provider specified managed networks. Video over IP may be construed as many things, but is primarily realized as an open system transport mechanism that can favor most any form of video—including resolutions upwards of 4K.

Karl Paulsen is chief technology officer for AZCAR Technologies and a SMPTE Fellow and an SBE Life Certified Professional Broadcast Engineer. Contact him at