Skip to main content

Reality in our industry is all in the presentation of approximations of reality. The essence of video and audio compression is the selective discarding of information that the consumer of the content will not miss. We do our job best not when we send the most pristine and perfect images and sound, but rather when it's been transmitted through the narrowest funnel we can manage without making it obvious to the consumer's untrained eyes.

Changes in video compression technology

This began decades ago as an area of research called bit-rate reduction, now, of course, called video compression. In the 1970s, it was generally thought that 525/625 video needed well in excess of 25Mb/s for adequate digital representation. In fact, the EBU published a paper once purporting to mathematically “prove” that the job could not be done in less than 34Mb/s. Jumping forward a couple of decades, it is quite obvious that the calculation was not able to take into account research done much later. At that time, there were plenty of questions about how one might even approach perceptual coding of images (and sound). There were also many papers about run length encoding, the Nyquist limit and other topics worthy of serious reading.

Part of what changed is the development of effective pixel motion estimation, which, in fairness to the scientists of the 1970s, was largely not possible in real time on affordable hardware. Today, we think nothing about using the hardware codecs in cell phones to transmit news stories in 720p at effective rates below 2Mb/s, and SD content is delivered over DTV transmission at sub-1Mb/s rates. The compression ratios are mind boggling. If the entire DTV bit stream is used to deliver one 1080i29.27, the compression ratio is most descriptively given as just over 0.3 bits per pixel of the display. But audio and PSIP eat into the available bandwidth, making the number seem even more absurd.

In the more modern era — the period when cost-effective, real-time systems began wide deployment — MPEG-2 compression has clearly dominated the market. Over the last few years, H.264 (MPEG-4 Part 10, or AVC compression) has begun to replace MPEG-2 for many uses. Its more efficient algorithms allow equivalent quality at lower bit rates, or higher quality at the same economic cost in bandwidth. Though we tend to dwell on the technical aspects, it is the economic benefit that drives technological change today. Would anyone doubt that we would have no reason to deploy AVC without the cost savings in transmission bandwidth or storage cost? I suspect the answer to such a rhetorical question is completely academic, for the cost of developing new compression tools would hardly be supportable unless there is a demonstrable benefit to companies investing in new hardware.

Frankly, we are lucky that compression has become a key component of technological advances we rely on, both as consumers of content and creators of content. DTV, Internet distribution of content, video chat, 3G/4G newsgathering, digital archives, wireless home networking of content, personal music and video players, DVD and Blu-ray players, and digital still and movie cameras are but a few innovations that would not be possible without compression. A ballot for the most essential invention in the media industry in our lifetimes would have to at least include compression.

But there is no free lunch. To use compression as more than a point source solution, that is to say at two ends of a loop in a closed system, we need effective standards, which of necessity stifle innovation in the process of technology self regulation. We might have much more effective compression by now if the marketplace was able to innovate without the need to interoperate. And part of the innovation continuum seems to be the increase in complexity that often comes along.

An example is the death of video switching. I do not mean to imply it is already dead. But I see the handwriting on the wall. There is a lot of “baseband” switching all over the fabric of our industry, but increasingly we see “switching” of compressed content streams. That process is more accurately described as a splice that joins to time-independent streams of content into one stream with perfect continuity in syntax. Baseband switching is far less complex, but as we inexorably move toward a mostly IT infrastructure carrying mostly compressed content, I see an increase in system complexity. The reasons are simple enough to understand.

To switch between two video signals, one needs to only break the electrical connection to one source and establish the connection to the second source. In an ideal world, you need to assure the signals are synchronized, though with the exception of a short glitch, cutting between to unsynchronized sources is often acceptable.

But with compressed signals, one must do much more. In any case, assuming the available bandwidth would support either source flexibly, you still need to align the syntax in the signals so that the decoder will not lose its place in the bit stream. In addition, it is critical to establish the group of pictures (GOP) cadence on both sides of the switch, better termed a splice. This is not terribly hard to do, but requires buffering to allow for matching up two inherently asynchronous signals.

Many years ago, SMPTE and others began work on standards to establish how such switching might be signaled to downstream devices, making it possible for a device listening in on the transmitted sequence to know when an appropriate splice point would be arriving. This work created a SMPTE standard, which is the basis of the SCTE splicing standards used for commercial insertion worldwide. Networks like FOX and others have adopted splicing as a critical technology for the distribution of content to affiliates, with a Technical Emmy given to FOX in 2008-2009.

A great reference for questions about splicing in considerable detail was written by Norm Hurst and Katie Cornog and can be found at

John Luff is a broadcast technology consultant.

Send questions and comments to: