Format wars

For just about as long as video has existed, there have been video format wars. The lack of compatibility between competing proprietary systems has been a constant source of frustration for broadcasters. As the industry migrates from a video-centric world to a data-centric one, all these problems will finally fade away, right?

The truth is, competing video formats may pose less of a problem than competing data formats. The situation is even more complicated than it was when there were two or three video formats to worry about. Data may just be a series of ones and zeroes, but there are many ways to compress and wrap it.

Why are there so many options when converting content to data? One reason is the number of proprietary formats. Sony and Panasonic used to be the rival tape formats, but Apple's QuickTime and Microsoft's Windows Media have now been added to the mix.

More important now is that some codecs are better suited to specific tasks and environments than others. For example, high-quality acquisition should ideally be I-frame only to minimize the problems in post production, whereas digital broadcasting can only be efficiently achieved using long-GOP compression.

Convergence or divergence

Ten years ago, all the talk in our industry was of convergence. Today, broadcasters understand that actually means divergence.

An evening's worth of television from a major broadcaster may contain content captured on everything from a digital cinematography camera to a mobile phone. Today, viewers are much less likely to sit and watch television all evening. Instead, viewers may consume content delivered over the Internet and to mobile devices. The forecast says that this is going to become the future norm.

Each acquisition format, each post-production and content management system, and each delivery platform have different requirements for data, including different capacities, bandwidths and resolutions. It is not unreasonable that each should have a different codec.

Further, they will have to coexist transparently to the user. Whether an editor is cutting a news story together or a viewer is watching a story online or off-air, content producers and users can't worry about underlying codec complexity. The editor has to be able to drop whatever content the story needs onto the timeline. The viewer needs to have the capability to select content without thinking about whether the chain of data formats lines up correctly. This is a reasonable requirement, but the biggest challenge is developing an efficient workflow.

Acquisition and production

For conventional acquisition, most manufacturers use a variant of either the DV group of codecs or something based on MPEG-2. Both use discrete cosine transformation (DCT), a mathematical process that reduces the amount of data by expressing it as frequency-dependent coefficients. The mathematics of this is well understood, and the tradeoffs between quality and bit budgets are also well understood.

Incidentally, it should be remembered that MPEG-2 was originally conceived as a delivery standard, so it is asymmetric, meaning it is complicated to encode but relatively simple to decode. This is fine in the broadcast model, which has a single compression engine and millions of decoding set-top boxes. It doesn't work so well when that compression engine has to be powered by a battery in a camcorder. But that is a different issue.

MPEG-4 benefits from improved algorithms to achieve dramatically increased coding efficiency — more than 50 percent reduction in bit rates for a given image quality. The downside, of course, is that it takes more processing power to gain the advantage of these new algorithms.

Looking at the production workflow from acquisition to transmission, these types of compression might be encountered:

top-end acquisition equipment using no compression or I-frame only, modest ratio MPEG-2;
typical EFP and ENG using MPEG-2 or DV codecs;
JPEG2000;
user-generated content from mobile devices using 3GPP or MPEG-4; and
propriety coding schemes.

Delivery options

There is pressure to move to MPEG-4, particularly if it fulfils the promise of delivering acceptable HD in the same bit budget as today's MPEG-2 SD. However, it is computationally complex, and some doubt that it can be achieved without custom processing. On the other hand, it is an open industry standard, which is always appealing.

H.264, also known as AVC, is another codec within the MPEG-4 standard. Its excellent compression efficiency makes it attractive for several applications. The video iPod uses H.264, as does the Sony PSP — although they are different in detail. Some of today's mobile phones use H.264, and if mobile broadcasting systems take off, it is likely that they will be H.264-based.

When it comes to delivering content on the Web there are four contenders:

QuickTimeQuickTime is not a video format, rather just a wrapper that can contain essence in any of a range of formats. Typically, this might be H.264, but the benefit of the QuickTime format is that the wrapper tells the consumer's computer how to decode the essence.
Windows Media 9Originally a proprietary format, Windows Media 9 has now been released by SMPTE as the VC-1 standard. The benefit of VC-1 is that it provides equivalent quality to H.264 but requires less encoding computational power.
Real MediaAlong with Windows Media, Real Media is a popular choice for Web delivery.
FlashThe new version of Flash requires only a Web browser. Flash 8 is a new codec capable of HD quality. It has the advantage of platform independence.

There is no space in this article to discuss digital rights management (DRM), but it is worth noting that Flash 8 does not include any DRM, relying solely on the fact that its .swf files cannot be downloaded, only streamed. Screen capture software will get around that one, as will any format that has DRM protection.

Repurposing

Compression tools make it feasible — technically and commercially — to repurpose content across platforms. (See Figure 1.) Increasingly, audiences are expecting this. Viewers may choose to buy the next episode of a favorite TV series to watch on an iPod, and they may check out a broadcaster's news Web site to watch stories that interest them.

Repurposing for different platforms means looking at much more than just recalculating resolutions. A picture that makes sense at 1920 × 1080 on a large plasma screen won't look good on a 176 × 144 mobile phone display.

The graphics, at the very least, will have to be replaced. It is likely that advertisements will need to be added, stripped or replaced.

For broadcasters, the reason for adding all these complications is to develop new revenue streams. Splicing local advertising into national content is one way of making video on mobile devices relevant.

Solutions

So what conclusions can be drawn? First, acknowledge that there are several codecs available. Each application has to balance the requirements for quality, bit rate, processing power, latency, resolution and security.

Second, acknowledge that the days of television as we know it will soon be over. Broadcasters need to create systems that will collect any content from anywhere, process it and deliver it to the platform of the consumer's choice, when and where they want it. That content may be either professionally produced programs made to the highest standards, or user-generated content that is timely and relevant.

Third, while all this involves multiple formats, resolutions and compression schemes, they should be invisible to the user at every stage of the production and delivery process.

To achieve this, production systems should be based on data, not video. And a central part of those systems must be software that manages formats, providing transparent conversions and compressions, which get the best possible quality out of any given set of parameters. This processing could be implemented as software to give it the flexibility to respond to new developments in acquisition and delivery formats.

There is no doubt that formats will continue to evolve and expand, from ingest through production to distribution. Not all will flourish, but it is notoriously difficult to say which will grab the audience's enthusiasms. Automating the ancillary steps of graphics resizing, commercial replacement and metadata delivery is key to broadcasters repurposing content.

Shawn Carnahan is chief technical officer for Telestream.