MPEG standards

While you may have thought you were familiar with all the most relevant MPEG standards, such as MPEG-2 and MPEG-4, the MPEG committee has been busily developing a new set of international standards, called MPEG-A, MPEG-B and so forth. But, as opposed to developing new compression tools, most of these new standards package together multiple existing MPEG technologies that form a collective solution to a particular application. MPEG-A and the other derivative standards were formed to deliver normative specifications to achieve interoperability of applications and present opportunities to use the standards in ways not originally foreseen.

Historically, MPEG has supported wide-ranging solutions by defining profiles. A profile in MPEG represents a subset of tools from a part of an MPEG standard (a subset of the syntax), in order to arrive at a trade-off in terms of functionality and complexity, for relevant classes of applications. Thus, while each numbered MPEG standard uses various profiles that, taken together, form a video codec with particular features, the new lettered standards group together different technologies that can be used for different applications. In fact, in an effort to make MPEG truly universal, some of these technologies even come from outside the existing realm of MPEG standards. This practice of combining technologies from different standards already exists in many applications, including ATSC and DVD.

MPEG-A

MPEG-A is the Multimedia Application Format (MAF) that describes a number of applications, such as the Professional Archival Application Format. As an example, consider the MPEG-A Music Player MAF and Photo Player MAF, shown in Figure 1. Not only do these specifications include elements from a number of MPEG standards, but elements of other standards, such as JPEG, are referenced as well.

The MPEG committee expects that new Application Formats can continually be developed and added to the standards, keeping them current with new technologies. A great value of this process is that the work needed to develop and validate new products and services can be reduced significantly, since the MPEG “letter” standards will come with reference software implementations that can be used for the rapid development of corresponding products and services.

MPEG-B

MPEG-B systems technologies define a number of coding tools, including Reconfigurable Video Coding (RVC), binary MPEG format for XML, Bitstream Syntax Description Language (BSDL) and Dynamic Adaptive Streaming over HTTP (DASH). MPEG-B RVC includes two standards, Codec Configuration Representation and Video Tool Library, completed in 2009. The Video Tool Library specifies a set of Functional Units (FUs) that describe video decoder processes such as block transforms, motion compensation and entropy decoding.

A Decoder Description Language (DDL) further defines the structure of a video decoder, and the format of the coded bit stream is defined using Bitstream Syntax Description Language (BSDL), both specified in the Codec Configuration Representation standard. In effect, RVC not only allows the design of codecs with different building blocks, but also it allows the interconnections of the blocks to be arbitrarily specified.

With this reconfiguration flexibility, one might be tempted to believe that a decoder realized completely in software could be more versatile than a hardware decoder. While such an implementation may offer value to PCs and similar devices, dedicated silicon (or high-speed signal processors) very often provides a more effective solution because permanent low-level structures can often be optimized better. Nevertheless, RVC does provide advantages for semi-custom silicon development, such as for Field-Programmable Gate Arrays (FPGAs), which are readily and inexpensively developed using RVC tools.

MPEG-C

MPEG-C (2006) covers various elements, including an accuracy specification for implementation of integer-output inverse discrete cosine transforms, and application requirements for stereoscopic video (SSV). The first, while sounding technically ominous, is simply a way of providing a known quality bound for block-based transform coding. Recall that the DCT is one of the elementary tools used in the most common video compression codecs, and a higher level of image fidelity (fewer block artifacts) can now be achieved when the accuracy of the DCT-inverse-DCT cascade (such as used in the coding-decoding process) is compliant with this specification.

Of perhaps greater interest to prospective MPEG-C end users is the new SSV specification. To maximize interoperability between content providers, broadcasters and display manufacturers, MPEG-C SSV defines a standard format for compressing 2D+depth video. Applications requiring the encoding of depth maps are supported by corresponding requirements, and mobile displays are also considered. Features of the specification include low overhead, backward compatibility with and re-use of existing MPEG and other standards (including MPEG-2 and AVC), flexibility with respect to compression scheme, timely availability, simplicity, and display independence.

MPEG-D, MPEG-E and MPEG-H

MPEG-D covers various audio technologies, including surround sound, spatial coding and unified speech/audio coding. MPEG-E is a new standard (M3W) providing support for the download and execution of multimedia applications, and MPEG-H is the new high-efficiency video codec (HEVC) that provides significantly increased video compression performance. The goal for HEVC is to cut the bit rate in half relative to codecs such as AVC. A broad set of applications is targeted.

MEPG-M

The MPEG-M eXtensible Middleware (MXM) standard is being developed to promote the extended use of digital media content, facilitating the production of multimedia applications and devices. MXM provides a standard specification for middleware architectures and technologies, middleware APIs (application programming interfaces) and inter-middleware protocols. While this has been done in arenas outside of MPEG, e.g., the open-source Android platform, the emphasis of MXM is on audio-video media and the consistent handling of that media.

The developers of MXM believe that it can provide a rapid and cost-effective path to innovative business models because all parts of the value chain are based on the same set of technologies. Although this sounds rigid and all-encompassing, MPEG-M users can pick and choose the parts of the MXM standard that are relevant to their particular application. (Think profiles, again.)

MPEG-U

The MPEG-U Rich Media User Interface is a specification that provides a standard protocol to build user interfaces, including widgets, and the interfaces between widgets and widget managers. One benefit of this standard is the interoperability of widgets from different service providers; personalized user interfaces are also possible.

MPEG-V

MPEG-V closes out the new alphabet of standards, defining formats and protocols for “Information Exchange with Virtual Worlds,” and covering data representations between virtual worlds and between virtual worlds and the physical world. The human interface is also considered, with particular attention to sensory information and data formats for interaction devices.

Aldo Cugnini is a consultant in the digital television industry.

Send questions and comments to: aldo.cugnini@penton.com