File Interchange, Part II

( view part I )

The adoption of digital media has opened the door to myriad opportunities for the exchange of information. When last we discussed the principles of file interchange, we touched on the user requirements for interchange and the definitions of metadata sets that are included in the material body. This installment continues with the principles of encoding parameters adopted as standards for the exchange of data between devices.

Two general methods are available when interchanging media between devices at a file level. One method is to use proprietary or native file structures and present them to a translator or converter that acts as a gateway between manufacturer-specific devices.

The second alternative is the generation of a file that – from the onset – provides for universal exchange between devices. For the television media industry, the latter appears to be the preferred method as it offers a more universal opportunity for the exchange of files between existing and future devices.

The nirvana of media file interchange and exchange would be to have files generated on system A be universally accessible and usable on system N – where "N" is any other system – and vice versa. Obviously, this universal exchange is not yet reality, but committees consisting of manufacturers, industry integration experts and end users are hard at work creating the next best solution.

COMPLEX ADDITIONS

As the acceptance of digital media continues to grow, so do the opportunities for complexities. Just when we thought DV and MPEG-2 were settling down – with systems being implemented that take advantage of the continual improvements in compression technology – things are stepping up a notch once again.

On the horizon for actual implementation is MPEG-4, with its sidebar successors of MPEG-7 and MPEG-21 not far behind. Beyond these baseline media compression technologies, a number of professional file exchange methodologies are well under way to being developed and in several cases, being readily accepted – and implemented – by video server manufacturers, editing software systems and videotape transport manufacturers.

We have seen that most of the major manufacturers – thankfully – are on the MPEG bandwagon, demonstrating their commitments to a common compression technology for professional applications.

Although we hear a great deal about the MPEG families, far less information is being presented regarding the efforts made toward a universal harmonization of data, data structures and operational interchanges for media-centric content.

Looking forward from July 1998 – when the EBU and SMPTE completed and published their work from the joint task force for the harmonization of standards for exchange of program material as bitstreams – we see a progressive series of standards that have set the foundational structure for all future work related to the exchange of data between systems and devices.

Some of the extensions of the task force's work will be introduced in the following brief highlights – and this writing covers only a small portion of that development. Interested readers are encouraged to delve deeper into the subject matter utilizing the resources of the SMPTE, EBU, AAF and other organizations.

SMPTE has, for several years, been working on a universal approach to an encoding protocol definition called KLV (key-length-value). To briefly explain, each series of bitstream data is presented to a device as a triplet of data, called KLV.

I GOT A BRAND-NEW KEY

The key indicates what kind or type of data will be presented in the payload. The length describes how many bytes are expected in this set of data; the value yields the actual payload of the length previously described.

A decoder or translator that is presented with the data reacts first to the key, which resolves the question of "Is this something that this decoder knows how to handle?" This is followed by the length, which indicates – in part – how much effort and for how long the decoder will need to work in order to process the data.

If the decoder/translator does not recognize the key, then the device can ignore the data or – depending on what the capabilities of the device are – may elect to attempt a process of decoding or translating, using an alternative set of methodologies that may be suitable for the application-specific functions.

KLV is working its way toward becoming a major part of the data infrastructure for the encoding and processing of media content. As defined in SMPTE 336M, the KLV octet-level data encoding protocol is useful in representing data items and data groups.

Data can be coded in either full form – universal sets – or in one of four other groups referred to as global sets, local sets, variable-length packs or fixed-length packs. For clarification, note that an octet is an 8-bit word, equivalent to the commonly used ‘byte'; a triplet represents three blocks of data working together to form a single function.

There is a good deal of depth to the KLV structure and protocol that in turn permits quite a variety of methods available for encoding and identification.

UNIVERSAL LABEL

One of the more fundamental concepts in KLV is that of the universal label. The universal label (UL), as described in SMPTE 298M, is a universal labeling mechanism that aids in describing the type and encoding protocol of data within a general-purpose bitstream.

These labels, which are attached to the data and travel together through communications channels, can be used by any organization in a manner that is unambiguous, globally unique and traceable to the authoring organization.

SMPTE administers a universal label that is of fixed length (16 bytes) and is defined by SMPTE 298M. An additional variable-length label is administered internationally by ISO/ITU, or its constituent organizations. The SMPTE 298M standards' document that describes the operations of the organization is known as the SMPTE Registration Authority. This entity issues the UL keys and keeps records, including other reference data.

In the key structure of KLV, a universal label has associated with it an identification key (UL key). The universal label follows the basic encoding rules (BER) for encoding an object identifier as specified in ISO/IEC 8825-1.

Each word in the UL key can be represented by a single octet. The full UL key consists of a 16-octet field with an object ID (OID) and the universal label size. This is followed by a UL code (1 octet long) and a series of other designators – called subidentifiers, which define the UL designators.

SMPTE 336M further defines universal set as one accompanying a standard or recommended practice that includes a structure designator and an accompanying universal set registry with a version number. Each data element in a universal set must apply the KLV protocol, including the full UL key value. The value of this universal set is a set of KLV-encoded elements, whose total length can be found in the length field.

In addition to the full-form – or universal – set is a set that consists of groups, consisting of two sets and two packs that extend beyond the universal full-form sets. Global sets, like full-form sets, must adhere to the KLV protocol, except that they are allowed to use a shortened ‘global tag' value that replaces the UL key used in the full-form universal set.

Local sets allow for elements – that may be present or absent – to be in any order within the local set. The structure of the designator is defined by a local set registry and standard or practice. Tagging is the process of attaching a UI (unique identifier) to the essence to enable reliable access and tracking at appropriate levels of granularity.

Packs – in either variable or fixed-length – are similar to local sets but do not have local tags. In the variable-length pack, elements must appear in a defined order. For the simplest of structures – the fixed-length pack – each element is comprised of only a value field.

FINE DETAILS

Although the fine details of the KLV encoding protocols are not essential to a fundamental understanding of material exchange architectures, it is valuable to know that – with the complexities of digital media that we must deal with – a structure is in place that will carry forward into the future.

Additional concepts – such as the UMID (unique material identifier) for production and broadcast environments detailed in the recommended practices (SMPTE RP 205-2000) – are also in place and are directly tied to the SMPTE metadata dictionary that defines how metadata will be named, catalogued and implemented for media-centric and other audio/visual activities.

Those interested in the full features of these protocols and conventions should refer to the specific SMPTE documents mentioned in this article.

As data-centric and media-centric material becomes more harmonized, the barriers between data exchanges will diminish. Several companies, such as Grass Valley, that drove much of the GXF development – and organizations, such as SMPTE, EBU, ISO/MPEG and the ITU – are striving toward a universal interchange, while still maintaining their individuality in performance for file and streaming functionality.

This author is grateful to SMPTE, AAF and the Grass Valley Group for providing portions of the information contained herein.