Metadata: The keys to the kingdom

To be able to assemble program elements for broadcast you have to be able to find them. In a tapeless environment there are no labels on the spines of tapes. Clips and graphic elements reside on a disk. A simple hierarchical directory structure may suffice in the short term, but how do you find a clip or graphics element from last year?

Disk storage is expensive so moving content off to a tape archive is prudent business practice. But you can’t manually search through a robotic tape archive by loading each tape and checking the volume table of contents via a directory or folder listing.

It gets worse. Let’s suppose you want every presidential speech in the last decade that mentioned unemployment. How do you find these clips when they are scattered over numerous archived tapes? Even if the content is on disks, this is still a virtually impossible task.

Now let’s add constraint. These speeches cannot be State of the Union speeches and must have been delivered at luncheons. It would be like searching for needles in a number of haystacks.

On your mark

Broadcasters and the media industry are not the first to be confronted with this dilemma. In the pre-PC age, libraries kept track of their books by using the Dewey Decimal System and an extensive card catalog that listed title, author, subject and publication date, among other information. In the 1960s the Library of Congress initiated the Machine Readable Catalog project. Under the direction of Herriette Avram, the MARC Project computerized the entire LoC book catalog.

But the project did not stop there. It was an international effort and became an ISO standard. Libraries all around the world were networked and able to browse the catalogs of any member library. Until the advent of the Internet, this was the largest public freely distributed information database in existence. And the amazing thing is that the MARC system has been augmented and is still in use today.

Babel

Compare this accomplishment to the state of metadata standards in broadcasting today. During the transition to digital production, many metadata implementations have come into being. Generally proprietary, once a system is installed, you are often locked into only that vendor’s solutions.

Compatibility and interoperability of various metadata implementations is now becoming a reality. Efforts such as the Advanced Authoring Format (AAF) and the Material eXchange Format (MXF) are gaining widespread industry acceptance.

AAF is intended for use during production and editing. It includes a rich set of technical, descriptive and rights metadata. AAF files can reference and incorporate other AAF files. In this way a log of the creative history of the piece or clip is maintained.

MXF is a subset of AAF and intended for distribution and archive processes. Essence files are wrapped with MXF metadata. The structure of an MXF file is independent of the format of the essence. MXF files are self-contained and do not reference other external MXF files. The MXF commonality allows transfer of the file to any MXF capable device. Yet to open and use the content, an appropriate decoder must be accessible.

The SMPTE MXF Implementers Working Group is a subgroup of the SMPTE W25 Wrappers and Metadata Committee. The objectives of Working Group include:

Promote interoperability between MXF implementations.
Provide a platform for users and industry to pose questions and requests for guidance and best practices on MXF implementation.
Identify areas where new MXF standards are required and provide adequate SMPTE due process specifications
Provide Advisory notes/reports which aid the MXF implementation community.

Open source MXF tools are available.

Unique Material Identifiers (UMID) defined in SMPTE 330-2000 are another potential link through the content lifecycle. Recommended use of UMIDs is described in RP 205.

Version- International Standard Audiovisual Number (V-ISAN), ATSC A/57 identifies programs, episodes and versions of content delivered in MPEG transport streams.

In the consumer environment, various metadata schemes are proliferating. Compatibility between systems is an issue. Of paramount concern at this point in the content lifecycle is the application of rights management. But perhaps a bigger problem is long-term archival storage on media whose lifetimes are uncertain.

A rose is a rose

Key to storage and retrieval is metadata. Standardization of metadata terminology is the means to interoperability among systems that use content, such as MAMs, editing applications and eventually home digital networks.

To attain this ideal, SMPTE has implemented a Metadata Registry. The EBU/SMPTE Task Force, who’s final report was published in 1998 introduced the concept of content = essence + metadata, that is now central to the production profession and infrastructure. Today, W25 carries on the work.

The SMPTE Registration Authority (http://www.smpte-ra.org/) provides a public repository for several labeling standards, including the SMPTE Metadata Registry, SMPTE Unique Material Identifiers (UMIDs), and MPEG Format Identifiers.

Communication

eXtensible Markup Language (XML) is the communication protocol used to convey metadata among applications. Rapidly becoming the document industry metadata exchange standard, XML text files are both human and machine-readable.

Key Length Value (KLV) SMPTE 336M syntax is now expressed in XML. Self-documenting XML files communicate technical, descriptive, administration and rights information between applications and equipment.

Publicly accessible metadata registries are necessary to facilitate interoperability of metadata dictionaries. By including dictionary references, access information and registry location, a cataloging application can find relevant metadata and the sought after material.

For libraries and long-term preservation of digital assets, the Open Archival Information System (OAIS) is developing a reference model to enable exchange of metadata among registries. If MPEG-7 and MPEG-21 attain the global acceptance and longevity that MARC has, the issue of metadata compatible schemes may vanish.

Traversing the chain

How is all this information linked through the media chain? How does a download purchase of content by a consumer, propagate metadata and transaction information up the chain such that rights information in the original AAF file generates a payment to the copyright holder?

Consider a consumer watching his favorite situation comedy. For the moment, let us assume that a means to convey information to the program originator exists. We will also assume that iTV T-commerce exists. The consumer loves a song that is played in the show. Using iTV features, a window with a purchase song button is displayed. At this moment, the receiver must know, via metadata in the delivered content 1) what song will be downloaded and 2) what program contains the song. The song info can be time code indexed in program metadata. The program can be identified by its V-ISAN.

In this example, the V-ISAN metadata field is the link from the consumer through the distribution channel, which carries the V-ISAN. The MXF file that wrapped the essence transmitted contains the V-ISAN in its metadata and is linked to the ISO ISAN. Moving back up the content lifecycle chain, the ISAN points to an AAF file that contains information necessary to locate any audio or video element contained in the program. Ultimately, the song in the program is reached.

Program (V-ISAN) and song (Title) information have been communicated from the consumer to the program originator via the backchannel. The content provider must now correlate the V-ISAN with the ISAN and then locate the AAF file that contains (or point to) the song essence and its metadata.

Rights information and usage fee payment directions facilitate automated royalty payment to the songwriter and publisher. The consumer completes the transaction in a Web-like transaction session. The song is then downloaded.

One language

Without integrated, consistent, interoperable metadata, ubiquitous, transparent digital audio and video content distribution and consumption may never become a reality for broadcasters. With all its flaws, the Internet does at least provide a means for finding and consuming information.

For large collections, automated generation of metadata will ease the tedious task of manually cataloging volumes of material. Systems that use speech recognition and closed captioning analysis are emerging. Somewhere in the future, video feature extraction will mature to the point of commercial implementation. Intelligent recommender systems will use this metadata to track user preferences and suggest the evening’s entertainment.

Metadata must be persistent. It must not be lost during any phase of the content’s lifecycle. Interoperability is mandatory, since there is no one format that covers all phases. Lose the metadata and you’ve lost the content. Who will remember where something is by its location on a drive in a file structure in five years!

The protection of rights must be maintained for content, but balanced with the ability to locate and preview. What good is rock solid content protection if no one can find your content to download?

Additional reading

1) Metadata Systems Architecture, O. Morgan, Metaglue

2) Issues in DTV Broadcast-Related Metadata, Richard Chernock & Frank Schaffa, IBM Whitepapaer

3) Integrating Metadata Schema Registries with Digital Preservation Systesm to support Interoperability: A Proposal, Michael Day, UKLON, University of Bath, UK

4) Digital Video Archives: Managing Through Metadata, Howard D. Wactlar and Michael G. Christel, Carnegie Mellon Univesity