Many years ago, SMPTE and EBU joined forces to work on harmonizing content-related issues associated with interchanging digital bit streams. Though the 1998 report was not the first use of the term “metadata,” it clarified terms related to interchanging content in the emerging digital video ecosystem. The preface to the report defines metadata as “a new class of program-related data called metadata. Metadata may be defined as the descriptive and supporting data that is connected to the program or the program elements.”
Metadata is one half of the media, the other being the essence, i.e., the actual content itself. Though metadata may be necessary to use essence, metadata can be processed and maintained in a totally separate path than essence. This leads to the potential for corruption, or disappearance, of one class while the other still exists.
Metadata includes two major classifications. Structural metadata provides critical information that may be necessary to process the associated content. This is comprised of things like the scan format, compression system used, number of audio tracks and their technical parameters, closed captioning, and unique media identifier (UMID) (SMPTE 330M-2004). Descriptive metadata provides other useful information about the content, such as program title, length, copyright holder and even scripts.
SMPTE 335M-2001 describes in great detail 15 classes of metadata and how they are represented. The overarching description is based on key length value (KLV) encoding. The key is a reference permanently tying the elements together. The length follows and specifies the number of bytes of data, which then follows in the value. This simple structure allows for almost infinite variation and simple parsing of the information.
Of course, it is often inconvenient to have separate metadata and essence. Managing the two independently, in the case of say a SMPTE 292 stream of HD content, would be more difficult. As a result, the concept of a wrapper must be introduced. A wrapper gathers the elements (video and audio essence, and metadata) together and presents one unified delivery mechanism. (See Figure 1.) SMPTE 292 is itself a wrapper, as it contains audio and video, and metadata (including UMID), on a 1.485Gb/s link. In the file domain, a QuickTime file is, in fact, a wrapper, as is MXF.
Using a wrapper allows the media and all of the metadata to be delivered in a unified structure. But there are times when metadata must be separated from the content. For example, in a TV station, the metadata about programs and spots is parsed to traffic, automation, and perhaps asset management and archive databases. Each has a function to perform, and each must be aware of aspects of the content, but does not need access to the content itself. To make the full wrapper available to such applications would make management of their data vastly more complicated and expensive. Traffic gets some of the metadata from a syndicator or programming database. It passes only the bare essential metadata to automation, perhaps no more than house number and start of message/end of message (SOM/EOM) along with the time the event is to play.
Automation sends additional metadata about the content back to traffic for reconciliation after air. An archive system might have a slightly more rich database to allow some searching, and a full MAM system would have a rich set of metadata fields on which searches can be performed.
Keeping data current
A major issue is keeping all data about each piece of content current. Knowing where the most current data is represents an interesting challenge. For example, traffic might have SOM/EOM that were provided by the vendor, but the times might have been trimmed to slightly different values when the media was ingested in master control. MAM would have no knowledge that they had changed either. Propagating those changes is often a complex process with multiple applications communicating, perhaps in real time.
If the metadata inside the wrapper with the content contains the same information, the complications become deeper. Do you really want to permanently modify the metadata associated with the content? To do so implies that the applications have access to the inside of the wrapper. What happens if the metadata becomes corrupted? Or worse, if it is an MPEG file, there is a chance that the syntax of the transport itself can be corrupted. The result could be corrupted essence.
There are other places where the linkage between remote metadata and essence can become broken with dire consequences. When a system is told to delete content (both essence and metadata), often a file system entry is simply expunged, leaving the blocks where the content is stored free for future recording. But if the record in the archive database or the video server's file system is trashed inappropriately, it may prevent access to content that is still valid and present. However, without the metadata (recording the location of the essence), it is easy to see that the content “doesn't exist.”
As time goes on, the set of “normal” metadata becomes more rich and complex. In a recent development, YouTube announced that it will be fingerprinting all content for the purpose of making take-down decisions more transparent. YouTube has provided the tools to broadcasters, through third-party vendors, to create the fingerprint. But in the end, that adds a piece of metadata, which must be stored with each item. When a program airs with changed supers, a new fingerprint may be needed, which means more metadata.
Digital rights management information also creates more metadata. One broadcaster sent a request for proposal out a few years ago, which asked that every use of each piece of content be tracked, so if news footage had single use rights, it could be flagged as unavailable after the first use. Because this required every process in the production sequence to be monitored, and metadata made available to all applications and everyone involved, it became horribly complex. So maybe it's the bits about the bits that are becoming more complex as the bits themselves become more ubiquitous.
John Luff is a broadcast technology consultant.
Send questions and comments to:firstname.lastname@example.org