Metadata and MXF, part 2

Last month in this column, we looked at the importance of metadata to the professional media industry. We talked about different types of metadata (static and dynamic, technical and descriptive), and discussed the importance of the Unique Media Identifier (UMID) and how it links metadata and content.

This month, we will focus on why metadata is so important, how rekeying of metadata is a common problem in file-based workflows, and how metadata contained in MXF files can reduce rekeying errors. We will also talk about application specifications produced by the Advanced Media Workflow Association (AMWA) that improve MXF metadata interoperability in file-based workflows.

Critical metadata

There are many reasons why metadata is critical to file-based workflows for professional media organizations.

The first, and perhaps most compelling, reason is that files simply won't play properly without the appropriate technical metadata. Files cannot be reliably identified without it, and for post production, in some cases it may be difficult or impossible to reliably determine how video was manipulated compared to the original source content without metadata. Overcranking (changing the frame rate of the source material in post-production) is one example. Linking the content to information contained in databases, business systems and other media applications is impossible without some sort of metadata.

Fast fail is important in professional media applications. Without metadata, an application must do a “deep dive” into the media, meaning that it must actually begin opening the media file to determine whether the application can decode and play back the content. The problem is exacerbated if the content is located remotely. Failing to implement Fast Fail might mean that an entire two-hour movie has to be transferred over a network (a time-consuming process), only to discover that once the transfer is complete, the file cannot be opened by the application.

Indexing to a specific point, or even knowing where the first frame of video is in a file cannot be accomplished without metadata. Partial restore is an important concept in professional media. Since media files may be large, it can take a long time to retrieve a short section of video from a movie, for example. Metadata can help applications that use indexing information to quickly move to a specific part of the content and begin retrieving the content from that point.

Even the relatively simple task of understanding what the expected structure of a file may be is conveyed with metadata. For example, is the file a simple one containing a single piece of essence (only video without its associated audio)? Is the file similar to a tape, containing a single video track and two or four channels of associated, synchronous audio? Does the file have a complex structure, including multiple pieces of video and audio with different timelines? The differences in these file types can make a huge difference to an application. Remember Fast Fail? It may be that the application is unable to handle anything but the case where the file looks like a video tape.

I know some of you are thinking of the many ways the issues raised above can be overcome without using metadata embedded in the file. Of course, you are correct. But, many of the solutions you may be thinking of are either proprietary or involve overarching media management systems.

I acknowledge that both proprietary solutions and media management systems are commonplace in professional media facilities. However, many users are looking for open, standards-based metadata solutions. Also, many media companies have implemented media management systems that do not fully leverage the benefits of file-based workflow metadata.

Some issues

Figure 1. This shows a hypothetical workflow involving manual intervention. A contract is created with a media supplier to air a program (A). The information from the contract is manually entered into a program planning system (B). Program information is viewed and updated in traffic (C). Additional information comes to traffic from the commercial order area (D) to be manually entered. The on-air automation system (E) receives information from traffic, but it is further manually edited in integration and QC (F). Sometimes, information is manually edited again as web distribution information is added (G).

One of the biggest issues facing media companies is how to efficiently move content to consumers, whether they are watching a TV, using a smartphone or some as-yet unknown device. All of these systems require metadata, along with the content in order to deliver a complete viewing experience. Unfortunately, the pressure on media companies to provide services has outstripped many organizations' ability to properly build, test and commission systems to properly populate metadata fields in media applications. This has had some unfortunate results:

Human rekeying of metadata — In this case, metadata for a piece of content already exists in a system, but operators have to manually rekey it because the company has not had time to engineer an automated solution.
Manual metadata entry — Sometimes, metadata exists in a computer-unfriendly form, or in a form that cannot easily be translated from one computer system to another (information coming in by facsimile, for example). In this case, metadata is manually entered from paper or read from one computer terminal on one system and entered into a field on a different computer system.

In both of these cases, which are extremely common, there are three fundamental problems. The first is that these workflows, without question, will introduce errors. The second issue is that the workflow employs humans in a task they are not good at. The third problem is that these manual operations frequently duplicate effort that already was expended elsewhere within the organization.

An additional problem is that because we have humans doing something they are intrinsically not good at (manually copying numbers from one screen to another, for example), errors will continue to happen for as long as the workflow is in place, regardless of a worker's competence or training level. Additionally, errors in file-based workflows are costly to media organizations, both in error visibility to the general public and in terms of lost revenue as a result of the errors. (See Figure 1.)

Improving reliability

Given the critical nature of metadata to our workflows, and given the serious nature of the difficulties manual systems create, user organizations are passionate about addressing these issues. Unfortunately, the reality is that many times it is easier to hire another person and train him or her to do the manual operation, rather than undertake what usually appears as a huge, multi-year, multi-million dollar media system project. That said, multiple current solutions exist for users and manufacturers to make improvements to their systems now:

Use MXF — MXF has become the standard for open, interoperable professional media interchange. The entire industry will be helped as more companies continue to move to a standardized file format for media interchange.
Use application specifications — The AMWA has created application specifications specifically designed to improve file-based interoperability in particular use cases. Using these specifications will increase the likelihood of successful integrations, and will reduce the variability found in “wild west” MXF files.
Use shims — Many AMWA Application Specifications include the concept of a shim. Think of a shim as a tape delivery specification. In the days of tape, many organizations specified the tape formats they would expect, the video format (e.g. NTSC), the audio format (stereo, or mono on both channels, for example), and how and where timecode should be coded. A shim in the file-based world performs a similar function. Many application specifications allow variability within a range — for example, the application specification AS-03 for finished SD program delivery allows MPEG-2 compression at rates between 5Mb/s and 50Mb/s. But, it may be that you only want to accept 15Mb/s at your facility. A user-specified shim allows you to set your own specifics, as well as other “shim-able” AS-03 parameters.
Use UMIDs properly — Last month, I talked about the UMID, and how it should be used. Proper use of UMIDs allows implementers to build systems that reliably deliver correct content efficiently.
Look for MXF metadata import opportunities — Look carefully at workflows and determine where metadata embedded in a file can reduce rekeying and eliminate errors.

Summary

Last month, we talked about different types of metadata and the importance of unique identifiers in file-based workflows. This month, we have examined issues surrounding metadata re-entry and introduction of errors, and we have looked at ways to reduce errors and increase interoperability using open, standards-based formats. We also talked about the business drivers that are spurring media companies to move forward with file-based workflows. Metadata is the key to unlocking the true power and savings of this new technology.

Brad Gilmer is executive director of the Advanced Media Workflow Association, president of Gilmer & Associates, and executive director of the Video Services Forum.

Send questions and comments to: brad.gilmer@penton.com