Professional video and film file formats


Figure 1. Essence and metadata are contained in a wrapper. The things you see and hear, along with information about those things, are contained in a file. Click here to see an enlarged diagram.

It is clear to many people in the industry that file transfer will be the dominant technology used to move video and audio between systems in the future. Already, completed programs and compositions are being moved around facilities as files. It is important for us to start thinking about file-based workflow and to anticipate how file-based production will change the facilities we build.

As Figure 1 shows, a typical file contains essence (the things you see and hear on the air) contained in a wrapper (a file format). One of the first file formats to see regular use was Digital Picture Exchange (DPX). Although it has been in the field for some time, DPX is still commonly used for film transfers. Film goes into a scanner, the scanner digitizes each frame of the film, and DPX files come out. A DPX file typically contains one film still-frame. As you can imagine, DPX file collections can become rather large.

This may seem like an unusual way to construct a file format, but it really makes sense in this application. Film work is frequently done on a frame basis; color correction, scratch removal and many other operations are all frame-based (or scene-based) functions.

Also, given past limitations on storage and memory, it made a lot of sense to load in a sequence of files representing the area of interest in a project without having to load the entire project. DPX allows you to do this by only loading the files of interest.

The first file format commonly used for the exchange of content between servers in the broadcast environment was the General Exchange Format (GXF). GXF is still in wide use in the industry and is supported by a several manufacturers. The designers of GXF recognized early on that it was extremely important to move metadata in addition to video and audio when doing a transfer. They also placed video and audio on a timeline, and they interleaved video and audio content so that the material could be played out continuously as soon as reception of the file had begun.

While GXF is still in use, it now appears that the dominant, standards-based file format in the broadcast space will be the Material eXchange Format (MXF). MXF is a powerful enabling technology. It allows manufacturers to transfer video and audio (essence), along with information about the video and audio (metadata). MXF has a particularly well-defined set of essence descriptors. Essence descriptors describe the technical aspects of the video or audio in extreme detail. This high level of detail is required so that the receiving piece of equipment can determine at the outset if it can playback the video or audio contained in the file. This information may also be useful later during post-production, where acquisition settings such as frame rate, overcrank, etc., are important. DMS-1 provides basic descriptive metadata. You can think of this as information that used to be contained on the tape label and on the cue sheet inside the cassette case.


Figure 2. There are subtle differences between MXF OP-1A files and OP-Atom files. OP-1A files are interleaved and contain multiple essence types. OP-Atom files contain only one essence type. Click here to see an enlarged diagram.

MXF comes in two types — Operational Pattern (OP) Atom and OP-1A. The difference between the two is subtle but important. (See Figure 2) Applications that create OP-Atom files create a separate file for each “track” — for example, one video file, one audio file (or more), one data essence file and so on. Each file contains only one essence type. OP-1A files contain video, audio and data essence all in one file. The essence types are interleaved so that they can be played out in a streaming fashion like conventional videotape.

When it comes to post-production, the Advanced Authoring Format (AAF) dominates in the editing, compositing and archiving environment. AAF is quite similar to MXF. AAF and MXF share the same object model; in simple terms, AAF and MXF applications use the same names for things. AAF and MXF applications also put things in the same place within the object model. As you can imagine, this is extremely handy when it comes to transferring content between broadcast and post. Where AAF and MXF differ is that AAF has a rich language that allows applications to describe, in detail, how a composition is built — how various video and audio clips, layers of graphics and so on were put together to yield the finished piece. One important distinction between AAF and MXF is that MXF supports streaming while AAF does not.

When thinking about file formats and workflow, it may help to look at a couple of use cases. A camera operator out in the field uses an MXF camera to capture a number of shots, which are to be edited into a news story. When the camera operator returns from the field, he plugs his camera into a network and downloads the files off the camera onto a central server. Once the content is on the server, the editor opens the files and begins to edit. After the story is edited and approved, it is transferred to a playback server so that it can be included in the evening newscast. Finally, the MXF file is ingested into an archive.

In another use case, MXF cameras are used to acquire content for a television show. When shooting is over, the content is downloaded to a server. Editors now access the content on the server and begin working on the show. Their AAF-enabled equipment reads in MXF files, and a project is begun. The project is stored in AAF files, which contain all of the information about the compositions created from the MXF files. The AAF files contain all of this metadata, but they contain none of the actual images or audio. The source video and audio is stored in MXF files, but they are referenced (pointed to) by the AAF files. Thus, the AAF files contain complex editing information, but the MXF files still remain relatively simple essence containers. When the editing session is finished, the final composition is rendered to an MXF file for transmission. The AAF file and associated MXF files are archived, preserving both the essence and composition information.

Building file-based facilities opens up exciting new opportunities for improvements in workflow. People can access the same content simultaneously. Management tools can be designed that will notify users when a particular production process is completed. Archive tools that automatically parse incoming AAF and MXF files can be created, aiding in later retrieval of this content. Expect exciting new developments as manufacturers and users begin to take full advantage of networked infrastructures.

Brad Gilmer is executive director of the AAF Association, executive director of the Video Services forum, and president of Gilmer & Associates, a consulting firm.

Send questions and comments to:brad_gilmer@primediabusiness.com