We are officially in a file-based world. That is not a shock; we have been moving that way for years. Now even cameras are spoken of as recording files, so I guess we have crossed the equivalent of the Rubicon. When Caesar crossed that celebrated river, he broke a Roman law. Progress in several technologies helped our industry to make this transition more than 20 years after nonlinear editors began working with files in earnest.
This first installment of a two-part series will look at the underpinnings of what is generally called file-based workflow, defining terms and setting the reference points in the discussion. Make no mistake: Using files instead of physical media is a radical change in how we work. British writer Sidney Smith said, “To do anything in this world worth doing, we must not stand back shivering and thinking of the cold and danger, but jump in, and scramble through as well as we can.”
Key concepts of digital content
The change mentioned above is brought on by embracing the digital nature of content, which is well described in the book “Being Digital” by Nickolas Negroponte.
There are three overarching concepts, spelled out more than a decade ago by the EBU/SMPTE Task Force on Harmonized Standards for the Exchange of Television Program Material as Bit Streams over a decade ago. First is essence. Essence is the content of a program, video and audio, and perhaps data for interactivity or other applications. Second is metadata, which most simply put is bits about the essence. It describes the content in ways that will allow a user to decode and use the essence. It might, for instance, describe the encoding used (MPEG-2, H.264, JPEG2000), the number of audio tracks, program rights information and even GPS coordinates where it was shot. Third is the idea of a wrapper. The wrapper surrounds the essence and metadata. It allows for parsing of the content and metadata to decoders in an efficient manner without requiring a multiplicity of files loosely coupled and difficult to manage. Wrappers are familiar to us, including MXF (Material Exchange Format, standardized by SMPTE), AVI and QuickTime.
It might be useful to look at another analogy. SDI and HD-SDI are similar to baseband wrappers. They carry video and audio, as well as metadata. There are important differences, however. SDI (and HD-SDI) does not announce the presence of services. Audio is always in the same place in the signal, so if a decoder is looking for audio, it knows where to look. Metadata is a bit less obvious, but as with audio, it does not announce its presence. One must know what to look for and where. With wrappers in file-based operations, the wrapper announces what is being carried and points to where it is. Thus, each file is self-referencing. As soon as it is received, an application can inventory the contents of the wrapper and disburse information to other applications as needed. (See Figure 1.)
Another key concept is that dubbing content to make copies is no longer appropriate. When you copy a file, you create another instance that's totally indistinguishable from the original. If a file is altered, perhaps with closed captions, or maybe the content is edited, you create a new version. Managing the multiplicity of instances and versions can be one of the more difficult aspects of a file-based facility. It requires a good asset management system to fully track versions and instances (and instances of versions). Later, I'll discuss some tools that allow complex descriptions of content to be part of the wrapper in ways that facilitate standardized program interchange and flexible creation of versions from a set of related essence elements.
Another important, though obvious, point is that all file-based facilities are network-based, IT-centric installations. That is not to miss the point that some aspects of a file-based world need to bridge to the linear analog world. Ingest in news is an obvious example of a hybrid workflow where file-based and linear worlds must merge seamlessly into a whole. Most release-to-air facilities similarly are in the hybrid camp. It is critical to understand that a file-based workflow is inherently not synchronous and time-dependent; it is asynchronous and loosely, if at all, coupled to time. There are methods of treating files as time-dependent entities, most clearly thought of as streaming.
Generally, networks deliver best effort movement of content, which is not a great way to run master control. To get beyond that, one really needs to do a good job of designing applications and controlling the architecture of the network on which the system lives. When this is done well, it works transparently.
The issue of synchronicity is central to many discussions about file-based workflow. Master control must deliver content reliably concatenated and locked to a delivery time. The goal of file-based master control is to provide the illusion that the content is seamless and continuous. To do that in the DTV world means that metadata must be delivered with essence. Even embedded metadata, such as dialnorm and DynRng, are tricky to deliver in a complete system.
Consider the case of a fade between two sources. Picture and sound are easily handled, but how does one transition metadata? Most implementations today pick a point in the essence transition and simply chop to the new source of metadata, a decidedly inelegant answer to a growing problem. The same problem exists in AFD and other classes of metadata.
There is a second problem with metadata management that is not immediately obvious. There are times when metadata needs to be separately processed, such as when rights information is updated. That means bytes need to be copied out of a container, transmitted to a program where they are modified, and later replaced in the original file, or perhaps a new version. It is possible that two operations are happening at the same time, leaving a complicated resyncing of metadata to be done. In the last few months, work has begun on a standard way to communicate that metadata, likely based on the SMPTE BXF standard. The data is XML code, which is what BXF is built on, so this appears to be a logical approach.
Structures for managing digital content
Earlier I mentioned structures for managing files and metadata. The good news is that MXF provides the framework for carrying a multitude of content types. The bad news is that there are a ton of options, including 10 different operational patterns. The standard encompasses hundreds of pages.
Some have claimed that MXF compliance is a recipe for client confusion and lack of interchange. While overstated, it is true that narrower standards force more controllable interchange. The Advanced Media Workflow Association (AMWA), whose predecessor was AAF, is an industry interest group made up of users and manufacturers that work out application specifications and related “shims,” which describe the constraints on content and the MXF implementation to be used. This allows a narrowly defined implementation of MXF to accommodate a specific application. Two worth mentioning are AS-02 Versioning (sometimes called the MXF Mastering Format) and AS-03 MXF Program Delivery, which targets file delivery of programming to stations. The first user of AS-03 is PBS, who will be using it (with its specific shim) to deliver programs to edge servers ready for use at PBS stations. Testing is in progress now.
Continue to next page
By specifying AS-03, the content and container are tightly defined, including video and audio encoding choices like bit rate and coding structure, number of tracks and the metadata structure, which carries the details about the content. A server manufacturer that can play an AS-03/PBS file must be capable of dealing with the details of the essence. Simply saying a server is MXF-compliant leaves the question of bit rate, coding standard and structure, number of audio tracks, etc., undefined. AMWA has carried it to the next level, making applications for the real world work.
AS-02 solves a different problem. In linear workflow, a version is a generation down from the master, and if many versions are necessary, many copies must be created and kept in an inventory. Using AS-02, a new version with German translation and subtitles, for example, could be created by adding tracks to a master, which can then be played out with only the new tracks varying between different air versions.
In 2007, at the NAB Show, a nascent AMWA led by Turner Entertainment demonstrated the concept at work, with multiple manufacturers showing interchange of essence and metadata seamlessly. At last fall's SMPTE Annual Technical Conference, AMWA put on another demonstration even more impressive. Figure 2 on page 39 shows two .mxf files in the “alice” directory, each of which would specify a construction with various tracks needed to fulfill program requirements. To add another language to the library, a new track is recorded and dropped into the folder — for example, alice_a2.mxf. To play it, you might create alice_v3.mxf, specifying alice_v0.mxf and alice_a2.mxf.
File-based workflow depends on access to content, but it does not necessarily require instant access to any content. Hierarchical storage is almost always part of a complete file-based workflow. John Watkinson, author of “The MPEG Handbook,” once told a SMPTE conference that it was time the industry realized that we don't care where the content is stored; we only care that when we ask to see it, it shows up. So it is less important to know what directory an item is stored in than to be able to use a software system to gain access under reasonable requirements. If a file is needed for air in a week, it does not need to be in online storage, but perhaps it is time to make a request to the queue for the archive to retrieve it within a day of air.
By structuring storage in ways that support the intended system utilization, significant expansion in capacity and simultaneous reduction in cost is often achieved. Video servers require high-speed, high-availability storage to keep the decoder queue filled at all times. That requirement for high performance creates the most expensive storage. A compromise would be to have nearline spinning disk available with lower performance, but much more capacity at a lower cost. Behind that often resides a robotic library of tape or DVD storage, which provides long-term backup and the lowest cost, but slow access for both read and write. Systems like this are not possible without good management of the assets with MAM or archive management software.
There are interesting dynamics at play in the choice of storage of media. Disk prices continue to fall, and even solid-state disks have reached practical economics. One video server manufacturer offers a green solution with entirely solid-state storage. The argument for using tape instead of huge farms of spinning disk is based on several factors, including cost per gigabyte, power consumption and cooling, MTBF, and the need to have backup copies stored off-site. All of these factors except for power consumption decline in impact over time. By the end of this decade, HDD storage should cost less than 1 percent of current costs, or a capacity 100-fold larger will cost the same as today. Tape storage costs will also decline and access speed will increase, but likely not in direct sync with disk space costs. (See Figure 3.)
Choosing storage options is also affected by the type of workflow planned. If post-production processes will need rapid access to several different layers during rendering, solid-state storage may improve throughput and performance. These are not video issues, but rather relate to good design practices for IT systems supporting video processes. The next article will discuss the skills needed in our industry in this new age.
Implementing an archive system requires attention to the rules engine that defines how content is handled. Some critical content might be recorded to nearline and archive at the time of ingest. Other short-lived content might have only a copy on spinning disk. Promos that play only once might live only in the high-performance online storage.
John Luff is a broadcast technology consultant.