File Interchange, Part I

Video server users have long sought a means to exchange files between differing video and media server platforms. As server and editing system development evolved, so did a diverse and disparate set of file structures, compression schemes and internal architectures. Each manufacturer established individual sets of features and performance characteristics aimed at providing a best solution for the target markets it addressed. Even though the input-output structures remain bound to accepted standards – primarily analog and/or digital video and audio – that is where the commonality of video server media structures ended.

(click thumbnail)
Spawned by the sophistication of motion-JPEG algorithms, users of desktop editing platforms were the first to recognize the closed, independent formats that separated one editorial platform from another. With only a few possible exceptions (certain graphics- and animation-based applications), the interchange of moving media files formats would remain isolated until a set of interchange tools and standards could be established.

One driving force behind file interchange development came out of the goal to streamline workflow processes. With dozens of options in both video server and editorial platforms available, often the selection of a particular product was solidified by previous – usually first-time – purchases in a sole-source editing system.

These comprehensive video products, especially those with a high-level user interface, focus on performance and uniqueness; thus, narrowing the probability that one manufacturer will integrate easily with a different manufacturer’s file sets. This strategy is beginning to change as owner/operators are realizing the needs for multiple platform integration into the facility. Users state that paramount consideration must be given to workflow processes that are not encumbered by the continual encoding/decoding steps necessary for interchange between devices.

THE EARLY DAYS

Early on, graphic designers created a set of software tools that allowed file formats to be exchanged through a conversion process operating on a frame-by-frame-based structure. Interchange between .gif, .tif, .eps, etc., files offered flexibility to the graphics designer, even though the tool sets would compromise certain resolution, colorimetry and scalability.

As graphics moved away from optical processing to digital processing – and as processors moved from minicomputers to desktops – interchange and transformation procedures demanded a more universal set of tools that could easily be transported between platforms and applications.

Moving media content (audio, video, graphics and text) appears to be taking the same course as graphics design: it’s also adding feature sets for overall media management.

Desktop processing at the file level is rapidly overtaking its predecessor tool sets, whose environments were centered around multiple videotape transports, large-scale "super edit" suites or dedicated singular black boxes with application-specific capabilities.

With compressed digital media, files need to be interchanged between a myriad of devices, platforms and physical locations. As standards and techniques for the inclusion of metadata grow from wish lists to reality, new avenues to distribution, storage and interchange also surface.

In professional applications, harmony in capabilities is leading to a set of user requirements for file interchange. User groups have narrowed these requirements for file exchange into a set of broadly defined categories (designated in Table 1).
Table 1: User Requirements for File Interchange
(click thumbnail)
PROFESSIONAL APPLICATION

DEFINITION

OF USE

Publication

Emission, Transmission, Store & Forward

Content Repository

Online, Nearline, Archive of media data

Finished Interchange

CD, DVD, Linear Tape

Authoring Interchange

Multimedia Editorial, Interactive
Next, to satisfy the transfer of program material between professional equipment in the broadcast and media content environment, a set of definitions must be created for both file transfers and stream transfers, which differ in various aspects.

File transfers are exchanged over a network with high reliability in a packet-based structure. File transfers are generally nonsynchronous – that is, they are not referenced to an external clock during the transfer. File transfers are nearly always acknowledged, meaning that packets are exchanged to signal the source that the destination received the files properly. File transfer is generally point-to-point or – in a limited file size, point-to-multipoint.

Stream transfers use a datastreaming interconnect in a point-to-multipoint or broadcast mode. With a specified transfer rate that is normally synchronized to a clock or is asynchronous, streams are generally open-ended and instigated without a predetermined start point or end point. This follows the analogy of a camera recording a live program directly to a receiving device (tape, disk or viewer.)

For file transfers, the file format structure allows the access of essence data at random or widely distributed byte positions. This means that the data may be arranged in various locations, generally requiring that the entire file transfer be completed before the data can be used. Conversely, for stream transfers, the decoding of the stream is always sequential, with a file format structure that allows access of essence data at sequential byte positions. Essence is the media content – defined as the video, audio or graphic content – and is differentiated from metadata, which is the detail about the bits in the essence (e.g., file date, source, ownership, version, etc.)

DEFINING METADATA

Beyond the defining file or stream transfer essence data, metadata sets are generally included in the material body. At a high level, metadata is divided into two broad categories, referred to as structured and descriptive.

Structured metadata defines, for example, what sources were used in the generation of the essence as well as how the essence was edited. It also generally defines the set of data called the "essence structure." Descriptive metadata is a set of information that describes, catalogues or defines content parameters. Examples include ownership, licensing rights or episode numbers.

Metadata is further defined as either static – applies over the duration of the program, segment or clip; or as dynamic – data that changes over time, such as timecode, keycode or the like. Within a whole program, a variety of metadata can be applied. Fig. 1 depicts a hierarchy of subdivisions to which static metadata can be applied. Note that there is metadata for each element of essence, be it the lowest level (a track) or the highest level (the whole program). The degree of metadata included in each element of essence is user-selectable and can vary from a single unit to several units.

Static metadata applies to the whole set of program material and is generally set once at the outset, usually within the header of the program. Static metadata can be applied to a segment of the program – that is, when one or more segments are included in the program as a whole. Within segments, metadata can be applied to the duration of a scene or event. Metadata in a scene or event can refer to a timeline event or be applied to the duration between two timeline markers.

Dynamic metadata is data that continues to change over the timeline and is best handled by embedding it with the respective essence data as the metadata is created. Information such as pan-and-scan vectors for tracking the 4 x 3 window in a 16 x 9 frame is a good example of metadata that must be set specific to the essence and the time it occurs.

Metadata – particularly dynamic metadata – must be carried within the file exchange, be it file transfer or streaming transfer. The metadata must be easily extractable in order for it to be of value. While the embedding of dynamic metadata (such as timecode) is a most obvious method of transport, an alternative might be to attach a stream of metadata that precedes or directly accompanies the essence package and would allow a storage device to properly handle the metadata in time coincidence.

We have covered some important elements that are quite relative to the subject of file interchange between devices. Next month we will cover the derivation and progress being made in standards aimed at interchange, such as MXF, AAF and GXF, which will also aid in workflow efficiencies as they are adopted and implemented.

Karl Paulsen

Karl Paulsen is the CTO for Diversified, the global leader in media-related technologies, innovations and systems integration. Karl provides subject matter expertise and innovative visionary futures related to advanced networking and IP-technologies, workflow design and assessment, media asset management, and storage technologies. Karl is a SMPTE Life Fellow, a SBE Life Member & Certified Professional Broadcast Engineer, and the author of hundreds of articles focused on industry advances in cloud, storage, workflow, and media technologies. For over 25-years he has continually featured topics in TV Tech magazine—penning the magazine’s Storage and Media Technologies and its Cloudspotter’s Journal columns.