Media-aware storage

Storage from an IT perspective could be viewed as a bucket of bits in which one set of bits is treated equally to any other set. This works out fairly well for the traditional IT professional, to whom one set of bits is essentially indistinguishable from any other. However, whereas the IT professional is concerned with data, and getting data to those users or processes that need it in an effective way, the media professional is concerned with content, and so it is not just the communication paths, bandwidth or access that is important, but also the quality, timeliness and meaning behind the bits.

Because a media facility does not have the same goals as a traditional IT data center, there may be aspects of traditional IT storage that are not ideal for media. The main issues could be enumerated as follows:

  • Media files are much larger than other types of files.
  • Bandwidth into and out of storage systems needs to be greater when dealing with media content.
  • Media activity is often not transactional in the same way that most IT activity can be. A single media file on a shared-storage network may need to be simultaneously accessed by multiple parties, so the same types of file locking that occur on shared business documents cannot occur here.
  • The movement of media is a much more significant issue, due mostly to larger file sizes but also to the time-based nature of the content.
  • Media files have significant amounts of metadata associated with them. This metadata is used in the daily operations of a media facility in a way that managers of traditional file systems are unaccustomed.
  • Security and access control of media is always a challenge, and traditional IT security setups may hinder as much as they enable.

Media-aware storage

Traditional IT storage is based on the assumption that data is heterogeneous — that it conforms to no particular pattern other than to be segmented into individual files. Media-specific storage, however, doesn't need to work from this same assumption. Media files are a subset of all types of files, and so storage that is intended to handle only media can make additional assumptions about the data that it stores, specifically that it is time-based in nature. Different assumptions mean a potentially different product. This could improve a file-based workflow.

While so-called “media-oriented storage” is widely available in the industry, “media-aware storage” as defined above has only just started to impact media workflows. Certain vendors have begun to experiment with the idea of extending IT storage concepts into the world of media to find efficiencies in dealing with the eccentricities of media files. Traditional IT technology has only recently truly stabilized its presence in the media industry, so this is a new area for media technologists.

Approaches and models

The concept of media-aware storage can mean different things when applied to different applications. There are many aspects to media management, just as there are many aspects to digital storage. Software that is used to manage storage systems can be “aware” of media in different ways. The following seven applications of the concept each have different repercussions on system design.

Specialized hardware and software stack

Storage file systems are application-layer constructs supported by many layers underneath — hardware, I/O connectivity, operating systems, presentation software, security, etc. (See Figure 1.) Efficiencies can be found at any of these layers by limiting bandwidth or by removing processor-intensive generic storage qualities, access prioritization or proprietary configurations.

For example, in media applications, providing data in a timely manner may be more important than providing data without error. If a storage system comes across data that has been corrupted or cannot be easily read, a traditional disk subsystem response would be to scan the data several times in an attempt to read it with all errors corrected. The media need may, in fact, be to deliver the corrupted data quickly without rescanning it (perhaps using error concealment techniques) just to keep up a consistent bandwidth.

Partial file restoration

Media files offer a fixed number of file types for a storage vendor to consider. If a vendor knows that its storage is for media only, specific file extensions (wrappers) and formats (codecs) must be supported.

Partial file restoration leverages this to find workflow efficiencies. When there is little online storage space compared to the available nearline or offline storage space, restoring only a portion of a large media file can ensure that the most valuable space is used as effectively as possible.

Automatic media processing

A logical extension to partial file restoration is partial file storage, or storing only pieces of whole files that can be automatically assembled or otherwise processed when they are requested by the user.

There are challenges here, such as buffer management or dealing with GOP structures, but the benefit of storage that actively manages, assembles and disassembles clips for delivery to an application is undeniable. For example, this storage could be used to package an EDL automatically, below the application layer.

This type of system would be especially useful in facilities that have spent years perfecting a tape-based workflow and depend on the linear nature of tapes and certain video servers to prepare their media. This would also allow a facility to capture portions of a single asset over several sessions, or allow an editor to save just the changed portion of an asset.

Automatic cataloging

A storage system can also have an assumption of what metadata is useful to catalog. This would allow the metadata that can be automatically generated to collect at the storage layer instead of at the application or media asset management (MAM) layer. Media-aware storage could automatically catalog duration, format, source, frame rate, resolution, color depth or any data that live in the essence wrapper.

Storing this data in the file system provides faster recall than a database-style MAM system, which would store data in a separate file that must be kept consistent with the file system. Databases have become more efficient in the past 30 years, but cataloging asset metadata with the storage system can simplify the design of media management applications and prevent the need for a separate database.

Video processing components can be used at the storage layer to gather more metadata automatically, and other wrappers or VANC data can be leveraged as well. Rights data that is associated with the media asset could automatically inform the access control of the storage system by mapping the rights in the media asset to the users/groups of the system.

Proxy workflow

The concept of a proxy workflow already is well-understood in the media industry. By expanding the concept to include more than simply a low-resolution and a full-resolution version, the concept becomes less about saving bandwidth and processor power and more about providing the right version of the asset to the right user or application. Business rules at the storage level could be used to present different-sized assets to different applications.

By combining a storage system that manages proxies with a rich security infrastructure, a facility would be able to prevent unauthorized users/applications from accessing or even knowing about other available resolutions or formats of an asset.

Format independence

The ultimate goal is this: Different applications requesting the same media asset get different versions of that asset, depending on the resolution and format that the application is expecting, and the storage system is responsible for generating and managing that versioning.

Format independence like this is especially useful for the exploding world of new media distribution. Contemporary production houses must generate tens or hundreds of differently formatted media assets as they hypersyndicate their content. Distribution requires not only SD and HD versions of TV material, but also streaming and electronic sell-through versions for online distribution. If the storage system knows how these assets are derived from one another, it can present the appropriate format to the appropriate distribution channel with minimal involvement from production staff.

Service-oriented storage

As the concepts of media-aware storage are explored, a trend begins to emerge: The best way to service the complex storage needs of media facilities is to combine streamlined IT storage with value-added media services. “Services” in this context could describe an actor in a service-oriented architecture (SOA), wherein every component of the system has a purpose at the business level and acts independently of the other components. Transcoding to provide format independence, for example, is a service to the storage system, because it adds value to the media workflow and provides a basic, business-oriented building block for media facilities.

A key aspect to SOA is the technology-independent interoperability of services. A higher-level storage service composed of multiple infrastructure services providing capabilities such as basic metadata logging, media movement and transcoding should, according to the precepts of SOA, expose all of these interfaces programmatically to external systems in addition to leveraging them internally. Based on this exploration of media-aware storage, the services that could fall under the storage “umbrella” are as follows:

  • archive;
  • media movement;
  • cataloging;
  • transcoding and format conversion;
  • metadata management;
  • rights and clearances;
  • versioning; and
  • editorial (to a limited extent — mainly the storage and processing of EDLs).

This group of functionality could be termed the “media service.”

Future of media-aware storage

Moving forward, some of these models will, no doubt, have greater “sticking power” than others. In 10 or even five years, the status of media-aware storage in the industry will look very different than it does today. It is almost guaranteed that some of the technology trends surfacing this year will be out of vogue by that time, while others will be around and integrated into the industry's storage concept in a much more permanent way.

The concept of a media service is becoming increasingly necessary in a world where complex, file-based facilities are being constructed in less time for less cost. There is a strong need here, however, for standardization. Initially, every vendor will have its own definition of service-oriented storage, so the various solutions in this space will in no way be comparable.

The media equation is a complex one, so the generally-accepted IT principles about storage (including recent trends toward service orientation) do not necessarily mesh with the needs of the media world. Therefore, the best solution for the industry is a vertically-integrated one (a media service) that can, itself, interact in the horizontally-integrated world of the modern media facility.

Joey Faust is a consultant with National Teleconsultants and co-author of the 2008 book “The Service-Oriented Media Enterprise.” Faust presented a longer version of this article at the 2008 SMPTE convention. A copy of that version is available fromwww.smpte.org.