Archive systems

I remember well the archive system in the first television station I worked at. The news department stored reels of film in metal cans on metal shelves,
Author:
Updated:
Original:

I remember well the archive system in the first television station I worked at. The news department stored reels of film in metal cans on metal shelves, with file cards denoting the can's contents. I am sure someone had a record of what was spliced onto each reel, but I would also bet that the records were accurate only part of the time.

In our slightly more modern age, archive systems take on a distinctly different connotation. Though much of the long-term archive of the television business still is stored in film cans and on quad tape reels kept in deep, cold storage mines, the industry is interested in an all-electronic data archive of media content. By borrowing the technology used for decades in the mainframe computer industry, we are embracing a mature technology for a “new” application.

There are several things an archive must do to be useful. It must store the contents of the program with high-reliability and for long periods of time. It must also allow the content to be retrieved with a minimum of fuss. It must provide sufficient write and read speed to work for the intended purpose. And the capacity must be either sized to hold a known mass of content or at least be expandable to achieve long-term usefulness.

This article will look only at long-term robotic archive. It is important to think of storage in terms of deep archive, nearline storage or archive, and online media. In this sense, nearline hard disk storage is a form of archive, but it is outside the intent of this article. Think of storage in terms of multiple locations all linked by a media asset management system that tracks all instances of the content.

Linear data tape vs. DVD archive

The most critical need is the stability of the medium that the data is stored on. Two primary types of media are used in our industry: linear data tape and DVD archive. Each has strong and weak points. The total storage density in a DVD archive is high, but the write speed is somewhat limited. This can be overcome by using more transports to achieve sufficient throughput.

Data tape can store content faster, but because it must be searched linearly, it is slower to retrieve short pieces of content. Data tape requires a physically larger machine, and the volumetric efficiency of the storage (bytes per cubic furlong) is lower due to the compact size of the DVD medium.

DVD archives tend to be used more often for short-form content (commercials) because they suffer no serious penalty when storing short items. This is in part due to the short time to load a disk, seek the right track and write a short burst of data. Tape archives are slower at this kind of operation because the setup time to seek linearly down a tape is much longer, and the amount of time saved in writing is insignificant with commercials. As a result, tape archives tend to be used for long-form content. As bit rates continue to fall (like they will with H.264 codecs) and write speeds on DVD improve, DVD will likely penetrate further into the turf of tape archives.

Software management

Both must be managed by software. In some cases, it might be a full-blown asset management system, and in other cases simply an archive manager that uses expert rules to determine where content is kept. For instance, if the traffic system does not show a need for a commercial for weeks, the archive manager might be programmed to scavenge the spot from the online storage and move it to archive until it is close to the time it is next needed. If it is needed in three days and never again, the content might be held in a nearline disk storage subsystem, where it can be retrieved quickly without tying up space on an archive from which it will be soon purged.

A full-blown asset management system would contain the same ability to move content to the most appropriate location based on knowledge of its use, but it also would have more information (metadata) about the content and its usage. It might use the same archive manager as the “data mover,” while retaining control over the decisions on where to “put” the content. It would also have a full record of content, which is interrelated, like versions of a program with different interstitials or editorial decisions necessary for usage in other release mediums.

Capacity is sometimes thought to be expandable infinitely. Envision a tape archive with cabinets or silos that can move physical media from multiple storage locations to a set of I/O drives. One might be able to add cabinets to some arbitrary size in the future. However, that clearly is only possible as long as the specific model is still in production. It is thus wise to think of the most likely final size of a nearline or deep archive at the time of initial implementation to avoid the possibility of technology precluding expansion later.

The decision tree

There are some advantages to expanding with little disruptions to the system. Adding storage later may require some downtime while mechanical systems are connected and software is updated to allow access to new storage locations.

Not so obvious is the “infinitely expandable” storage system. By that I mean a deep archive out of the electronically accessible system, for example on shelves. As long as the content is registered in the system, most software will allow content to be moved out of the robot. When it is needed, the software will call for the correct media, which the human mind can fetch for the machine making the decisions. I mentioned at the beginning of the article that content must be retrieved with a minimum of “fuss.” Part of the planning process includes a decision about using human labor and metal shelving in an “automated” library. This is just part of the decision tree, but it's a decision that might take a system with a planned 5-year life and extend it far into the future.

Both types of archive mechanics are suitable for specific applications. Before listening to sales pitches, make sure you list everything you know about your intended use. Consider the number and length of items, required access times, number of items added and purged in any unit of time, bit rates and write speeds needed, and size of archive in items and bytes. Once you have a clear picture, vendors will be happy to explain how they can fit into your intended usage.

John Luff is senior vice president of business development for AZCAR.

Send questions and comments to:john_luff@primediabusiness.com.