Broadcast archives

With the inception of file-based productions, a broadcaster's archive has become more of an issue than ever. When programs were supplied on videotape, they were kept for the duration of the rights window and then returned to the production company or removed from a transmittable area. After transmission, commissioned material was sent to a climate-controlled warehouse to sit on a shelf for perpetuity.

The management of the warehouse ranges from a card index of tape and shelf numbers to a comprehensive solution with bar codes and a database. Tapes are checked out if a series is to be aired again, or if the content is needed as archive material in a new program.

When the case is made for the investment in an archive for files, a number of questions are raised. What is an archive for? Is it a repository of assets to be mined in the future, or is it for disaster recovery? What should be archived, and what should be trashed? What file format should be used? And finally: What storage medium should be used — data tape, spinning disks or just outsourcing the storage?

The simple answer is that you should balance the cost of buying and running the storage against the value of the assets.

The archive

An archive can serve several functions. For the newsroom, it's an essential pool of material for creating background stories to explain current events. However, news ages very quickly, and a skilled news archivist will ensure that only what is essential is archived. Add to that, most news clips are short and do not require the same amount of storage as long-form programs. It makes sense to manage a news archive as a stand-alone system, separate from the program archive.

As program production moves to tapeless formats, a file archive will replace the videotape library as the broadcaster's main repository of program assets. For a small station, it can be a backup for the disk storage. For larger broadcasters commissioning their own programming, it is a permanent repository of their assets. For any broadcaster, it can form part of their disaster recovery (DR) strategy. Data tapes can be shipped out to a remote site, or for those with deep pockets, a second tape library can be installed at the DR location. The issue here is not technological; it's a business decision.

A typical archive consists of a large RAID array for nearline storage, backed by a tape library system. The disk subsystem stores work in progress — in post or waiting for transmission. The post and transmission departments can pull files from the nearline to high-performance storage for editing and playout.

The technology

A file-based archive is just part of a larger system and could be considered a service to the media asset management (MAM). (See Figure 1 on page 10.) The file archive sits at the bottom of the storage hierarchy and provides the lowest cost-per-byte at the expense of performance (time to restore an asset). The archive management application sits between the media storage and the digital asset management (DAM or MAM).

Archive manager functions

In simple terms, the archive manager can be thought of as managing the data tape library. To get the optimum performance from a tape drive, a large managed buffer is required so data can be streamed at maximum right speed to the tape. Most libraries have multiple drives, and the archive manager can prioritize read/write operation to best serve broadcast operations. For example, a late schedule change may mean a file must be restored urgently for playout.

An index maps files to tapes in a directory. The manager can group content files to suit the operations. One group of tapes could be used for spots, one for series and one for movies. Content can be grouped to single tapes, so they can be removed from the robot to store on a shelf.

Beyond the basic tasks related to store and retrieval, the archive manger can perform background checks on the integrity of drives and data, and preemptively migrate content to current drive formats and fresh media.

DAM

DAM means that most production processes can use a low-resolution proxy of the broadcast asset. The proxy is stored on a regular RAID array using generic low-cost IT storage. Today's IT networks can easily handle the demands for proxy viewing with a properly designed switched infrastructure.

A typical process flow is shown in Figure 2. Original content is ingested and stored in the archive in the highest resolution chosen by the broadcaster. From this, versions can be made for transmission. These may be edited and segmented. Legacy material may need processing — including scratch removal and noise reduction, deinterlacing and color correction — to clean up the picture. This processing can be applied to a copy, leaving the preservation master untouched for improved image rescue techniques in the future. The processed copy becomes the transmission master.

The key components of any archive are two processes: data movement and transcoding. The data movers receive commands from the DAM and broadcast automation to copy files from tape to nearline, or move files from nearline to tape.

No single file format fits all applications. The archive must be the highest quality, but for editing, a format like Avid, DNxHD or Apple ProRes may be more suitable. For the playout servers, a lower bit rate, long-GOP format is more appropriate. Using the right format for the job maximizes visual quality while optimizing hardware costs. To service this need for different formats, the archive management must include transcoding. This may be integral to the archive management, or it may call on a transcoding service.

Any transcode will introduce artifacts, so the workflow should be designed so that transcoding upward in bit rate and resolution is avoided. Each videotape dub dropped a generation in quality, and 10 dubs from shooting to playout were not unreasonable. The number of transcode steps in a typical file workflow can be minimal compared with videotape, so quality stands to improve. (This assumes that too much compression is not used.)

Partial restore

This is often cited as a differentiator between video storage and generic IT systems. First-generation LTO had a transfer rate around 15MB/s. A one-hour program stored as 50Mb/s MPEG is about 20GB. Restoring files from a tape archive takes about 20 minutes. If you needed a three-minute clip for a promo, tying up a tape drive for 20 minutes was not efficient. With LTO-4 offering 120MB/s transfer rate, it's not such an issue. The entire program can be restored to disk. If a promo is being made, it will be aired in the near future anyway and will have to be restored from tape in its entirety.

Partial restore is still a valid concept for large HD files. These may be stored at 200Mb/s or higher, and a 120-minute movie is a large file to store even for two weeks on the disk array. A similar reasoning can be applied to HD sports content. Once the game has been aired, all that may be required in the future are highlights; partial restore is ideal for such applications. The move to 3GB/s, even with mezzanine compression, only exacerbates the issue.

The medium

Although many technologies from the mundane to esoteric have promised replacements for magnetic storage, they remain just around the corner. Optical storage is still limited to 50GB per disk, so the conventional disk drive and data tape remain at the heart of any storage subsystem. Solid-state drives are finding applications in acquisition and playout, but it is still in the early days for mass storage.

Tape libraries come in all sizes, from an auto-loader with a capacity of 10 tapes up to enterprise libraries with capacities of tens of thousands of slots.

The front end of an archive today is a disk array. This technology has evolved from parallel-connected disks, SCSI and IDE/ATA to the current serial technology, SAS and SATA. A RAID subsystem can provide backup against disk failure. These arrays provide cost-effective storage for work in progress and smaller archives.

For longer-term and low-cost storage, data tape is the most popular option. Again, the technology is constantly changing and improving. Technologies like DLT have been replaced with LTO. The next-generation LTO-5 stores 1.6TB, about 70 hours of 50Mb/s video. The LTO cartridge is about two-thirds the volume of 3.5in hard drive, so the storage density is about double (a 1TB drive). Hard drives need more space around them for cooling and disk controllers, so with current technology, the tape store will take up less floor space.

Energy efficiency

Energy use has always been a cost factor with archives. Videotape and film both need climate control. For a long-term tape archive, the U.S. Library of Congress recommends 10 degrees Celsius and 20 percent to 30 percent relative humidity (RH). For film, it recommends 3 degrees Celsius and 20 percent to 30 percent RH.

Spinning disks draw power, so they need additional cooling. For a deep archive, why keep the disks spinning when they are not in use? This is the principal behind a massive array of idle disks (MAID). Unused disks spin down until they are needed. In a typical MAID array, only 25 percent of the disks are spinning. Not only does it reduce power consumption, but also it prolongs the life of the drives.

Data tape libraries use power for the robots and drives, but far less than an equivalent capacity disk array. The data tape has the same environmental requirements as the videotape, not too hot or humid.

You can prove anything with statistics, but the generally held view is that tape is the lowest energy user.

Summary

An archive can serve several functions. It can form part of the backup strategy, it can be used for DR, just as a program repository, or all three.

The archive may be thought of as a permanent store, but the underlying technology is anything but permanent. There is no right answer to archive design. No current storage technology has a long life. Drives become obsolete, and the recording media decays. The current wisdom is to replace drives every three to five years, and to migrate data from tapes after 10 to 15 years. These factors must be considered when calculating the operating costs of the archive. Protecting the archive, automated monitoring and migration of drives, and media condition are just not possible with videotape.

Archive management has become a service to an overall DAM/MAM system, which handles data movement, transcoding and tape management. For the broadcaster, this can deliver large cost savings. The traditional broadcaster uses manual processes to move content from production to transmission. The move to file-based workflows can eliminate the bulk of those human processes.

There are choices for the broadcaster: The complexity of managing the archive can be outsourced to a data center. This could be within the corporate firewall or at a remote shared data center.

Each broadcaster has a different set of technical requirements and will place its own value on its assets. Storage technology is getting cheaper, but production creates ever more content. New formats like HD, 2K and 4K plus UHDTV in the future just increase the file sizes for a given program duration. Today's optimum solution will be wrong in two years. So choose something, and expect to migrate!