Backing Up Media Content

It goes without saying that data is one of the enterprise's most valuable assets - so protecting it is of paramount importance; yet for many, it's also one of the biggest headaches. In the face of the growing volumes of content media delivered as digital data, protecting storage mediums involves decisions that weigh heavily upon the revenue-generating value of that content versus the negative impact of not having it accessible when scheduled.

How any enterprise handles its content and the efforts for its perseverance are multifaceted. For broadcasters, content that lives for a day or two would generally be handled differently from content intended to endure a decade of reproductions.

Content media is frequently divided between "active" and "nonactive." Active content is media used in the very near term, say within a day or two. Nonactive content is media that would remain idle for weeks or months, and would be pushed onto data tape, DVD, or stored in an archive either locally or off-site. Varying levels and methods of protecting media content are now at hand, each with different costs and complexities associated with them. The principal categories include:

Protection through data redundancy (RAID) on a single drive array;
Protection through duplicated (mirrored) copies on separate drive arrays;
Copies of data at an isolated site (disaster protection);
Offloading to ancillary randomly accessible mediums (DVD or hard disks);
Offloading and cataloging of data onto linear magnetic mediums (data or videotape);
Or combinations of the above.

In addition to the data protection methods, the types and scale of storage vary, depending upon the current and long term uses of the content. Fundamental metrics for digital media content storage are detailed in Fig. 1.

One key to content protection lies in developing a workflow and content management scheme for these assets, which starts at delivery to the enterprise and extends beyond transmission. From a process perspective, content arrives in either physical form (tape) or as files-transmitted media, generally via satellite, or over fiber (e.g., private carrier or public service). At the point of entry, the physical media is inspected, logged into the system, and scheduled for a screening/quality-control process. At an appropriate time, media is then cached as baseband video/audio to a video media server. Software applications mark files with a SOM (start of message) and timing DUR (duration) and metadata is entered into a database that links the content and its metadata to a scheduled transmission (air or satellite distribution)(See Fig.2).

(click thumbnail)Fig. 2
This process yields a natural protection scheme in the master videotape-provided the tape media remains in the facility's possession. In the event of a problem, and with the proper indexing and database interface between the server media and master videotape, the content could, in an emergency, be broadcast from the original videotape.

However, when the content is received via a real- or non real-time delivery platform, a different scenario presents itself. In the mid-1980s, when real-time delivery over satellite was augmented by CycleSat's service of pushing short-form commercial content over satellite to local VTRs, the concept of unattended receiving of media was officially born. Today, thousands of content elements are being delivered over several services as digits; but that content is not necessarily in a format or structure that the facility can manage directly for playback or broadcast-so again, a cache method must be inserted before the content can be transmitted or aired.

FROM CATCH TO CACHE

Content digits may be streamed, sent ftp or dribbled into the receive point. In most applications, data is collected in a catch server (e.g., delivery platforms such as DG, VyVx or Pathfire), then later migrated to a local video server platform (or to videotape). Local video servers configure the content for the specific standards and practices necessary for that particular transmission or broadcast. In many cases moving this data from catch to cache is done manually, similar to ingesting content from VTRs to servers. Nonetheless, similar processes of screening, QC and timing are required. However, some services now provide timing and identification information via another file, which aids in the conversion of the files to the video server's native format at the time the data is moved between catch and cache servers.

Recently, some delivery service providers have teamed with video server manufacturers to enable a background transfer scheme from catch to video server. This harmonization of delivery and playout platforms has emerged for various reasons-both technical and marketing: The content may be contractually obligated for broadcast only for a pre-specified time frame. The catch server's content may be time-sensitive or unavailable after a certain time frame; i.e., limited catch disk space or the delivery software employs an auto-purge feature whereby the end user is expected to transfer the catch disk content to a secondary medium in a prescribed time frame. The catch server was not intended to interface with the facility's automation system, enforcing the need to offload a copy of the content and protect as necessary.

Content delivered to a privately placed catch server may still require a secondary backup scheme. For broadcasters and content delivery service providers, the current and more common choices for protection include mirrored video servers, DVD-RAM backup, and tape (video and/or data). Mirrored servers provide the most rapid access to content, but are pricey. DVD-RAM offers a flexible high-density medium-but slow write times. Tape offers the highest density, highest capacity, but least-accessible methodology for the protection of media content assets.

The offline storing of data for protection raises the issue of restoring the data to servers, should an unrecoverable loss on the main and/or backup disk arrays arise. The restoring of content to unprotected servers when data is lost or corrupted is time-consuming and stressful. As users depend more heavily on their digital assets, the need for planned restoration methods rises-offset in part by the demand for shorter restoration windows and reduced storage expenditures.

Traditionally, asset backup has relied on data tape, principally for its cost-effectiveness, rapid backup time, and the ability to move data to an off-site location for protection. However, data tape restoration is slow and, some say, unreliable for complete restores-which are the main reasons that backups are needed in the first place. One alternative is disk backup, which offers a much shorter restoration period when compared to data tape of similar volume.

THE 'SHOESHINE EFFECT'

With the rapid decrease in cost per unit of storage, there now may be advantages to adopting a full disk-based backup strategy, with concepts driven from the data-centric world where servers are routinely backed up both incrementally and completely, traditionally only to data tape. One of the problems with utilizing tape drives for data backup relates to linear tape's operational inefficiencies. Tape drives must be "tuned" to avoid the effect of starting, stopping and repositioning the tape-sometimes referred to as the "shoeshine effect." The data world avoids the problem by a process called multiplexing, whereby several concurrent backups are streamed to the tape drives, minimizing the shoeshine effect. However, it takes extra time to read images and handle multiple incoming backup sources, and it is not readily suited to the types of contiguous data sets that comprise compressed digital data for MPEG-2 file structures.

Conversely, disk arrays do not need a steady stream of data, so there is no shoeshine effect even for small incremental backups. Other advantages to disk-based protective storage include:

Technologies, such as RAID, allow for exceptional recovery, at a reasonable cost;
Reliability of disk drives is better than a pure tape medium. Discovering a bad tape might render an entire restore operation to failure-yet RAID protection continues through a complete restore even if one of the disks fails;
For media applications, most restorations from the backup or archive medium are single files. Disk storage is an efficient single-file recovery method, and being a random-access medium, makes disks ideal for recovery of single or random groups of individual files (e.g., promotional interstitials, segments of programs, and commercial content).

The final element, which will only be mentioned in passing, is the Archive Management Application. These software applications and associated server hardware deal with backups for both disk and tape mediums. The physical interfaces between video servers and nearline disk arrays vary from provider to provider; and there is always the option of backing the disk-based media to a subsequent medium, such as DVD-RAM or even data tape (e.g., DLT, LTO, etc).