Media Server Technology: Karl Paulsen
Backing Up Media Content
It goes without saying that data is one of the enterprise's most
valuable assets - so protecting it is of paramount importance; yet
for many, it's also one of the biggest headaches.
In the face of the growing volumes of content media delivered as
digital data, protecting storage mediums involves decisions that
weigh heavily upon the revenue-generating value of that content
versus the negative impact of not having it accessible when scheduled.
How any enterprise handles its content and the efforts for its
perseverance are multifaceted. For broadcasters, content that lives
for a day or two would generally be handled differently from content
intended to endure a decade of reproductions.
Content media is frequently divided between "active" and "nonactive."
Active content is media used in the very near term, say within a
day or two. Nonactive content is media that would remain idle for
weeks or months, and would be pushed onto data tape, DVD, or stored
in an archive either locally or off-site. Varying levels and methods
of protecting media content are now at hand, each with different
costs and complexities associated with them. The principal categories
include:
- Protection through data redundancy (RAID) on a single drive
array;
- Protection through duplicated (mirrored) copies on separate
drive arrays;
- Copies of data at an isolated site (disaster protection);
- Offloading to ancillary randomly accessible mediums (DVD or
hard disks);
- Offloading and cataloging of data onto linear magnetic mediums
(data or videotape);
- Or combinations of the above.
In addition to the data protection methods, the types and scale
of storage vary, depending upon the current and long term uses of
the content. Fundamental metrics for digital media content storage
are detailed in Fig. 1.
| Relative Level of Compression |
Applications and format examples |
| Full bandwidth |
Long term preservation via direct conversion to
bits without compression (SD video at 270 Mbps, HD video at
1.485 Gbps) |
| Intermediary or mezzanine |
Near and long term preservation, contribution
level distribution, or direct/previously compressed content
in a prescribed `native' format (e.g., DV, HDCAM, etc.) |
| Modestly compressed |
For transmission uses, editorial reviewing, and
near
term applications on video servers (e.g., MPEG2 MP@ML at between
8 and 15 Mbps) |
| Highly compressed |
For browse servers or thumbnail recognition (e.g.,
motion JPEG, MPEG1 or MPEG4) |
| Streaming media |
Streaming servers for Internet, Web or desktop
applications (e.g., Windows Media, Real, Quicktime) |
| Fig. 1 |
One key to content protection lies in developing a workflow and
content management scheme for these assets, which starts at delivery
to the enterprise and extends beyond transmission. From a process
perspective, content arrives in either physical form (tape) or as
files-transmitted media, generally via satellite, or over fiber
(e.g., private carrier or public service). At the point of entry,
the physical media is inspected, logged into the system, and scheduled
for a screening/quality-control process. At an appropriate time,
media is then cached as baseband video/audio to a video media server.
Software applications mark files with a SOM (start of message) and
timing DUR (duration) and metadata is entered into a database that
links the content and its metadata to a scheduled transmission (air
or satellite distribution)(See Fig.2).
 |
| Fig. 2 |
This process yields a natural protection scheme in the master videotape-provided
the tape media remains in the facility's possession. In the event
of a problem, and with the proper indexing and database interface
between the server media and master videotape, the content could,
in an emergency, be broadcast from the original videotape.
However, when the content is received via a real- or non real-time
delivery platform, a different scenario presents itself. In the
mid-1980s, when real-time delivery over satellite was augmented
by CycleSat's service of pushing short-form commercial content over
satellite to local VTRs, the concept of unattended receiving of
media was officially born. Today, thousands of content elements
are being delivered over several services as digits; but that content
is not necessarily in a format or structure that the facility can
manage directly for playback or broadcast-so again, a cache method
must be inserted before the content can be transmitted or aired.
FROM CATCH TO CACHE
Content digits may be streamed, sent ftp or dribbled into the receive
point. In most applications, data is collected in a catch server
(e.g., delivery platforms such as DG, VyVx or Pathfire), then later
migrated to a local video server platform (or to videotape). Local
video servers configure the content for the specific standards and
practices necessary for that particular transmission or broadcast.
In many cases moving this data from catch to cache is done manually,
similar to ingesting content from VTRs to servers. Nonetheless,
similar processes of screening, QC and timing are required. However,
some services now provide timing and identification information
via another file, which aids in the conversion of the files to the
video server's native format at the time the data is moved between
catch and cache servers.
Recently, some delivery service providers have teamed with video
server manufacturers to enable a background transfer scheme from
catch to video server. This harmonization of delivery and playout
platforms has emerged for various reasons-both technical and marketing:
The content may be contractually obligated for broadcast only for
a pre-specified time frame. The catch server's content may be time-sensitive
or unavailable after a certain time frame; i.e., limited catch disk
space or the delivery software employs an auto-purge feature whereby
the end user is expected to transfer the catch disk content to a
secondary medium in a prescribed time frame. The catch server was
not intended to interface with the facility's automation system,
enforcing the need to offload a copy of the content and protect
as necessary.
Content delivered to a privately placed catch server may still
require a secondary backup scheme. For broadcasters and content
delivery service providers, the current and more common choices
for protection include mirrored video servers, DVD-RAM backup, and
tape (video and/or data). Mirrored servers provide the most rapid
access to content, but are pricey. DVD-RAM offers a flexible high-density
medium-but slow write times. Tape offers the highest density, highest
capacity, but least-accessible methodology for the protection of
media content assets.
The offline storing of data for protection raises the issue of
restoring the data to servers, should an unrecoverable loss on the
main and/or backup disk arrays arise. The restoring of content to
unprotected servers when data is lost or corrupted is time-consuming
and stressful. As users depend more heavily on their digital assets,
the need for planned restoration methods rises-offset in part by
the demand for shorter restoration windows and reduced storage expenditures.
Traditionally, asset backup has relied on data tape, principally
for its cost-effectiveness, rapid backup time, and the ability to
move data to an off-site location for protection. However, data
tape restoration is slow and, some say, unreliable for complete
restores-which are the main reasons that backups are needed in the
first place. One alternative is disk backup, which offers a much
shorter restoration period when compared to data tape of similar
volume.
THE 'SHOESHINE EFFECT'
With the rapid decrease in cost per unit of storage, there now
may be advantages to adopting a full disk-based backup strategy,
with concepts driven from the data-centric world where servers are
routinely backed up both incrementally and completely, traditionally
only to data tape. One of the problems with utilizing tape drives
for data backup relates to linear tape's operational inefficiencies.
Tape drives must be "tuned" to avoid the effect of starting, stopping
and repositioning the tape-sometimes referred to as the "shoeshine
effect." The data world avoids the problem by a process called multiplexing,
whereby several concurrent backups are streamed to the tape drives,
minimizing the shoeshine effect. However, it takes extra time to
read images and handle multiple incoming backup sources, and it
is not readily suited to the types of contiguous data sets that
comprise compressed digital data for MPEG-2 file structures.
Conversely, disk arrays do not need a steady stream of data, so
there is no shoeshine effect even for small incremental backups.
Other advantages to disk-based protective storage include:
- Technologies, such as RAID, allow for exceptional recovery,
at a reasonable cost;
- Reliability of disk drives is better than a pure tape medium.
Discovering a bad tape might render an entire restore operation
to failure-yet RAID protection continues through a complete restore
even if one of the disks fails;
- For media applications, most restorations from the backup or
archive medium are single files. Disk storage is an efficient
single-file recovery method, and being a random-access medium,
makes disks ideal for recovery of single or random groups of individual
files (e.g., promotional interstitials, segments of programs,
and commercial content).
The final element, which will only be mentioned in passing, is
the Archive Management Application. These software applications
and associated server hardware deal with backups for both disk and
tape mediums. The physical interfaces between video servers and
nearline disk arrays vary from provider to provider; and there is
always the option of backing the disk-based media to a subsequent
medium, such as DVD-RAM or even data tape (e.g., DLT, LTO, etc).
Karl Paulsen is vice president of engineering at AZCAR ( www.azcar.com
).
|