Principles in Archive Management, Part II

Part II of II

Last month we introduced you to the archive, one of the less understood components of the enterprise video server solution. This month we will look into the features and functions available when considering how archive components affect performance.

The archive manager is the core device responsible for manipulating the instruction set of the ATL (automated tape library) according to the needs of the video server and automation systems. This core controls how data is retrieved and sent, and how it transfers that data between the tape drives and the video server.

The archive manager can provide both the software and the electrical signal links between tape drives and the video server. Thus, the proper archive match for the video server and the archive depends upon several factors – some we’ve discussed previously and others we will look at in this article.

DATA RATE

One element in the archive equation is the rate at which data can be moved from device to device. In today’s technology, the term native transfer rate specifies just how fast data can be accepted or dispensed from a data tape drive or optical storage subsystem. The physical drive’s (tape, optical or otherwise) transfer rate becomes a significant performance factor in how an enterprise-based video server system performs. It must be considered in both read and write modes to be complete.

Archive systems and their tape drives are not new. The devices from the major players in tape drive solutions (Ampex, StorageTek, Exabyte, Quantum, Fujitsu, Sony) already utilize their solutions – coupled with either third-party or their own manufactured robotics systems – in thousands of data storage environments for other than the broadcast media. This experience allows for a thorough understanding of data management that can be applied to the video server domain with only minor hardware or software changes to the physical devices.

Devices from these select few manufacturers are therefore picked for their reliability, size, performance, their ability to handle continuous operations and price.

PRODUCT EXPERIENCE

Keep in mind that the broadcast marketplace commands only a small fragment of the nearline and offline tape archive business. We find vendors providing broadcast applications already have a good deal of product experience and are, in the case of television operations, marrying their existing products to application-specific archive manager software, adding feature sets aimed at meeting today’s needs for media storage on digital tape.

Archive mediums, such as tape drives and now optical DVD-RAM drives, operate in various modes and therefore contain different and sometimes non-harmonious application specifications. Coupling these drives to secondary software systems then become the key ingredient for the nearline archive solution, with some solutions good for some applications and less favorable for others.

For example, while a DVD-RAM can seek and execute data playback much faster than a linear tape drive, transferring the data to a DVD-RAM device is a slower process than using DLT or DST type tape systems.

The physical connection between archive and server plays an important part in the overall solution. Generally, archive storage devices are direct-connected to the video servers disk storage arrays. While the native data rates are generally high (10, 20 or even 24 MBps) for tape transfer devices, the true effect of that rate may be encumbered by other influences. The video server has a built-in governor that controls just how much data it can absorb and in what priority it doles out from those values.

OTHER FACTORS

A disk drive storage system’s ability to process data from the archive to the server’s disks is dependent upon other factors. Overhead factors include the amount of activity on the Fibre Channel network, the number of encode or decode streams being processed and even the inherent internal architecture of the video server itself. Collectively, these factors actually may degrade the specified 10 to 24 MBps transfer rate from the archive device to a number much lower than the advertised transfer rate from the drives.

Furthermore, the cycling of data and the amount of usage a system gets adds overhead to the performance number. The metric used, called exchanges or duty cycles, may be very high as stated in the specifications; yet when the actual operating environment is considered, this metric may have little influence on the system, compared to the additional cost a user must pay upfront for that feature.

An exchange may be described as just a single pick, mount, dismount and replace. This action almost always includes a robotics library system and is stated in exchanges per hour. Sometimes an exchange specification looks more like a duty cycle spec and may include the entire operation from pick to data read/write to replace.

BATCH OPERATIONS

The archive works in sets or batches of operations. A single operation may be as simple as a command to move data from the server to the archive. A batch operation is then a set of commands that require many activities in order to complete the transfer.

Batching generally includes multiple tape mounts and dismounts, locating the data on the tape, moving data files to/from the archive, and processing a priority level to determine which files get moved when and to where. The volume of both data files and commands, singular or batch per given period of time, is yet another factor in defining overall system performance.

When viewing performance from the video server’s perspective, how data gets from data tape to server disks also varies from server manufacturer to manufacturer. We should all remember, however, that fundamental changes are ongoing in video server architectures. All video servers have established flexible and expandable drive storage expansion models, and in these cases, what is stated today may be headed for a change in the not too distant future.

Still, in many cases, data transfers from online storage drives to decoders are constrained by some finite requirements, starting with the fundamental source encoding rate and including the delivery rate to the video transport. In cases where 270 Mbps SMPTE 259M is the transport medium, encoders are optimized so that the user knows with reasonable certainty that the server will deliver x-streams at y-data rate without interruption.

What is changing the complexion of the server environment is how the architectures will be expanded with an aim at improving data transfer rates between devices other than in conventional video. Changes in these devices include interfacing to nearline archives, compressed media devices, wide area networks and now the Internet.

FIBRE CHANNEL

Fibre Channel is now universally accepted as the transport means between large arrays of videodisk storage. Today, however, it is not the universal method for moving data from disk drive arrays to other mediums, such as archives, WANs or the Internet. Fibre Channel, while a magnificent improvement over conventional parallel SCSI interfaces, is merely the transport that carries the data in what is a still mostly a SCSI-based protocol environment.

Even though SCSI controllers and Fibre Channel host interfaces have broken down several barriers for data transfers, in some video servers the SCSI protocol still dominates the data flow internally and hence becomes a limiting factor in data transfer to other devices.

In multiple server systems consisting of library, ingest, air and protection (mirrored) servers, the interface to the archive system can have a varying performance depending upon the current development of that video server’s architecture. For example, in the course of the broadcast day, archived media is routinely transferred from the tape archive to one or more servers. When the server architecture is SCSI-protocol based, the connection between tape and server is probably a one-to-one or "direct" connection. Data only moves from archive tape, via a dedicated SCSI port on the tape drive directly to one and only one server’s SCSI port.

To get the protection data from the air server to the protect (mirror) server, a Fibre Channel transfer between servers must be initiated. Some servers allow this transfer to begin shortly, within a few seconds, after the initial tape-to-server transfer begins. Other servers may require the entire transfer to be completed before the Fibre Channel transfer begins, essentially two separate serial transfers being construed as two distinct transfers – sometimes with two different transfer rates (one from tape to server, and another from server to server).

CHANGING PLAYING FIELD

The playing field, however, is changing. There are newer concepts in tape-to-server transfers in the works and in operation. These new solutions involve intermediary devices, called "gateways," that essentially buffer higher-speed tape data and in turn distribute the data in a one-to-many server configuration. The gateway is controlled via instructions received from automation and routed to the archive manager’s gateway server.

The gateway is essentially a serverlike workstation, with sufficient bandwidth and disk cache to manage the instructions from the automation system and handle the higher data rates direct from tape devices. The gateway prioritizes and distributes data based upon the overall system requirements. Gateways eliminate the need for placing supplemental software (DLLs) on the video server, thus relieving certain CPU processing functions from the video server and in turn opening up bandwidth between external data devices and tape drives.

Despite the fact that data tape functions are application-specific and that these drives move data exceptionally fast, they are still highly mechanical, linear devices that basically get a single chance to correctly output or input data. If they miss the mark, they are commanded to stop, recue and redo the operations.

In the simplest terms, placing a disk-based cache between tape and external video server destinations reduces system overhead and allows for misread data problems to rapidly correct themselves while the tape drive essentially carries on with other tasks.

PLATFORM TO PLATFORM

Getting data from the gateway cache to the servers is another one of those architecture-dependent solutions that vary from platform to platform. One method might employ a distributed architecture whereby multiple-gigabit Ethernet connections direct from the archive gateway to every one of the disk storage arrays (or nodes) on the server system. Under this model, the aggregate bandwidth is opened up and the gateway can perform more like it was intended. Notice we see gigabit Ethernet, a trend surfacing in video servers only a few years after we’ve seen it take hold in computer networking.

Keep in perspective that server manufacturers’ internal architectures vary, and what may be possible for the brand-x server may not be possible in brand-y. In planning for a high-volume environment, with a heavy-duty cycle such as where program material and interstitials are both stored on the same server, it is important to understand the total solution taking into account all the elements – servers, tape drives, archive manager and automation.

In large- or small-scale systems, each potential operating environment needs to be engineered. When the media content constantly varies in length, where the data is in the library, what data is stored on the tape and the various software paths from source to destination all affect performance. This is the concept of data management – one of the two remaining factors to be explored as we continue next month in this series on the least-understood elements of enterprise video server solutions.

Karl Paulsen

Karl Paulsen is the CTO for Diversified, the global leader in media-related technologies, innovations and systems integration. Karl provides subject matter expertise and innovative visionary futures related to advanced networking and IP-technologies, workflow design and assessment, media asset management, and storage technologies. Karl is a SMPTE Life Fellow, a SBE Life Member & Certified Professional Broadcast Engineer, and the author of hundreds of articles focused on industry advances in cloud, storage, workflow, and media technologies. For over 25-years he has continually featured topics in TV Tech magazine—penning the magazine’s Storage and Media Technologies and its Cloudspotter’s Journal columns.