Storage management

As videotape gradually disappears from the broadcast plant, digital storage becomes more important. In most cases, this is a combination of spinning disk and removable media, either data tape or DVD. With that comes the management of material as it moves from video server, to nearline, to archive and back again. The software products that perform this function are generally called hierarchical storage management (HSM) systems. However, this is actually a misnomer, as HSM refers to a specific type of storage management rather than a product category.

HSM is often used as a catchall term for storage management. Storage management really consists of four broad subcategories of products, each of which has advantages and disadvantages as it applies to on-air, production and news environments. Specifically, the subcategories are: shared file systems, HSM systems, disk extenders and data movers.

Shared file systems

Shared file systems were developed in response to the introduction of storage area network (SAN) systems into a heterogeneous computer environment. In the IT world, SANs allow disparate operating systems to share files seamlessly, enabling common storage to be used for multiple applications and computer types. This does not necessarily translate to a broadcast workflow, where video file formats and real-time processing require a closed environment. Shared file systems are more reasonable to implement in either a homogenous (i.e. single vendor) or a production environment with less of a real-time requirement.

Shared file systems offer several advantages over discrete storage. First, it allows file sharing between different applications. This is particularly important in a production environment where one set of applications may be used for effects and another set for finishing. Rather than cutting a tape to move material, the material is passed as a file.

Second, bandwidth is greatly enhanced for the applications sharing files. Instead of moving files across a network, the material appears as direct attached storage to the application.

Third, redundant file storage is eliminated. This practice allows files to be stored just once. This is often used in multichannel facilities that share spots, where two servers may need the same file at different times to share program material between networks.

Finally, file (i.e. media) management is greatly simplified when the material is stored in one place. If material needs to be purged, there is one place (possibly two if a backup is employed) to go to administer this function.

Shared file systems are no panacea, however. There are no common file-sharing constructs within broadcast, although MXF holds much promise. MXF is a great tool, but it is only a descriptive wrapper and does not guarantee common essence implementations between applications. Given that these hurdles can and will be overcome, a secondary issue is control. Sharing of files begs the question of who moves material from production to on-air and under what conditions.

HSM systems

HSM systems are designed to migrate material between different levels of storage based on rules defined by the user. This presents storage as a single unit to the application but is, in fact, a hierarchy of storage moving from high-performance, high-cost media to progressively more inexpensive and remote media storage. The HSM software tracks files wherever they are in the hierarchy. (See Figure 1.)

The most common migration rule in the IT world is called “least used.” As the name implies, the files that have been used the least — or not at all — in a user-set time period are candidates for migration to a lower level of storage. Reverse migration also may be applied.

In business, information becomes less relevant over time. And to a certain extent, HSM storage mirrors news operations in the relevancy of stories over time. However, it has limited relation to on-air operations, because previous use of a program or spot may or may not have any relevancy to when that material will be needed again.

If we look at the on-air video server as the top of the hierarchy and removable media as the bottom, a broadcaster does not want the spot scheduled to play in five minutes migrated to removable media. The movement of media needs to be more explicit.

Disk extenders

Disk extenders operate under the premise of making removable media look like a part of the spinning disk system. A disk subsystem in the front end provides the first level cache. The disk extender keeps a “stub” of the file on disk and moves the majority of the file off to cheaper removable media. To applications, the robotic system appears as a letter drive with all files stored on the same disk. When an application requests a file, the data blocks on removable media are restored to the cache and then transferred to the application. Figure 2 illustrates the data flow.

The migration associated with a disk extender product is initiated by a high watermark. The concept is similar to HSM. The difference is in implementation. HSM systems are usually used in self-contained systems. Disk extenders often integrate with third-party applications.

The biggest problem with disk extenders is that they treat the removable media robot as a giant disk, which means that media cannot be removed from the library. It also means that if a video file is larger than a single piece of media, it cannot span to a second piece of media and, therefore, cannot be archived.

Another drawback is latency on restoring video clips. Migrated material with HSM or data mover applications is restored directly back to the target. With disk extenders, however, material is migrated from the disk cache to removable media, so the restore path is back to the disk cache instead of back to the target requested by the application. This means that the video file must be restored to the disk cache before being transferred back to the local storage on the application server (on-air or editing server).

Data movers

Data movers operate as explicit migration tools. In other words, data movers wait to move files from one level to another until they are told to do so by a controlling application. Controlling applications include automation, editing and asset management. (See Figure 3.)

Explicit migration of video files is in line with common operations within a broadcast plant that tend to be synchronous. Whether the controlling application is automation or news editing, the application knows what material is needed where and when, and can control where the file needs to be.

A typical operation on the part of the controlling application would be to move a video clip from a video server to an archive through the data mover API. File movement can be between different levels of storage or storage locations of the same type.

The advantages to data movers are also the disadvantages. The movement of data is explicit, and data movers have no inherent intelligence in moving data around to the correct location. For that reason, data mover providers are incorporating more HSM-type functionality into their products.

A standard feature of data movers is the ability to offer multiple levels of storage as a single entity to the application, with rules based on migration. However, migration rules for broadcast go beyond the concept of least used. They can include migration based on time as well as type of file and targets to include levels of storage and other equipment, such as transcoders.

One other point to note about data movers is that they are specific to their industries — in this case, broadcast video. Data movers work well with the automation, video server and editing systems in the market, which have specific interfaces. Broadcasters need the storage software to work in their environment and also have companies that are responsive to their needs. This means that the install base is of hundreds rather than thousands. This makes the data mover application more expensive because the development and marketing costs are spread over a smaller customer base.

In the future, the standard architecture will be a data mover/HSM hybrid offering the best elements of both approaches. And data mover providers have made significant progress in providing the migration tools needed to effectively manage storage. But there is more to be done. Common file formats will help make shared storage for all departments easier to implement.

Backing up

Disturbingly, backup strategies are rarely considered from a data management standpoint within the broadcast community. Broadcasters' healthy paranoia which cause them to duplicate almost everything has not been applied to media files themselves. Inevitably, it will take someone losing everything before the industry as a whole recognizes this as an issue and demands better tools.

Steve Atkinson is vice president of sales for SGL.