Storage performance considerations

Access to stored content is a function of many technology variables. Disk access speed alone does not determine the rate of file transfers. Server performance and network topology also influence the speed that files move through the production workflow.
Author:
Publish date:

Regardless of whether production supports single- or multiplatform distribution, storage system performance issues have a large impact on the workflow. Hierarchical storage levels are conceptual, can be classified in a variety of tiers and broadly divided into online, offline and disk- and tape-based systems.

Each conceptual category employs a different storage technology, but all require an interface device to manage network communication and file transfer. Each computer/storage platform is a media server. Groups of servers are aggregated into tiers, and tiers, with the associated software, constitute a media asset management (MAM) system.

MAM specifications include the number of concurrent reads/writes, definable write/read priorities, partial file restore capability, wrapper support such as MXF, custom metadata and offline storage location/ID.

Bandwidth issues

Disk performance is measured in bytes per second, so conversion from network data rates specified in bits per second (by dividing by eight) must be performed when analyzing overall system performance.

File compression formats have a direct impact on storage requirements. Files compressed at 100Mb/s occupy four times the number of bytes than files compressed at 25Mb/s. Additionally, 100Mb/s files consume four times the interface bandwidth. Therefore, selection of a compression format greatly affects the requirements of a storage system.

Disk technology performance varies depending on technology. For example, Fibre Channel and iSCSI are two popular high-performance storage interface technologies. Comparison of the two reveals that although Fibre Channel outperforms iSCSI today, the reduced cost of iSCSI may warrant its use for a given workflow.

Similarly, a comparison of the performance of three types of data tapes — SDLT 600, LTO3 and SAIT-1 — used in robotic systems reveals subtle performance issues. Capacity, transfer rate, file access time and media load time, with cost considered, must be balanced. Overall performance metrics must be carefully analyzed in order to predict real-world system suitability and performance.

High-performance storage

High Performance High Availability (HPHA) storage systems provide great performance, but with the trade off of a much higher cost. Even so, two production processes where HPHA may be especially applicable are video composting and graphics animation. The speed of both of these rendering processes is extremely dependent on storage throughput.

HPHA bandwidth can approach sustained data rates of 5Gb/s to 10Gb/s. This often requires multiple 4Gb/s Fibre Channel drives with high-speed Ethernet interfaces. Such arrays permit compositing to be done at faster-than-real-time speeds, and animation rendering times can be dramatically shortened.

Before committing to the purchase of HPHA technology, however, you need to first determine if the application justifies the expense. If the cost of labor in person-hours saved exceeds the price of the storage, then HPHA may be a good solution.

Online storage

Throughput performance requirements for online storage applications are often less demanding. In order to sustain real-time ingest or playout of uncompressed HD content, disk read/write capabilities must surpass 187.5MB/s. Compressed 100Mb/s HD is less demanding, but still requires 12.5MB/s throughput.

Today, an individual drive cannot attain the needed HPHA data rate of 187.5MB/s. By using parallel read/write techniques, however, it is possible to achieve this bandwidth. In these cases, a single disk need only maintain a 12.5MB/s read/write rate.

Multichannel audio further increases required array performance. An AES3 uncompressed audio signal adds 1.152Mb/s, or 144KB/s per channel, to the required storage bandwidth.

This means that it may be wise to provide some system overhead by using 200MB/s-capable arrays for 100Mb/s video and 15MB/s arrays for 25Mb/s video.

Where tape fits in

It is interesting to note that the early dream of the total elimination of tape has not come to fruition. In some scenarios, workflows that use some form of tape storage, whether data tape for long-term archives or industry workhorse VTR D-5 for uncompressed surround audio and HD video, are still the most cost-effective.

Broadcasters should not be too quick to eliminate or fail to evaluate the need for tape-based uncompressed content storage. Notably, there are repurposing scenarios where tape storage will enable content, especially HD, to maintain the highest attainable quality when repurposed.

Perhaps rerunning classic sporting events in 4K format movie theaters will become a fad in 10 years. It would be unacceptable to discover that HD events were compressed to minimal bit rates and that transcoding to 4K produced annoying artifacts. Recent surveys indicate that viewers will seek content elsewhere if video content is plagued with artifacts. This is why keeping a copy in the original format on tape may prove to be fortuitous.

Another concern is legacy material stored on tape. Unless a project is undertaken to digitize and ingest all content in the current tape library, many historic events and important programs will exist only on tape.

Data integrity and backup

To assure data integrity, content must exist in duplicate copies in diverse geographic areas. Yet returning to sneakernet or van-shuttle to retrieve content, in the event of a disaster, is not an acceptable scenario; tapeless, file-based production workflows would be destroyed.

This leads to the need for data integrity measures. Two techniques are commonly used: data redundancy and error correction and concealment.

Disk mirroring is the act of concurrently writing files (and other information) to a live, online disk and an identical backup disk. In principle, if the online disk fails, the system will transparently switch over to using the mirrored disk.

Another approach to maintaining data integrity is to append error correction and concealment data to the file. Protection methodologies can verify file integrity or correct bad data; however, the addition of error detection and correction data makes the file larger.

Multiplatform program distribution

Unless unlimited funds are available for infinite storage, decisions will have to be made as to what content should be kept on what medium and what content can safely be discarded. Can events be edited down to key action? Does every minute of a three-hour sporting event need to be stored, or will just the game action be sufficient for future repurposing?

A football game can have the time between each play cut. This will reduce the game time from three hours to around 45 minutes of actual action that can be stored on online and nearline servers. Yet to interestingly tell the story, other moments should be retained. The same is true for a news event.

As challenging as it is to design a storage system for single-platform production and distribution, when multiplatform distribution is a system requirement, the complexity increases exponentially. Workflow and infrastructure have a symbiotic relationship and must work as one unit. Selection of a storage technology and the design of a MAM system is as much a business decision as workflow consideration. Enough dramatic content must be available in a timely manner to support production that tells a compelling story on any distribution platform, and appropriate data recovery capabilities must not be compromised.