Storage for parallel production

Hierarchical storage technology has been implemented in many file-based production infrastructures. This storage technology accommodates single platform production workflows, but modifications to file migration policies may be necessary to efficiently support parallel production workflows and multiplatform distribution.
Author:
Publish date:

As production has transitioned to file-based nonlinear workflows, a hierarchical topology is frequently used in storage system design. The design philosophy espouses using tiers of different storage technologies and mediums based on content access frequency. The tiers are roughly divided into online, near-line, offline and archive, with decreasing cost traded for longer access and transfer times.

Rules-based file prioritization determines where content is stored; file transfers are automated and occur transparently to the user. Content that is frequently used is kept in relatively expensive, high performance high availability (HPHA) disk storage technologies. As content is accessed less frequently, it is moved to less expensive, but slower, near-line and offline storage locations. Eventually, it will reside in a content archive.

Parallel production

Hierarchical storage methodologies work well when producing content for single platform distribution. Content is created, ingested and edited, and graphics are added. The show is assembled in a program control room and commercials, bugs, crawls and closed-captioning are inserted in the master control room. Transmission compresses and airs the program. In most cases, only key clips and some graphics elements need to be saved for future use, and the entire program can be moved to an archive. Valuable online and near-line storage capacity is freed for current production needs.

With content repurposing, however, linear migration of content through the hierarchical storage topology does not always support the most efficient use of resources. Workflows that support repurposing content, both now and in the future, must be carefully considered. No one can foresee what type of content delivered over which channel will generate revenue in the future. Determining exactly what can be moved to offline and archive storage and what will be needed online to produce programming for additional platforms is not always clear.

Multiplatform distribution production issues

To simplify system analysis and design, storage can be divided into two categories: production or finished program. Content can be divided into audio, video, graphics and data.

Various production, assembly and distribution scenarios can be created to handle each of these content categories, but hierarchical storage rules may have to be modified to address access needs. In fact, content migration between the tiers may have to be done manually. At times, it may be most efficient to move content directly from online to archive. Editorial decisions ultimately determine future access priorities and where in the storage hierarchy content will be kept.

A hypothetical multiplatform workflow

Considering a hypothetical workflow will shed light on the design process. First, basic content formats for each distribution channel need to be established. In this case, the house DTV format will be 1080i. Content repurposed for the Internet will be produced and delivered as quarter screen 320 x 240 at 60 progressive fps; this pixel grid format enables the user to scale content and maintain acceptable video quality in variable size windows. Cell phone content will use the QCIF format 144 x 176 at 30fps.

Consider the parallel production workflow and storage requirements for a multiplatform package of sporting events. Source content consists of 10 events that are three hours each. Two-minute segments of highlights from each game will be produced for a news show. The full games will be available for a week streaming over the Web beginning the day after the game. Two days after game day, a 15-minute version of each game will be available on on-demand cable systems. The games will also be simulcast to cell phone users in their local DMAs.

The parallel production workflows can be divided into platform-specific requirements. Each of the 10 games broadcast will be produced on site, packaged at the NOC and ingested in their entirety.

Cell phone distribution only requires transcoding and platform-appropriate handling of graphics and commercials.

Production will support two highlight segments for each of the 10 games — 15-minute on-demand cable packages and Web packages. One production workflow will produce the two-minute highlight packages from the raw material on game day. A parallel production workflow will produce the 15-minute versions for on-demand cable. This requires the full game content to be stored for up to two days until the on-demand versions are ready to air. After this, the native resolution games can be archived.

Web streaming will use the low-resolution proxies created by the asset management system at ingest. Production will create additional Web content such as stats, descriptions and links. The finished Web versions will be stored for one week on Web servers, after which they can be moved to an archive.

Storage requirements are dynamic. The design philosophy is to transfer content downward in the storage hierarchy until it is archived. But if tiers are skipped to enable the most efficient use of storage with respect to production workflows, then content/file migration will no longer be truly hierarchical.

Storage requirements

Games are broadcast in1080i but are backhauled from the site as 100Mb/s MPEG-2 compressed video. Eight tracks of audio compressed as Dolby E are carried as an AES pair. The three hour per-game audio and video storage requirement is 135GB for video and a generous 1GB for audio. The total storage requirement for all five games the network has rights to is 681GB.

The other five games are taken off the air. After demodulation and decoding, the native format can be considered the MPEG-2 transport stream data rate. This maxes out at just less than 20Mb/s and 27GB per game. For five games, 135GB are required, which includes both audio and video.

All 826GB of game content will have to remain in online storage for two days of post production for on-demand highlights and Web distribution.

At ingest, the media asset management (MAM) system produces 768Kb/s MPEG-4 video proxies. These video proxies equate to a data rate of 96KB/s and 1.036GB of storage for a three-hour game. At three hours per game, 30 hours of raw video will be captured, which will require slightly less than 5.2GB of total proxy storage.

These proxies will be used for Web streaming and must be stored for a week. Because locating content is dependent on proxies, it may be a good idea to back up proxies by implementing mirrored storage; this doubles the requirements to 10.4GB.

Fifteen minutes of long-form edited games will be stored as MPEG-2 transport streams. This requires 2.25GB per game and 22.5GB for all 10 games. So, after two days, the 826GB of raw games can be migrated down the storage hierarchy, or directly to archive. Only the 22.5GB of on-demand content needs to be stored online.

This example illustrates the nuances that must be considered when designing a production infrastructure that can support parallel production. Distribution platforms and scheduling has an impact on the use of storage. The hierarchical model can be adapted to support these requirements.

Content management and storage topology

It would be difficult and expensive to store all of this content in each format in online storage indefinitely. The potential exists for the workflow design problem to become more complex because content production and distribution is dispersed over time. Games will go live to air, and highlights will air during the game and during pre- and post-game shows.

In the example provided here, the solution to the problem has been simplified by the fact that production is distributed during two days, proxies are used for Web distribution and cell phone content can be transcoded on the fly.

Regardless, a large amount of online and near-line storage will be in use during the parallel production processes.

Storage topology must be designed to support the production workflows necessary for all distribution channels, now and in the future. As with all engineering, design is a function of requirements, available technology and available funds.