The evolution of digital storage

The broadcast industry is reaching a milestone in the evolution of digital technology. First, there were individual digital applications, usually at the high end of post production. From there, broadcasters used digital acquisition, editing and manipulation, albeit often using proprietary formats and interconnecting through SDI — a dedicated video-style interface. Now, the industry is finally recognizing digital video and audio files for what they are — data.

As a succession of ones and zeros, they are identical to any other data in any other computer application. We need to treat them as such. Think about how it would affect your work if you applied this exclusionary attitude to all data. What if your spreadsheets needed to run on separate computers, networks and servers from your e-mail, or your documents in Microsoft Word needed to be decoded to raw ASCII before they could be passed on to someone else? This would clearly be unacceptable.

But it happens in most broadcast environments today. If, for example, you have a file in a Grass Valley server and you want to get it into an Avid editor, you have to decompress the file to SDI, pass it over a real-time connection and then encode it into the new format. It is only now that the barriers are being broken down and open standards allow ready interconnection.

The purpose of this article is not to talk about these open standards and file interchanges, but to look at what happens when open transfers become possible. Once you have a facilitywide system that can exchange digital files as data, with each broadcast application being able to share common content, then the logical step is to provide a central store for that content.

Layered storage

The traditional IT approach to centralized storage is to create a hierarchical view of the system that includes:

online — expensive spinning disks with high throughput for immediate delivery of data;
nearline — less expensive spinning disks that can move content to the online server quickly when required;
archive — a tape or optical disk library system that takes content from the nearline disks when capacity is an issue; and
offline — shelved storage of tapes or optical media when the archive system is full.

From top to bottom, those four levels decrease in convenience and access speed but also decrease in cost. (See Figure 1.) The art of the system designer is to achieve the right capacity at each stage to meet the service level requirements at the minimum cost.

Broadly speaking, this is a good way to visualize the overall structure of a video content storage management system. That is not to say, however, that a standard IT storage management system will meet broadcast requirements. Television has specific demands to consider.

First, some applications have higher priorities than others. Most obviously, playout has to be at the top. If the content is not available on the playout server at its scheduled time, it's not an inconvenience; it is a disaster.

Second, the storage management system has to be transparent to the users it supports. (See Figure 2.) Whether it is an editor in an Avid suite or a scheduler preparing the final playout rundown, users simply need to ask for content and be confident that it will be delivered, where they need it, when they need it. Users should not need to worry about where their content is within the central storage or about the mechanics of how it gets to where they want it.

Third, the rules concerning how and when content is moved between levels of storage are much more complicated than how recently it has been used. For example, every broadcaster has content it needs to be able to air instantly in an emergency, but this content is not used on a regular basis. Alternatively, content that was used recently can be archived (a news story that has reached its conclusion, for example).

The critical issue is to design a system that is based on the business and the service levels you require. A guarantee of five nines (99.999 percent) availability means designing resilience, bandwidth and throughput into the system. Through this, you can determine your service level agreement and the consequent return on investment.

As already noted, a central storage system will normally be spread across multiple layers. The broadcast application itself (editor or playout server, for example) will have its own local storage. In some products, this will be a buffer store (the local disk on a nonlinear editor, for example). Other broadcast applications have sufficient capacity and their own network capabilities to provide what is, in effect, an online server.

System architecture

Because it has to look like a single system to the applications, the temptation is to design the storage management with a single server to manage it. In servers, broadcasters have three key needs:

the ability to expand the content storage network, both in terms of connectivity to delivery applications from broadcast to Web and mobile and in terms of storage capacity itself;
a guarantee that the data throughput will meet current requirements and grow as the network expands; and
security, in that the system will be highly resilient to failure and that the data itself will be protected by redundancy.

It is only by understanding these issues and the need to meet them that you can make sensible decisions on systems architecture. In particular, I believe that this rules out a single server architecture, as there are distinct limitations under each of these three key headings that severely restrict the ability of the system to meet real-world requirements. At worst, it is a single point of failure; at best — with mirrored single servers — it demands operator response to initiate manual processes in the event of failure. A clustered architecture provides significant benefits in operational flexibility, resilience and, ultimately, cost of ownership.

Cluster for resilience

In a clustered solution (see Figure 3), every server is physically identical (typically HP DL380 servers or equivalent), and each runs Windows and Microsoft SQL. Identical hardware and operating software makes for simple maintenance. A typical installation will be comprised of a database (consisting of two or more servers) and a number of servers comprising the cluster (also referred to as nodes). Certain essential services run on only one node at a time, though they are installed on all servers.

Because every server is identical, it makes sense for each to be loaded with exactly the same set of software. That gives the potential for any physical piece of hardware to instantly take up the task of any logical device or service. (See Figure 4.)

Content prioritization

Another critical element for the success of the system is prioritization of content. Again, there are specific requirements that cause this to be more than just a simple list. Some tasks will take longer than others. Obviously, moving a two-hour movie will take longer than a 30-second commercial. But a well-designed storage management system should allow for partial restore.

For example, a story could be cut using browse resolution copies and only the selects and handles from the EDL that needs to be transferred to the online conformer. If the material is on tape, then it will take time to spool to each clip.

Priority setting is also an issue. Some tasks will be regarded as more important than others. Loading material into the playout server for immediate transmission is an obvious example. Requests for archive material into the news editing environment are often urgent. There is also the issue of managing resources to best effect. By its nature, tape storage works best when reading or writing at its maximum data rate.

Content lifecycle management

It is worth emphasizing once again that a well-designed central storage system should be invisible to the day-to-day user by complementing the broadcast application. With valuable content now routinely delivered to broadcasters as files over IP circuits rather than on videotape, secure storage of that content is increasingly important. A common request that can be set up using lifecycle rules is to store the content on nearline disks, even if it is to be played soon, and to make two tape backups. (See Figure 5.) One of the tapes is retained in the archive, and the other is delivered to secure storage.

The largest broadcasters and playout facilities have disaster recovery sites. These alternative playout centers are at remote locations and can take over if an event takes the main facility out of action. The content storage system should have the ability to communicate with the disaster recovery site to minimize time off-air should there be a need to go to the backup.

Deleting content

Another important field that is governed by the rules of content lifecycle management is how you handle deleted content. Material might be deleted for a number of reasons. Bought-in programming will be licensed for a certain number of transmissions, and the contract with the producer may even specify that the recording must be wiped at the end of the license period.

In news, there may be a requirement to keep all raw footage for a short while. After that, you might want to archive only the cut stories, or the selected takes, EDL and voiceover as separate linked files.

The automation or asset management system may delete the content from its database, but it is only when it is wiped from the central storage system that it actually disappears. When it is wiped, it leaves empty space, which can be used for new content. A new piece of content is unlikely to match the size of the hole left by a deleted piece of content, so the space needs to be managed. Today's disk systems have sufficient power to cope with fragmented files and will manage their storage accordingly.

Content preservation

One new application is content preservation. Major broadcasters and content producers have a large back catalog of programs, which exist as tape or film on shelves. The tapes may be in formats that are hard to replay. The content may well be physically deteriorating.

The owner needs to secure the content for future use; however, the owner may not be in a position to implement a full-scale asset management system at this stage, for logistical or cost reasons. A simple way of storing the content and capturing the basic information is required.

Data exchange

The ultimate benefit of digital television is that all content is handled as data, readily available and exchanged between multiple content delivery systems without the restrictions of a limited number of real-time video paths. Data can be moved faster than real time across Gigabit Ethernet or Fibre Channel networks, or it can be handed off to remote disaster recovery sites at slower than real time. An organization can create a storage and archive infrastructure that precisely meets its operational and business needs.

The goal is to make all this possible without the bottlenecks in workflow that compromise content delivery to critical on-air applications or risk valuable content. A central storage management system developed with these specific requirements in mind cannot be based on a conventional IT approach.

Broadcasters typically cannot tolerate any down time and its consequent loss of revenue. It is, therefore, vital that the central storage system have failover architecture that is fully automated, fast, integrated with the management software and designed in from day one. That can only be achieved with a clustered architecture. It also allows for planned maintenance and system expansion, with the same availability as far as broadcast applications are concerned.

Bernie Walsh is the marketing director for SGL.