UNCOMMON PERFORMANCE from COMMON SOLUTIONS

The goal of grid-based storage is to create one virtual computer or storage system out of many networked IT resources. The use of distributed resources in an intelligent networked environment allows users to handle different types of workloads and applications easily and, in many cases, with dramatically improved efficiency. The distributed nature of grid-based systems not only facilitates resource sharing, but also lends itself to greater redundancy and security of stored data.

Most of today's broadcasters and media groups are under pressure to minimize operating expenses while ramping up new revenue-generating services. All of these factors make grid processing and storage an appealing option.

Virtually every broadcaster has moved to digital operations and now handles media files, including large HD files, within regular production and transmission workflows. These digital media files can be accessed and manipulated with much greater flexibility than tape-based media. However, this capability is limited in its effectiveness without the appropriate storage and network infrastructure in place.

Grid-based technologies, which have been evolving for more than a decade now, can be used to integrate multiple functions and streamline operations, thereby delivering the cost savings sought after by broadcast and production facilities. These systems enable simultaneous access to media files, reduce the bottlenecks typical of peaks in the facility's network traffic or usage, and provide scalability in terms of network bandwidth, storage capacity and computing power.

The size of a grid-based storage architecture can be small, with just a few terabytes of storage capacity, or large, with thousands of terabytes. The same kind of scalability is true with respect to bandwidth, which can be expanded to aggregate throughput rates as high as hundreds of gigabits per second.

The storage system reflects the adage of the whole being greater than the sum of its parts. Combining networked resources leverages all available processing power and storage capacity to make media available to multiple users or applications, quickly and concurrently. The computational or processing power of networked systems can be directed toward different applications, or subsets of those applications, to keep key workflows running smoothly.

Grid-based systems combine storage, processing power, bandwidth capacity, redundancy and security to deliver better and more balanced resource use within a media-oriented organization. Here's how it works.

Object-based storage

Grid-based storage systems rely on grids of interconnected machines, or servers, each of which provides a particular amount of storage, and a combination of bandwidth and processing power for the overall system. Media or data files are striped across these devices or stored in intelligent slices that comprise the larger media files.

Often, the metadata accompanying sliced file data is stored in a central database within the storage system's scheduling or management system, which is typically mirrored by a second system. Every slice incorporates a header and cyclic redundancy check (CRC) that facilitate data verification and replication.

These slices are, in turn, duplicated to add redundancy and offer more flexible retrieval of data, depending on network traffic. The number of preset replicas per slice — for individual files, whole directories or the entire system — depends on the degree of redundancy and speed of retrieval required. When stored material is actually requested, it is rapidly retrieved from the most available replica and compiled from all its constituent slices located across the storage grid. Through this process of slicing and replication, multiple clients can access any slice of any file from any available server. (See Figure 1.)

This style of object-based storage provides consistent data, which are protected by CRCs throughout the course of their lifetime. Thus, any random data corruption or loss is identified and corrected automatically, often before the file is accessed. In this sense, grid-based storage offers a more proactive approach to data protection than RAID-based storage, which detects compromised data only when users attempt to access them.

The grid architecture also relies on intelligent storage to ensure that the data most likely to be requested will be allocated to high availability nearline storage. For files accessed in a regular pattern, the distributed system can transfer data much more rapidly than a single drive could.

Although the power of a grid-based storage system stems from its multiple networked servers, the utility of the system comes from its unified appearance to users. Expansion of the system's storage capacity is straightforward, generally requiring the addition of another machine to the server network. For the user, there is no notable difference, as media continues to appear and behave as though it resides on a single file server.

Smart processing power

Networked IT-based systems can generate enormous processing power. When processing-intense applications are broken into subsets, tasks can be executed simultaneously, with the results combined to yield the solution or end product of that application.

Theoretically, a process split into five subsets and using five processors could be completed five times faster than in a linear computing environment. In reality, no application is perfectly scalable, but the implications for processing media files are significant, to say the least.

Management of available processing power and bandwidth is critical in an effective grid-based storage system. The only way to maximize networked server resources is to ensure prioritization within the broadcast workflow. Consider, for example, material that must be prepped for air, as opposed to material that's ready for archiving. The management element of the system must be able to differentiate between these tasks and ensure that the more time-sensitive task is completed first.

Each server within the storage grid is equipped with internal applications that help to manage and optimize its resources. In addition to remote procedure call (RPC) requests and replies, the system manages the status of the server within the larger network; its connection to other servers or machines on the network; the operations, such as read and write commands, that relate directly to data slices stored on the system; and functions including logging, security and memory use.

Intelligent task

Two or more central management or scheduling systems oversee the storage grid, dictating where file slices are stored, initiating and terminating file access, and assigning tasks to the different server systems that make up the grid. Tasks can be assigned in a number of ways. Assignment depends on server availability, proximity to the data required for that task and advance scheduling of a specific, known task to a certain machine.

In advanced grid storage systems, machines report on their status — busy or idle. This allows the scheduling mechanism to put idle machines to work on queued up tasks, either until the job is complete or until a more important task takes precedence.

Preset rules governing priority assignments help to maximize resources without disrupting the most important processing operations. Other rules, such as time-of-day constraints, can be used to ensure that necessary, but low-priority, tasks are completed during the least busy work windows.

The versatility of the grid system lies in the scheduling system's ability to react immediately to new tasks added to the job load and to changes in the use of servers within the network. SNMP control enables configuration, monitoring and management of the system, enhanced by reporting and alarm capability. By identifying all the servers and switches within the grid, the management system gains an awareness of the load being handled by the storage grid. As a result, the scheduling or management system also can help inform the user about how the grid is being used and if there may be a need for additional resources.

Servers within the grid-based system also take advantage of idle periods to handle critical housekeeping functions, such as data verification and processing of deleted slices. The verification process ensures the validity of stored content or, if content has been compromised, initiates a re-replication of that slice from an identical piece of data stored elsewhere on the network. All of these factors bring greater reliability to operations at a broadcast or production facility by ensuring that the media files called for are available without damage or delay.

Unlimited scalability

Grid-based storage can be expanded infinitely through extension of the network topology, an increase in the server systems comprising the storage grid or an increase in the number of clients accessing the system. If the grid is built on a robust GigE network with intelligence built into each server system, it can grow indefinitely without any degradation of its performance or ability to maximize available resources. Likewise, the addition of new clients to the system can enhance the overall quality of the storage grid. Every new addition brings greater computing and caching capacity to the system.

Larger systems often group servers together in volumes, which in turn make up the grid. These groups of servers are often collocated within a single rack or enclosure. Thus, the system can be configured to store slices across different groups whenever possible so that a failure of an entire rack will not wipe out the slices required to replicate a media file.

Distribution of large media files over multiple servers effectively eliminates any size limitations. It also allows for expansion of the overall storage infrastructure without interruption of critical ongoing operations. What's more, the distribution of media storage among multiple nodes offers improved protection of data, as well as improved hardware and software fault tolerance and recovery.

Security and data

Data protection is as important in a grid-based system as it is in any other storage platform. Consequently, grid storage systems incorporate user validation schemes and limit file access based on user groupings. Administrators and other users are given permission to perform different types of operations, so the risk of data loss is minimized without constraining users' ability to work efficiently.

The distributed grid architecture not only makes system expansion easier, but also it eliminates any single point of failure, whether as a result of a hardware failure or data failure. The system's ability to rebuild data quickly ensures minimal impact to the overall system performance. As the system identifies faulty slices, it automatically replicates the slices, limiting the possibility of data failure. Even in the case of drive failure, data can be re-supplied quickly by other servers on the network.

In theory, every node of the system could be dedicated to rebuilding the lost data, providing a recovery rate dramatically faster than recovery of a RAID-based storage system. (See Figure 2.) In reality, the system allows client requests to take precedence over the rebuilding process so that key tasks proceed smoothly, unhindered by any loss of bandwidth. In this scenario, the system manager or scheduler balances workflow priorities against the need to replicate data within the system, so peaks in client traffic can be handled seamlessly, and the integrity of redundant data storage is maintained.

Smart storage

Grid-based storage architectures use fairly common IT-based systems and network infrastructure to enable a greater flexibility in handling large digital media files. They present users and administrators with a simple and straightforward interface, through which all data appears as if in a single file system. The systems are scalable and smart enough to let operators know when they're being under- or overutilized. Regardless of their size, grid-based systems maintain that intelligence, helping media-oriented companies make the most of resources, expand operations and lower overall expenses.

Geoff Stedman is vice president of worldwide marketing for Omneon.