Omneon's MediaGrid

As television broadcast and production facilities seek new ways to minimize their costs and new opportunities for increasing revenues through new services or repurposing existing content, they require a storage solution that offers exceptional day-to-day performance while also handling the different workloads and applications found in a single facility.

The real challenge for broadcasters is in bringing online a system capable of enabling increased workflow integration using high-performance, high-availability storage platforms. Such platforms must be capable of scaling easily to suit the needs of the facility; meeting the facility's requirements in terms of capacity, performance and data availability; and providing the necessary bandwidth, storage capacity and computing power to ensure optimal and uninterrupted access to media.

Rethinking the traditional storage scheme

Omneon designed the architecture of the MediaGrid active storage system to meet the changing needs of today's broadcast and production facilities, particularly with regard to large-file storage and simultaneous access to media. By scaling bandwidth, capacity and processing power, the system ensures end-user accessibility and constant availability of media files. It provides storage capacities from just a few terabytes to multiple petabytes, all within a single file system, and can scale data-access bandwidth to many hundreds of gigabits per second of aggregate throughput.

The system's architecture protects media while simultaneously providing maximum access to it. As data is transferred to the system, it is divided, replicated and distributed across the entire array of disks. This arrangement provides built-in data integrity by maintaining replicas of files stored in different locations of the storage pool. It also optimizes file availability by taking advantage of any file replica for data access. Additionally, users can dynamically increase the number of copies for high-demand content.

The system achieves this flexibility through “object-based storage,” which uses “slices” as the smallest unit of storage. Each file is broken into slices, each a minimum of 8MB. A slice is a smart object whose behavior adds to the intelligence of the overall system with information including lifetime CRC, active consistency monitoring and metadata redundancy.

The use of lifetime CRCs allows for constant checks to ensure that any random data corruption or loss is identified and automatically corrected — in most cases, before that file has even been accessed. Because slices carry information indicating to which file and where in that file they belong, the system can perform “bottom-up” validation of file system consistency not possible in traditional storage systems.

Media storage access and control

ContentDirectors and ContentServers are the two major operational components of MediaGrid, the former providing overall file system and data management and the latter providing access to data and raw storage. In general, clients interact with the ContentDirector to request control-type services (e.g. file open), and the it responds by referring the client to one, or more, ContentServers for data access (e.g. file read). Once a client has a list of servers to contact, interactions occur directly between the client and the servers.

The servers make up the bulk of the system, housing storage for the system, monitoring that storage, providing available CPU computing power to grid-based applications, and maintaining constant communication with the file system controller to keep them updated as to drive status, file slices stored, etc.

Slice allocation is based on server availability, system load, available capacity and server grouping. The actual disk space available is managed as volumes, or file systems. Volumes are made up of one or more groups, and a group is made up of one or more servers. Each slice of any particular file is stored on a different server, and a replica of each slice is placed on another server. Depending on specific needs for file availability or file protection, the user can specify the number of replicas of any file. Slice replication is dynamic and immediate, and the process allows multiple clients to access any slice of any file from any available content server, thereby reducing resource contention across the system and enhancing its performance.

An “idle time” management process handles functions ranging from data verification to processing of deleted slices. Verification of data in every slice occurs during idle time to verify that the slice data is still accessible (readable) and valid (CRC check). If either check fails, then the slice is marked as invalid, and a “re-replication” of the slice data is launched.

ContentDirectors run very specific applications to maintain and monitor the flow of content onto and off of the servers. Typically, a MediaGrid system will have two or three file system controllers and some number of ContentServers, ranging from about 12 to as many as several hundred, depending on storage capacity required.

Asset movement and processing

The greater integration of broadcasters' workflow processes is most evident in MediaGrid's ability to serve as a parallel computing platform for media processing applications that run directly within the system. Now processes such as quality control, transcoding, closed caption embedding and audio track tagging all can be executed within the grid, eliminating the need to move the data to separate digital islands and then back into storage.

Because the storage system is based on an all-Ethernet interconnect, no translation is required when moving data between storage and clients, thus avoiding the attendant bottlenecks and failure modes. Omneon's SystemManager is used to configure, monitor and manage the system, with the SystemManager being responsible for discovery, configuration, monitoring, alarms and reporting.

Data protection

Data is protected in a number of ways. The distributed design of the storage system means that there is no single point of failure — either for any hardware component or for the data itself. Data stored using advanced data replication techniques is always available to clients, even in the event of a disk failure, because the system automatically manages all aspects of protecting and recovering from data loss. The system is also designed to allow for expansion of storage, clients and bandwidth without shutting down or interrupting operations.

Data rebuild in the event of a hard-drive failure is extremely rapid, without any noticeable overall system performance degradation. By the time a failed drive has been replaced, the data that had been on the failed unit will, in all likelihood, already have been re-replicated somewhere else. Re-replication occurs across the system, using every node for data rebuild operations. Therefore, recovery speed is literally an order of magnitude faster than typical RAID-based systems. As client requests have priority over replication traffic, peaks in client traffic are handled seamlessly, without interrupting client operations.

Client interfaces to the storage system are provided from a variety of operating systems. Each client interface is capable of recognizing a loss of connection to the server architecture and automatically connecting to another ContentServer that has a replica of the requested data. Of course, behind the scenes, the storage system will also have identified the error and started a re-replication of the data if required. When a ContentDirector is lost, typically the impact is minimal. When the replacement ContentDirector boots, it connects to the cluster, identifies itself and requests an update for the MediaGrid file system. The update comes in the form of a synchronization operation that makes the new ContentDirector current.

The system accommodates the broadest range of media processes and data management tasks, all while minimizing complexity for system administrators. Regardless of the size of the system, its distributed intelligence and processing capability ensure that it can deliver the highest-quality media-access performance. Any broadcaster seeking to streamline workflows and integrate multiple applications can rely on this innovative new solution to deliver scalable capacity and bandwidth for fast and reliable media storage and management.

Geoff Stedman is vice president of marketing for Omneon.