Dynamic storage tiering

Data-intense broadcast projects can require petabytes of data with several hundred terabytes often actively in production at any given time. The need for storing content that is “in the can” on lower-cost archive systems with frequently accessed mission-critical data on more costly performance-hungry media can be a precarious balancing act for even the most seasoned IT administrator.

Employing an integrated tiered storage architecture that enables information to be dynamically stored on lower-cost SATA drives when not actively in use while leveraging the performance of SSD drives to handle digital workloads is a better way to gain the performance, simplicity and high availability that is required in a production environment with a dramatic reduction in acquisition, deployment and operating costs. Because of the orders of magnitude difference in storage media costs and performance, data storage systems that feature SSD tiering within a broadcast environment offer a viable alternative to traditional NAS environments.

Storage performance is reaching its limits in traditional NAS architectures as hard disk drive technology approaches limits imposed by the nature of a spinning disk. Even though the capacity of HDDs continues to increase, NAS servers have struggled to keep up with the amount of read/write requests sent to ever-denser disk drives. While capacity has increased, disk I/O rates have remained relatively constant, resulting in a decrease in the actual number of operations per stored byte. To improve performance, IT managers have turned to overprovisioning Fibre Channel drives, which results in the underutilization of a significant percentage of their raw capacity, inefficient use of limited data center space and wasted power consumption. Even high-end NAS servers cannot overcome the fundamental limitations of increased access times on denser SATA drives while attempting to deliver high-bandwidth file access to clients.

SSDs are a better option in overcoming the performance limits of HDDs, but they too have limitations of their own. Volatility can be an issue with DRAM, and flash memory can only be written to so many times. And introducing new media into a data center can be a hardship for storage administrators, who would have to learn the characteristics of SSDs and how they are best utilized by both broadcast and corporate applications.

To gain the performance benefits of SSDs within broadcast environments while minimizing the limiting factors of introducing new storage media into the infrastructure, a new approach using an appliance-based automated tiering system may be the answer. By automatically placing data on the storage medium that is best suited for its current access patterns, active files can be stored on fast-access media while files that have not been accessed recently reside on a mass storage server. Because the data is automatically moved to the most appropriate tier, the amount and specialization of management involved is negligible.

The appliance, which features multiple tiers of fast-access storage and operating system software, is deployed between a currently installed mass storage NAS server and the client and application servers. (See Figure 1.) This allows an organization to use its existing infrastructure without disrupting data access. Inside the appliance are different types of storage media, including solid-state storage and serial-attached SCSI (SAS) HDDs. The OS software analyzes how files are being accessed and places the files internally on the most appropriate storage medium for the fastest possible access. This approach benefits write loads as well as read-only data. For optimal performance, changes made to data by client and application servers are stored locally within high-speed storage tiers on the tiering server, which writes all changed data back to the mass storage server at an interval specified by the administrator.

In much the same way that additional media can be added to a mass storage server to increase capacity, new nodes can be added to the tiered storage system to form a scalable cluster of combined resources that increases overall performance. By separating the processing of file requests in the NAS infrastructure from data retention on the mass storage server, organizations will realize high-performance data delivery while freeing processor cycles on the storage server for tasks such as data mirroring, de-duplication and backup operations.

  • Implementing a tiered storage approach to broadcast engineering provides several key benefits, including:
  • Optimization of current NAS servers to enable performance increases for the most demanding active applications;
  • Preserving current investment in existing NAS infrastructure by dramatically improving its performance and extending its useful lifespan;
  • Enabling the use of less-expensive NAS servers and lower-cost, higher-capacity SATA drives as primary storage to expand the capacity of a NAS infrastructure without sacrificing performance;
  • Saving on operational expenses by decreasing the number of expensive NAS servers and disk shelves within the data center:
  • Cost per terabyte, Power, Cooling, Rack space;
  • Avoiding the expenses of overprovisioning by allowing companies to pay only for the performance they need, with the option of scaling performance in the future by adding more automated tiering nodes to the cluster.

In the broadcast environment

Tiered NAS appliances are used in broadcast environments to separate data delivery tasks from data retention and deliver both more efficiently. The data that a tiered file system stores on the cluster is called the working set. As clients and application servers request new files, such as images, audio, video and documents, the cluster retrieves them from the mass storage server and adds them to the working set. Active data remains available on the internal SSD and HDD media within a cluster of high-performance tiered appliances. As files become less active, the file system moves them to slower storage tiers and eventually removes them from the working set, at which point they are located only on the mass storage server.

Rather than design a system around supporting the most active data, a tiered system can be constructed to dynamically move that data to various tiers as the need for access changes. For example, if the system receives only a few random read-only requests for a file, it places it in DRAM and eventually writes it to disk storage. However, if the cluster then sees multiple random reads for the file, from many clients, it moves some blocks from DRAM to flash SSD, retaining the hottest data in the highest-performance storage medium. If the file is modified with write operations, the cluster also writes the changes back to the mass storage server within the time period specified by the maximum write-back delay setting.

In all cases, the contents of the file are distributed across the pooled resources of all of the nodes in the cluster, preventing data from becoming bottlenecked on a single node. The file system serves the file's data as blocks and permits different clients to access and update different parts of the file. This is particularly useful for very large files that are accessed by multiple clients or threads simultaneously; for example, client A can write one part of a file while client B simultaneously writes a different part of the same file. Additionally, if the access patterns indicate the need, the file system can place read-only copies of the file on multiple nodes in the cluster.

With the OS set to constantly monitor data access patterns and self-adjust to increase performance, tiered storage systems are optimized to distribute workload in the cluster and minimize accesses to the mass storage system as needed. Rather than adding HDDs to increase application performance, having multiple storage tiers to move data in and out of based on its frequency of usage and access provides better overall performance at a lower cost. The total equipment deployment cost of a tiered storage appliance can be as much as one-third that of traditional NAS, in addition to providing dramatic savings in operational costs.

Conclusion

A dynamically tiered NAS infrastructure meets the need of broadcast engineering in a way that allows high production values to seamlessly meet with cutting-edge technology to efficiently deliver digital content while minimizing the costs of management, equipment, power, cooling and rack space. By combining multiple storage tiers in a single appliance with integrated software to automatically organize data for maximum performance, broadcast organizations are better positioned to ensure their most mission-critical information is readily available while proactively dealing with the realities of economics in today's challenging business environment. Implementing a tiered storage architecture that leverages high-performance SSDs with lower-cost, higher-capacity SATA HDDs is an ideal way for broadcasters to ensure they always remain on-air without any interruptions.

Ron Bianchini is president and CEO of Avere Systems.