Anticipating Storage Management Needs

Storage is now the name of the game and the wake-up call to address it has already happened. Postponing decisions related to long-term digital media storage is just not an option.

By its very nature, storage must and will evolve to meet the needs of continuous data growth. Expectations for increased storage and performance are tied to a continual rise in service-level expectations. As a result, the supply will never be sufficient and the job of managing storage will never be finished.

Data growth, particularly in the media industry, is a constant. Just consider today's storage requirements, and it becomes a given that next year's will be more--by possibly an order of magnitude or more. There are some suggested methods for controlling the need for more storage, some practical and others not.

The first is to create less content, something that is highly unlikely. The second is to create more local storage, but that, too, has consequences--finite floor space, access performance in retrieving the data, limits on capital budgets, etc. A third method is putting the same data into less space, which is where improvements in compression help. A fourth method is to manage the storage differently, such as using off-line storage, data migration or archive management and the controlled purging of expired media.

When a media enterprise takes into account storing material from all sources that generate data, many elements must be considered. E-mails, database records, documents, videos and other digital objects all become information components that must be stored somewhere and somehow. So what does it take to keep up and is there any answer to "how much storage will be enough?"

These are difficult and delicate questions that need to be answered.

STORAGE UPON STORAGE

In the past, each individual server had dedicated storage, sometimes internally to the device and sometimes externally through a local storage array. Scaling was accomplished by adding or replacing individual drives, upgrading servers or by adding external arrays. It wasn't long before the media servers became a set of disconnected storage islands.

Nonlinear editing platforms had their own storage. Graphics images were kept on small footprint local drive arrays. Video servers had a very limited amount of high-performance storage capacity on small hard drives. Once the video server reached capacity, users were forced to purge valuable content and often had to re-ingest it for programming or interstitials on any given day.

All these woes pointed to poor use of capacity, which in turn resulted in many operational headaches.

The lines between media and data content are getting fuzzier all the time. Regulatory compliance with record-keeping will certainly be extended beyond spreadsheets, accounting databases and e-mail. For a content-generating organization such as a local TV news department, it is only a matter of time before some entity forces the retention of all content to "protect the risks and liabilities" of the stakeholders. And what will this mean if every raw image, spoken word, script, graphic and completed broadcast story must be retained?

The material will need to be kept electronically; and at least for today, that means the data will need to be stored on a magnetic or optical platform!

Some challenges when dealing with continual need for more storage--not spending too much; controlling growth while containing staffing requirements; retention, backup, protection and security of the data assets; disaster recovery including failover and failback; and making improvements in performance throughout the network and the application space.

POOLING STORAGE

One method for managing growth from a physical perspective is centralization, or the creation of a storage pool. Consolidation of long- and near-term media assets is one of the fundamental concepts for addressing growth.

The same platform that provides immediate access, high bandwidth and high throughput does not have to be the same platform that stores the assets during off-peak periods. A common storage system, potentially a hybrid of spinning disks and optical or tape-based off-line devices, can be used for lower priority assets.


(click thumbnail)Fig 1Such a centralized storage pool can be arranged in any of three basis structures. (See Fig. 1.)

The earliest structure deployed for media servers was direct attached storage (DAS), which provides for dedicated block storage and is directly attached to heterogeneous servers.

A storage area network (SAN), whereby multiple servers are connected to shared storage arrays over a network, is the newest. Last, a network attached storage (NAS) system is one in which clients and servers access files over a network using standard Internet protocols such as Common Internet File System or Network File System.

Due to the limited ability to scale storage, DAS is used mainly for small applications when only modest growth is anticipated. NAS platforms are relatively easy to deploy. Using specialized servers dedicated to file serving, the NAS will generally come with integrated storage, which can be added to the network with relative ease. Most SANs will use higher performance Fibre Channel connectivity because it provides robustness, high bandwidth, fast throughout and low latency.

Gateways provide further connectivity between NAS and SAN hybrid storage systems. Storage pools need not necessarily be confined strictly to spinning magnetic disk drives--they can include DVD or tape systems.

For managing the data, automation is often applied to manage migration, retention and retrieval. Metadata, that descriptive information about the data, is used to categorize and validate the movement of data between storage tiers; and is based on work-flow requirements. Tiered storage systems are often comprised of disk, tape, and optical-storage devices. For applications focused on data retention, tiered storage is just one method that helps lower hardware costs while retaining pools of storage for different service levels.

Work-flow policies dictate how and when data migration takes place. For example, files that have not been accessed in the past six months are migrated to secondary storage, which could be either near-term DVD or large sets of inexpensive spinning disks. In the case of a long-term storage requirement, a tape-based archive is often employed. The archive's data mover application associates the asset-management application to the migrated data process. Coupled with browse and asset management routines, users can still search and access the data even though it might be semi-offline.

Storage and application performance is governed by technology and driven by user needs. Beyond data migration, another solution for improving efficiency is to deploy higher performance storage arrays. Attributes affecting performance include RAID type, number and type of storage processors, amount of cache, number of host ports, internal bandwidth and architecture, number and type of disk drives, and the workload profile.

Installing faster and higher performance arrays unfortunately has a direct relationship to price, so selecting the proper mix of attributes is an important part of storage network system design.

Even with the faster arrays, other bottlenecks may still reduce performance. When connectivity between servers and storage is the issue, increasing the number of connections or employing a faster networking technology (e.g., 4 GB Fibre Channel) may solve the problem. Conversely, decreasing the number of "hops" or interswitch links will in turn increase throughput and provide better connectivity.

Every user's need for storage may require a customized solution to the overall storage-pool concept. Products with application specific requirements, such as news editorial systems or production-content rendering, are optimized for dedicated storage solutions.

These applications require not only large amounts of storage, but fast and wide bandwidth delivery performance. The manufacturers of storage systems and content media servers have literally spent thousands of man-hours perfecting their particular solutions.

Those in the manufacturing sector recognize that for long-term homogeneous solutions, a common storage pool is essential. This is one reason why the products offered by dozens of vendors are being accepted and deployed in operations worldwide. However, it is a two-way street. For media asset management solutions to be valuable, the balance with other hardware must be considered.

Karl Paulsen

Karl Paulsen is the CTO for Diversified, the global leader in media-related technologies, innovations and systems integration. Karl provides subject matter expertise and innovative visionary futures related to advanced networking and IP-technologies, workflow design and assessment, media asset management, and storage technologies. Karl is a SMPTE Life Fellow, a SBE Life Member & Certified Professional Broadcast Engineer, and the author of hundreds of articles focused on industry advances in cloud, storage, workflow, and media technologies. For over 25-years he has continually featured topics in TV Tech magazine—penning the magazine’s Storage and Media Technologies and its Cloudspotter’s Journal columns.