The Virtualization Of Storage
March 18, 2009
The term "bit bucket" is used to reference agnostic physical storage, usually on disk drives, that can be utilized as a repository for all sorts of data in almost any range of formats. The data bits need not be configured for a particular server's native storage architecture—it is simply that place where "bits" are stored until they are used for something.
Bit buckets can vary to sizeable dimensions. In a large centralized storage system, the bucket might be comprised of an array of from 1 to 2 terabytes of high performance drives that are used for mission critical or online applications. When the bucket is a combination of online and near line storage, one could find disk arrays amassed in the range of a few terabytes to several hundred terabytes. On the grandest of storage scales, one that includes local tape based or "deep-archive" plus spinning near line disks, the store could be elevated from sub- to many-petabytes of usable storage space for data assets.
As these levels of storage continue to expand—as digital asset management systems begin to categorize and systemize the myriad assets in their domains and as the ability of humans to manage these huge volumes of assets diminishes—the requirements for new forms of storage management and systemization grows upward. One of the means used to combat this plethora of storage compartments involves a process of abstracting logical storage from physical storage. We use the term "storage virtualization" to describe this abstraction, which may be utilized at any layer in the storage software and hardware stack.
For our discussion of storage virtualization we first need to get some base terminology defined. In a computer or media asset storage system, logical drives are those elements that provide an area of usable storage capacity. These areas usually will span multiple physical disk drive components. On such a storage system, this could be thought of as a partition, a logical volume or a virtual disk. In this respect, the term "logical" is used because the storage drive does not actually exist as a single physical entity in its own right. Instead, as mentioned, it is an "abstraction" of the physical storage entity. Furthermore, the physical drives are the actual media—the disks, tape storage and optical storage—that the bits are stored on.
Abstraction is the process for removing characteristics from an element or set of elements, so as to reduce it to a set of essential characteristics. Abstraction can be thought of as the result of a generalization obtained by reducing the information content of a concept or an observable phenomenon.
For example, the My Documents folder on your Windows desktop could be the abstraction of all the information kept in your storage system (i.e., your local, offline or network attached storage), logically arranged so it can be virtually accessed from a central location, regardless of where it is actually found on physical storage.
The hardware and/or software storage stacks are those sets of subsystems or components that are necessary to deliver a fully functional solution. Storage subsystems, usually a form of RAID, are those logical disks (aka partitions or volumes) that are presented to the storage area network (SAN) and have emerged from the RAID arrays themselves, which coincidentally contain the actual physical disks.
Storage virtualization is an outgrowth of those conventional partitioning schemes which have become too complicated to manage or grow as the physical requirements for storage continues to expand. As stores grow, the allocation of that space burdens its effectiveness and all sorts of issues begin to surface.
One of the more flexible methods used for allocating storage space on mass storage devices is referred to as logical volume management (LVM). This volume manager is the system that performs and handles the processes of concatenation, combined striping or the merging of partitions into much larger virtual stores (volumes) that can be moved or resized—sometimes while in operation. LVM is an abstraction layer that resides over the hard drives. LVM allows the operating system kernel to access the filesystems normally, even though each filesystem may span, or be comprised of, multiple hard drives.
When enterprises need to add storage, they need to address many factors. A partial list includes adding drives to existing RAID chassis; adding more RAID chassis; combining onto a SAN those user workstations' data that heretofore resided on only local hard drives; the harmonization of the filesystem; and many more considerations.
The concept of storage virtualization is not new to the data storage industry, but it has risen to a new dimension recently. Storage virtualization includes those processes for handling the consolidation, or "pooling," of physical storage which may be comprised of multiple network storage devices such that all of the storage appears to be a single storage entity. These stores are then managed from a central control point, i.e., a console.
The extension to storage virtualization is file virtualization. Defined as an abstraction layer that lies between the file servers and the clients, file virtualization are those processes that access the file servers and in turn allows a user to access a file without having to know where that file is stored.
The principle values in employing storage virtualization for SANs include the benefits of a single point of administration and the nondisruptive migration of data from one storage platform to another, such as when adding or retiring legacy storage.
Additional features include control over the information lifecycle management, the means to let the user know that their information is placed on the appropriate tier of storage that is best for the use of that application; plus the efficiency of allocation, that is the management of a best approach to utilizing varying degrees of storage.
Storage virtualization helps to mitigate the challenges associated with disaster recovery operations, particularly when the enterprise desires to maintain agnosticism among storage vendors, disk arrays and even discrete disk drive systems themselves. Having this abstraction layer residing between various storage systems helps to lower the total cost of ownership by providing a single control element as opposed to multiple sets of systems all being controlled individually.
Storage virtualization will become increasingly more evident as we further the efforts for collaboration of workflow or the consolidation of asset services on a global, regional or even local or internal basis. Storage virtualization can be implemented as a software suite, or it can be one of many elements that are transparent to the user while being integrated into an enterprise level asset management system.
Karl Paulsen is chief technology officer for AZCAR Technologies. Karl is a SMPTE Fellow and an SBE Life Certified Professional Broadcast Engineer. Contact him at firstname.lastname@example.org.