RAIDers of the lost archive

Hard disk storage has been around since 1956 with the advent of the IBM 350, which used 50 24in platters to store 5 million 6-bit characters (3.75MB of data), providing much faster access to data than was possible from loading punch cards. While we now store 1 million times that much data on a 3.5in drive, the basic issues of data storage are the same as they were 60 years ago: capacity, speed, reliability and recoverability.

One of the common features now found in disk storage is RAID. RAID comes in several flavors — RAID 0, RAID 1, RAID 10, RAID 5, RAID 6 — each of which takes a different approach to the issues of speed and recoverability. This article describes the types of RAID; the newest method, RAID 6; and some hybrids.

What is RAID?

In 1987, with disk prices falling to less than $20/MB and capacity increasing into the hundreds of MB, David Peterson, Garth Gibson and Randy Katz of UC Berkeley’s Computer Science Division published a paper, “A Case for Redundant Arrays of Inexpensive Disks (RAID)” (www.cs.cmu.edu/~garth/RAIDpaper/Patterson88.pdf), which laid out five methods for improving reliability and speed by storing data across multiple disks. Although the word “inexpensive” has since been replaced with “independent,” the numbering system for levels of RAID described in the paper remains in use.

As disk capacity rapidly increased, read/write speeds did not keep up, an issue that persists to this day. To improve I/O, RAID stripes the data in a file across multiple disks, allowing several heads to simultaneously read or write portions of the data. Read and write speeds then become a multiple of the number of disks used, rather than being limited to the speed of a single head.

Striping the data across several disks, however, increases the chances of data loss. When a file is stored on a single disk, the data is lost when that one disk fails. When the data is striped across four disks, the data is lost if any of those four disks fails. RAID addresses this in two ways: by storing multiple copies of a file on multiple disks or by calculating parity data and storing it on a separate disk, which can then be used to reconstruct data lost by a disk failure. RAID is designed for recovery from a disk failure, not to recover individual files and so does not replace backup. However, it does operate in real time, so the data is current as of the point of failure.

Vendors take different approaches to RAID, implementing it in a storage array, in software or in the storage controller. No standards body defines the RAID levels or certifies vendors’ implementations.

Types of RAID

In addition to the original five RAID levels described by Peterson, et al, developers have created several others over the years, including some proprietary versions such as RAID-S, which EMC developed for its Symmetrix storage systems.

The RAID levels commonly in use today are:

  • RAID 0. Designed purely for speed, RAID 0 splits the data evenly across multiple disks without any type of parity information or data redundancy. For applications such as file serving or video streaming where a backup copy exists elsewhere, RAID 0 provides quicker access and higher data rates than a single disk. Do not use it for storing the only copy of a file. (See Figure 1.)

    Figure 1. RAID 0 splits data evenly across multiple disks without parity information or redundancy.

  • RAID 1. RAID 1 takes the opposite approach to RAID 0 — redundancy, not speed. RAID 1 doesn’t stripe the data across disks, but creates complete mirrored copies on separate disks. If one disk fails, the other disk takes over. It is useful for applications where redundancy is paramount, and a single disk, perhaps supplemented by a cache, can write fast enough. Read speeds, however, are faster than a single disk since both disks can read simultaneously. (See Figure 2.)

    Figure 2. RAID 1 mirrors data across multiple disks, emphasizing redundancy over speed.

  • RAID 3. RAID 3 requires a minimum of three disks. It stripes the data at the byte level across two or more of the disks and uses a separate disk to store the parity information generated by the controller. RAID 3 is seldom used for applications with a lot of small data requests, but it performs well for large sequential data transfers, such as editing uncompressed video files.
  • RAID 5. The most common RAID type, RAID 5 combines block-level disk striping with parity. Unlike RAID 3, which stores the parity data on a separate disk, RAID 5 stripes the parity information with the actual data. Under normal operations, the array doesn’t read the parity data. However, when there is a read failure, the array uses the parity information to reconstruct the missing information. (See Figure 3.)

    Figure 3. RAID 5 combines block-level disk striping with parity data and can survive the loss of one disk.

  • RAID 6. RAID 5 can withstand a loss of a single disk. RAID 6 uses two parity blocks per stripe instead of just one, allowing it to survive the loss of two disks. Even if a disk goes down during a rebuild, no data is lost. (See Figure 4.)

    Figure 4. RAID 6 uses two parity blocks per stripe, so it can survive the loss of two disks.

  • Hybrid RAID. RAID methods can also be combined or nested. The two most common hybrid RAID levels are RAID 10 (or 1+0) and RAID 01 (or 0+1). RAID 10 requires a minimum of four disks paired in two sets. The controller stripes the data across the disk pairs, but each disk in the pair is a mirror of the other. RAID 01 takes the opposite approach, striping the data across one set of disks which are mirrored to a second set. (See Figure 5.)

    Figure 5. RAID 10 combines RAID 1 and 0 by striping data across two pairs of disks, where each disk in a pair is a mirror of the other.

RAID 6 for broadcast

Each of the RAID methods has its trade-offs in terms of available vs. total disk space, read/write speeds, reliability and processing overhead for computing parity. Banks and broadcasters don’t need the same level of RAID. Even different applications within the broadcast environment can use different RAID levels, such as RAID 0 for video streaming and RAID 3 for editing.

For simplicity, however, particularly when using shared storage, companies prefer to settle on a single technology to support. The standard has been RAID 5, but anyone who uses RAID 5 should consider RAID 6 at the next upgrade.

The problem with RAID is that disk capacity has far outstripped I/O and network speeds. In the Peterson paper, the authors are talking about a 100MB personal computer disk and a 7.5GB IBM mainframe disk array that filled 24 cubic feet. When RAID came out, even with the slower disks and networks, disks could still be rebuilt in minutes. With the multiterabyte drives in use today, restoration can take hours.

RAID 6 offers greater protection in the event of a failure, providing full service and allowing a rebuild even if a second disk goes down.

Drew Robb is a freelance writer covering engineering and technology. He is author of the book “Server Management of Windows System” published by CRC Press