Storage area networking

Storage area networking or SAN is a server technology that allows the user to separate storage from processing and I/O. SAN frequently connects hard disks, tape drives and other peripherals to a host server. It also allows users to connect more than one server to the same storage peripheral. SAN software can provide elaborate monitoring, backup and load-balancing functions.

One can also use SAN to create a pool of virtual storage that a server treats as if it were local. SAN can comprise local storage on a number of machines, centralized storage or a combination of both.

Unlike a traditional network, a SAN does not involve file transfer; nor does it involve connecting to a remote drive on a server. Instead, a SCSI channel is mapped across the network to the remote device, making the device think that the storage peripheral is directly attached to the server. For this reason, the server treats the storage just as if it were hard-wired to the peripheral interface. A SAN typically operates separately from a local-area network (LAN) so storage-related functions do not slow LAN traffic.

SAN basics

A SAN consists of three basic components: an interface, interconnects and a protocol. The interface can be the small-computer-systems interface (SCSI), the enterprise system connection (ESCON) or Fibre Channel. The interconnects can be switches, gateways, routers or hubs. The protocol, like IP or SCSI, controls traffic over the access paths that connect the nodes. These three components plus the attached storage devices and servers form the storage-area network. While the SAN supports a number of interfaces, Fibre Channel — both Fibre Channel Arbitrated Loop (FC-AL) and Fibre Channel fabrics — dominates SAN implementations due to its flexibility, high throughput (up to 2Gb/s) and inherent fault-tolerant paths.

Figure 1. A SAN separates computing and I/O functions from the storage itself.

One way to think of a SAN is as a high-performance network on the “other side” of a server (see Figure 1). Many networks provide connectivity between a server and remote workstations. A SAN provides connectivity between servers and storage. The purpose of a SAN is to separate computing and I/O functions from the storage itself. Once the storage is separate from the processor, multiple processors or servers can access a pool of common storage, and additional disk storage can be added without having to add processors.

A large SAN system allows many workstations using multiple processors to have access to the same data at (almost) the same time. This lets users improve workflow and efficiency. In a news environment, multiple editors can access the same raw footage to create different packages. In the broadcast play out application, the same content can play out of multiple servers to multiple channels.

Layer by layer

Figure 2 shows a simplified SAN solution employing Fibre Channel. This example illustrates the layers involved in a typical video-server application.

Generally, the application is not aware of the SAN. The application makes storage requests of the operating system and the operating system handles the details. When an application makes a storage-related request, the operating system communicates with the RAID controller through a Fibre Channel-switched network, typically referred to as Fibre Channel fabric, using standard SCSI commands. The SCSI drivers shown in Figure 2 are the drivers responsible for generating SCSI software commands, not SCSI physical connections. This is an important distinction. SCSI commands are still sent across the network. However, using Fibre Channel-switched fabric eliminates the limitations of SCSI hardware.

The gigabit linking unit (GLU), Fibre Channel switch and Fibre Channel RAID controller comprise the SAN. The GLU is similar to a network interface card (NIC) in an Ethernet system. It provides the physical and electrical interface to the Fibre Channel fabric. Once the SCSI commands reach the RAID controller, the controller saves or retrieves the data from the storage system based upon the configuration of the controller itself. From this point on, communication between the controller and the physical drives is typically SCSI. Since the controller is usually co-located with the disk drives, SCSI limitations are generally not a problem.

Figure 2. This simplified SAN solution employing Fibre Channel illustrates the layers involved in a typical video-server application.

As with any multi-user system accessing shared storage, conflicts can arise when two users request to write to the same record at the same time. Locking systems resolve these conflicts by allowing one user access to the data while temporarily locking access to the file for other users. These systems typically do not lock an entire file, but rather lock a particular record, row or byte of the disk data while it is being modified. Once the write operation is finished, the lock is removed.

In large SAN systems, redundancy becomes an issue since all of the material is stored in one large system. There are a number of strategies for dealing with the risk, but the most common approach is to provide two SCSI storage systems. This is relatively easy to implement since almost all Fibre Channel SCSI devices are dual port.

There is one important note about SAN hardware you should know. If you purchase a SAN solution, you might be surprised to learn that your installation does not use fiber-optic cable. The Fibre Channel specifications allow networks to be built with copper or fiber. Non-optical Fibre Channel (non-OFC) implementations are fully supported using coax as well.

While FC-AL remains popular for SAN, new technology is available that allows SAN traffic to travel over IP networks. This technology encapsulates the Fibre Channel into IP so that TCP/IP networks can carry the traffic. This solution may provide less performance than SAN over Fibre Channel, but it provides a solution for those with large TCP/IP deployments where some users need access to SAN peripherals but do not have ready access to Fibre Channel fabric.

Design issues

SANs raise some interesting design issues. For example, a SAN allows a designer to specify that data be striped across multiple drives, or even multiple locations. If SAN data is not stored in a particular location and on a particular drive, how can it be adequately backed up? The answer is software. Some SAN solutions come with software that automatically creates two copies of any newly stored material. The system makes sure that the same data is not stored in the same location. Other SAN solutions stripe the data across multiple systems. If one server's local storage becomes unavailable, the SAN recreates the data using well-understood parity algorithms. But, be aware that some of these software solutions are vendor-specific, so you may not be able to get these features if you start mixing in different vendors' products.

Another issue in SAN design is bandwidth. SAN designers can ensure that the SAN does not fall over if all users on the SAN request data simultaneously in two simple ways. First, they can design the bandwidth of the SAN so that it has extra capacity. The extra capacity assures that the SAN keeps functioning even in times of extremely high demand. Some might argue that this is wasteful and drives up cost. The fact is that high-speed network hardware is falling in price, and is now such a small part of the total system price that this is no longer a consideration. Second, they can insist on careful control of connections to the SAN. If you grow a SAN in an unplanned way, peak-use conditions can exceed its overall bandwidth.

Should you use one?

If SAN is so great, why doesn't everyone use it? First, interoperability issues still exist between vendors. Additionally, SAN may turn out to be more expensive when you are looking for a server system that has a low number of I/O channels, but lots of storage. Finally, SAN may not be the way to go if you are looking for a small system. Generally, stand-alone systems are less expensive.

So, where do SANs make the most sense? SANs are best used in larger systems where users want many I/O channels and want to access the same content. As storage prices fall, building one server with a huge amount of storage is not a problem. However, I/O still requires bandwidth inside the server. There are two common strategies for dealing with large I/O requirements. One is to build a large server with what amounts to a router inside it. The other is to connect a number of smaller I/O devices to a network. That is what SAN does. It allows you to break the connection between storage and I/O so that you can increase either of these as needed.

Brad Gilmer is executive director of the AAF Association, executive director of the Video Services Forum, and president of Gilmer & Associates, a technology consulting company.

Send questions and comments to:brad_gilmer@primediabusiness.com

Home | Back to the top | Write us