Intelligent storage - TvTechnology

Intelligent storage

IT technology can offer improved performance at a lower cost for large storage networks.
Author:
Publish date:

The migration to HD and the adoption of file-based workflows has created new demands for high-capacity storage. Transporting HD files also requires high performance from the data networks. The media business is not unique in the demands placed upon the data storage systems; one only has to look at oil and gas surveying to find large data sets. What is different is that the media business has high demands but, in enterprise terms, relatively small budgets. The desire to contain cost means that broadcasters must look to commodity IT components to meet their needs for large storage networks.

News and reality shows are typical examples where production demands a large pool of online storage for collaborative editing. Leveraging the latest IT technologies can bring improved performance at lower cost for such storage needs.

Desktop video

Broadcasters now want everything available at the desktop. This speeds workflow for general review and approval processes. Operations like captioning can be streamlined. Viewing proxies can be made available to all from online disk arrays. Editing quality content can be stored in data tape libraries.

Before the introduction of file-based production, the main applications of IT-based systems were in the islands of video editing and playout. Elsewhere video was transported as real-time streams and videotape. As files replace streams, from acquisition through to playout, the door is open to use IT systems.

The workhorse for video storage has been the SCSI drive, with Fibre Channel drives used where the highest performance was needed. Recently there have been several developments that promise to improve performance while lowering cost — just what is needed for HD workflows.

One is the move from parallel to serial disk interfaces; the other has been the ubiquity of Ethernet and IP, with the attendant cost reductions from volume manufacture. Coupled to these hardware developments is the move from dumb to intelligent storage systems.

Data rates

Even the latest technologies are stretched by HD data rates. The Ultra SCSI disk interface has a transfer rate of 160MB/s or 320MB/s (1.3GB/s and 2.6GB/s), with single drive able to sustain data rates of more than 100GB/s. Fibre Channel is moving from 2Gb/s to 4Gb/s, with 8Gb/s promised soon. The serial PCI Express bus can transmit 200MB/s per lane. A 16-lane slot can support a transmit rate of 3.2GB/s, more than twice the rate of the earlier PCI-X 533 standard.

Compare those to HD bit rates for 1080i video sampled at 10-bit, 4:2:2. At a 50Hz field rate, the data rate is 130MB/s, and at 60Hz, it is 156MB/s. Editing a multilayered HD sequence in real-time is still pushing technology. And waiting in the wings are the 1080p/50 and 1080p/60 standards, with a data rate of 3Gb/s.

Interconnects

The first digital video systems used parallel interconnections. They were replaced by the serial 270Mb/s interconnection that we still use today, now joined by HD-SDI at 1.5Gb/s. The old parallel interconnections suffered from several disadvantages:

  • Data skew limited cable length.
  • The 25-pin connectors had a low packing density that made devices such as routers very large.
  • The cost of cables and cable termination was high.

Hard drive interfaces are going through a similar evolution. For many years, parallel SCSI and IDE/ATA have been the most popular disk interfaces. SCSI is used for enterprise applications with a high duty cycle, and ATA is used for desktop computers. (Video applications use the more robust SCSI drives.) Both systems support a daisy-chained bus, where many disks share the same bus adaptor to the host processor. Parallel data is limited by factors such as cable skew and crosstalk, and the Ultra 640 interface has been supplanted by serial standards. This architecture adds a further disadvantage: The shared bus represents a data bottleneck.

SCSI

The small computer system interface (SCSI) is a family of standards that describes command sets and physical interfaces for the interconnection of storage devices, tape and hard drives. What is often referred to as a SCSI interface usually refers to a SCSI parallel interface carrying SCSI commands. Ultra 160 and 320 are the current 16-bit parallel standards. Recent developments based on the serial ATA (SATA) interface have created a new standard, Serial Attached SCSI or SAS.

The standard removes the many disadvantages of the parallel SCSI interface and provides higher performance. The low cost of high-speed serial transceivers means it is now cost-effective to replace parallel SCSI with simpler serial interconnections. SAS has a roadmap to 12Gb/s with the initial product rated at 3Gb/s. Compare that with the upper limit of parallel SCSI of 2.6Gb/s (Ultra 320). The serial connectors and cables are more compact than the parallel interconnections, a great advantage for building high-density disk arrays.

Serial attached SCSI is compatible with SATA, which allows common components to be used in devices in order to lower manufacturing cost. SAS connectors carry two ports, allowing for fail-over system design.

Instead of a manually set device identifier (ID), a globally unique ID means that no user interaction is required when attaching drives, the bane of video editors when moving jobs around on SCSI drives. A discrete signal path is used for each drive rather than the daisy-chain, so the user does not have to worry about terminators (as with parallel SCSI). SAS disks can be hot plugged. Parallel drives cannot be added or removed while the bus is active. (See Figure 1 on page 10.)

Ethernet

GigE or 1Gb/s Ethernet is standard for current networks, but for many storage applications, it lacks speed. The next step up is 10GigE, which has been implemented in copper and fiber versions. For short runs, the copper standard 10GBase-CX4 offers cost-savings. With a range of 15m, it is sufficient for cabling within a rack. The fiber version can use single- or multimode cable, depending on the range required.

The high capacity of 10GBase-CX4 lends it to links within storage systems. It and Infiniband find applications interlinking storage nodes. Infiniband is a serial interconnection that can provide low latency connections between storage and processing clusters.

The SAN and NAS

In enterprise computing systems, the clients are generally all running a single OS, namely Windows. In creative applications, there is more often a mix of Windows, OSX and Linux. Most storage area network (SAN) products will only support one OS per head, and a mix of OSs complicates the design.

The pressures on media businesses to reduce overheads can militate against Fibre Channel SANs. Support needs specialist knowledge beyond the average IT technician.

Network attached storage (NAS) is a popular alternative. The support requirement is similar to a basic file server. The NAS cannot be considered a replacement for a SAN. The data transfer rate of a NAS is constrained by the network interface. For large collaborative projects, several NAS appliances may be needed to serve the workflow. Unfortunately, the files must be split across the several appliances. This adds a management overhead, as jobs may have to be transferred from one NAS to another to free space or consolidate. In contrast, a SAN pools all the storage for common access.

The basic difference between a SAN and a NAS is access to the data, block or file, respectively. The special demands of the media business — large file sizes and high data rates for real time transfer — have led to the search for an alternative that meets the needs of collaborative workflow with cross-platform support, yet at reasonable cost. As products evolved, hybrid solutions have been developed that feature the advantages of both file and block-based storage. As a consequence, the boundaries between the two architectures have become somewhat blurred.

Intelligent storage

The NAS and SAN architectures use interconnected dumb drives. With the low cost of processing power today, the limits of the SAN and NAS can be overcome by distributing intelligence through the storage system. One or more disks, a CPU and network interface can be packaged as a single storage device. Many of these units are then linked to form a large and scalable storage system. With intelligence in the storage array, more sophisticated data redundancy than RAID can be constructed. (See Figure 2.)

Some intelligent storage products use a separate metadata server to manage file requests and direct the client to the requisite storage unit to access the media file (much like a SAN). Other products distribute the file system across a cluster of storage units, with each able to handle read/write requests. In both cases, file locking for multiple write access and file replication for redundancy is handled by the management layer.

Intelligent storage can incorporate load-balancing to handle client applications and automatic file migration to distribute files across the storage as older files are deleted or migrate to archive storage.

Object-based storage devices (OSD)

Object-based storage is a generic term for clustered storage with distributed intelligence. Files are split into several storage objects, which are then distributed across an array of storage object devices. Tasks such as block allocation are now managed by the storage device. This relieves the metadata server of those low-level operations, which represent a bottleneck with very large storage systems. The metadata server is left with the task of mapping files to objects, and ensuring the redundancy of objects. The traditional method of redundancy is RAID.

Object-based storage devices (OSD) can provide redundancy by replicating objects across more than one device. RAID systems have always been vulnerable to failure during a drive rebuild, often necessitating the restoration of a file from archive tapes. A smart management system knows where copies of the object are located and can rebuild a copy very quickly; RAID rebuilds can take many hours.

The intelligent storage devices can serve files directly to the initiating applications, avoiding the network interface bottlenecks of a NAS.

The T10 committee of INCITS (International Committee for Information Technology Standards) is responsible for the SCSI standards. It has defined formal standards for object-based storage devices.

Advantages

Intelligent storage can offer many advantages to broadcasters. The systems scale to the large sizes that are needed for HD production. Smart storage systems allow a 100TB system to appear as a single drive to applications and support multiple operating systems. The maintenance overheads of Fibre Channel SANs are avoided, with the move to Ethernet and IP networks. The driver is the low cost of CPU power, which means it can be distributed throughout the storage rather than concentrated in a metadata server.

Commodity disks have 500GB capacity (1 hour of uncompressed HD), with 1TB around the corner. Serial interconnections promise lower costs. Plus they need less rear-panel real-estate to connect ever large disk arrays for the creation of random-access content libraries.

It is no longer a dream for a production staff to have the content library available to view from the desktop, and with instant access. This can only be achieved by a move away from traditional video hardware and the adoption of the latest IT hardware and systems.

The second edition of David Austerberry's book “Digital Asset Management”is available from several booksellers. It also can be ordered directly from the publisher atwww.focalpress.com.