By Arvind Subramanian Published Date
01 - Jan - 2007
| Last Updated
01 - Jan - 2007

RAID is the abbreviation for "Redundant Array Of Independent Disks." It refers to the technology of storing data with a higher degree of protection and/or performance than regular storage.

There are many different "levels" of RAID, and typical examples are RAID 0, RAID 1, RAID 1 0, RAID 0 1, RAID 3 0, and many more. At each level, RAID takes a different approach to the storage of data.

RAID works on the concepts of Data Mirroring, Data Striping, and Parity Checks. That's where the levels come in: RAID 0 uses only striping, RAID 1 uses only mirroring, and further levels include parity checks and/or a combination of these. You get many permutations as a result, but RAID always uses at least two hard disks that work as a single unit. 

The Basic Concepts
Data Mirroring
Simply put, mirroring means making simultaneous writes of the same block of data on multiple disks. That's where the "Redundant" in "RAID" comes from. It is expensive but far from redundant in transaction- and data-intensive applications such as accounts, finance, and banking. In the case of a hard disk crash, data can be restored from a "mirrored" disk. 

Data Striping
Here, a stream of data is divided into blocks, and each block is written to a different disk in the array. The writing of the blocks takes place concurrently after the division, thus increasing the speed of the write operation. Concurrent reads, too, are faster: if you need 10 KB of data, 5 KB can come from each disk at the same time, so theoretically, you get double the data rate. This, however, is a broad generalisation. 

Parity Checks
A "parity check code" is generated during the write of every block of data. This code helps the RAID system recognise the data. During a read of the same block, the parity check code is again generated, and this is matched with the parity check code generated during the write. Now, the parity code generated during the write is stored on one of the disks in the array, and this increases the reliability of the data-and thus that of the entire database in use. 

What RAID Can Do
RAID levels 1, 0 1, 1 0, 5, 6, 5 0, and 5 1 allow a single hard disk to fail while keeping the data on the system accessible to users. Users in general would not realise that a disk has failed, and would continue to work normally. The use of RAID allows the disk in question to be changed, and the data restored and updated, without hampering work. Data from the working disk can then be mirrored back on to a new disk. Backups taken on tape or on a separate hard drive are not "online" or in real-time; there will necessarily be a delay before the system can be restored.

In RAID levels that use striping, at the byte (RAID 3) level, each byte of the stream is stored on a different disk; at the bit (RAID 2) level, data is broken down into bits and then stored on different disks. This is very useful in applications that deal with large image files, as well as video editing applications, which demand a good amount of speed.

What RAID Cannot Do
RAID can help in the recovery and restoration of data, but it cannot protect it. It cannot, for example, stop viruses or malicious code from attacking data. Any such attack will cause simultaneous and equal harm to all the disks in the system.

RAID does not simplify disaster recovery. Restoration from tape backups is still much simpler than recovery from a disk attached to the array in question.

RAID also cannot provide a performance boost to all running applications. Increasing the data transfer rate does little to help desktop users, since most files that are accessed are typically very small. Disk striping using RAID 0 increases the performance of a sequential read /write operation-such as one for a single, large video file-but does hardly anything for a random seek. For users with high performance as a goal, it is better to buy a faster and bigger single disk with a higher rpm and buffer than to run two smaller drives under RAID 0. 

Problem Areas
RAID systems are not easily interchangeable, unlike the case with single disks. The RAID BIOS, which controls the reads and writes of the data to the disks in the array, must be made available to the operating system. Moreover, RAID controllers make use of different formats depending on the level used. Even moving the controllers to a new system might see a degradation in performance. 

Proprietary RAID
Different vendors have their own adoptions of RAID:
DVRAID by ATTO technology: This was developed for digital audio and video applications, which typically require a lot of speed. RAID 0 provides speed but not fault tolerance. ATTO used the striped parity concept, where a parity check code is generated, and the code itself is striped and stored across disks.

RAID S or Parity RAID by EMC Corporation: A single disk is used to store the parity check code, the difference being that it is distributed across different volumes. 

MATRIX RAID-Intel ICH6R RAID BIOS: Nothing new, except for the fact that each disk in the array acts as RAID 0 as well as RAID 1.