RAID
R
edundant Array of Independent (or Inexpensive) Disks

 

*** Should you use RAID?  No !!  ( see RAID - not such a Clever Idea for the Home - there are actually tons of articles about the problems with RAID)

*** instead - invest your money in a Fast drive, with a good backup drive.  Don't mirror - instead run overnight backups.

There are two main types of RAID that home users will buy - both are problematic :

*** see also http://www.acnc.com/04_00.html for an excellent overview

Spanning (Non-RAID) - some vendors will tell you that they have two types of RAID 0 - Spanning and Striping.  But Spanning is not RAID at all.  It is a special method of storing data on two hard drives, however.  It is when the data "spans" 2 drives, and the system views both drives as a single drive.  In that respect, it looks to the user, as if it is RAID 0.  Spanning takes the drives and combines them so it appears that you have one large drive (Drive A + Drive B), so 2 x 500GB drives would appear as one 1000GB drive on your computer.

DO NOT USE SPANNING !!  It has all of the disadvantages of striping, but none of the benefits.

RAID is a configuration of 2 or more hard drives, that uses various algorithms that are based on two concepts:

Here we discuss RAID for use with PC's and Servers -  and the technical aspects of RAID (all the various RAID Levels).  RAID has been in use with servers for many years, but recently has become popular with PC's as well, as the capacity demands have increased, and RAID prices have come down.  Some people call these "RAID Arrays" but this is a redundant use of the term, since the acronym has the word "array" in it.

The five most Common RAID Levels

Other less common RAID Levels

 

All RAID Levels

There are a number of different ways of doing this, and each method has been assigned a RAID Level, such as RAID Level 0, RAID Level 1, etc.  These levels are often called simply RAID-0, RAID-1, etc.

  • Level 0 -- Striped Disk Array without Fault Tolerance: Provides data striping (spreading out blocks of each file across multiple disk drives) but no redundancy. This improves performance but does not deliver fault tolerance. If one drive fails then all data in the array is lost.
  • Level 1 -- Mirroring and Duplexing: Provides disk mirroring, and therefore twice the read transaction rate of single disks and the same write transaction rate as single disks.
  • Level 0+1 – A Mirror of Stripes: Not one of the original RAID levels, two RAID 0 stripes are created, and a RAID 1 mirror is created over them. Used for both replicating and sharing data among disks.
  • Level 2 -- Error-Correcting Coding: Not a typical implementation and rarely used, Level 2 stripes data at the bit level rather than the block level.
  • Level 3 -- Bit-Interleaved Parity: Provides byte-level striping with a dedicated parity disk. Level 3, which cannot service simultaneous multiple requests, also is rarely used.
  • Level 4 -- Dedicated Parity Drive: A commonly used implementation of RAID, Level 4 provides block-level striping (like Level 0) with a parity disk. If a data disk fails, the parity data is used to create a replacement disk. A disadvantage to Level 4 is that the parity disk can create write bottlenecks.
  • Level 5 -- Block Interleaved Distributed Parity: Provides data striping at the byte level and also stripe error correction information. This results in excellent performance and good fault tolerance. Level 5 is one of the most popular implementations of RAID.
  • Level 6 -- Independent Data Disks with Double Parity: Provides block-level striping with parity data distributed across all disks.
  • Level 7: A trademark of Storage Computer Corporation that adds caching to Levels 3 or 4.
  • Level 10 – A Stripe of Mirrors: Not one of the original RAID levels, multiple RAID 1 mirrors are created, and a RAID 0 stripe is created over these.
  • Level 50 - a stripe across distributed parity RAID systems
  • Level 51 - a mirror striped set with distributed parity (some manufacturers label this as RAID 53)
  • Level 100 - a stripe of a stripe of mirrors
  • RAID S: EMC Corporation's proprietary striped pairty RAID system used in its Symmetrix storage systems.
  •  

    The Hard Drive Bottleneck - RAID to the Rescue

    In 1987, UC Berkeley researchers David Patterson, Garth Gibson, and Randy Katz warned the world of an impending I/O crisis because server hard drives simply could not keep up with all the advancements in the speed of the other components (CPU, RAM, etc).  They realized the large mainframe hard drives were the bottleneck of the entire system (actually - to this day - hard drives are still the bottleneck). The team assessed the various advantages/disadvantages of these huge "mega-drives":

    At that time, large disk drives for mainframes were incredibly expensive ($35,000 was not unusual), but they noticed that the common PC drives had become fast and cheap, especially SCSI drives.  

    Idea 1 - Combine Multiple Drives - they came up with a key concept, to lash 75 PC disk drives together, and created a controller which combined their storage into one , big, virtual drive.  They then assessed the advantages/disadvantages of this solution and compared them with the current mainframe drive :

    The solution seemed great - except the reliability was poor, because now there were 72 cheap assembly-line devices to worry about breaking down, instead of 1 expensive high-quality device.  

    Idea 2 - Redundant Drives - to tackle this problem, they came up with the second key concept in RAID technology.  They suggested the use of extra "check" disks, containing redundant information that could be used to completely recover data in the event of a disk failure. Once a failed disk was replaced, either by a human operator or by electronic switching, data would be reconstructed onto it automatically.

    Their team then proposed this solution in a paper, and coined the acronym, RAID.  The paper was titled "A Case for Redundant Arrays of Inexpensive Disks (RAID)." 

    Within a couple of years, Intel-based products like the Compaq Systempro (released in 1990) made RAID an expected ingredient in every midrange and high-end server.

    The RAID Acronym Fiasco - it is a rare occurrence when vendors change the meaning of a standardized acronym - but in this case, they did.  The word "Inexpensive" was true for the first test model of the combined 72 PC drives.  But as you can expect, vendors don't make $$$ on inexpensive components - so the RAID arrays (initially SCSI, then both SCSI & IDE were available) on the market were actually very pricey.  The Vendors temporarily decided that the "I" stood for "Independent" instead.  Now everyone wonders why there are two meanings for the acronym.

     

    RAID with PC's

    RAID arrays were expensive, and only used with mainframes and servers for many years.  But recently they have come down in price, and although still a niche market - they are coming on quickly and are available by default with new PC's - or as upgrades to existing PC's.  Businesses need the added reliability for their power user workstations, and home users need the added speed for games, video, etc.

    RAID Controller - Integrated MOBO vs Add-On Card - most new motherboards now have RAID support built-in (integrated RAID).  For existing systems, you can buy a RAID controller card (the most popular is Promise Technologies).  

    Matched Drives - while it is not mandatory to have identical hard disks with RAID - it is very highly recommended !!  Actually, if you can - get identical drives - Capacity, Make, and model.

    The 3 Types of Hard Drives -  make sure that the RAID controller you use will work with the type of hard disk you have.  Most will work with all three primary drive types (IDE, SATA, SCSI), but there are some that do not.

    NOTE on SATA Drives (Serial ATA Drives) - the 40-pin parallel ATA IDE and EIDE drive interface has been the standard for years, with SCSI running far back in second place.  In 2002, Serial ATA (SATA) was introduced as the next step in ATA technology. SATA provides greater scalability, simpler installation, thinner cabling, and faster performance (up to 1.5 Gb/s). SATA maintains backward compatibility with the Parallel ATA software drivers, and is planned for a speed increase of up to 6Gbps in coming years.

    Be aware that when you install a serial drive, by default it will show up in Windows as a SCSI drive !!  I am not sure why, but it may be that Windows developers did not envision SATA drives, and therefore do not have that text in their list of all possible devices.

    *** see also www.serialata.org

     

    Tips:

     

    RAID 0 (Striping) - Don't do It !!!!!!

    Striped Disk Array without Fault Tolerance

    RAID 0 substantially increases your risk of disk failures, and in fact it "often" causes disk failures.  In addition, both disks fail simultaneously !!  Worst of all, almost no software exists that uses the simultaneous data access in an efficient manner - so although it does offer much faster data access - it has very little real speed impact on the applications because they continue to operate on one piece of data at a time.

    RAID 0 is an oddity  .  .  .  the "0" means that there is zero RAID (no redundancy).  So, oddly - RAID 0 is not RAID !! 

    RAID 0 simply stripes data across the drives - so if you have two 100 GB drives, for example, you have one, big, 200 GB virtual drive.  Since there is no backup of the data, RAID 0 - or RAID at Level 1 is the only Level that does not offer data redundancy.

    STR simply does not significantly impact performance of typical desktop applications.

    There are certain uncommon situations where RAID 0 can significantly improve system performance. For example, editing of large audio or video files is sometimes limited by the maximum sequential transfer rate of the hard drives, but it is far more common for the processor to be limiting factor.

     

    So why not just use the two drives separately then ??   Well, RAID 0 designed systems use hard drive controllers that can Read and Write data faster if they alternate between drives.  The bytes can actually be written to the two drives simultaneously, but keep in mind that a single CPU machine can only work with one byte at a time.  Nevertheless, the CPU is much faster than hard drives, so it supplies the drive buffers with as much data as they can possibly work with.  So bottom line - you get faster data transfers. 

    BUT if one of the drives fails, all drives go down.

    Disadvantages

     

    RAID 1 - Mirroring

    Disadvantages