Jump to content

Occ Raid Guide


Verran

Recommended Posts

The Official OCC Raid Guide, Version 2.0

 

This guide is intended to discuss the pros/cons of various RAID types. It is also intended to explain how these different RAID types work on a hardware level. This guide is NOT intended to teach people how to setup RAID on their system. Every motherboard or RAID card will be different, so that would be quite unreasonable.

 

If you see anything in this guide that A) needs more explanation, B) is incorrect, C) could use rewording, please post here or PM me. I welcome any help you can provide to keep this guide as accurate as possible. This thread is not a discussion, and is not the place for you to ask specific questions about your hardware. Any posts that are not related to the guide should be instead posted in another thread, as they will be deleted here.

 

Finally, you will notice that all illustrations here are done with MS Paint. Among other reasons, this is done as an homage to LoArmistead, whose Paint skills are unmatched, though sadly missing from the forums these days.

----------------------------------------------------------------------------------------------------------

 

Terminology

 

Before I start to explain RAID technologies, there are a few terms that a reader needs to understand, as they will be used regularly throughout the guide.

 

Redundancy:

Redundancy is a general term to describe a backup. Most RAID types offer redundancy in some form in order to improve the chances of retaining data when a disk is lost. With redundancy, one or more disks in a set can be lost (fail) without any data loss.

 

Physical Disk:

This is quite simple on its own. A physical disk is a hard drive you can hold in your hand. Physical disks or drives are all the hard drives you have installed in your RAID array. This is used in comparison with Logical Disks.

 

Logical Disk:

A logical disk is a storage space that you see and have access to in Windows (or any other OS). Normally, when you install an 80Gb physical drive, you see an 80Gb logical drive in Windows. However, with RAID, this is not always the case. It is important to differentiate the storage space of the physical hard drives from the storage space that is ultimately available in windows.

 

Overhead:

As mentioned above, the capacity of all the physical drives in a RAID set may not be the same as the logical storage space it offers. This is because of RAID overhead. The price of data redundancy is overhead. In some cases, you may use 2 80Gb hard drives, but your RAID set may only be a total of 80Gb of storage. This "lost" space is called overhead.

 

Parity:

Parity is a way to create data redundancy without making a full extra copy of the data. For every pair of bits (a 1 or a 0), a parity bit is calculated and stored. See the table below:

post-13138-1199888504.png

Based on that table, it is easy to see how if you only knew two of the values in a row, you could easily calculate the third. This is how parity works in RAID. Treat each column in the table like a physical disk. If any column is lost, you can still calculate the contents by using the other two. By that same rationale, parity can be used to create redundancy so data on a lost drive can be "rebuilt".

----------------------------------------------------------------------------------------------------------

 

Types of RAID

 

RAID-0

Pros: Drastically increased data throughput speeds

Cons: Increased risk of data loss

Disks Used: 2 or more

Overhead: None

 

raid0.jpg

 

RAID-0, also called "striping", is used primarily to increase the sustained throughput speeds of data on the drive. A 2 disk RAID-0 set can read/write data roughly twice as fast as a single disk of the same speed, and a 3 disk set can read/write at roughly 3 times the speed but only during large sustained data transfers. This is accomplished by splitting all read and write activity between the drives evenly. By dividing each file into equal pieces to be written in parallel, data can be written and read much faster. However, while RAID-0 increases sustained transfer speeds substantially, it does not benefit short reads and writes much at all. This can make the benefits of RAID-0 misleading because while it can drastically increase disk speeds, it doesn't always increase speeds where they make the most difference. Since RAID-0 offers no redundancy, and because a piece of every file is on each disk, a single disk failure will corrupt the entire array. Effectively, the failure of any single disk will destroy all the data on all the disks. For this reason, RAID-0 is best used for data that is replacable, like OS and game installs.

 

 

RAID-1

Pros: Data redundancy

Cons: High overhead and slower write times

Disks Used: 2

Overhead: 50%

 

post-13138-1199888610.jpg

 

RAID-1, also called "mirroring", is used primarily for data redundancy. Effectively, RAID-1 makes 2 full copies of every file written to it. RAID-1 has 50% overhead for redundancy, meaning that 2 physical 80Gb drives make one 80Gb logical drive when mirrored. Each hard drive holds an exact copy of the other, so if either fails, the entire set of data is still on the other.

 

 

RAID-5

Pros: Increased data access speeds, data redundancy

Cons: Minimal overhead, more disks required

Disks Used: 3 or more

Overhead: 1 Disk

 

raid5-1.jpg

 

RAID-5 combines the fast data access of RAID-0 with redundancy similar to RAID-1. But instead of making a full extra copy of every file, RAID-5 uses parity to create redundancy. In a RAID-5 set with three physical drives, everytime data is written, two drives receive 1 block of data just like a RAID-0 set, while the third holds parity information. By doing this, any of the three drives can be lost, and all of the data is still recoverable. Unlike RAID-3 and RAID-4 (which are rarely used anymore), RAID-5 does not always write parity data to the same disk. Instead, parity data is divided amongst all of the drives in the array. In a three disk set, the RAID-5 logical drive will be the size of just two disks, creating an overhead the size of a whole disk. In a four disk set, the overhead is still just one disk, making the overhead smaller by percentage as the number of disks increases. The access speeds also increase as the number of disks in the set grows, however, the more disks in the set, the more likely it is that two could fail at a time, making data recovery impossible.

 

 

RAID-6

Pros: Increased data access speeds, very high redundancy (double-fault tolerant)

Cons: Substantial overhead, more disks required

Disks Used: 4 or more

Overhead: 2 Disks

 

raid6ev7.jpg

 

RAID-6 is a logical extension of RAID-5. Where RAID-5 uses one parity block per write, RAID-6 uses two. The benefit to this is that while RAID-5 can survive the loss of one phsyical disk, RAID-6 can actually survive the loss of two. So if one physical drive fails, and then another fails before the first can be replaced, RAID-6 will still maintain data integrity. The downside of RAID-6 in comparison to RAID-5 is the extra overhead. RAID-6 requires more disks to provide the same logical storage space when compared to RAID-5. As with RAID-5, RAID-6 does not always write parity blocks to the same disks, but rather spreads them out across all disks in the set.

 

 

RAID-10

Pros: Increased data access speeds, very high redundancy (double-fault tolerant)

Cons: Substantial overhead, more disks required

Disks Used: 4 or more

Overhead: 50%

 

raid10mw4.jpg

 

RAID-10 (also known as RAID 1+0) is one of many types of "nested raids". Since RAID allows us to treat multiple physical disks as a single logical disk, there is nothing stopping us from using one RAID structure inside of another RAID structure. In the case of RAID-10, we take two RAID-1 sets (mirrors) and put them in a RAID-0 set (stripe). By putting a RAID inside a RAID, we gain the benefits (and fall-backs) of both configurations in one single logical drive. RAID-10 offiers the read and write speeds of RAID-0 while providing the data redundancy of RAID-1. RAID-10 can survive the loss of a single disk without losing data under any circumstance. On top of that, RAID-10 can actually survive the loss of two disks situtationally. As long as the two disk failures are not from the same mirror set, the data integrity is maintained. If both disks are lost in either mirror set, however, the data is lost. The cost of this is that RAID-10 requires more disks and has a fairly high overhead. It is also a more complicated configuration that can be tricky to keep track of.

 

 

JBOD

Pros: Consolidated workspace for files

Cons: None

Disks Used: 2 or more

Overhead: None

 

JBOD stands for "Just a Bunch Of Disks" and is not technically a type of RAID, but the ability to use it is almost always provided by a RAID controller and for that reason it is generally lumped into the RAID category. JBOD is like RAID-0, only without the striping (or splitting) of data. JBOD offers no data redundancy and no access speed increases. Instead, JBOD simply combines two or more physical drives into one single logical drive. With JBOD, it is possible to make a 60Gb drive and a 80Gb drive appear as one single 140Gb workspace in Windows (or any OS). When a disk in a JBOD set fails, whichever files reside on that disk are lost. All files on the other disks in the set remain unharmed.

 

91x17-digg-button.png

Edited by Verran

Share this post


Link to post
Share on other sites

  • Replies 28
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

the parity table is wrong... the parity is calculated to be true (1) if an odd number of bits are true, so the parity column should be { 0, 1, 1, 0 }

 

this is the XOR, exclusive or, logic

 

also, parity is actually arranged in blocks (bytes, or possibly larger) rather than bits, iirc

Share this post


Link to post
Share on other sites

the parity table is wrong... the parity is calculated to be true (1) if an odd number of bits are true, so the parity column should be { 0, 1, 1, 0 }

 

this is the XOR, exclusive or, logic

 

also, parity is actually arranged in blocks (bytes, or possibly larger) rather than bits, iirc

CRAP! That's twice you've bested me today! You're right, that's an OR table, and it should be XOR. That's what I get for making the table by cutting and pasting 1's and 0's on an image, and not in text. I fixed it :)

 

And yes, the parity can be done at bit, byte, or block level. I think that's actually the difference between RAID 3, 4, and 5. I just did the bit level stuff for ease of explanation.

 

EDIT###

Yeah, as a little appendix for curious minds...

 

RAID-3 is byte-level parity on a dedicated parity disk, RAID-4 is block-level parity on a dedicated parity disk, and RAID-5 is block-level parity with parity spread over all disks. Chances are you will never see or use RAID-3 or RAID-4, but it's good info for comparison's sake.

Share this post


Link to post
Share on other sites

Yeah, you're definitely right nrg. I wanted to leave the RAID-5 explanation as simple as possible for the sake of understanding, but I agree it's best to make sure the data's technically right if it's going to be an "official" guide. I updated all the images, and changed the RAID-5 description a bit. Let me know what you think.

Share this post


Link to post
Share on other sites

  • 1 year later...
  • 8 months later...

Hi, nice guide, but I have a few questions.I know nothing about raid, so bear with me.

 

Do I need to use the same hard drive, or one similar for a raid setup?

 

Your guide says that there is an increased risk of data loss in Raid 0. Now, my question is, is there a higher chance that one of the disks will fail in Raid 0? I suspect the answer is no. So the only reason there is an increased risk of data loss is because there's twice the chance that your disk will fail because you have two disks, or three, rather than one, right?

Share this post


Link to post
Share on other sites

Hi, nice guide, but I have a few questions.I know nothing about raid, so bear with me.

 

Do I need to use the same hard drive, or one similar for a raid setup?

 

Your guide says that there is an increased risk of data loss in Raid 0. Now, my question is, is there a higher chance that one of the disks will fail in Raid 0? I suspect the answer is no. So the only reason there is an increased risk of data loss is because there's twice the chance that your disk will fail because you have two disks, or three, rather than one, right?

 

It's better to use identical drives. Your right about raid 0, since it splits your files onto x drives, you only need 1 drive failure to lose all your data.

 

RAID5 is a pretty good compromise, it's fast and as long as you only lose 1 drive, your data is fine and the array can be rebuilt.

Share this post


Link to post
Share on other sites

  • 1 month later...

So, if I already have one hard drive, and I get another drive to put in a raid 0 set up, is that easy to do? Or is it best to start a raid setup when you first install the operating system. It would seem to me that if I tried to start a raid setup now I would have to wipe my drive and start over with the two drives in raid in order for the data to be split amongst the two. Is this correct? Or when you add a second drive and make a raid setup will it automatically transfer data over to the second drive?

Share this post


Link to post
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now

×
×
  • Create New...