Sunteți pe pagina 1din 23

The concept of RAID in

Databases

By
Junaid Ali Siddiqui
Some Key terms
 Concatenated array
 This is an array where multiple disk
drives or arrays are logically
connected together, end-to-end .
 Data Drive
 A data drive is a disk drive that is
dedicated to storing data, as
opposed to parity, Hamming code, or
a hot standby .
 Logical Disk
 This is what a RAID array is. Although
the RAID array is multiple disks, it
appears to the Operating System as
a single disk.
 Physical Disk
 A physical disk is a disk. This term is
sometimes used to distinguish it
from a logical disk.
FIGURE 1-1 Logical Drive Including
Multiple Physical Drives
 Segment size
 This is the number of blocks (sometimes
expressed in bytes) that are written to one disk
drive, before moving on to the next disk drive in
the array.
 Stripe size
 This is similar to Segment size, except that it is
only valid for RAID-0 arrays. Many manufacturers
use this term when they mean Segment size.
 Stripe width
 This is the number of blocks that must be written
to the array, so that every data drive has had a
complete segment written.
What is RAID?
 Redundant Array of Inexpensive Disks (RAID) is a
storage technology used to improve the
processing capability of storage systems. This
technology is designed to provide reliability in
disk array systems and to take advantage of the
performance gains offered by an array of multiple
disks over single-disk storage.
 RAID's two primary underlying concepts are:
 Distributing data over multiple hard drives
improves performance.
 Using multiple drives properly allows for any one
drive to fail without loss of data and without
system downtime.
Types or levels of RAID
 RAID 0
 RAID 1
 RAID 2
 RAID 3
 RAID 4
 RAID 5
 Compound RAID levels
RAID 0

 In RAID Level 0 (also called striping), each segment is


written to a different disk, until all drives in the array
have been written to.
Using RAID 0
 Advantages
 The I/O performance of a RAID-0 array is
significantly better than a single disk. This
is true on small I/O requests, as several
can be processed simultaneously, and for
large requests, as multiple disk drives can
become involved in the operation.
 Disadvantages
 This level of RAID is the only one with no
redundancy. If one disk in the array fails,
data is lost.
RAID 1
 In RAID Level 1 (also called mirroring), each disk is an exact
duplicate of all other disks in the array. When a write is
performed, it is sent to all disks in the array. When a read is
performed, it is only sent to one disk. This is the least space
efficient of the RAID levels.
 Advantages

RAID-1 arrays with multiple mirrors are often used to improve


performance in situations where the data on the disks is being read
from multiple programs at the same time. By being able to read
from the multiple mirrors at the same time, the data throughput is
increased, thus improving performance. The most common use of
RAID-1 with multiple mirrors is to improve performance of
databases.
The read performance for RAID-1 will be no worse than the read
performance for a single drive. If the RAID controller is intelligent
enough to send read requests to alternate disk drives, RAID-1 can
significantly improve read performance. Mirrored set without parity'
or 'Mirroring'. Provides fault tolerance from disk errors and failure of
all but one of the drives.
Two (or more) disks each store exactly the same data, at the same
time, and at all times. Data is not lost as long as one disk survives.
Disadvantage
This is the least space efficient of the RAID levels.
Total capacity of the array is simply the capacity of one disk.
RAID 2
 RAID Level 2 is an intellectual curiosity, and has never been
widely used. It is more space efficient then RAID-1, but less
space efficient then other RAID levels.
 Instead of using a simple parity to validate the data , it uses
a much more complex algorithm, called a Hamming Code.
 Advantages
A Hamming code is larger than a parity, so it takes up more
disk space, but, with proper code design, is capable of
recovering from multiple drives being lost. RAID-2 is the
only simple RAID level that can retain data when multiple
drives fail.
 Disadvantages
The primary problem with this RAID level is that the amount
of CPU power required to generate the Hamming Code is
much higher then is required to generate parity.
In general, all data blocks in the stripe modified by the
write, must be read in, and used to generate new Hamming
Code data. Also, on large writes, the CPU time to generate
the Hamming Code is much higher that to generate Parity,
thus possibly slowing down even large writes.
RAID 3
 RAID Level 3 is defined as bytewise (or bitwise)
striping with parity. Every I/O to the array will
access all drives in the array, regardless of the
type of access (read/write) or the size of the I/O
request.
 During a write, RAID-3 stores a portion of each
block on each data disk. It also computes the
parity for the data, and writes it to the parity
drive.
 In some implementations, when the data is read
back in, the parity is also read, and compared to
a newly computed parity, to ensure that there
were no errors.
RAID 3
 Advantages
RAID-3 provides a similar level of reliability to RAID-4 and
RAID-5.
Striped set with dedicated parity or bit interleaved parity or
byte level parity. This mechanism provides an improved
performance and fault tolerance similar to RAID 5, but with
a dedicated parity disk
One minor benefit is the dedicated parity disk allows the
parity drive to fail and operation will continue without parity
or performance penalty.
 Disadvantages
RAID-3 also has configuration limitations. The number of
data drives in a RAID-3 configuration must be a power of
two. The most common configurations have four or eight
data drives.
Unfortunately, it is not possible to have multiple operations
being performed on the array at the same time, due to the
fact that all drives are involved in every operation.
RAID 4
 RAID Level 4 is defined as blockwise striping with parity.
The parity is always written to the same disk drive. This can
create a great deal of contention for the parity drive during
write operations.
 Advantages
For reads, and large writes, RAID-4 performance will be similar to a RAID-0
array containing an equal number of data disks.
The error detection is achieved through dedicated parity and is stored in a
separate, single disk unit.
 Disadvantages
For small writes, the performance will decrease considerably. To
understand the cause for this, a one-block write will be used as an
example.
 A write request for one block is issued by a program.
 The RAID software determines which disks contain the data, and parity,
and which block they are in.
 The disk controller reads the data block from disk.
 The disk controller reads the corresponding parity block from disk.
 The data block just read is XORed with the parity block just read.
 The data block to be written is XORed with the parity block.
 The data block and the updated parity block are both written to disk.
 It can be seen from the above example that a one block write will result in
two blocks being read from disk and two blocks being written to disk
RAID 5
 RAID Level 5 is defined as blockwise striping with parity. It
differs from RAID-4, in that the parity data is not always
written to the same disk drive
 Advantages
RAID-5 has all the performance issues and benefits that RAID-4
has, except as follows:
 Since there is no dedicated parity drive, there is no single point
where contention will be created. This will speed up multiple small
writes.
 Multiple small reads are slightly faster. This is because data
resides on all drives in the array. It is possible to get all drives
involved in the read operation.
 Distributed parity requires all drives but one to be present to
operate; drive failure requires replacement, but the array is not
destroyed by a single drive failure.
 Disadvantages
The array will have data loss in the event of a second drive failure
and is vulnerable until the data that was on the failed drive is
rebuilt onto a replacement drive. A single drive failure in the set
will result in reduced performance of the entire set until the failed
drive has been replaced and rebuilt.
Compound RAID levels
 There are times when more then one
type of RAID must be combined, in
order to achieve the desired effect.
In general, this would consist of
RAID-0, combined with another RAID
level (Often RAID-1, RAID-3 and
RAID-5 used with RAID-0).
 The primary reason for combining
multiple RAID architectures would be
to get either a very large, or a very
fast, logical disk.
Any questions?
Junaid_upesh@yahoo.com
Message from the presenter
We spend our days waiting for the
ideal path to appear in front of us but
what we forget is that paths are
made by walking not waiting.So
always keep yourself on the right
path.
Thank you for your
attention
References:
http://www.accs.com/p_and_p/RAID/Recovery.html

S-ar putea să vă placă și