This section describes how to create software RAID 6 and 10 devices,
using the Multiple Devices Administration (mdadm(8))
tool. You can also use mdadm to create RAIDs 0, 1,
4, and 5. The mdadm tool provides the functionality of
legacy programs mdtools and
raidtools.
RAID 6 is essentially an extension of RAID 5 that allows for additional fault tolerance by using a second independent distributed parity scheme (dual parity). Even if two of the hard disk drives fail during the data recovery process, the system continues to be operational, with no data loss.
RAID 6 provides for extremely high data fault tolerance by sustaining multiple simultaneous drive failures. It handles the loss of any two devices without data loss. Accordingly, it requires N+2 drives to store N drives worth of data. It requires a minimum of 4 devices.
The performance for RAID 6 is slightly lower but comparable to RAID 5 in normal mode and single disk failure mode. It is very slow in dual disk failure mode.
|
Feature |
RAID 5 |
RAID 6 |
|---|---|---|
|
Number of devices |
N+1, minimum of 3 |
N+2, minimum of 4 |
|
Parity |
Distributed, single |
Distributed, dual |
|
Performance |
Medium impact on write and rebuild |
More impact on sequential write than RAID 5 |
|
Fault-tolerance |
Failure of one component device |
Failure of two component devices |
The procedure in this section creates a RAID 6 device
/dev/md0 with four devices:
/dev/sda1, /dev/sdb1,
/dev/sdc1, and /dev/sdd1.
Ensure that you modify the procedure to use your actual device nodes.
Open a terminal console, then log in as the
root user or equivalent.
Create a RAID 6 device. At the command prompt, enter
mdadm --create /dev/md0 --run --level=raid6 --chunk=128 --raid-devices=4 /dev/sdb1 /dev/sdc1 /dev/sdc1 /dev/sdd1
The default chunk size is 64 KB.
Create a file system on the RAID 6 device
/dev/md0, such as a Reiser file system
(reiserfs). For example, at the command prompt, enter
mkfs.reiserfs /dev/md0
Modify the command if you want to use a different file system.
Edit the /etc/mdadm.conf file to add entries for
the component devices and the RAID device
/dev/md0.
Edit the /etc/fstab file to add an entry for the
RAID 6 device /dev/md0.
Reboot the server.
The RAID 6 device is mounted to /local.
(Optional) Add a hot spare to service the RAID array. For example, at the command prompt enter:
mdadm /dev/md0 -a /dev/sde1
A nested RAID device consists of a RAID array that uses another RAID array as its basic element, instead of using physical disks. The goal of this configuration is to improve the performance and fault tolerance of the RAID.
Linux supports nesting of RAID 1 (mirroring) and RAID 0 (striping) arrays. Generally, this combination is referred to as RAID 10. To distinguish the order of the nesting, this document uses the following terminology:
RAID 1+0: RAID 1 (mirror) arrays are built first, then combined to form a RAID 0 (stripe) array.
RAID 0+1: RAID 0 (stripe) arrays are built first, then combined to form a RAID 1 (mirror) array.
The following table describes the advantages and disadvantages of RAID 10 nesting as 1+0 versus 0+1. It assumes that the storage objects you use reside on different disks, each with a dedicated I/O capability.
|
RAID Level |
Description |
Performance and Fault Tolerance |
|---|---|---|
|
10 (1+0) |
RAID 0 (stripe) built with RAID 1 (mirror) arrays |
RAID 1+0 provides high levels of I/O performance, data redundancy, and disk fault tolerance. Because each member device in the RAID 0 is mirrored individually, multiple disk failures can be tolerated and data remains available as long as the disks that fail are in different mirrors. You can optionally configure a spare for each underlying mirrored array, or configure a spare to serve a spare group that serves all mirrors. |
|
10 (0+1) |
RAID 1 (mirror) built with RAID 0 (stripe) arrays |
RAID 0+1 provides high levels of I/O performance and data redundancy, but slightly less fault tolerance than a 1+0. If multiple disks fail on one side of the mirror, then the other mirror is available. However, if disks are lost concurrently on both sides of the mirror, all data is lost. This solution offers less disk fault tolerance than a 1+0 solution, but if you need to perform maintenance or maintain the mirror on a different site, you can take an entire side of the mirror offline and still have a fully functional storage device. Also, if you lose the connection between the two sites, either site operates independently of the other. That is not true if you stripe the mirrored segments, because the mirrors are managed at a lower level. If a device fails, the mirror on that side fails because RAID 1 is not fault-tolerant. Create a new RAID 0 to replace the failed side, then resynchronize the mirrors. |
A nested RAID 1+0 is built by creating two or more RAID 1 (mirror) devices, then using them as component devices in a RAID 0.
If you need to manage multiple connections to the devices, you must configure multipath I/O before configuring the RAID devices. For information, see Chapter 7, Managing Multipath I/O for Devices.
The procedure in this section uses the device names shown in the following table. Ensure that you modify the device names with the names of your own devices.
|
Raw Devices |
RAID 1 (mirror) |
RAID 1+0 (striped mirrors) | ||
|---|---|---|---|---|
|
|
| ||
|
|
Open a terminal console, then log in as the
root user or equivalent.
Create 2 software RAID 1 devices, using two different devices for each RAID 1 device. At the command prompt, enter these two commands:
mdadm --create /dev/md0 --run --level=1 --raid-devices=2 /dev/sdb1 /dev/sdc1
mdadm --create /dev/md1 --run --level=1 --raid-devices=2 /dev/sdd1 /dev/sde1
Create the nested RAID 1+0 device. At the command prompt, enter the following command using the software RAID 1 devices you created in Step 2:
mdadm --create /dev/md2 --run --level=0 --chunk=64 --raid-devices=2 /dev/md0 /dev/md1
The default chunk size is 64 KB.
Create a file system on the RAID 1+0 device
/dev/md2, such as a Reiser file system
(reiserfs). For example, at the command prompt, enter
mkfs.reiserfs /dev/md2
Modify the command if you want to use a different file system.
Edit the /etc/mdadm.conf file to add entries for
the component devices and the RAID device
/dev/md2.
Edit the /etc/fstab file to add an entry for the
RAID 1+0 device /dev/md2.
Reboot the server.
The RAID 1+0 device is mounted to /local.
A nested RAID 0+1 is built by creating two to four RAID 0 (striping) devices, then mirroring them as component devices in a RAID 1.
If you need to manage multiple connections to the devices, you must configure multipath I/O before configuring the RAID devices. For information, see Chapter 7, Managing Multipath I/O for Devices.
In this configuration, spare devices cannot be specified for the underlying RAID 0 devices because RAID 0 cannot tolerate a device loss. If a device fails on one side of the mirror, you must create a replacement RAID 0 device, than add it into the mirror.
The procedure in this section uses the device names shown in the following table. Ensure that you modify the device names with the names of your own devices.
|
Raw Devices |
RAID 0 (stripe) |
RAID 0+1 (mirrored stripes) | ||
|---|---|---|---|---|
|
|
| ||
|
|
Open a terminal console, then log in as the root user or equivalent.
Create two software RAID 0 devices, using two different devices for each RAID 0 device. At the command prompt, enter these two commands:
mdadm --create /dev/md0 --run --level=0 --chunk=64 --raid-devices=2 /dev/sdb1 /dev/sdc1
mdadm --create /dev/md1 --run --level=0 --chunk=64 --raid-devices=2 /dev/sdd1 /dev/sde1
The default chunk size is 64 KB.
Create the nested RAID 0+1 device. At the command prompt, enter the following command using the software RAID 0 devices you created in Step 2:
mdadm --create /dev/md2 --run --level=1 --raid-devices=2 /dev/md0 /dev/md1
Create a file system on the RAID 0+1 device
/dev/md2, such as a Reiser file system
(reiserfs). For example, at the command prompt, enter
mkfs.reiserfs /dev/md2
Modify the command if you want to use a different file system.
Edit the /etc/mdadm.conf file to add entries for
the component devices and the RAID device
/dev/md2.
Edit the /etc/fstab file to add an entry for the
RAID 0+1 device /dev/md2.
Reboot the server.
The RAID 0+1 device is mounted to /local.
In mdadm, the RAID10 level creates a single complex
software RAID that combines features of both RAID 0 (striping) and RAID
1 (mirroring). Multiple copies of all data blocks are arranged on
multiple drives following a striping discipline. Component devices
should be the same size.
The complex RAID 10 is similar in purpose to a nested RAID 10 (1+0), but differs in the following ways:
|
Feature |
Complex RAID10 |
Nested RAID 10 (1+0) |
|---|---|---|
|
Number of devices |
Allows an even or odd number of component devices |
Requires an even number of component devices |
|
Component devices |
Managed as a single RAID device |
Manage as a nested RAID device |
|
Striping |
Striping occurs in the near or far layout on component devices. The far layout provides sequential read throughput that scales by number of drives, rather than number of RAID 1 pairs. |
Striping occurs consecutively across component devices |
|
Multiple copies of data |
Two or more copies, up to the number of devices in the array |
Copies on each mirrored segment |
|
Hot spare devices |
A single spare can service all component devices |
Configure a spare for each underlying mirrored array, or configure a spare to serve a spare group that serves all mirrors. |
When configuring an complex RAID10 array, you must specify the number of replicas of each data block that are required. The default number of replicas is 2, but the value can be 2 to the number of devices in the array.
You must use at least as many component devices as the number of replicas you specify. However, number of component devices in a RAID10 array does not need to be a multiple of the number of replicas of each data block. The effective storage size is the number of devices divided by the number of replicas.
For example, if you specify 2 replicas for an array created with 5 component devices, a copy of each block is stored on two different devices. The effective storage size for one copy of all data is 5/2 or 2.5 times the size of a component device.
With the near layout, copies of a block of data are striped near each other on different component devices. That is, multiple copies of one data block are at similar offsets in different devices. Near is the default layout for RAID10. For example, if you use an odd number of component devices and two copies of data, some copies are perhaps one chunk further into the device.
The near layout for the mdadm RAID10 yields read
and write performance similar to RAID 0 over half the number of
drives.
Near layout with an even number of disks and two replicas:
sda1 sdb1 sdc1 sde1 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9
Near layout with an odd number of disks and two replicas:
sda1 sdb1 sdc1 sde1 sdf1 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12
The far layout stripes data over the early part of all drives, then stripes a second copy of the data over the later part of all drives, making sure that all copies of a block are on different drives. The second set of values starts halfway through the component drives.
With a far layout, the read performance of the
mdadm RAID10 is similar to a RAID 0 over the full
number of drives, but write performance is substantially slower than a
RAID 0 because there is more seeking of the drive heads. It is best
used for read-intensive operations such as for read-only file servers.
The speed of the raid10 for writing is similar to other mirrored RAID types, like raid1 and raid10 using near layout, as the elevator of the file system schedules the writes in a more optimal way than raw writing. Using raid10 in the far layout well-suited for mirrored writing applications.
Far layout with an even number of disks and two replicas:
sda1 sdb1 sdc1 sde1 0 1 2 3 4 5 6 7 . . . 3 0 1 2 7 4 5 6
Far layout with an odd number of disks and two replicas:
sda1 sdb1 sdc1 sde1 sdf1 0 1 2 3 4 5 6 7 8 9 . . . 4 0 1 2 3 9 5 6 7 8
The offset layout duplicates stripes so that the multiple copies of a given chunk are laid out on consecutive drives and at consecutive offsets. Effectively, each stripe is duplicated and the copies are offset by one device. This should give similar read characteristics to a far layout if a suitably large chunk size is used, but without as much seeking for writes.
Offset layout with an even number of disks and two replicas:
sda1 sdb1 sdc1 sde1 0 1 2 3 3 0 1 2 4 5 6 7 7 4 5 6 8 9 10 11 11 8 9 10
Offset layout with an odd number of disks and two replicas:
sda1 sdb1 sdc1 sde1 sdf1 0 1 2 3 4 4 0 1 2 3 5 6 7 8 9 9 5 6 7 8 10 11 12 13 14 14 10 11 12 13
The RAID10 option for mdadm creates a RAID 10 device
without nesting. For information about RAID10, see
Section 10.3.1, “Understanding the Complex RAID10”.
The procedure in this section uses the device names shown in the following table. Ensure that you modify the device names with the names of your own devices.
|
Raw Devices |
RAID10 (near or far striping scheme) |
|---|---|
|
|
|
In YaST, create a 0xFD Linux RAID partition on the devices you want
to use in the RAID, such as /dev/sdf1,
/dev/sdg1, /dev/sdh1, and
/dev/sdi1.
Open a terminal console, then log in as the root user or equivalent.
Create a RAID 10 command. At the command prompt, enter (all on the same line):
mdadm --create /dev/md3 --run --level=10 --chunk=4 --raid-devices=4 /dev/sdf1 /dev/sdg1 /dev/sdh1 /dev/sdi1
Create a Reiser file system on the RAID 10 device
/dev/md3. At the command prompt, enter
mkfs.reiserfs /dev/md3
Edit the /etc/mdadm.conf file to add entries for
the component devices and the RAID device
/dev/md3. For example:
DEVICE /dev/md3
Edit the /etc/fstab file to add an entry for the
RAID 10 device /dev/md3.
Reboot the server.
The RAID10 device is mounted to /raid10.
Launch YaST as the root user, then open the
Partitioner.
Select to view the available disks, such as sdab, sdc, sdd, and sde.
For each disk that you will use in the software RAID, create a RAID partition on the device. Each partition should be the same size. For a RAID 10 device, you need
Under , select the device, then select the tab in the right panel.
Click to open the wizard.
Under , select , then click .
For , specify the desired size of the RAID partition on this disk, then click .
Under , select , then select from the drop-down list.
Under , select , then click .
Repeat these steps until you have defined a RAID partition on the disks you want to use in the RAID 10 device.
Create a RAID 10 device:
Select , then select in the right panel to open the wizard.
Under , select .
In the list, select the desired Linux RAID partitions, then click to move them to the list.
(Optional) Click , specify the preferred order of the disks in the RAID array.
For RAID types where the order of added disks matters, you can specify the order in which the devices will be used to ensure that one half of the array resides on one disk subsystem and the other half of the array resides on a different disk subsystem. For example, if one disk subsystem fails, the system keeps running from the second disk subsystem.
Select each disk in turn and click one of the buttons, where X is the letter you want to assign to the disk. Available classes are A, B, C, D and E but for many cases fewer classes are needed (e.g. only A and B). Assign all available RAID disks this way.
You can press the Ctrl or Shift key to select multiple devices. You can also right-click a selected device and choose the appropriate class from the context menu.
Specify the order the devices by selecting one of the sorting options:
Sorted:
Sorts all devices of class A before all devices of class B and
so on. For example: AABBCC.
Interleaved:
Sorts devices by the first device of class A, then first device
of class B, then all the following classes with assigned
devices. Then the second device of class A, the second device of
class B, and so on follows. All devices without a class are
sorted to the end of devices list. For example,
ABCABC.
Pattern File:
Select an existing file that contains multiple lines, where each
is a regular expression and a class name ("sda.*
A"). All devices that match the regular expression are
assigned to the specified class for that line. The regular
expression is matched against the kernel name
(/dev/sda1), the udev path name
(/dev/disk/by-path/pci-0000:00:1f.2-scsi-0:0:0:0-part1)
and then the udev ID
(/dev/disk/by-id/ata-ST3500418AS_9VMN8X8L-part1). The first
match made determines the class if a device’s name matches
more then one regular expression.
At the bottom of the dialog box, click to confirm the order.
Click .
Under , specify the C and , then click .
For a RAID 10, the parity options are n (near), f (far), and o (offset). The number indicates the number of replicas of each data block are required. Two is the default. For information, see Section 10.3.1, “Understanding the Complex RAID10”.
Add a file system and mount options to the RAID device, then click .
Select , select the newly created RAID device, then click to view its partitions.
Click .
Verify the changes to be made, then click to create the RAID.
A degraded array is one in which some devices are missing. Degraded arrays are supported only for RAID 1, RAID 4, RAID 5, and RAID 6. These RAID types are designed to withstand some missing devices as part of their fault-tolerance features. Typically, degraded arrays occur when a device fails. It is possible to create a degraded array on purpose.
|
RAID Type |
Allowable Number of Slots Missing | |
|---|---|---|
|
RAID 1 |
All but one device | |
|
RAID 4 |
One slot | |
|
RAID 5 |
One slot | |
|
RAID 6 |
One or two slots |
To create a degraded array in which some devices are missing, simply
give the word missing in place of a device name. This
causes mdadm to leave the corresponding slot in the
array empty.
When creating a RAID 5 array, mdadm automatically
creates a degraded array with an extra spare drive. This is because
building the spare into a degraded array is generally faster than
resynchronizing the parity on a non-degraded, but not clean, array. You
can override this feature with the --force option.
Creating a degraded array might be useful if you want create a RAID, but one of the devices you want to use already has data on it. In that case, you create a degraded array with other devices, copy data from the in-use device to the RAID that is running in degraded mode, add the device into the RAID, then wait while the RAID is rebuilt so that the data is now across all devices. An example of this process is given in the following procedure:
Create a degraded RAID 1 device /dev/md0, using
one single drive /dev/sd1, enter the following at
the command prompt:
mdadm --create /dev/md0 -l 1 -n 2 /dev/sda1 missing
The device should be the same size or larger than the device you plan to add to it.
If the device you want to add to the mirror contains data that you want to move to the RAID array, copy it now to the RAID array while it is running in degraded mode.
Add a device to the mirror. For example, to add
/dev/sdb1 to the RAID, enter the following at the
command prompt:
mdadm /dev/md0 -a /dev/sdb1
You can add only one device at a time. You must wait for the kernel to build the mirror and bring it fully online before you add another mirror.
Monitor the build progress by entering the following at the command prompt:
cat /proc/mdstat
To see the rebuild progress while being refreshed every second, enter
watch -n 1 cat /proc/mdstat