Using RAID in Linux
Setting It Up

Alexander Prohorenko
Thursday, August 1, 2002 01:52:49 PM
The easiest way to create a RAID array is to do it during the installation
of any new Linux distribution from the graphical installer. In Red Hat
the utility
named Disk Druid suits our needs. You can create RAID partitions as easy as simple partitions; then
you can combine them into one array and set its level. That's all!
However, sometimes
Disk Druid is too "clever" and it suggests partitions placement on disks, which goes
absolutely against what any system administrator would want.
If this has happened to you, you can easily divide partitions with the
command fdisk (don't forget about assigning for
Linux partitions RAID value with partition type 0xfd). In the future, you can combined them
into larger arrays with Disk Druid.
If you don't want to reinstall a distro, this may be the best way to
start working with RAID anyway. Though, as with everything else in
Linux -- the best of RAID can be achieved by editing its configuration
file.
So, with your favorite text editor, create file /etc/raidtab and
typing something like this for RAID Linear-mode:
raiddev /dev/md0 # raid device name
raid-level linear # linear mode
nr-raid-disks 2 # number of used disks
chunk-size 32 # in this case it doesn't affect at all
persistent-superblock 1
# list of partitions below and their placement
device /dev/sdb6 # partition name
raid-disk 0 # disk number in array
device /dev/sdc5 # ... and so on
raid-disk 1
# ...
For creating such array we just need to execute:
mkraid /dev/md0
After that, while viewing /proc/mdstat, we can make sure of the
workability of our array. This device can be run with next command:
raidstart /dev/md0
and stopped with this command:
raidstop /dev/md0
Easy, isn't it?
Once the array is created, the device /dev/md0 can be used for
placement of system files
as usual--like with any other disk of the system. After reboot, this device will
be auto-connected, without any raidstart (or raidstop) needed. You
don't need to fix initialization scripts, you don't need to touch
absolutely anything!
For RAID 0, the file /etc/raidtab can looks like:
raiddev /dev/md0 # as above
raid-level 0
nr-raid-disks 2
persistent-superblock 1
chunk-size 4 # here, size makes sense. look commons below.
# everything is the same, like in example above
device /dev/sdb6
raid-disk 0
device /dev/sdc5
raid-disk 1
The chunk-size argument in this case means stripe size in
kilobytes. For best productivity, (at least in this configuration) the
size of the partitions should average out to be the same. The default
value is 4KB, however a higher value--about 32KB--will give more
productiviy. It should be
like the size of disk cylinders. The calculation of disk caches
in modern hard drives can sometimes vary, sometimes becoming more like
cache size. Creating, starting, and stopping RAID uses identical
methods as those describe above.
/etc/raidtab for RAID 1 will read:
raiddev /dev/md
raid-level 1 .
nr-raid-disks 2
nr-spare-disks 1
chunk-size 4 # doesn't matter
persistent-superblock 1
# as usual
device /dev/sdb6
raid-disk 0
device /dev/sdc5
raid-disk 1
# description of drives of "hot reserve"
device /dev/sdd5
spare-disk 0
When we have "hot reserve" disks and if one of the "mirror" disks
fails, a process of reconstruction of disk information from the proper
disk in the array will start in the background. After that, the "hot
reserve" disk will be exchanged with the broken disk.
Finally, for RAID 5, the /etc/raidtab file will read:
raiddev /dev/md0
raid-level 5
nr-raid-disks 3
nr-spare-disks 1
persistent-superblock 1
parity-algorithm left-symmetric # it should be this way
chunk-size 128 # "good" value for the beginning
#
device /dev/sda3
raid-disk 0
device /dev/sdb1
raid-disk 1
device /dev/sdc1
raid-disk 2
device /dev/sdd5 # reserve disk
spare-disk 0
This situation is like what we find in RAID 1. Array productiviy
depends on chunk-size, so in this case you
should increase that value, more than what it is in RAID 0. 128-256KB usually gives good results.
It is important to remember that while formatting the file system with
mke2fs command, you have special argument stride,
which affects records placement on disks. Usually, the best value of this argument is
chunk size/inode size, i.e., with chunk-size = 256 and block-size =
4096 bytes, stride = 32.
You point it this way:
mke2fs -b 4096 -R stride=32
Only run this command for RAID levels 0, 4, or 5. For Linear-mode and
RAID 1, it doesn't make any sense.
Next: Recovering RAID, Hot Upgrades, and Some Final Cautions »