Rescuing Linux Systems--Generic and Distribution-Specific Safety Nets Common Rescue/Recovery Scenarios Bill von Hagen
Monday, July 8, 2002 11:31:59 AM
The number of ways in which a computer system can break is essentially
infinite. Luckily, the number of common "dead system" scenarios that
you can actually recover from is relatively small, and falls into
several general classes. The following are my favorites, and some tips
and tricks for recovering from them:
Rescue disks were made for the situation where your system won't
boot because the root filesystem is corrupted and you can't even
boot to the point where you can access the fsck utility on the
system itself. In this case, it's fairly easy to boot from a rescue
disk and then use the version of fsck that they provide to repair
the corrupted filesystem. You can use standard fsck tricks such as
using an alternate superblock if the filesystem's primary superblock
is damaged. If you actually lost files, you can then either copy
them to removable media on another system and then reinstall the
copies on your original system, or reinstall them from your original
disks if you have access to them.
If filesystem misconfiguration is the problem, you can boot from the
rescue disk and then repair the filesystem configuration file
(/etc/fstab) or use utilities such as tune2fs, debugfs, and so on to
correct misconfiguration in the filesystem header.
If you are having bootloader problems, you can boot from a rescue
disk, correct the boot loader configuration files (if necessary),
and then reinstall some or all of the bootloader (if necessary),
including running LILO if that's your boot loader.
If you lost the kernel on your system, you can boot from a rescue
disk, mount the root partition from the actual Linux system and then
rebuild the kernel. This generally involves first using the chroot
command to change the system's notion of the root filesystem so that
you can then rebuild or simply reinstall the correct kernel in the
correct place. If you;re lucky, you can then reboot from your
original system, and voila!
In the worst case, you may find that your filesystems are so damaged
that it is easier to reinstall your system in its entirety. In this
case, you can boot from the rescue disks and then use backup
utilities to back up files to supported removable media or over the
network, if that's supported by the rescue disk that you're using.
The next few sections discuss a variety of different Linux rescue
mechanisms, the rescue mechanisms that are provided with many common
Linux distributions, and a variety of distribution-independent rescue
disks that are designed to provide the tools needed to get almost any
Linux system up and running, regardless the vendor who provided the
distribution on which it is based.