An In-Depth Look at Reiserfs - page 2
Included in the Linux kernel
Here's an example of how this happens: Suppose you have a word processing document, which you open in the application and to which you add some content. If the machine crashes before you save the file, you have lost all your changes but your original file will still be okay. If the machine crashes after you save the file, then you really haven't lost anything except the time it takes to reboot and reload the program. But what happens if the machine crashes during the exact moment when the disk is being written?
The answer is, "Things get very ugly." Since the new version of the file is physically overwriting all or part of the old version, the data can have some of each at the moment the drive stops writing. You end up with a file that you can't open because the internal format of its data is inconsistent with what the application expects.
This gets even worse if the drive was writing the metadata areas, such as the directory itself. Now instead of one corrupted file, you have one corrupted filesystem -- in other words, you can lose an entire directory or all the data on an entire disk partition. On a large system this can mean hundreds of thousands of files. Even if you have a backup, restoring such a large amount of data can take a long time.
Most PC operating systems have no good way to prevent the loss of a single file that was being written during a system failure. Modern systems, such as Linux, OS/2, and NT, however, do make an attempt to prevent and recover from the horrible metadata corruption case. To accomplish this, the system performs an extensive filesystem analysis during bootup. Well-designed filesystems often incorporate redundant copies of critical metadata, so that it is extremely unlikely for that data to be completely lost. The system figures out where the corrupt metadata is, and then either repairs the damage by copying from the redundant version or simply deletes the file or files whose metadata is affected. Losing files this way is bad, but it is much better than losing the whole partition.
Unfortunately, such an extensive diagnostic analysis requires a great deal of time. Even on a very fast PC, a large and heavily-used partition can require several minutes to check. Most of the time, however, the check is not really needed because the system was shut down normally, without a sudden crash. To prevent unnecessary delays, the operating system's normal shutdown process puts a status flag on the filesystem as it is unmounted, marking it as a "clean" filesystem. If a crash happened, the system never has the chance to mark the filesystem as "clean" and the bootup process knows that it needs to run the extensive filesystem tests just to be safe. A filesystem that has not been shut down cleanly is called, appropriately enough, a "dirty" filesystem.