ext3 or ReiserFS? Hans Reiser Says Red Hat's Move Is Understandable
Red Hat's Decision is Conservative, Not Radical

Dennis E. Powell
Friday, August 24, 2001 02:28:44 AM
Red Hat's decision to employ ext3 as the default filesystem
in its upcoming release has sparked considerable interest among
technically savvy Linux users. But it is not the only, nor in
many ways the best, of the journaling filesystems available to
users of modern Linux kernels. Yet it has attributes that make it
an attractive first step for a large distribution, chief among
them backward compatability.
A brief and incomplete explanation for those who have not
followed journaling filesystem development:
The traditional Linux filesystem, ext2, is ideally suited for
fairly small files on fairly small drives. As the size of drives
has grown, and the size of files has, too, performance has
suffered. Some of this is in gaining access to data on the drive,
as wasted space -- "slack" -- and fragmentation have
grown. Some comes in the filesystem's recovery time in the event
of power failure or other improper shutdown. Enduring a
filesystem check by e2fsck on a one-gigabyte drive is easy; the
same check on a 40-gigabyte drive can be very time
consuming. Some comes in the bitmap method of keeping track of
the filesystem -- satisfactory for small drives with few files,
it's inefficient with the large drives commonly employed
today. Hence, journaling filesystems.
These keep track of the state of the drive in a file, called
a journal, so that restarting after an improper shutdown requires
reference to that lone file for restoration of the filesystem's
state instead of a scan of the entire drive. Additionally,
depending on their design, journaling filesystems make more
efficient use of drive space and make data reads and writes
faster over a wide variety of file sizes. To top it off,
journaling filesystems offer what amounts to dynamic space
allocation, meaning that the system administrator needn't guess
at appropriate partition sizes at the time of installation, and
they offer the potential of spanning drives in a single logical
volume. A journaling filesystem is something that becomes
essential as programs and their data files (and the drives that
hold them) grow huge.
Linux does not have a journaling filesystem. It has
four. Well, three and a half:
- Reiser filesystem, named after Hans Reiser, is probably the
best known of this quick new class of keeping track of the
contents of hard drives. It has worked and been in relatively
wide use for more than a year, and is the filesystem recommended
by the installation program of SuSE 7.1 and 7.2.
- JFS, developed by IBM and made available to the Linux world,
is designed with high throughput in mind. After a series of betas
that began in February 2000, its 1.0 release became available at
the end of June.
- XFS is the Silicon Graphics, Inc., journaling filesystem,
also made available for Linux. It, too, offers all the features
of full-blown journaling filesystems.
- ext3 is the "half" of a journaling filesystem
mentioned above. Why half? It is a layer atop the traditional
ext2 filesystem that does keep a journal file of disk activity so
that recovery from an improper shutdown is much quicker than that
of ext2 alone. But, because it is tied to ext2, it suffers some
of the limitations of the older system and therefore does not
exploit all the potential of the pure journaling
filesystems. This is not entirely bad, though, because it means
that ext3 partitions do not have a file structure different from
ext2, so backing out to the old system (by choice or in the event
the journal file were to become corrupted) is extremely
simple.
Red Hat's adoption of ext3 is a first, tentative step toward
a journaling filesystem. When the company's plans became known
with its release of the second beta of its upcoming release,
Michael K. Johnson, chief of the company's kernel hackers, was
quick to provide a rationale.
"Why do you want to migrate from ext2 to ext3? Four main
reasons: availability, data integrity, speed, and easy
transition," he wrote. Availability, he pointed out,
involves quick recovery from a system interruption rather than
enduring e2fsck taking the long way around. The journaling
provided by ext3 makes avoiding data corruption
likelier. "Despite writing some data more than once, ext3 is
often faster (higher throughput) than ext2 because ext3's
journaling optimizes hard drive head motion," he wrote.
Perhaps the determining factor, though, was Johnson's fourth
reason.
"It is easy to change from ext2 to ext3 and gain the
benefits of a robust journaling filesystem, without
reformatting," he said. "That's right, no need to do a
long, tedious, and error-prone backup, reformat, restore
operation in order to experience the advantages of
ext3."
Johnson said that Red Hat's choice was not meant to disparage any
of the other new filesystems, but instead was the most sensible
one for the biggest commercial distribution right now. Indeed,
the developers of the various journaling filesystems, too, have
gone to considerable lengths to avoid a holy war of the kind that
erupts frequently among backers of different projects that
perform similar functions.
"I personally think filesystems should be rewritten from
scratch every 5 years, but there are lots of people who think
quite differently on this," said Hans Reiser, for whom the
Reiser filesystem is named, in an email interview yesterday.
"Reiser4 is going to have a completely new core engine, and
quite a lot of people think that we should just make lots of
tweaks to what we have instead. It is extremely expensive,
risky, and just plain hard work, for us to do that core engine
rewrite, and yet I think it just has to be done. I could give
you lots of logical reasons why we are doing it, but those aren't
the real reasons why we rewrite when other filesystems don't.
People just have different styles, and fortunately both styles
work in their way, each with different effects and
benefits."
While pointing to benchmarks that demonstrate a substantial speed
increase when using the Reiser filesystem as opposed to ext3,
Reiser said there's sense in Red Hat's more circumspect
approach.
"ext3 is in its way an excellent filesystem written by
very talented programmers, and the upgrade path is surely easy for
users and distro alike," he said. "The upgrade path
issue really makes it a conservative rather than crazy decision
for RedHat; I can easily understand their decision."
Even as there are many distributions, allowing users to
select one that best suits their needs, the multiplicity of
journaling filesystems is, Reiser said, a sign of health,
offering users to select one to match their knowledge and comfort
level. His development effort is to push the technology to the
maximum.
"Reiser4 is designed to be highly extensible thanks to
DARPA's funding us to do plugins. There are lots of semantic
enhancements like inheritance and auditing coming down the pipe
in version 4. We want, in our small way, to help make Linux not
just another Unix, but something novel and cutting edge. This is
the main reason users should find Reiser4 of interest. Not
every distro is attracted to pushing past traditions though, and
the beauty of Linux is that users get to choose what distro they
need. I think that Microsoft is going to heat up the race for
semantic innovation in the filesystem namespace in the next few
years. We are going to try to innovate faster than Microsoft in
the filesytem namespace enrichment arena, and I hope you will
wish us luck in it."
There is an enormous body of highly technical literature
explaining not just the superiority but the inevitability of
journaling filesystems. While not entirely one of these in the
strictest sense, ext3 provides a painless way for nontechnical
users to enjoy some of the benefits of the new high-power
systems, while keeping one hand on all that is familiar. But it
seems clear that, as storage, code, and user data grow in size,
and as flexibility in storage options grows, today's cutting edge
will be in universal use tomorrow, and ext2 and its derivatives
will take a place in history -- an honored place, to be sure, but
history nonetheless. For now, users have the choice to dive in
head first, dip their toes, or remain entirely ashore.