Hardware Notes: Hard Drive Benchmarking With iozone
Modern machines: clockspeed is only part of the picture

Lou Grinzo
Thursday, February 22, 2001 09:00:16 AM
Try this experiment some time on an unsuspecting geek friend:
Ask your
unwitting subject about the performance of his or her newest computer, and I
bet you'll be treated to a litany of specs: how many megahertz or
gigahertz the processor and graphics chip are, how much memory is on the
motherboard and the graphics board, what kind and size of cache the CPU
uses, etc. But nowhere in this recital will your friend tell you anything
specific about the hard drives in the system--no data transfer rates, no
platter RPM values, no drive cache sizes, etc. For many computer users,
even the hardest of the hard core, disk drives are "just there," and
any
detail beyond the drive being EIDE or SCSI, or (possibly) UDMA 66 or UDMA
100 or some particular sub-flavor of SCSI, is a mystery.
This is a shame, and all the more curious, since modern desktop
computers, particularly those in a mainstream setting, are typically more
sensitive to the speed of their disk subsystem than they are to their
processor speed. Except for the people who spend all day generating
fractals or applying graphics filters to huge image files, very few of us
are up against a processor speed limit. Yet several times every day most of
us have to wait for a disk drive. If nothing else, notice how much time it
takes your system to boot or compile a kernel, and how much disk I/O it does
during that process, and ask yourself how much of the delay is due to the
processor and how much of it is the system waiting for disk I/O to complete.
Modern desktop computers are so fast in all respects that we often
have
the luxury of ignoring performance issues. (People like me who started with
a 1MHz processor have no trouble remembering how we lusted after every extra
slice of CPU speed.) But if you're in a position where you need or want to
maximize your computer's speed, all but the most compute-bound tasks will
likely benefit from a faster disk drive system. Which raises the question:
Aside from reading specs (which are really just a first cousin to
statistics in terms of veracity), how do you evaluate the performance of a
disk drive?
My favorite tool in this area is iozone, a relatively small
but
full-featured open source program for benchmarking disk systems. iozone was
written by William Norcott and Don Capps, with contributions credited in the
source code from several other people, and it can be built for over two
dozen OS's and versions, including Linux, Windows (32-bit), several BSD's, and Solaris.
One of the annoying but true facts of life in benchmarking is that you
have to decide up front exactly what you want to measure, and then take
great pains to make sure that's what you really measure. Nowhere in
computer system performance work is that more true than in disk drives,
where the response time you see from disk activity is a function of a whole
gaggle of factors. Besides the characteristics of the drive itself (head
seek time, rotational and data transfer speed), you have to contend with the
operating system, too, and its penchant for caching disk contents, and in
some cases rescheduling pending disk I/O to use the disk more efficiently.
(One of the most fascinating results I've seen in this area involved running
two mainframe operating systems, one atop the other, when I worked at IBM.
The host OS was VM, which is so good at scheduling disk I/O that the guest
OS actually ran slightly faster on top of VM than it did on the bare metal.)
So, what do you want to measure? The performance of the bare drive, or
how it works in the context of a real world setting, including your choice
of Linux distro, applications, and usage pattern, or something in between?
There are arguments to be made for and against both sides of this issue, and
like almost all interesting questions, there is no single right answer; it
all boils down to figuring out what kind of data is most useful to you, and
then making sure you get it as accurately as possible.
Next: Versatile benchmarking with iozone »