February 17, 2019

Hardware Notes: Hard Drive Benchmarking With iozone - page 3

Modern machines: clockspeed is only part of the picture

  • February 22, 2001
  • By Lou Grinzo
Whether you're benchmarking a hard drive, a processor, a network connection, software setting, operating system, or anything else, it's worth keeping some things in mind about benchmarking:

  • Don't be afraid to experiment. Nothing will help you get an accurate answer to whatever question you're asking better than playing around with programs like iozone to get a feeling for how they work and how they respond to various configuration details. The iozone documentation has a series of interesting graphs, beginning on page seven, that show some of the "surprises" newcomers can expect to encounter.

  • Don't get obsessed with minor differences in performance numbers. You can drive yourself nuts trying to figure out why a pair of measurements that should have yielded identical results differed by some small margin. There are always small differences due to timing granularity and unseen differences (like where iozone happens to place the test file on your drive), so no one can explain these curiosities. This is why reported performance numbers are typically an average of three runs, to smooth out those variations. An even bigger mistake is taking these minor differences as absolute truth and basing a large purchasing decision on them. You shouldn't be afraid to say that a performance difference of a few percentage points is negligible compared to saving $20 per unit on a truckload of hardware upgrades, or that tweaking every desktop configuration in the company to improve performance by 5% isn't worth the effort.

  • Do everything you can to make all comparisons "apples to apples". If you're comparing two disk drives, make sure you run them on the same system and OS, with the same connection, and that you're really using the same workload on each. It's distressingly easy to make one minor mistake or let one "meaningless" difference creep into a comparison, only to find out later that it made a significant difference in the results and led you to an incorrect conclusion. The rule of thumb is that you should strive to isolate completely the one thing you're assessing--the way a hard drive is tuned, the exact model of hard drive, the layout of partitions on the drive, the use of a caching disk controller board, whatever--so that the differences in your measurements can be most accurately traced to the factor of interest and not to something else.

  • The further you move away from raw measurements, like pure disk I/O transfer speed, to more realistic performance, e.g. compiling a Linux kernel or using a system as a web server, the harder it can be to accurately model the results you'll see from the final configuration. First, it can be hard to know exactly what's going on inside your real world application. E.g. what is the pattern of disk I/O in your MySQL database? It's not enough to know that 80% of your accesses are to three specific tables, but you'll need to know the pattern of hits within those tables, and how they're interleaved (or not), etc. Second, very few systems today have a workload that's both easily modeled and constant over any useful period of time. Whether it's a laptop, desktop, or server system, things change, sometimes a lot, and today's optimal configuration today can become a problem tomorrow, at least until you do more measurements and figure out how to re-tune the system.

Coming up:

In upcoming Hardware Notes columns I'll take a look at one of the current 18" LCD flat panel displays (hint: save your pennies, you'll want one), and then return with a more detailed look at iozone, specific benchmarks on various EIDE and SCSI hard drives, and the black art of tuning hard drive performance under Linux.

Most Popular LinuxPlanet Stories