January 20, 2019

Get the Most Out of Your Multicore Processor - page 2

Two heads are better than one!

  • September 11, 2009
  • By Akkana Peck

Some Intel processors -- notably Pentium 4, Atom and the new Core i7 -- appear to have twice as many cores as they actually do, due to "hyperthreading". So a dual Atom, for example, will appear to Linux as though it has four cores rather than two (Figure 6).

You won't see the same gains from these four virtual cores as you would from a true quad-core system, but on most modern machines you will see a performance boost. However, some people claim that on older Pentium-4 machines, hyperthreading actually caused a performance loss.

If you're not sure and want to find out whether hyperthreading is helping you, the time command is very handy. Just insert it before any command:

$ time make -j4
[ ... lots of make output ]
real 15m16.796s
user 50m55.839s
sys 3m41.130s

The exact format of the output may vary depending on what shell you're using, but "real" or "elapsed" is the field you're looking for.

Apps that use multiple CPUs: make -jwhat?

You're running a performance monitor ... and it tells you that your favorite application isn't using all your processors effectively. On, no! Is there anything you can do about that?

Generally, not much. Some jobs are hard to run in parallel, because one part of the task has to be finished before the next is begun. You can't start building the walls of your house before the foundation is finished, and you can't start on the roof until the walls are ready. A lot of programs are the same way.

Some apps, though, are designed to run several operations in parallel, usually by creating separate processes or "threads". A few programs even let you control how parallel you want them to be (Figure 7). Others, like ffmpeg with its -threads option, let you specify parallelism from the command line ... though in ffmpeg's case, you won't always see much benefit, since some video formats can't be split into multiple threads very well (each frame is dependent on the frame before it).

One place where parallel processors come in especially handy is in building programs from source. You can get huge gains by telling make to use multiple processors. Just add -j along with the number of processes you want it to run.

What number? You can use the number of processors -- -j2 for a dual-core, -j4 for a quad and so forth. But you'll often get a slight gain from using the number of processors plus one. On my dual-core Pentium, here's how long it took me to build the 2.6.31-rc8 kernel:

make -j1 (or just make) � 10:31
make -j2 6:01
make -j3 5:57
make -j4 5:56

As you see, there's only a small gain on a dual-core machine from going above 2, and not enough to matter above 3. But that can vary a bit; on a faster Intel Core 2 duo, I saw a big jump going from -j2 to -j3.

So if you want to get the most out of that multi-core machine, run your own tests, with time, and see for yourself!

Most Popular LinuxPlanet Stories