Computing for a Cure - page 3
One We'd Like to End
The SUNY Stony Brook laboratory has its own 100-unit Linux cluster to run protein simulations. However, it would take too long to run the simulations necessary to observe just 50 nanoseconds of the protease's activities.
"We go through the initial stages of building the molecules and getting everything set up on our own cluster," say Simmerling. "Then, when we are ready for the real production, we move them to the computer centers where we can do the large simulations for a long time."
With the HIV simulation, the first step involved creating the molecule and running enough of the simulation to see that the protein didn't fall apart. Then, it was moved to the Mercury cluster at the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champagne. Since NCSA has a faster interconnect than the cluster at Stony Brook--Myrinet rather than gigabit Ethernet--as well as faster processors, the simulations could be done in half the time. To get them to run even faster, Simmerling looked into running the simulations on the NCSA's new Cobalt supercomputer, which is geared to performing these types of simulations.
"It is very convenient to use something that is just a single system image," Simmerling says. "Cobalt is really just one computer, so you don't have to deal with issues of queuing systems or multiple modes, and handling things like MPI [Message Passing Interface] is easier."
The NCSA purchased the Cobalt system in July 2004 and got it online the following year. The SGI Altix system contains 1024 Intel Itanium 2 processors running the Linux operating system, 3 TB of globally accessible memory, and an SGI InfiniteStorage TP9500 disk array to hold a 370 TB shared file system. It has peak performance in excess of 6 TB.
"The Altix has the same CPUs as the other clusters, but it has low-latency, high-bandwidth interconnects," says Simmerling. "Then, having scientific staff at SGI who really understand the machine well enough that they could help us optimize our code made a huge difference."
The NCSA recommends Cobalt be used for applications that have a moderate to high level of parallelism (32 to 512 processors) and particular applications that require more than 250 GB of shared memory. It is intended for large scale simulations, high-end visualization, large-scale interactive data analysis, and codes that perform better in an SMP environment. For applications that run on a smaller number of processors or run well in a distributed cluster environment, the NCSA recommends one of its other clusters.
The project took a total of 20,000 CPU hours to run the necessary simulations. Cobalt accomplished this within a month. The simulation allowed the SUNY Stony Brook team to not only see the protease move into the open state, but also to observe the drugs latching on to the protease. Further simulations are planned for HIV, but Simmerling is also working on other projects, such as drug-resistant tuberculosis. He says that the newer computers, like Cobalt, open up whole new areas of research that simply weren't possible before. He recommends other researchers start thinking bigger in their project plans to take advantage of the processing power now available.
"There are not enough people taking advantage of the national centers," he says. "There is computer time available and it is not that difficult to get if you have a good project."
This article first appeared on ServerWatch, a JupiterWeb site.