Test Plan Charlie Unplugged: An Interview with David Boyes - page 6
Meet David Boyes
If it isn't feasible in the real world, why was Test Plan: Charlie important?
We learned a lot about the performance characteristics in that environment. Under VM you have a lot of instrumentation for things like CPU usage and I/O performance. We were also able to do some interesting things with [allocating] different performance levels. For any individual system, you can constrain the performance of the Linux environment by constraining the resources you give it. You can offer tiered services, and you can do it on the fly.
The customer bought a $2 million mainframe and is currently running over 9,000 customers on the machine. It's cabled to the original test partition, so if there's an outage they can cut back over to the other machine and there is only a few seconds of outage for each customer.
Instead of breaking it down into separate hundred megabit Ethernet, we're terminating OC12s (622 megabits per second) directly into the machine. We can keep doing that until we run out of channels. We've skipped the distribution space, delivered very high-performance networking without spending a lot of money.
They're currently planning for an additional system, probably a reasonably-sized z900. This will be about a 200% increase in CPU per system.
All of this stuff is about ten or fifteen years old, such as the ability to run the network stack on one CPU and the rest of the application on another. What if you did this in a geographically distributed world? We're extending the benefits of this three decades of mainframe technology, under the covers, to the Linux world.
Have people overrated Test Plan: Charlie's relevance in the real world?
They get more excited about the numbers than about what's actually happening in the technology. In terms of a proof point, this idea is valid, is absolutely crucial. Is this something where any sane person would run their production? No. In that sense, it is a little overrated.
It did get IBM's attention. From IBM's perspective, VM has been a difficult system to work with. It doesn't sell big boxes because it makes little boxes run very efficiently. IBM makes a lot of money off of VM. Revenue from PROFS [an IBM groupware system], for example, is still coming in and is very lucrative. It took something like Test Plan: Charlie to wake up IBM management to the power that they had. To that degree, I can't overestimate how effective that test has been. It finally made the argument to IBM that virtual machine technology is an absolutely critical part of how they will have to play the marketing strategy in order to survive. In that sense, it's importance has not been overstated at all.
It had to happen, it had to happen in a very public and graphic setting.
What is Test Plan:Omega and how is it different from Test Plan: Charlie?
We found that we could get 41,400 servers in a partition in a G5 [machine]. The idea of Omega is to see what we can do with an entire machine, and preferably the biggest machine we could get our hands on. At the time we ran the test, the biggest machine around was a ZZ7, which has 12 CPUs at around 160 MIPS per CPU. It's a pretty good sized box. It used to be the top of the line -- it's a lot of iron. We were able to get some standalone time from one of the outsourcing vendors. They were willing to give us two hours on the box just for the coolness value of finding out how far we could push this thing. It had 16 gig of RAM.
The design point for VM is 99,999 simultaneous logged-on users. Our test was to take Test Point: Charlie and just go nuts. We've got a huge amount of RAM, all the I/O horsepower we could ever want. We got 97,943 images.
There are recently introduced z900s that are four times that size. The largest web facility I know of is around 14,000 servers. We're talking about several times the size of the largest Internet facility in a single box. That box is ten by ten square feet.