|
Test Plan Charlie Unplugged: An Interview with David Boyes
Meet David BoyesAnyone who works with Linux on IBM's System/390 mainframes has certainly heard of David Boyes. He made history early in the project by running no less than 41,400 Linux images on a single mainframe, all of them doing real work under simulated load as web servers. More recently, David has been involved in helping application service providers and other companies deploy Linux on System/390 hardware in the real world. David Boyes is 34 years old and lives in Ashburn, Virginia, just a few miles west of Washington D.C. He is newly married to Margarete, a native of the former East Germany whom he met when she was in the U.S. as an exchange student. Together they run Sine Nomine (Latin for "Without Name") Associates and take care of their nine (!) Persian cats, all rescued from animal shelters. This interview took place on March 20, 2001, between David Boyes and Internet.com Technical Editor Scott Courtney. LinuxPlanet: How did you get interested in computers? Boyes: I started out interested in English and History. I'm not really a computer geek at all, but am a person who gets interested in interesting problems. I started out at University of Oregon in 1983 as a biology major, but that didn't work out for me. I switched to History and English and graduated with an MA in History in 1988. I also have dual bachelors degrees in English and Roman History. The approach [in college] was to study something interesting and then find out if there was a way to get paid for it. I started my computing career in college as a night operator on a IBM 360/50 running card decks through the machine and batching print back out to users. It was a great job as an English/History major. I got paid to do my homework, get up every hour or two, run some cards, collect some output, go back to homework. When the university got a 370/155, that was a big upgrade, a big deal. Three months later they went to a 4341. IBM suggested that we bring in this thing called VM. We didn't want to break all the OS/360 stuff we were running. They brought in VM as a migration tool. It was VM/370 Release 6. It was love at first sight. No more coming in at 3 am to test system stuff! I've always had a soft spot for VM as an environment -- a long love/hate relationship with CMS, but CP -- that's beautiful. The sheer idea of being able to simulate whatever hardware you want within the architecture of the machine is an unbelievable idea. I got involved with it as a systems programmer because I had the time to do it. The University went through a period of hard times...and a lot of positions simply disappeared. That was the main reason I left the University, because the job disappeared. That ended up transferring me to Rice University down in Texas. It's very small, right in the heart of downtown Houston. It's a fairly wealthy school, and very much an engineering school. They have a long relationship with IBM. Over the six years that were there, a lot of the players that are now part of the Linux 390 story had some connection with Rice. Myself, Rick Troth, Adam Thornton, and others. The thing that was particularly interesting at Rice was that the mainframe was relatively central to the Rice community. That was beginning to decline, and our interest was basically to preserve the platform that we were working on. Gopher and other internet tools were survival tools for us. Rick Troth wrote Webshare [the first web server for VM] and I wrote the IP sockets piece of it. It was based on work by us and by Serge Goldstein at Princeton. So you were involved with VM quite early. What was it like working in that environment? A lot of this stuff has really evolved as 'Can we do this?' How much can we compress these things? How much of the data can be shared? One of the interesting things about VM is that it really is the first personal computer. We wondered if you could have a virtual machine with everyone running a private copy of the operating system. Rick had been thinking about what was involved in a personal UNIX workstation. Sun had just come out with the Sun 3. Rick had been playing with what he called VNIX, and the idea was that everyone had a personal virtual machine. Since Rice was an engineering school, the tools ran on UNIX, mostly on Suns. The [mainframe] team dispersed, and I left for Notre Dame. Not much happened there. I was teaching economics at that point. After the experience at Notre Dame, I left that and went to the consulting business. I hooked up with Dimension Enterprises. The company had about four people. We were building fairly large-scale IP infrastructure at that point, taking advantages of different skill sets. I got stuck with the one-off machines that weren't necessarily mainstream. I got to work with a whole lot of weird hardware. You've got this one Sequent in the corner and an experimental DEC machine that DEC never manufactured [in quantity], and an Intel Hypercube. I had to glue that kind of stuff together. You come up with automation tools so you have to do as little work as possible.
From Mainframe to Linux -- A Natural Step
How did this lead to your work with Linux on System/390? Linux for 390 is really more of the same. Linux itself wasn't the core of the idea for me. What I contribute to the process of creating these kinds of solutions is really the architectural stuff. What I'm into is 'How do we glue these things together?' Linux by itself, as a technology exercise, is an interesting hack. It's a lot of fun to play with. In terms of sheer inventiveness, it's a really neat piece of work. But in terms of real world scenarios, there are going to be places where it's not the right tool. In order for Linux to be successful in the enterprise, it's got to operate in an ecosystem where there are different kinds of tools. It provides a sort of mediation layer, where what it is presenting is a way to move information through an enterprise, and possibly some point solutions. Linux helps you glue all that stuff together so you can present a unified solution. That's what intrigued us about Linux 390. From an enterprise perspective, there is little question that System/390 is the most reliable system in existence. If you look at where enterprises store their critical data, it's not on UNIX systems. It's on the S/390. After 30 years of 'the mainframe is dying', 70% of the data is still on the mainframe. The idea of Linux in this environment, especially when combined with VM, represents a way that you can get enterprise access to this huge amount of data in a way that doesn't involve teaching OS/390 all these tricks. OS/390 is really a batch system with interactive access bolted onto the side, and I mean that in the Frankenstein sense of bolts sticking out of the side of its neck. It's not really what it's made to do. IBM has made it better over the last decade, but it was a fight. You have often praised IBM's VM operating system as a hypervisor environment for Linux. For readers who aren't mainframe gurus, can you explain the difference between LPAR, VM, CP, and CMS? The idea with a piece of mainframe hardware is that it's a fairly expensive box. In most cases an organization is going to try to divide that expensive resource so that they can apply pieces to different problems simultaneously. One of the things IBM does to support its OS/390 operating system is to physically divide the system into partitions. You can assign resources like memory or I/O or CPUs into logical partitions, or LPARs. The hardware has the ability to do that. It's fairly static. In 1964, IBM was developing a new machine, the 360/67. The hardware didn't exist yet and they wanted to get a running start on the operating system but didn't have any hardware to test on. They developed a simulator, in a multitasking environment, so that they could have several developers working on it at the same time and they wouldn't have to compete for one development machine. They took a lot of care with this, and the simulation was so good that the first time the hardware appeared it was basically a no-break environment. When they ran on the real hardware, it worked the first time. That simulator was what became VM. The original IBM systems were batch systems. Someone had the idea about what if you put into one of these virtual machines a basic environment for a user -- an editor and so on, so that each person could have their own computer. The idea was to write a tiny, very interactive support environment so that it was friendly for users to work with. It gave them the warm fuzzy tools to edit source code and send files to each other. It became called the Conversational Monitor System, or CMS. CMS became the de-facto environment. It's an operating system that runs in those virtual machines. Up until about version 3 it would run on the bare metal, but it became more sophisticated and no longer runs on the bare hardware. The idea that you create an operating system that runs in a virtual machine is a very powerful idea. That can be extended to run a different operating system, which is what we did with Linux. It's a different environment that happens to be running on the same physical machine.
Leaving IBM, but Staying with Linux and S/390You worked for, or with, IBM for a while. What led you to leave IBM? I worked for IBM, mostly as a consultant. IBM fascinates me, just because of the sheer volume of good technology that comes out of that organization. Very few people give IBM credit for their contribution to the industry over the last 30 years. For example, the Winchester disk drive was an IBM invention. They consistently produce vastly intelligent solutions that they are consistently unable to market successfully. In that way they are kind of like Xerox. That last step, making it work in that ecosystem environment that I described, is where they consistently fail. There are actually two ports of Linux to S/390 hardware. One is IBM's officially supported version, and the other is the Bigfoot port done primarily by Linas Vepstas. What are your feelings about the Bigfoot port and its relationship with IBM's port? I was peripherally involved in the Bigfoot port, in making the case to IBM. IBM came to us before they decided to release Linux, and asked us, 'Why would we want to do this?' We responded that, basically, you are going to do this with us or we are going to do this without you. Bigfoot has some better ideas. For example it was designed to run on any 370 architecture machine. It would support more hardware. It was designed to work specifically under VM, and would take advantage of things that are only available there. One of the things the Bigfoot port did is that the stack grew in a different direction, which made some things more efficient. There were a lot of things in Bigfoot that were wise choices at the time. Politically, it was done in a more typical Linux development environment. It was public, people were hacking on stuff, more like the typical Intel kernel. The IBM port was done more as a very quiet, very skunkworks, project: Don't get the IBM badge holders in trouble. They were working mostly on their own time without management approval. We did a lot of things in Bigfoot that would have made the IBM port easier to do. If I had to pick one thing that was an unwise choice on IBM's part, it was the use of the relative instruction set so that it would only run on G2 or higher. They could have used our Bigfoot work, but that would have been a very difficult sell inside IBM. IBM is a hardware company, they want to sell hardware. But if I'd done it, I would have done it differently. Did Linas Vepstas get a raw deal from IBM when they passed over his port and wrote their own? I think Linas overreacted to a number of things. There is definitely a culture in the Linux community that says, 'I've done something, that gives me some value in the community.' This is something that was very personal to him, and IBM came along with their port and sidelined him in the community. I don't think anybody at IBM actively tried to give him a raw deal. It was a matter of timing and specifically of IBM internal politics. Beyond that, I don't know. I don't think he got a raw deal. The Boeblingen lab people have been forced to be much more cooperative, and some of them have been uncomfortable with it. They want to do things their way. Over time they have become much more cooperative. It's been a learning process from both sides of the community. They've had to learn to work with the development community, and we've learned to work with some of their internal constraints that we can't see.
The Early Days, and Looking to the FutureWere the early versions of Linux for S/390 easy to install and usable after you got them running? We were almost always installing under VM. If you had the advantages of VM and the tools around VM, then yes, they were always easy to install if you were comfortable with installing an operating system under a virtual machine. It was no difference installing Linux than any other virtual operating system. On bare metal, it was a pain in the butt. The designers were always testing under VM and they assumed that things on bare metal would work the same. That's not always the case, such as with shared control units. The first Marist distro was always very easy. The latest SuSE version has some changes that make it a little harder to install. I'm not a big fan of graphical installers because when they break they break spectacularly, but maybe that's just because I'm an old timer. What applications do you think make sense for L390? What are bad fits? I'll give an answer for now, for the next three years, and for five years out. Today, what we see Linux on S/390 being most useful for is as an infrastructure server running e-mail, DNS, and the traditional back-office environment. The open source tools are available today, and are vastly superior to some of the commercial tools for the same tasks. It's not uncommon to have a Linux-based [e-mail] environment with tens of thousands of users on one system. That's not something I'd want to try on Lotus Notes or Microsoft Exchange. That's a good use today, a profitable use, for Linux 390. In three years, I see an expanded role of Linux on S/390, and specifically Linux under VM. Right now applications, database, and web servers are three separate pieces. What I see then is that those three pieces become three instances under VM. To roll these applications into a virtual server environment becomes very easy to manage. The architecture of the applications won't change. You'll still see the three-tier design where you have the application service, the database, and the web front end. A lot of the I/O intensive tasks are something where in the 390 world the cost is much lower. Gigabit Ethernet is cheap, but in between virtual machines I'm getting eight gigabytes for free once I've got the infrastructure in place. In the long term I think distributed systems will be re-centralized. For Beowulf to be an effective technique, the problem has to be partitionable into little pieces that don't interact much with each other. What we're really talking about is how to partition applications. We've had thirty years of performance tuning, chargeback accounting, etc., in the VM environment. I think you're starting to see that discussion in how people are interested in ASPs. Thirty years ago we called them a timesharing organization. You don't have the resources or skill sets to do the job yourself, so you rent the service from someone who does. This means accounting becomes more important. In five years, I think you'll see a lot of people looking at this type of solution. Sun and HP will not sit still, but IBM has a thirty year jump. Been there, done that. So you think this is the right direction for the future? I think we can't afford not to go this way, because the floor space of discrete systems is reaching the point of diminishing return. For things like personal productivity applications, I don't think that will fit into this environment much. I think you'll see more of the big applications here. People want control of a personal system. For things where there is a lot of interactivity and customization will stay on the desktop. A good model for this would be a personal computer running Linux, using X11 to run remote applications on the 390. You have the same interface for both. This also requires broad-based availability of high-speed communication, but I think that's happening in most areas. It's going to be a cooperative effort between what's on your desktop and what runs remotely. What typically people buy PCs for today is probably going to stay on there.
Linux and S/390 at Telia, Test Plan: CharlieWhat did you do at Telia? How was their application a good Linux and S/390 fit? Telia is a perfect example of the infrastructure play. The piece of Telia that's interested in this is their ISP division. They have to look at provisioning and getting the customer online as quickly as possible. The other thing they're looking at is some really I/O intensive applications. They have to collect billing information from their dialup servers. That's multiple hundreds of megabytes per day. They've got Usenet feeds that have to handle hundreds of megabytes per day. That's something that plays very well in this structure. They're lucky in that most of their applications are either open source or they wrote the code themselves. They were easily ported. IBM provided a very scalable platform for those applications to run. The 390 vastly accelerated their ability to react to resource demands. The problem with the 70 Suns [which Telia replaced] is that they could not move resources around. You ended up doing box replacements, or many boxes. Solaris is a good general-purpose solution but it's not optimized for anything. It's like a Swiss army knife. The combination of Linux and VM for Telia was interesting because they got to keep all their UNIX applications. They already knew how to run that; all they did was move them to a new container. What that container gave them was the ability to move resources around, and to have a much bigger container. They were able to start out with a very flexible platform and could move things around without having to go out and touch the physical machines every time. Are there other companies doing similar things? Are you involved? Yes. I can't discuss a lot of the details, but there are a lot of companies in the enterprise, telecomm, and financial fields that are absolutely beating down our doors about this. Particularly in non-US areas where floor space is at a premium -- Japan is an example, where floor space is $475 a square foot. We're not talking about companies with a few dozen web servers, but companies with 30 or 40 thousand web servers. If you actually do the math on what an NT server costs, the physical server is cheap but the support environment, management, and backups around it are very expensive. We really have more work than we can do. But as competitive as the marketplace is now, none of these companies want to raise their kimonos. This is a seriously competitive tool. I've gotten calls from as high as Lou Gerstner's office wanting to know who these companies are, and I can't tell them. These are generally very secretive projects because you're messing with the bottom line, and they're playing for keeps. People were amazed when you announced the results of Test Plan: Charlie, in which over forty thousand Linux images ran simultaneously on one mainframe. Can you explain just exactly what Test Plan: Charlie involved? Test Plan: Charlie was originally a demonstration project. A customer, a telco, was looking to build an ISP service. They wanted to sell bandwidth. One of the services they wanted to do was a managed router project. These services tend to be everything down to the plug into your LAN. The telco sells you a server, a router, they preconfigure everything and manage it at their office where you don't have to see it. The telco was looking at the back end of this. What they did was go to a consulting firm and paid them a large sum of money to get them an infrastructure to provide this service. They had to provide a pretty stringent service level, with guaranteed uptime. All the versions of Linux and NT that I'm aware of don't provide the ability to control the resources that any given user uses. In the mainframe environment this is something that was solved years ago. When they went to the original consultant, they said, 'Give us a design.' The consultant gave them a pretty basic Sun design, with two machines for DNS and one machine for a more I/O intensive application. Two UE2s and one UE450, mostly because you needed bunches of disks. This is about $50K worth of hardware and they were going to replicate it for each customer. They projected 250 initial customers, so they would need 750 machines. You also need rack space, and a building to put it in. You need network cables, routers, switches -- all this peripheral infrastructure. If you're held to service level agreements, you need some way to measure that. You need a way to quickly restore a system if you have a hardware failure, backups -- this all takes people. They were looking at a price tag on the order of 50 to 60 million dollars. As everyone knows, the phone company has lots of money and can do this kind of system. We had done some other work for them., though, so they called us and said, 'Can you give us a second opinion?' We realized this was a configuration in which most of the servers were identical. What if we could do this in an environment of virtual servers? This particular customer has a System/390 already in place. They use it to print about 300,000 or 400,000 bills a month, just for one region of their national network. They had a test partition available and were already VM customers. We proposed to them building a similar server "farm" using Linux, VM, and open source tools to provide the services to the customers on their 390 platform. We did the initial study in terms of floor space, power consumption, and assuming they bought a new [System/390] machine to do it. It came out around two to three million dollars. They were uncomfortable with committing to this kind of solution without actually seeing it work. We were uncomfortable too -- this was all pretty leading edge stuff. The initial [project] was to build a working model of day one, with 250 customers. Over the next week, we built a lot of tools, did a lot of experimenting and crashed Linux about ninety billion times. The process was to build tools that let us create, and duplicate, Linux environments between virtual machines. We had to come up with what could be shared between instances. It's different from a development environment where people are screwing around with stuff. The only thing that had to be different was the actual data that you're working on. We built the web infrastructure, and something to generate the load. They put some pretty serious constraints around it. It had to be a pretty real-world case. We did it in stages. The first was the initial 250 customers. Test Plan: Beta took this to two thousand, and then ten thousand customers, to see if this would work, and what were the boundaries on how far this thing could scale. Test Plan: Charlie was just, 'Let's go for broke.' So we started duplicating the virtual servers. When we hit 41,400 servers, we saw a message that said, 'I can't allocate any more resources.' It didn't crash and burn, it stayed up. This is an insane load! This thing was paging its brains out. [It had only 128 meg of RAM]. I wouldn't recommend this configuration for the real world.
What We Learned from Charlie, Onward to OmegaIf it isn't feasible in the real world, why was Test Plan: Charlie important? We learned a lot about the performance characteristics in that environment. Under VM you have a lot of instrumentation for things like CPU usage and I/O performance. We were also able to do some interesting things with [allocating] different performance levels. For any individual system, you can constrain the performance of the Linux environment by constraining the resources you give it. You can offer tiered services, and you can do it on the fly. The customer bought a $2 million mainframe and is currently running over 9,000 customers on the machine. It's cabled to the original test partition, so if there's an outage they can cut back over to the other machine and there is only a few seconds of outage for each customer. Instead of breaking it down into separate hundred megabit Ethernet, we're terminating OC12s (622 megabits per second) directly into the machine. We can keep doing that until we run out of channels. We've skipped the distribution space, delivered very high-performance networking without spending a lot of money. They're currently planning for an additional system, probably a reasonably-sized z900. This will be about a 200% increase in CPU per system. All of this stuff is about ten or fifteen years old, such as the ability to run the network stack on one CPU and the rest of the application on another. What if you did this in a geographically distributed world? We're extending the benefits of this three decades of mainframe technology, under the covers, to the Linux world. Have people overrated Test Plan: Charlie's relevance in the real world? They get more excited about the numbers than about what's actually happening in the technology. In terms of a proof point, this idea is valid, is absolutely crucial. Is this something where any sane person would run their production? No. In that sense, it is a little overrated. It did get IBM's attention. From IBM's perspective, VM has been a difficult system to work with. It doesn't sell big boxes because it makes little boxes run very efficiently. IBM makes a lot of money off of VM. Revenue from PROFS [an IBM groupware system], for example, is still coming in and is very lucrative. It took something like Test Plan: Charlie to wake up IBM management to the power that they had. To that degree, I can't overestimate how effective that test has been. It finally made the argument to IBM that virtual machine technology is an absolutely critical part of how they will have to play the marketing strategy in order to survive. In that sense, it's importance has not been overstated at all. It had to happen, it had to happen in a very public and graphic setting. What is Test Plan:Omega and how is it different from Test Plan: Charlie? We found that we could get 41,400 servers in a partition in a G5 [machine]. The idea of Omega is to see what we can do with an entire machine, and preferably the biggest machine we could get our hands on. At the time we ran the test, the biggest machine around was a ZZ7, which has 12 CPUs at around 160 MIPS per CPU. It's a pretty good sized box. It used to be the top of the line -- it's a lot of iron. We were able to get some standalone time from one of the outsourcing vendors. They were willing to give us two hours on the box just for the coolness value of finding out how far we could push this thing. It had 16 gig of RAM. The design point for VM is 99,999 simultaneous logged-on users. Our test was to take Test Point: Charlie and just go nuts. We've got a huge amount of RAM, all the I/O horsepower we could ever want. We got 97,943 images. There are recently introduced z900s that are four times that size. The largest web facility I know of is around 14,000 servers. We're talking about several times the size of the largest Internet facility in a single box. That box is ten by ten square feet.
Concluding RemarksYou said earlier that you have a "love/hate" relationship with CMS but that you love CP, both of which are part of VM. Can you explain that remark? CMS as a development environment...there are things I really like about it. There is a fabulous change control system built into it. The thing I hate about it is that it's architected for an IBM 3270 terminal. Imagine doing all your programming through a web browser. The behavior of some of the utilities is non-intuitive. It is a 30-year-old utility and it's starting to show its age. CP has been kept really clean. For a long time, both VM and CMS have been considered a package. By placing Linux in this environment, you've got a much better interactive environment than with CMS. A lot of the things that are part of CMS were there to solve problems of 1974, that really aren't issues any more. We're carrying a lot of baggage. I like CMS, but is it something we have to preserve for the future? Probably not. Would I be upset if CMS development stopped at this point? Probably not. There are too many advantages for developing solutions on Linux and migrating to the 390. Same thing for scaling down applications to a smaller box as well. What do you think about the idea of Open Source software? Open source isn't a new thing -- it's been around in both the mainframe community and the PC community for decades. The earliest example is the Waterloo Mods tape for OS/360 -- 1967, I think. A collection of good stuff shared because it's good for the benefit of all. VM Workshop tape, etc.etc. Linux didn't invent it; it just learned from it.
|