Back to article
Reasoning Your Way to Linux
Study Shows Open Source Code Superior
April 7, 2003
All open source programmers believe in their heart of hearts that open source is not only the best way to write software, it produces the best possible software. It's a point that's been argued endlessly, but until recently there hasn't been any hard proof from an non-partisan third party demonstrating that open source code was actually superior to closed source. Now, thanks to research (http://www.reasoning.com/downloads/opensource.html) done by Reasoning, a leading automated software inspection service vendor, objective proof is here: Open source is better.
Or, to be more precise, Reasoning, using their automated C and C++ source code inspection service, Illuma (http://www.reasoning.com/solutions/index.html), found that there were 0.10 defects per thousand lines of source code (D/KLSC) in Linux's 2.4.19 TCP/IP stack compared to an average of 0.55 D/KLSC in five different proprietary TCP/IP implementations. Four of the five proprietary stacks have been on the market for over ten years. In short, Linux triumphed over mature, proven programs.
TCP/IP was chosen as the target for this study because it's the "fundamental protocol that underlies the Internet" and "the functional requirements are well defined and stable, the implementation is non-trivial, and it is a critical component of every computer system and many embedded devices." In addition, by keeping the study's focus on a narrow, well-defined area, Reasoning was trying to avoid the apples and oranges problems of comparing full-scale applications and operating systems.
To put Linux's results in a broader context, In Reasoning's most recent analysis of 200 commercial projects totaling 35 million lines of source code, 33% of these programs had D/KLSCs below 0.36; 33% had D/KLSCs between 0.36 and 0.71, and the remaining third had more than 0.71 D/KLSCs. "Thus, the TCP/IP implementation in the Linux operating system ranks in the upper third, while the composite code quality of the commercial implementations ranks in the middle." In short, Linux's TCP/IP implementation is excellent while the six commercial implementations, while not awful, are in the middle of the pack.
If .71 D/KLSC sounds good to you, you're not a programmer. For mature programs, it's downright lousy. And since Illuma is designed to seek out critical coding errors, such as memory leaks, NULL pointer references, out of bounds array accesses, and uninitialized variables, there are programming mistakes that come back to haunt first users and then developers. These are the kind of foul-ups that can lead to program lock-ups and even system failures.
Now Illuma is not perfect. For example, what appears to be an uninitialized variable in Linux's TCP/IP stack turned out to be a variable that's assigned before use by a tiny built-in interpreter. Still, the problems it finds are one that any developer worth his salt needs to investigate further if for no other reason than to document questionable code that could be mistaken for an error by a human quality assurance programmer. And, beyond that, the simple fact remains that the open source code had less than 20% of the average errors of the proprietary code.
Why is that? Reasoning doesn't take a position, but simply lists the usual reasons given by open source advocates. For example, open source users don't just report bugs, but actually track them and fix them. And, with peer source code review, defects tend to be found quickly and only the best code survives. This, in turn, means that programmers will present only their best efforts since they know that the only way to rise to the top of the open source world is to deliver excellent code that can withstand public scrutiny.
The study itself has some problems. Due to confidentiality agreements, for example, we don't know what five proprietary TCP/IP stacks were reviewed.
Automated code testing itself has in the past been known to result in many false positives. Reasoning tries to avoid these problems and minimize the time it takes to deal with such problems by including in its error reports the location and circumstances of problems and using statistical analysis and other tools to identify the parts of the code with the greatest risk, so developers can focus their energies on what are potentially the most critical problems.
Still, when all is said and done, the bottom line is quite simple. Open source code is much cleaner than proprietary source code. What more need be said?