Sandia to develop light version of Linux

Problems with Linux prompted Sandia to develop Linux ultralight

Although William Camp, director of Computation, Computers and Mathematics at Sandia National Laboratories, is a strong believer in Linux clusters for high-performance computing, he sees problems with the current version of Linux as an operating system for this purpose. To solve some of those problems, Sandia is developing Linux ultralight.

The problems Linux ultralight will tackle include:

* Linux was not designed for high-performance computing. In fact, Camp said Linux has many characteristics aimed at commercial environments that "are antithetical to the goals and needs of high-performance computing." He said operating systems for traditional high-performance computing solutions are made as simple and single-purpose as feasible for efficiency and reliability. For example, the Cougar lightweight kernel OS on ASCI Red SuperComputer, which Sandia developed, was designed to do very little other than load applications, manage communications and detect problems. He estimates that it has about 1 percent or less of the lines of code in Linux.

* Linux is a moving target. And its motions are not necessarily predictable or favorable to the goals of the high-performance computing niche. Camp said Sandia invested a huge effort in to building software around the Linux virtual memory. In a subsequent release of Linux, the design of the virtual memory code changed radically, requiring the lab to start over.

* Linux does things you may not like. Camp said that occasionally, "Linux will decide to wake up and launch background process management." In technical computing, the application should own the node, he said. And node management should be left to the application, not the OS. "Linux is trying to manage non- wanted, non-needed and sometimes even non-existent processes, and it takes a long time to do it," he said.

In an attempt to solve those and other problems, Sandia is in the process of porting its Cougar lightweight kernel OS to Linux. That OS will have one major use: routing messages among nodes in the cluster.

The benefits of a lightweight version of Linux have not been proven. "Not everyone in the Linux community agrees with us on this, so you will have to term the project as somewhat of an experiment," Camp said.

NEXT STORY: The ABCs of how CDs work