Breaking through bottlenecks

New system provides storage relief for high-performance Linux clusters

Government research and engineering shops are among the biggest users of powerful computer clusters that link inexpensive off-the-shelf PC servers running Linux open-source software.

But one of the headaches with these supercomputer-class systems is storage. More precisely, it's finding an affordable, fast and easy way to pull data off disks and feed it quickly to the tens, hundreds or even thousands of microprocessors chugging away on their little piece of some gigantic computing job.

"There hasn't been a really good [input/output] solution for these open clusters at all," said Gary Grider, leader of the scalable input/output team with the High-Performance Computing Group at Los Alamos National Laboratory.

Officials at one start-up company think they have an answer to the problem, and Los Alamos officials seem to agree. Panasas Inc., based in Fremont, Calif., last month introduced its first product: the Panasas ActiveScale Storage Cluster. The network-attached storage (NAS) system uses a new object-based storage architecture that company officials say enables the system to support increasingly large Linux clusters without suffering the throughput bottlenecks that plague regular storage systems.

"Panasas is unique in its object-based storage system," said Arun Taneja, founder, president and consulting analyst at the Taneja Group. "It is quite drastically different than traditional storage systems. It's what helps them achieve the performance they get."

Los Alamos recently bought and installed 120 terabytes of Panasas' storage. The deal also includes an option to buy up to 500 terabytes more in fiscal 2004. Grider said the storage has met officials' expectation of delivering 1 gigabyte/sec of throughput for every teraflop of computing power on the Linux cluster.

For now, the laboratory has the Panasas storage system connected to three Linux systems: a 256-node cluster, a 1,024-node cluster and a 1,408-node cluster. All three are used in the lab's weapons simulation tests, and each node consists of a two-processor server.

Other options for providing storage to a high-performance cluster have various problems, according to Grider.

Proprietary computing clusters such as those from Hewlett-Packard Co., IBM Corp. and Silicon Graphics Inc. have integrated storage and can deliver the performance needed, but they are also far more expensive than Linux clusters running on commercial equipment, he said. Another option is to use a general- purpose storage-area network, but they are also expensive and can only scale to support "hundreds of processors, not thousands," he said.

Grider said the lab is paying about 2 cents per megabyte for Panasas' storage, a price he said is "pretty comparable to [NAS] in general."

Although the price is in the same neighborhood, the performance is not, he said. "The real magic is being able to scale to multiple gigabytes of throughput with multiple processors accessing a single file," Grider said. "Other companies that bid for this contract can scale multiple processors to multiple files, but not multiple processors to one file."

Pulling this off is a function of the Panasas system's clustered storage hardware and object-based architecture, according to Paul Gottsegen, the company's vice president of marketing.

The Panasas system takes a file and breaks it into segments, which are then organized into virtual containers called objects. Those objects contain the original file data as well as meta data, which are attributes about the data that the Panasas "smart" drives can use to know how to handle it, according to Gottsegen.

"This gives the system the intelligence to allow the individual [storage] blades to manage a lot of the workflow that's normally associated with a centralized [file server]," he said.

The conventional approach to storage is to use sectors that have no knowledge about the data they hold or the relationships between bits of information, which makes "the storage unable to do any kind of planning based on the way the data is going to be used," said Garth Gibson, Panasas' co-founder and chief technology officer.

Taneja pointed out that for the Panasas system to deliver its top throughput speeds, client software must be loaded onto the Linux machines. He said that is not a big deal for the technical and scientific computing shops that Panasas is initially targeting, where information technology managers are used to doing whatever it takes to get the performance they need. It could be an issue, however, at more mainstream IT shops if and when the company tries to tap into a wider market.