The need for speed

SGI boosts application performance with reconfigurable gate arrays

Silicon Graphics Inc. has released a new rackmounted unit that can increase the speed of some server-based applications by orders of magnitude. SGI has done it using an old technology in a new way.

With field-programmable gate arrays (FPGAs), hardware devices that act as dedicated processors for specific routines, SGI's Reconfigurable Application-Specific Computing (RASC) technology can accelerate those routines by a factor of at least 100 in some cases. The unit can be easily added to SGI's Intel Itanium 2-based servers and visualization systems.

The "reconfigurable" part of the name refers to programmers' ability to reprogram the FPGAs to work with a desired application. That sets SGI's technology apart from other performance-acceleration products that are intended for a specific application, said Ron Renwick, product manager of reconfigurable computing at SGI.

In addition, users can develop FPGA modules for specific tasks and share them with others.

"FPGAs have been around for decades, but they haven't had the robustness to do high-performance computing," Renwick said. FPGAs have been widely used in embedded devices, he said.

Processors in high-performance computing environments are becoming bottlenecks, he said. Moore's Law, which predicts the doubling of processing power every 18 months, is still mostly in effect, but the speed increases do not always extend to the execution of applications.

SGI isn't the only company using FPGAs for high-performance computing, said Jonathan Eunice, president and principal analyst at Illuminata. Cray and other SGI rivals are also using the technology. Cray's XD1 supercomputer cluster, for example, features FPGAs that can be reprogrammed in milliseconds, according to an Illuminata report.

Like SGI's product, Cray's technology makes FPGAs part of the network fabric so that accessing them takes less time than would be necessary using a PCI bus.

That speed advantage is one of the main points Renwick touts for SGI's RASC. Using the company's NUMAlink interconnect technology elevates RASC above acceleration technologies that depend on a PCI bus connection to transmit data.

"PCI bandwidth is no match for NUMAlink," he said. "It's also not a seamless match."

RASC, connected via NUMAlink, does not require programmers to write specific calls within the code to include the accelerator technology, he said. It is simply another part of the overall computing environment available for applications to use, he said.

FPGAs are "not very smart by the standards of a modern microprocessor, but they have the virtue that they can be adjusted and tuned for very specific functions," Eunice said. "They run at hardware, not software, speeds. For the tight loop of key computations, the extra effort required to program specialized hardware assists can be worthwhile."

RASC is intended for data-intensive applications that run a core set of computing routines. Such programs are limited by general-purpose processors and share processing capacity with other demands on a system's computing cycles. FPGAs act as specialized compute engines for specific routines, giving them a place to run in isolation and also freeing the general processors from their demands.

The innovation SGI is trying to introduce is ease of programmability. Historically, FPGAs have been difficult to program and de- manded a high level of expertise, limiting their usefulness. Using programming tools from partner companies, SGI is trying to make reprogramming FPGAs as easy as writing code in C, a programming language that many application developers use daily.

To that end, SGI's RASC toolset includes an FPGA-aware version of GNU Debugger and a RASC application programming interface and core services library.

Although the RASC unit is available now, Renwick said, more work will be done to make programming easier.

"For this to get global acceptance, we need tools, we need an ecosystem so that this method of programming is like C," he said. "A couple of years down the road, it's going to be seamless. For this to be a standard part of anybody's portfolio, it's got to be seamless. Today it's not, but there are some toolmakers out there providing things to make it more so."

One of those toolmakers is Nallatech, based in Glasgow, Scotland. It is working in partnership with SGI on the programming tools.

"We've been successful with this technology in the embedded world," said Allan Cantle, president and chief executive officer of Nallatech. "We saw the high-performance computing industry as an industry that could benefit from the technology."

The two companies began talking in 2004 and worked out their partnership in about 15 months, Cantle said. Nallatech already has a presence in the U.S. government, primarily in government labs, he noted. But its work with SGI could lead to expanded business.

"Nallatech's quite a small company, and if [a customer agency] wanted to scale up to a really big system, they probably wouldn't trust us as a small company with such a big contract," he said. The two companies have a strategic agreement to develop new business opportunities in the high-performance market, both in government and industry.

So far, he added, the partnership has not brought significant new business opportunities. But the companies are pursuing a new market that is only beginning to become aware of the possibilities, Cantle said.

"I guess we're in the early entry stage of adoption of FPGA in the HPC space, so it's starting off small," he said.

RASC also represents SGI's first major milestone toward a goal the company calls "multiparadigm computing." As described in SGI documents, multiparadigm computing enables a single system architecture to serve a wide variety of applications. Using SGI's NUMAflex shared-memory architecture for uniting architectures, the company is aiming to create supercomputers capable of supporting a combination of computational approaches.

RASC allows organizations to continue to use existing systems, making it a crucial part of a multiparadigm computing initiative.

In implementation, the addition of the RASC unit significantly boosts the performance of applications running on SGI's Altix servers or Prism visualization systems.

Renwick said the smallest performance gains that SGI measured in tests was a 42-fold increase, and many applications ran at more than 100 times their usual speed.