Scientists discover supercomputer alternatives
Many bioinformatics applications need the kind of high-powered supercomputer resources that government and private research organizations have only limited access to, if any.
Grid computing harnesses the power of multiple processors to fill that void. Grids can make use of small, inexpensive servers that are already installed or extend networked clusters of processors through several divisions or organizations.
Platform Computing Inc., a Canadian company with offices worldwide, including in the United States, supplied grid-computing expertise to the Human Genome Project, an international consortium led by the National Institutes of Health — one of two efforts in the 1990s to map the human genome. The company also helped Celera Genomics Group, a private firm in Rockville, Md., that was pursuing the same goal, said Rob Rick, director of life sciences at Platform.
"We've gone through that era, done a lot of [gene] sequencing, and databases are populated now," he said. "Now we've got this data, so now the question is how can we take this data and turn it into knowledge? That takes much more computational power."
Applications have to be created specifically to run in a parallel mode to make use of grid configurations. A New Haven, Conn., company called TurboWorx Inc. got its start in 2001 with TurboBLAST, a version of the BLAST genomic search software developed by NIH's National Center for Biotechnology Information. BLAST takes the gene or protein a researcher is interested in and compares it to databases of known genes and proteins, looking for matches. TurboBLAST is adapted to run on clustered processors.
"As people have gone on, they've increased the amount of computing they want done," said Andrew Sherman, vice president of operations at TurboWorx. "People have started to focus on clusters and clusters of clusters. You don't see many people being successful building desktop environments for bioinformatics."
As biotechnology and pharmaceutical research progresses, the amount of information grows at ever-faster rates, Rick said. In the 1990s, scientists focused on the genes embedded in DNA, a straightforward chain of chemicals that could be expressed as a simple sequence of letters. Now more emphasis falls on the proteins those genes cause cells to create, which are more complex, 3-D structures.
"They talk about the proteomic phase as being 10 to 100 times more [data]," Rick said. "What I see is there's no 'killer application' to take advantage of this. A lot of companies are working on this."