Giving XML more muscle

New XML appliances offer to take a load off overworked agency servers

Virtually every agency's information technology department is a den of pack rats, collecting data by the ton. Still, as pack rats go, they would make Martha Stewart proud of how well they can store and retrieve that data.

But then the Homeland Security Department was created and with it the need to integrate departments, data and a crop of cross-agency e-government initiatives. Storing and retrieving data in-house was manageable, but now agencies have to unlock data from proprietary systems and share it with one another.

This challenge arose as a new data-sharing solution started coming into its own. Extensible Markup Language allows disparate systems to exchange data by tagging it in a common yet highly flexible format. XML has become a powerful workhorse at many agencies, but it also has a downside.

"XML is big, bulky and verbose," said Tom Rhinelander, an analyst at New Rowley Group Inc., a technology market research and analysis firm. XML processing can consume network resources, causing "a slow but steady decline" in network performance, he said.

Coming to the rescue is a cavalry of dedicated XML-accelerator appliances and chips. These solutions — a few are ready for purchase, but most are still in testing — perform one or more essential XML-handling functions, usually data translation and sometimes security.

New challenge

Owen Ambur, chief XML strategist at the Interior Department, said the large volume of XML-based documents presents a "new and growing challenge, which is taking some IT professionals by surprise."

The same data transmitted without XML uses much less bandwidth, but XML provides critical context for the applications that process the data. "Data without context is meaningless, as well as subject to misunderstanding and confusion," Ambur said.

Analysts estimate that XML bloats data by about 40 percent to 50 percent, compared to marking up the same data in HTML, which is used to present data in a Web browser so users can view it.

Ambur said Interior officials are studying solutions to the XML overhead problem, adding that they see the wisdom in the accelerator approach. "It would seem to make far more sense to apply hardware-based processing," he said.

He said graphics processing has migrated to hardware because specialized processors can better complete the task than software on a general-purpose processor. Ambur thinks XML processing may follow a similar route.

Vendors such as DataPower Technology Inc., Forum Systems Inc. and Sarvega Inc., soon to be joined by Conformative Systems Inc., provide turnkey hardware appliances that speed up the transformation of data from a proprietary format to XML and handle message routing. Many appliances, which include the necessary software, can also perform security tasks such as XML-schema validation, encryption and signature, all of which require extensive processing.

Another approach to the problem is to house the XML processing engine on a chip provided on a PCI card. Manufacturers can build the chip into network switches, routers, storage devices, servers or even PCs. DataPower began shipping chips on PCI cards in May, while Conformative and Intel Corp.'s spinoff Tarari Inc. expect to begin shipping such chips later this year.

Appliances on the market range in price from about $30,000 to $70,000. Vendors say the tools can increase XML processing speeds by 10 to 20 times compared to running the applications only on general-purpose servers that also handle other workloads. The actual effect on a given application can only be determined through testing, they say.

Ambur, who is also co-chairman of the CIO Council's XML Working Group, said he does not know of any agencies buying XML accelerators yet. But he said that as agencies make more extensive use of the language, more officials will be forced to consider alternatives to general-purpose servers for XML processing.

Government organizations that need both XML transformation and security functions will likely adopt accelerators first because the combination can add a heavy load to processing overhead. For example, last July, the Massachusetts Department of Revenue released the first phase of its WebFile service, a Web-based application that allows taxpayers to file tax returns and wage reports, make payments, review status and account history, and maintain accounts. The site relies on various Web services standards, including heavy use of XML.

The massive programming effort to build and maintain the application was difficult. "We didn't want to give our programmers the additional burden of configuring and maintaining security, which had to be bulletproof," said Jim Sheehy, security specialist at the department.

To simplify the security function, the agency purchased DataPower XS40 XML Security Gateway, which handles the XML security rules and the translation of XML-based Simple Object Access Protocol messages arriving from the taxpayers' Internet service providers.

Sheehy said the immediate need to contain security administration costs, not performance worries, was the primary

reason for buying the product. However, as more taxpayers use WebFile, performance will become an important issue.

"We expect [that] the XML appliance should help dodge any performance hits," he said.

Different architectures

Because Massachusetts revenue department officials use the appliance as a security gateway, Sheehy has configured it to operate in proxy mode. In this setting, the most typical architecture for the appliances, the accelerator receives all incoming traffic and handles only the XML security tasks. It passes everything else to the general-purpose server.

One of the advantages of using the appliance in proxy mode is that it does not require new software or changes to the Web applications.

For officials who want more control of which processes are handled by the appliance vs. the server, another option is to set the appliance as an application co-processor. In this mode, the application server receives all incoming traffic and selectively sends requests for processing to the appliance.

Before XML appliances appeared, the only way to speed XML processing was to add more general-purpose servers to the mix. Some experts question whether this option will be feasible as XML transactions increase during the next several years.

"Many agencies will hit the wall with XML," said Barry Schaeffer, president of X.Systems Inc., a systems integrator and Sarvega partner that has several federal agency customers. Eventually, XML processing may take up as much as 50 percent of server processing power, he said.

Alternatively, a single XML appliance costs about $30,000, the same as a midrange general-purpose server, and it can take on XML processing from many servers, with little additional administrative overhead, Schaeffer said.

In addition, compared to general-purpose servers, the appliances may be more reliable and easier to administer because they have no moving parts — other than fans — or disks. They do not need backup or replication software, and they contain no application software that would have to be licensed.

Knowing when it makes sense to offload XML processing to an appliance is not an exact science. Wes Swenson, chief executive officer of Forum Systems, said that volume should probably be greater than the 25 transactions per second rate. He warns that at least some of the transactions may be less than obvious. "I think most network managers would be shocked at the volume of XML moving in and out of their networks, in applications, e-mail and messaging," Swenson said.

Still, for those agencies without a pressing need, there are some advantages to putting off buying XML appliances. For one, even though a turnkey appliance reduces the need for trained people to administer it, a learning curve is always associated with unfamiliar products, which could add to the cost.

"In general, it's best to go with what you know," Swenson said. Data centers are familiar with Intel Corp. and Advanced Micro Devices Inc., which makes chips for most general-purpose PC servers. "If you can stick with general processes and possibly add software-based XML acceleration, you might be better off, at least in the near term."

Second, a general-purpose server is more flexible. If you only use half of the capacity of an XML device, the remainder is wasted. If you purchase another general-purpose workstation to handle XML processing, whatever is not needed for that can be used for other jobs.

Finally, although XML is close to maturity, it is not yet there. "XML and Web services are changing rapidly, which includes the applications, the standards, use cases, new vulnerabilities, and, of course, this all affects the operations to be performed on XML," Swenson said.

There's no doubt that XML-accelerator appliances in many cases represent a

cost-effective and relatively easy approach to reducing XML bottlenecks. Even though the technology is still young, an XML appliance would be a good investment for agencies that need the XML horsepower. Agencies with less XML traffic may do well to research the products and postpone buying for six months to a year.

Stevens is a freelance journalist who has written about information technology since 1982.