A virtual conundrum
- By John Moore
- Mar 10, 2002
Sir Winston Churchill once described Russia as a "riddle, wrapped in a mystery, inside an enigma."
Today, one could apply that label to the nebulous area of storage virtualization — specifically the virtualization in a storage-area network (SAN) environment. Virtualization, in various guises, has been around since mainframes ruled, but virtualization across a heterogeneous mix of storage devices and servers in a distributed SAN is a recent twist and a puzzling one at that.
The basic premise of virtualization is simple enough: create a single pool of storage from multiple devices that can be allocated to servers as needed.
Vendors — typically small companies backed by venture capital — hawking virtualization wares promise to do that today. But there is a catch. The various product makers offer different approaches to virtualization, each claiming to provide the one true path. The noise around virtualization is such that some vendors are now reluctant to use the term.
Here's another source of uncertainty: The storage players with the size and clout to establish de facto standards in virtualization have yet to fully address the subject, according to industry watchers (see "SAN uncertainty"). In the meantime, federal users are dealing with the muddle.
"Most of the products are just coming out and don't have a huge following yet," said Curt Tilmes, a systems engineer at NASA's Goddard Space Flight Center. Tilmes has put a few virtualization products through their paces at a SAN test bed at Goddard. His verdict? "It's too new to know what works and what doesn't yet."
But industry experts insist there is value in virtualization. The approach, they say, will improve disk utilization, improve SAN management and provide the ability to mirror expensive storage devices with cheaper disks.
So should federal customers wait on the sidelines or take the virtualization plunge? Tony Prigmore, a senior analyst with Enterprise Storage Group Inc., urges the latter response. "You have to have a starting point, and now is a good time to be starting," he said. "Technology is available from multiple vendors, and it is all supportable. Start experimenting with the benefits now in a controlled way."
A key aim of virtualization is to address the access limitations of direct-attached storage. Traditionally, information technology managers have purchased a storage device for each server. But binding a storage device to a server hinders other servers from tapping unutilized storage on that device.
Virtualization, in tandem with the any-server-to-any-storage-device nature of a SAN, puts all the storage into a pool that can be allocated to servers as they need it, said Dan Watt, senior product marketing manager at Compaq Computer Corp. Improved storage utilization means organizations can reduce their dependence on acquiring additional storage devices.
Another objective of virtualization is to improve the management of networked storage. Dan Tanner, a senior analyst with the Aberdeen Group, said storage management took a "giant step backward with SAN" because the simplicity of a server managing the storage directly attached to it was lost.
Virtualization vendors, on the other hand, claim they are bringing intelligence to SAN to address management issues.
Although the goals of virtualization are straightforward, getting to the promised land of improved storage efficiency is less so. Two vendor camps have formed around network-based virtualization for SANs: in-band vs. out-of-band, or symmetric vs. asymmetric.
The role of any virtualization product is to translate or map logical storage addresses (a file organization scheme that systems administrators create in software) to physical storage addresses (the place where data actually resides on a disk). In-band (or symmetric) solutions typically accomplish this through a specialized virtualization appliance that resides in the data path between storage devices and servers making input/output requests. The metadata regarding the physical location of data and the data itself ride the same path.
In contrast, the out-of-band (or asymmetric) virtualization method provides a separate track for the metadata, via an appliance that sits outside the storage-to-host data path. Paradoxically, some elements of out-of-band aren't out-of-band. The method places virtualization software on each server in the I/O path. Thus equipped, the servers receive the metadata from the appliance and then issue I/O requests directly to the storage device.
Each approach has advantages and disadvantages. In-band supporters say their method centralizes virtualization, with one appliance supporting multiple servers and storage devices. They say the approach is less complicated than out-of-band and, therefore, easier and less expensive to maintain. Moreover, they contend that IT departments are more comfortable confining virtualization software to a specialized box, rather than piling it on the servers.
"The asymmetrical approach requires that people put software on every single host computer in the network," said Robert Woolery, vice president of corporate development at appliance vendor DataDirect Networks. "But customers don't want to mess with the [application] server."
The arguments against the in-band approach boil down to reliability and scalability. Having a box directly in the data path introduces a single point of failure, out-of-band backers contend. And as the number of devices and servers to be supported increases, so too must the number of appliances — an inelegant and expensive proposition, they say.
Out-of-band, however, keeps the appliance out of the data path. "With out-of-band, you don't have to worry about the [virtualization] server breaking or the server needing to be scaled," said Tom Isakovich, president and chief executive officer of TrueSAN Networks Inc.
Scalability is an issue at Goddard, according to Tilmes. "I'm concerned [about] how it scales," Tilmes said. "They say, 'Just keep adding more and more boxes.' " Although that tactic might be technically feasible, "the price is not cheap," he said.
Some industry observers believe customers will deploy a mix of in-band and out-of-band solutions, gaining the advantages of both. "In the long term, you're going to see a mix of both because they both have strengths and weaknesses," Woolery said.
The division of labor could have out-of-band appliances handling storage resource management, while in-band appliances take on mirroring and replication tasks. But it is anybody's guess whether in-band, out-of-band or some hybridized form will win out.
"It's just too early to call," said Enterprise Storage Group's Prigmore.
In the meantime, the lack of standards is among the key missing links in SAN virtualization.
"Virtualization is great provided there are tools to use it," said Aberdeen's Tanner. "Right now, there are no standards, and each vendor uses different tools." As a consequence, users face multiple interfaces, and programmers deal with multiple application program interfaces, he said.
The standards gap limits the very interoperability vendors are trying to advance. An organization that commits to a virtualization vendor implicitly commits to using only that vendor's storage services, such as backup and remote mirroring, if it wishes to keep management tasks simple, Tanner noted in a 2001 report. Essentially, organizations face the choice of standardizing on one vendor's solution for virtualization and storage services or using a mix of services from the virtualization vendor and the storage vendor. The latter scenario, of course, compels storage managers to maintain multiple tools.
The Storage Networking Industry Association is working on virtualization definitions and a "family tree" comparing and contrasting vendors' varied methods, said Wayne Rickard, chairman of SNIA's technical council.
He believes that SNIA members will begin to submit proposed virtualization standards this year. If the council gives its blessing, the standards-making process can take anywhere from three months to years — depending on the technical complexity and degree of industry consensus involved, he said.
File systems can also be an obstacle, according to Milton Clauser, a principal member of the technical staff at Sandia National Laboratories. The lab uses a mix of SGI machines and Linux clusters. SGI's CSFX parallel file system allows several machines to access the same files, but the Linux platform lacks a similar capability, and virtualization is not designed to bridge that difference.
As virtualization matures, industry executives advise customers to take their time. "This is a new technology, and you might want to start out small," said Spencer Sells, manager of product marketing at switch vendor Gadzoox Networks. "Walk before you run."
Moore is a freelance writer based in Chantilly, Va.
Architectures and standards for storage-area network virtualization have yet to solidify. How and when that will happen depends on greater participation by the top storage players, experts say.
Tony Prigmore, a senior analyst with Enterprise Storage Group Inc., said determining a clear winner in virtualization will be difficult until IBM Corp., Compaq Computer Corp. and other large vendors hit the market. "The big guys are just starting to get ready with virtualization," he said, noting that Compaq and IBM are gearing up their VersaStor and StorageTank products, respectively.
For storage giant EMC Corp., virtualization is "one little subset of what we see as a broader challenge in storage management," said Ken Steinhardt, director of technology analysis at EMC. The company addresses virtualization with its Automated Information Storage management strategy and WideSky storage management middleware. Storage Technology Corp., meanwhile, handles tape virtualization with its StorageNet 6000 series and plans to add the capability for disk storage.
Prigmore said the top vendors will be in "mass market education mode" by the end of this year. Wider scale deployment of SAN virtualization presumably will follow.