Solving virtualization's storage problem

Sometimes when you fix one problem, you create a new one. That is often the case when you turn to server or desktop virtualization to reduce hardware, power and administrative costs only to end up with a host of unexpected new headaches — and expenses — due to a now over-burdened data storage infrastructure.

When you virtualize 10 physical servers or 100 desktop computers and run them all as software on a single big server, the storage system connected to that server might not be able to support the unique demands of the new workload, particularly as the density of virtual machines increases.

The resulting bottleneck is a prime culprit in the unpredictable system latency that seven out of 10 federal IT managers cite as a problem when they virtualize servers in their data centers, according to a MeriTalk report published last year titled “Consolidation Conundrum.”

Latency equals poor performance, and poor performance means frustrated and unhappy users. And that’s only one, and not even the most costly, of the storage problems that server and desktop virtualization can create. The traditional solution of adding another server host or more or faster disk storage is not ideal, said George Crump, president of consulting firm Storage Switzerland.

“The [return on investment] on server virtualization is impressive upfront, but the moment you have to buy another host, you start to eat into that ROI,” Crump said. “And you’re just kicking the real problem down the road.”

The better bet is to identify the precise causes of performance problems and apply more targeted and cost-effective fixes for them, Crump and others say. Solutions will likely involve some amount of storage architecture redesign and possibly one or more of the new storage products designed to improve performance of virtualized server and desktop environments, such as solid-state storage and flash memory, storage virtualization and — the latest of the bunch — storage hypervisors.

That approach will still involve new investments, but if it’s done properly, the end result will be a shareable, more flexible storage infrastructure that is better able to accommodate the demands of virtualization.

Why it matters

The storage challenges associated with virtualization are only going to grow in the next few years. On average, government IT professionals expect virtualized workloads to almost double in the next four years — from 37 percent to 63 percent, according to a MeriTalk report titled “Virtualization Vacuum: The 2012 Government Virtualization Study.”

There are several storage-related problems that crop up in virtual environments and affect performance and costs, experts say.

Input/output bottlenecks. When you put several virtual servers on a single physical server, each will likely have its own pattern for reading and writing data to underlying hard disks, depending on application behavior. Collectively, the resulting, highly random I/O stream funneled through a single server’s storage pipe can wreak havoc on overall performance.

“You can go from 60 mph to 5 mph, with little predictability about when it will happen,” Crump said.

Similar challenges afflict servers that host virtual desktops, with the added problems of coordinating I/O-intensive virus scans and boot storms, which occur when lots of users start up their virtual desktops at the same time — for instance, at the beginning of the workday.

Storage utilization blind spots. A less appreciated but potentially more costly challenge is ensuring the efficient use of expensive storage capacity in a virtual server environment, said Jim Damoulakis, chief technology officer at GlassHouse Technologies, a data center consultant with numerous federal customers.

Before server virtualization started to take off, storage administrators used to keep a close watch on how storage was used in data centers. Now, however, it is increasingly common for IT departments to let virtual server administrators request one big pool of storage capacity from the storage administrator that they can later parcel out at will among their virtual servers, he said.

“If it’s not a priority for the [server] virtualization folks or if they don’t have the skills to monitor and manage it, [you run the risk of] increasing the inefficiencies of the storage utilization,” Damoulakis said. That inefficiency leads to organizations buying excessive storage capacity as insurance against the risk of important applications running out of storage space.

It’s a costly premium to pay when keeping up with actual growth is already so expensive. For example, the Army’s Acquisition, Logistics and Technology Enterprise Systems and Services reported that its data storage capacity increased 800 percent in the past four years. The agency is shopping for a storage virtualization solution to better manage that capacity.

The fundamentals

There are various ways to address the storage bottleneck associated with server and desktop virtualization. Some solutions are new and purpose-built for the task, while others involve more traditional tools enlisted to focus on the specific needs of virtualization.

The product label getting a lot of recent attention is the storage hypervisor. The name is a play on the server hypervisor, that key enabling component of server and desktop virtualization software that abstracts a physical server into a set of resources that are allocated to multiple virtual machines. Likewise, a storage hypervisor abstracts multiple physical storage assets (such as disks and cache) into a single virtual resource that can be allocated in any number of ways for different purposes.

“The term ‘storage hypervisor’ is new and not generally accepted in the industry or agreed upon,” said David Hill, founder of the consulting firm Mesabi Group. “Other terms, such as ‘virtual storage,’ may be used instead with different approaches that yield the same basic capabilities.”

The storage vendor Virsto is one of the chief promoters of the storage hypervisor label and uses it to describe its software product that was specifically designed to improve the efficiency and performance of storage resources in server and desktop virtualization environments. The company uses the term because its software integrates tightly with server hypervisors from VMware and Microsoft.

Virsto improves storage performance in several ways: It transforms virtual machines’ random I/O stream into a more orderly process to reduce latency, and it uses a technique called thin provisioning that basically fakes virtual machines into thinking they have more storage committed to their needs than has actually been physically allocated, thereby raising utilization efficiency.

Neither technique is entirely new, but Virsto’s degree of integration with the server hypervisor is, Crump said. DataCore Software and IBM are also using the storage hypervisor moniker to describe products that tackle similar issues. Other vendors addressing the storage issue in server or desktop virtualization environments with purpose-built solutions include Atlantis Computing and Nutanix.

Storage tiering is another important technique, working in step with storage virtualization, for coping with the dynamic demands of server virtualization environments, said Clay Cole, business director at the Agriculture Department’s National Information Technology Center.

The center’s storage infrastructure consists of three tiers — from the top-level Tier 1 that provides the highest performance capabilities at the highest cost to Tier 3, which provides less robust performance but at a lower cost. The center manages more than 4,000 servers, approximately 70 percent of which are virtualized.

“Our virtualized storage tiers complement our virtualized server infrastructure well,” Cole said. “We routinely stick with our Tier 2 standard unless an application proves it needs something more. If and when that happens, the virtualization technologies allow us to address those requirements easily, even as the given application is running.”

Agencies can build tiered and storage virtualization capabilities using products from most storage vendors.

Cache memory and, increasingly, solid-state disks (SSDs) are also valuable tools for mitigating server and desktop virtualization bottlenecks, Crump and others say. Those tools use memory chips instead of mechanical spinning disks to store data, so they read and write data extremely fast and, therefore, can relieve intense I/O pressure. An increasingly popular strategy is to integrate SSDs into a tiered storage architecture at what’s typically called Tier 0, an approach supported by many storage vendors.

The hurdles

Although a growing number of storage solutions can help mitigate virtualization’s negative impact, agencies need to be careful that their storage fixes don’t eat into virtualization’s savings. For example, SSDs are far less expensive than they were a few years ago, but they are still 20 to 30 times more costly than traditional hard disks so they need to be deployed selectively.

Likewise, agencies interested in new turnkey storage solutions purpose-built for server and desktop virtualization environments need to weigh how those products mesh with their existing storage gear and upgrade plans.

Nevertheless, some investment in storage modernization is inevitable and desirable, Hill said. “Explosive data growth and tight budgets can no longer keep up with just the falling cost of storage alone,” he said.

Next steps: Cover the basics

Server and desktop virtualization offer compelling benefits, but they can also be disruptive technologies that raise costs for related infrastructure, such as storage, and drag down end-user satisfaction if system performance suffers. Here are some steps that can help mitigate those negative effects without major new spending.

1. Read the manual. All the server and desktop virtualization vendors include guidelines for optimizing systems that will be virtualized and cover issues such as how to configure virus scans and other routine chores that can turn ugly fast on a virtual server. Follow that advice, said Ken Liska, a virtualization specialist in NetApp’s public sector division. “These guidelines will have the biggest effect on storage performance,” he said. “Making a few best-practice maneuvers in your [virtual machines] could reduce the amount of money you spend on disks greatly.”

2. Do your homework. Take the time to evaluate the existing workload and understand which virtual servers will coexist nicely on the same physical host before you deploy, said Donna James, president of Accent Global System Architects, which recently helped the Internal Revenue Service virtualize more than 1,600 servers at 13 data centers. “You can have all the hardware resources in the world and still have contention issues if you have not architected the solution properly,” she said.

3. Use testing and diagnostic tools. A good performance reporting tool is worth its weight in gold for getting to the bottom of virtualization bottlenecks and targeting cost-effective fixes. A new benchmark tool called VDI-IOmark allows agencies to compare how different storage products handle desktop virtualization workloads. Liska said agencies can also use tools such as those from Liquidware Labs to help plan and test virtualization deployments and measure storage impact prior to production.

Who's Fed 100-worthy?

Nominations are now open for the 2015 Federal 100 awards. Get the details and submit your picks!

Featured

Reader comments

Wed, Jun 27, 2012 Todd Sanders Silver Spring, MD

The article only addresses 1 or 2 issues but the user did not take into consideration the following points that would have been paramount to the discussion: Different vendors have been offering the types of solutions for years. IBM 8800 front end array, Hitachi HSV (be aware that the front end array will need to format data in order to take advantage of this feature), Falconstor IPStor and/or Virtualization, Datacore Storage Virtualization (mentioned), HP 3Par and EVA Cluster, and even open source FreeNAS 8.X etc. As a subset, replication, HSM, de-duplication, DR and a number of other features can be introduced to the storage equation to help resolve existing problems. HSM - Hierachical Storage Management, this technology created by IBM is used by virtualization vendors to addresses large amounts of storage. The user can automatically move data from one tier of storage to another by usage and/or policy features. This gives the user improved speed and improves efficiency. File System - The only "Storage Virtualization" tool we have found that is taking advantage of one of the most power file systems is FreeNAS (opensource tool), we have found that ZFS has improved speed, performance and transfer rate where we could take JBOD's, insert them into a disk array and by adding them to the collective, we increased transfer rates by 40-50% using this filesystem (the controllers manage this type of file system where the OS does not even know it exists). Fabric - By improving fabric connection speed and by upgrading the controllers on the backend, the user can improve speed and performance from an I/O standpoint. Brocade has improved their fabric connection speed from 4GB to 16GB. Network convergence - allows the user to converge/combine their core with the SAN by creating VSANs and VLANs to allow 100GB connection speed from end to end (Brocade was doing this last year). CNAs - "Converged Network Adapters" can be placed in host chassis so to take advantage of the improved performance when the fabric has been upgraded Backup - As mentioned before, we could use HSM to move the data to slower disks and eventually to tape. That tape in turn can be moved offsite to companies that manage tape rotation. All part of the storage virtualization tools mentioned. So to me I think the article was written fairly but I think by delving into key areas that address specific solution points would have given the reader broader perspective regarding advantages of storage virtualization "hypervisor" solutions Thank you for updating the public, it is greatly appreciated. Todd Sanders (Sr. Partner) www.itotsnetworks.com

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above