Shrinking backup time
- By John x_Zyskowski
- Aug 09, 2004
Earlier this year, Oregon's Department of Consumer and Business Services' data backup routine reached its breaking point. After the amount of data the department stored doubled annually for several years, managers needed all day Saturday and Sunday each week to copy data from the production servers to backup tapes for safekeeping.
The data growth rate was enough of a problem, but a scheduled move of some older mainframe applications and related data to the new servers was the clincher. Either the officials would have to find a way to fit more than 24 hours into a day or they would have to cut the time it took to do backups.
Lacking the option of cracking the space/time
continuum, department officials found a way to
shrink the backup window in the form of a new disk-based backup system from Overland Storage
Inc., one of the numerous vendors that offer such a solution.
Because a spinning random-access disk can write and retrieve data much faster than a tape drive, which reads and writes data sequentially, the new system allowed the department to cut backup time by 75 percent to just 12 hours, said Ellen Murphy, a network analyst and backup administrator for the department.
Disk-based backup is one of several new storage technologies that are redefining how government offices of all sizes can protect data from everyday errors and more catastrophic threats. These new systems usually perform better and often cost less to buy and manage than traditional backup solutions.
Other new technologies in this group include storage network gear that uses the new Internet SCSI (iSCSI), turnkey storage-area network platforms and data copy solutions that consume far less storage resources than before and that can replicate data across long distances to remote disaster recovery sites.
Some of these capabilities are not
entirely new, having been available
before to big information technology shops with budgets to match and pressing needs to justify the hefty investments. What has changed is the price tag. Even small agencies can afford remote disaster recovery capabilities, speedy backups and system restores that occur in seconds or minutes instead of hours or days.
But it's not only average Joes who benefit from this new crop of storage solutions. Large shops also are using the new gear
to cut their data protection costs by creating tiered storage
architectures that match service levels based on business requirements with the appropriate technical capabilities and costs.
"It's almost been revolutionary in that we used to say traditionally that we backed up to tape, period, end of story," said Dianne McAdam, senior
analyst and partner at the research and advisory firm Data Mobility Group LLC. "Now we have lots of options that we can use in stand-alone mode or in
conjunction with tape to streamline or speed up the whole backup and restore process. And because it's getting very competitive out there with so many vendors, pricing has been competitive, too."
Market observers have long imagined a day when the price of disk drives would come down to a level at which IT managers would consider using hard drives to store backup and archival data. This is no knock on tape, which has served managers adequately for about 50 years and is still unmatched on price for the capacity. But tape has its issues.
For one, tape is a serial technology. Just like finding a certain scene in a movie on a videotape, locating a specific piece of data requires forwarding or rewinding the data tape, which takes time and causes wear and tear. And unlike diamonds, tape is not forever. The magnetic charges recording the data degrade, necessitating periodic replacement of old tapes. In fact, you cannot even be sure the original data you have written is sound.
"When you back up to tape, you never know if the data
was properly written until you try to use it to restore lost data," said Bob Farkaly, director of worldwide disk product sales at
Overland Storage Inc., which sells disk and tape backup products. When the boss accidentally deletes an important report, this
is not the time to discover that the file server backup copy was not written properly.
Those who foresaw the rise of disk-based backup are seeing their prediction come true. But many were wrong on
at least one important point: They
expected the disks to be high-end technologies such as Fibre Channel or SCSI, which have the performance and reliability characteristics assumed to be required for the job. It was just a
matter of waiting until their prices fell sufficiently.
As it turns out, the disks that are used in most of the disk-based backup solutions are Advanced Technology Attachment (ATA) or the newer Serial ATA type. ATA disks have been standard in desktop computers for years. They have always been less expensive than the Fibre Channel and SCSI disks — as much as one-third the cost — but they also had a reputation for being less reliable and therefore inappropriate for mission-critical use.
But ATA is proving to be a worthy player in the enterprise space by using techniques such as Redundant Array of Independent Disks (RAID) that guard against data loss from disk failure and by adding new high-end features such as hot-swappable parts for replacing faulty disks without bringing down the whole disk array.
The other point on which pundits were not exactly right is the role that disks would play in backup. Instead of replacing tape, most disk-based backup systems are used as a middle tier, a
so-called nearline staging ground for backup data as it moves from online primary disks to its eventual resting place on off-
The justification for buying a disk-based backup system to play this role is twofold, said McAdam: Backups are done much quicker because disks are faster than tape, and data restores also are quicker, as long as a copy of the data to be restored is still on the disk backup unit. Farkaly said that about 90 percent of Overland's disk-based backup customers use the company's devices in the nearline role in conjunction with tape, rather than as a tape replacement. McAdam said her group advises organizations to buy enough disk capacity to keep at least one or more full system backup copies on their units at all times.
After backup data is initially written to disk, at some point it is written from the disk-based backup unit directly to tape, an operation that has the added benefit of running in the background and not affecting end users' access to the primary data on servers.
In Oregon, Murphy's department makes two tape copies from the disks. One tape cartridge is stored in the office for use when routine data restores are required, and the other is shipped to a remote facility and available for disaster recovery purposes in the event systems are knocked out at the primary facility.
Murphy said officials plan to spend about $40,000 for two of Overland's REO disk-based backup units and one of the vendor's new NEO series tape libraries. "This should meet our plan of getting through the next three years without having to make more changes," she said.
Elsewhere, government entities are combining disk-based backup with other relatively new storage technologies to transform backup and disaster recovery routines. For example, officials at Michigan's Information Technology Department are tapping disk-based backup plus distance replication and new storage networking gear to overhaul data-protection schemes.
Three years ago, Michigan IT staff handled backup by writing data to tape drives located alongside the mainframe and servers housed in the state's two production data centers near Lansing. After the data was written, the tapes would be taken off the drives and hand-delivered about seven miles away to a third data center, which serves as the disaster recovery site. There the tapes would be stored, available to restore lost data or downed systems, operations that could take several hours.
This routine began to change with the deployment of new storage networks and software that enables data copies to be transmitted electronically. The first piece of the puzzle is EMC Corp.'s Symmetrix Remote Data Facility. This software allows officials to simultaneously make two identical copies of data in a process called mirroring, which stores another copy at the production site and one at the disaster recovery site.
If a problem occurs with data at the primary site, the copy from the disaster recovery site can be available immediately for essentially uninterrupted operations.
The seven-mile gap between sites is bridged using a state-owned fiber-optic line and Computer Network Technology Corp.'s Spectrum 2000 dense-wave division-multiplexing product, which splits the line into independent data-carrying channels. The state stores the mirrored data on four Symmetrix disk arrays, two at the production sites and two at the disaster recovery site.
"We use the mirroring solution for those applications that have [around-the-clock] business requirements, such as state police and law enforcement data, child support data, and treasury data," said Carol Steele Sherman, director of data center operations at the department.
For other applications with less demanding requirements, officials created a separate Tier 2 set of backup and recovery routines using less expensive technologies that still offer more powerful capabilities than tape-only methods.
The first part of this process involves new EMC Clariion ATA-based disk arrays at the two production facilities for nearline, disk-based backup. The Clariions handle full backups of the entire production system and frequent incremental backups of recently changed data. As a result, the state minimizes its exposure to data loss, because the incremental disk-based backups occur far more often than before when only tape was used.
"Our service-level agreements for restoration of data are significantly improved in that we can restore data 15 times faster now than we could before with tape" by using disk-based backup, said Rick Hoffman, storage manager for the Michigan department. "We're also able to see a [six-fold] decrease in backup times."
The backup data on disks then makes its way to tape. But unlike when tape was written on drives at the production site and then hand-carried to the disaster recovery site, a new Fibre Channel-based tape system at the disaster recovery site handles the tape writing responsibilities remotely.
The state uses a Fibre Channel switch from McDATA Corp. to send the backup data via the state-owned fiber lines to the tape drives. Restore operations also can be conducted via the fiber link.
Another relatively new technology that Michigan officials are using is a method for rapidly copying data to disk known as snapshot or copy on write.
EMC offers a couple of versions of
the copy software depending on the
storage platform used with it. They all achieve similar results: data copies that
are only 30 percent of the size of the original dataset, said Jon Wehse, business
continuity practice manager for EMC's federal division.
"Snap copies take less storage consumption, so you can make more of them more often," Wehse said. That decreases the risk of losing data during unplanned system outages.
Snapshot products are also available from many other companies. For example, storage specialist Veritas Software Corp. offers FlashSnap capabilities in its Foundation Suite line, and Network Appliance (NetApp) Inc. offers a family of Snapshot Technology products for use with the company's network-attached and nearline storage devices.
Mark Weber, vice president of NetApp's federal systems division, said using snapshot capabilities in conjunction with nearline disk storage is a popular combination with the company's intelligence agency customers. "They can't afford the tape recovery time," he said.
Another new direction storage companies such as EMC, FalconStor Software Inc., Hitachi Data Systems, Veritas and others have taken involves doing data replication over long distances, using asynchronous methods. The problem with using synchronous copy methods across long distances is that the small lag or latency that a wide-area network introduces is enough to cause trouble for systems that require split-second response times when handling transactions.
Last year, EMC introduced an asynchronous distance data copy product for the company's Symmetrix arrays called SRDF/A. Using this approach, remote copies are created soon after — even seconds after — the production data is written to the primary storage device.
In exchange for having backup copies that are out of sync by a matter of minutes, IT administrators get the advantage of being able to situate the backup site across continents or oceans from the primary data center because network latency is no longer an issue, Wehse said. It also requires significantly less network bandwidth than synchronous mirroring.
"Disaster recovery is a huge trend and replication is a big part of it, but everyone always thinks about terrorism" as the threat, said Paul Smith, vice president of government operations at Veritas. Interruptions to government services that result from more routine occurrences, such as flooding and equipment failures, are far more likely he said.
Safety in numbers
Other vendors are exploring ways to achieve greater tolerance for equipment failures and other problems using clustered approaches to storage.
For example, Xiotech Corp. introduced its Magnitude 3D storage system last
year. It allows IT administrators to create one centrally managed pool of storage
and physically separate the system's storage controllers and disk drives. The system can recognize problems such as faulty or knocked-out controllers and shift traffic automatically to another available component, which can be located up to 300 meters away.
"Other vendors also offer a controller failover, but the controllers are housed in the same rack, so if you lose the rack, you lose both controllers, which is no help," said Rob Peglar, vice president for technical solutions at Xiotech.
Officials in the Minnesota Secretary of State's Office have been using a pair of Xiotech's previous-generation Magnitude systems for three years, and they recently purchased the Magnitude 3D.
Officials are not using the 3D's distributed component capabilities yet, but they use Xiotech's Geo-Replication Services software to replicate critical business filings and election data between a Magnitude 3D in one building and an older Magnitude in another, said Tom Lodemeier, the office's network administrator.
"If our main building goes down, servers in the other building automatically grab the [mirrored disks], and everything is back up quickly," Lodemeier said.
Indeed, being able to get systems back online quickly following an outage is the goal driving new storage strategies governmentwide.