Spinning a new tune

Agencies interested in disk-based backup face a slew of options

Agencies have many options to choose from to improve their data protection routines with low-cost, disk-based backup. The market for disk-based backup has become a bowl of alphabet soup: D2D, D2D2T, VTL, CDP and more.

Many factors can influence an agency's choice. Budget, requirements for speed of recovery and currency of the recovered data, and the willingness to change backup processes and specific objectives, such as disaster recovery or long-term archiving, can be part of the decision.

"There is a clear movement to disk-based backup because of its speed and increasing affordability," said Greg Schulz, senior analyst at the Evaluator Group. "Tape itself is being focused more on long-term [data] retention."

For example, the Washington State Transportation Department's South Central Region Office of Information Technology opted for a disk-to-disk-to-tape solution (D2D2T) as part of its server and storage consolidation effort. The agency manages about 700G of data and was backing up data to tape with nightly incremental backups and a full backup on weekends.

"The previous approach simply was not adequate," said Phil Johnson, network administrator for the region. For starters, the agency had no off-site disaster recovery capability, because the old backup software could create only one backup tape copy. The copy was kept in the office so that it was available when files or systems had to be restored. Also, recovering files from the tape was often time-consuming.

Department officials chose a D2D2T backup appliance from STORServer because it offers the benefits of disk-based backup, but also allows them to use the existing tape library for archival and disaster recovery purposes.

With the new system, the department's STORServer W2000 initially backs up all data to its built-in disks. The appliance then automatically copies the data to two sets of tape, one of which is shipped off-site for disaster recovery purposes.

Meanwhile, if users need a lost file restored, Johnson can quickly recover it from the appliance's disks. The appliance was easy to set up and cost $38,000, including three years of maintenance, he said.

The Stanislaus County Health Services Agency in Modesto, Calif., also added disks to its tape-only backup process, adopting Quantum's DX30 backup appliance.

"We wanted this for the ease of restoring data," said Ken Hoach, IT manager for the agency. "We were doing backups, but we had problems at many points in the process. When it came [time to] restore, it was hard to find the right tape. It took a lot of administrative work, and it wasn't reliable."

Now the organization maintains two weeks of backup data on disk, which is more than enough for recovery purposes. "It is rare that we have to go back more than one or two days," Hoach said. And when a user asks for a file to be restored, "it is so fast, we can do it while the caller is still on the phone."

The DX30, which cost $35,000, came with 4 terabytes of raw storage, 3.25 of which was usable. The agency backs up to the appliance using Computer Associates' BrightStor ARCserve backup software, which illustrates another point about disk-based backup. All the major third-party backup software products now support disk-based backup, so customers usually do not need to modify their existing backup routines when disks are introduced.

The agency plans to increase the DX30's capacity this year but will continue to use tape for disaster recovery purposes. "We still need to send data off-site," Hoach said. "If there was a disaster in our computer room, we'd lose the DX30, too."

For that reason, most disk-based backup options still include either the option to back up to removable media such as tape or to replicate the data to disk at another location.

There are basically three main options in the disk-based backup market, though there is some overlapping functionality among them:

n D2D refers to any process that copies primary data stored on disk to a backup disk. D2D can take the form of backup datasets; mirroring, in which data is written to the primary and secondary disks simultaneously; or replication, often referred to as snapshots. In the D2D scenario, tape may or may not be used as a final resting place for the backup data.

n With D2D2T products, the automatic copying of data from the backup disks to tape is built into the process. "D2D2T is the way most organizations are going," said Arun Taneja, principal consulting analyst at the Taneja Group. "We don't recommend getting rid of tape altogether. It is too radical."

n Virtual tape libraries act like conventional tape libraries, but the data is stored on disk drives. The benefits are greater reliability, because the disk is protected through redundancy, and faster restoration. It completely eliminates the delays related to finding and loading the right tape and identifying the correct file.

Other options

Officials in Forsyth County, N.C., which encompasses Winston-Salem and its surrounding areas, use two EMC CLARiiON CX600 disk arrays to store data at a production site and back it up a disaster recovery site 2 miles away.

"We use the CX600 as virtual tape at our disaster recovery site," said Alisa Phelps, an Internet analyst for the county. The county uses BakBone NetVault backup software to send the data from its primary CX600 array to the CX600 at the disaster recovery site using a fast Fibre Channel connection.

The county opted for EMC's product because it "allows us to mix [Fibre Channel] and [Advanced Technology Attachment] disk in the array," Phelps said. "At the time we bought it, they were the only one."

Like most others, the county stores its disk-based backup data on tape. The entire storage infrastructure including the two CX600 systems, EMC storage-area network equipment and new tape and backup software cost about $500,000.

"We were doing this because we really needed the SAN," Phelps said.

The disk-based backup and disaster recovery amounted to an ancillary expense.

In general, Fibre Channel disk is used to store primary production data while the less expensive ATA disk, now being replaced by a Serial ATA disk, is used as the backup target.

"It is the low cost of ATA and [Serial ATA] disk that makes backup to disk economically feasible," Schulz said.

Continuous data protection (CDP), which also uses disks, represents a completely different approach to data backup. "This is a niche technology for backing up databases," Taneja said.

Rather than back up the whole database, CDP captures the changes database applications make to the data and saves information about those changes in a log file on the disk. It relies on the existence of an initial complete copy of the database, but from that point on, it captures and saves only the changes. Should a problem arise, the system rolls the log back to where the problem started to quickly re-create the database.

The Air Force Center for Environmental Excellence took yet another approach, called televaulting. The center's problem was determining how to back up servers spread nationwide. Agency officials also wanted to be able to restore entire servers, operating systems and applications in the event of a disaster.

The conventional approach would put tape backup into each location and then continuously ship tapes among the offices to get them off-site. But the agency turned to Asigra, which sells a disk-based televaulting solution by the same name.

"We wanted speed of recovery and we wanted to be able to recover laptops too," said Ralph Miles, the center's network administrator. Asigra covers the center's multiple offices, including desktop and laptop computers, and provides fast recovery, Miles said.

The solution involves loading Asigra software onto a server at each branch office. The server collects items that need to be backed up from every system it can access in that office. The server then filters out redundant data, compresses and encrypts what remains, and sends it across a wide-area network to a disk array at the center's headquarters in Brooks City-Base, Texas. There, employees back up the data to tape, although they say that is not necessary because the data is already stored at each remote location and on the disk array at the center's headquarters.

To protect what amounts to more than 1 terabyte of data, the center only has to license Asigra for 600G of data protection, because of the filtering and compression. Miles is pleased so far with the system's performance.

"I was able to completely rebuild a 13G server that had crashed in three hours," he said. "Otherwise, it would have taken a whole day."

Disk-based backup is not yet as cheap as tape-based backup, but the price of disk-based backup continues to drop. And because tapes' portability, disk-based backup won't completely eliminate the need for tape backup until agencies figure out another way to get their backed-up data off-site.

However, with a growing number of disk-based backup options, agencies can likely find one that can give them the performance they need at a price that fits their budgets.

Radding is a freelance journalist based in Newton, Mass. He can be reached at alan@radding.net.

Two types of disk-based backup

Disk-based backup products typically follow one of two approaches.

  • Disk as tape: The disk system emulates a tape library, which eliminates any need to change existing applications, backup software or backup processes.
  • Disk as disk: This approach takes advantage of the capabilities of disk, including random access, in which data can be retrieved in nonsequential order, and concurrent read/write, in which the disk can simultaneously handle reading and writing.

— Alan Radding

Sizing up your backup needs

Your agency's operational requirements and a backup system's capacity determine two important metrics. Assigning a value to them will help you select the right backup solution.

Recovery time objective: A measure of the speed it takes to recover backed-up data, including finding, loading and restoring the data.

Recovery point objective: A measure of how up-to-date the recovered data will be.

— Alan Radding

2014 Rising Star Awards

Help us find the next generation of leaders in federal IT.

Reader comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above