Tape: Still reeling them in

Innovations up usefulness, cost-effectiveness

To hear some disk array manufacturers tell it, tape-based data storage is dead, a victim of shrinking backup windows and the need for quicker recovery from system crashes. But leaving aside self-serving proclamations from companies that want to sell more disk arrays, a look into most government information technology operations makes it clear that tape remains a viable and important component of data protection and management strategies.

In fact, new technologies such as tape virtualization are helping make tape more useful and cost-effective than ever, while others, such as serverless backup, hold promise but still must be proven in the field.

Tape virtualization actually refers to two distinctly different technologies. Systems from IBM Corp., Storage Technology Corp. (StorageTek), Neartek Inc. and others provide virtual tape in the form of a disk cache (a fast process) that temporarily holds data until it is written to tape media (a slower process). Vendors say this approach optimizes the speed of disk transfers to tape and the physical use of the tape media itself — claims supported by the experience of at least one government IT department (see Case Studies).

Those systems are not to be confused with tape-virtualization strategies such as tape pooling. In those strategies, multiple servers, or hosts, share multiple tape devices, which are represented to the servers as a set of logical devices, each one accessible to store a server's data.

StorageTek provides one example of how this pool is created and managed through an intermediary device, the StorageNet 6000 storage domain manager. The SN6000 provides a single point of cabling for multiple tape devices, which it represents to attached hosts as logical tape devices, regardless of the device type or manufacturer.

The hosts in turn connect to the SN6000 and use the attached tape devices through a single software driver, eliminating the need to store multiple drivers for different tape drives or libraries on each host. The SN6000 provides all of the necessary translations and scheduling to ensure that the libraries are used efficiently.

On the flip side, companies such as Veritas Software Corp. offer tape- pooling solutions that use individually configured, host-based software drivers. Installed on every host using the tape pool, those solutions have the advantage of being integrated with Veritas' backup software product. They do not require an additional hardware component, as in the StorageTek solution.

A drawback of the software-based solution, however, is the often cumbersome manner in which tape-pool scheduling must be managed separately on each server. Scheduling is required so that two servers don't try to write data simultaneously to the same tape drive.

Scott Carson, vice president of storage research at OTG Software Inc., said his company's MediaStor product solves that problem by providing the ability to configure host-based software from a single console.

Tape pooling isn't for every government IT shop, however. For some, such as the U.S. Postal Service Data Center in San Mateo, Calif., the sticking point is security. The center uses a number of StorageTek tape silos as backup repositories for some of the center's Unix-based and Microsoft Corp. Windows NT-based applications, but pooling is not viewed as a good fit for now.

The center hosts more than 200 applications in a unique infrastructure — each application is deployed in its own subnetwork — and firewalls keep one subnetwork separate and secure from all others, according to Glenn David, a senior systems analyst responsible for open systems tape at the USPS facility.

"Security is critically important, not only for data stored on disk, but also for data stored on tape," David said. "For this reason, we've looked at shared storage networks, but we have decided we are not interested in them at present. We have looked at tape virtualization and pooling products like StorageTek's SN6000 that would allow tape devices to be shared by multiple applications, but we are not doing anything with it in the short term at least."

Advanced Digital Information Corp. offers another take on tape virtualization. Echoing the views of David, ADIC Marketing Director Steve Whitner said his customers prefer to back up their data to a dedicated device — not to a pool. To enable multiple hosts to share their library product, ADIC segments "the library into smaller virtual units for exclusive access and use by individual servers," Whitner said.

Serverless Backup

Serverless backup is another new tape-related technology that is attracting a lot of interest, although there are few products available today — and still fewer customers. The vendor community is split on the efficacy of the technology, and many describe burgeoning products as solutions in search of a problem.

The notion of serverless backup — moving data from disk to tape without consuming processing cycles on the server that hosts the backup software — is an ideal.

"The industry is always trying to find ways to move the [millions of instructions per second] associated with a backup application to the least- expensive resource on the network," said Bob Bryar, chief technical architect with the storage services division of Comdisco Inc. "Serverless backup involves finding a way to make the [network] switch or the storage array or tape drive — whichever is cheapest — perform all the heavy lifting in the disk-to-tape transfer." Advocates of the technology say it solves a real problem in high-availability application environments, where you don't want to risk a mission-critical server going down or even redirecting a portion of its processing power just to do a routine backup to tape. Its backers concede that, for now, serverless backup is still a very high-end solution.

"This is really advanced technology, not for mom-and-pop jobs, but for the enterprise," said Mike Adams, product marketing manager for Veritas, whose NetBackup ServerFree Agent is among the first serverless backup solutions on the market. "But over time, with current rates of data growth, there will be a lot of movement to this technology."

A contrary view is offered by Fabrice Helliker, vice president of engineering for BakBone Software Inc. According to Helliker, serverless backup is an imprecise term that encompasses many product architectures, including disk array-based, block-level data snapshots, the off-loading of data movement responsibilities to specialized protocols such as Network Data Management Protocol (NDMP) and a number of proprietary vendor schemes that promote device-specific capabilities, such as the Veritas approach. The problem, Helliker said, is that none of those methods is truly serverless.

"Third-party copy, or snapshotting, is the closest to serverless," he said. "The array serves as the server and creates images or snapshots of its data that can be migrated to tape. Very little information about the data is stored so there is no granularity in this approach that enables the recovery of a specific file." Information about data, including how it is formatted and where it is located, is known as metadata.

The NDMP solution primarily applies to network-attached storage devices and enables the embedded server on the NAS box to do some of the work in the backup, according to Helliker. However, metadata is still written to the server hosting the backup software, so it's not entirely serverless. That is also true of other solutions that use proprietary, rather than NDMP, data movers, he added.

Helliker noted that improvements in backup products have reduced the load that applications impose on servers. "Getting rid of data-transfer management responsibilities doesn't really free up many server cycles," Helliker said. "The metadata must still be recorded in the backup software database, and that is where the cycles get consumed."

Several vendors said the marketing hype around serverless backup offerings from Veritas and others is creating pressure for this "capability" to be added to their products, even if it is not truly useful.

Toigo is an independent consultant and author specializing in business auto-mation issues. He can be reached via his Web site at www.toigoproductions.com.