Tape: Still reeling them in
To hear some disk array manufacturers tell it, tape-based data storage is
dead, a victim of shrinking backup windows and the need for quicker recovery
from system crashes. But leaving aside self-serving proclamations from companies
that want to sell more disk arrays, a look into most government information
technology operations makes it clear that tape remains a viable and important
component of data protection and management strategies.
In fact, new technologies such as tape virtualization are helping make
tape more useful and cost-effective than ever, while others, such as serverless
backup, hold promise but still must be proven in the field.
Tape virtualization actually refers to two distinctly different technologies.
Systems from IBM Corp., Storage Technology Corp. (StorageTek), Neartek Inc.
and others provide virtual tape in the form of a disk cache (a fast process)
that temporarily holds data until it is written to tape media (a slower
process). Vendors say this approach optimizes the speed of disk transfers
to tape and the physical use of the tape media itself claims supported
by the experience of at least one government IT department (see Case Studies).
Those systems are not to be confused with tape-virtualization strategies
such as tape pooling. In those strategies, multiple servers, or hosts, share
multiple tape devices, which are represented to the servers as a set of
logical devices, each one accessible to store a server's data.
StorageTek provides one example of how this pool is created and managed
through an intermediary device, the StorageNet 6000 storage domain manager.
The SN6000 provides a single point of cabling for multiple tape devices,
which it represents to attached hosts as logical tape devices, regardless
of the device type or manufacturer.
The hosts in turn connect to the SN6000 and use the attached tape devices
through a single software driver, eliminating the need to store multiple
drivers for different tape drives or libraries on each host. The SN6000
provides all of the necessary translations and scheduling to ensure that
the libraries are used efficiently.
On the flip side, companies such as Veritas Software Corp. offer tape-
pooling solutions that use individually configured, host-based software
drivers. Installed on every host using the tape pool, those solutions have
the advantage of being integrated with Veritas' backup software product.
They do not require an additional hardware component, as in the StorageTek
solution.
A drawback of the software-based solution, however, is the often cumbersome
manner in which tape-pool scheduling must be managed separately on each
server. Scheduling is required so that two servers don't try to write data
simultaneously to the same tape drive.
Scott Carson, vice president of storage research at OTG Software Inc.,
said his company's MediaStor product solves that problem by providing the
ability to configure host-based software from a single console.
Tape pooling isn't for every government IT shop, however. For some,
such as the U.S. Postal Service Data Center in San Mateo, Calif., the sticking
point is security. The center uses a number of StorageTek tape silos as
backup repositories for some of the center's Unix-based and Microsoft Corp.
Windows NT-based applications, but pooling is not viewed as a good fit for
now.
The center hosts more than 200 applications in a unique infrastructure each application is deployed in its own subnetwork and firewalls keep
one subnetwork separate and secure from all others, according to Glenn David,
a senior systems analyst responsible for open systems tape at the USPS facility.
"Security is critically important, not only for data stored on disk,
but also for data stored on tape," David said. "For this reason, we've looked
at shared storage networks, but we have decided we are not interested in
them at present. We have looked at tape virtualization and pooling products
like StorageTek's SN6000 that would allow tape devices to be shared by multiple
applications, but we are not doing anything with it in the short term at
least."
Advanced Digital Information Corp. offers another take on tape virtualization.
Echoing the views of David, ADIC Marketing Director Steve Whitner said his
customers prefer to back up their data to a dedicated device not to a
pool. To enable multiple hosts to share their library product, ADIC segments
"the library into smaller virtual units for exclusive access and use by
individual servers," Whitner said.
Serverless Backup
Serverless backup is another new tape-related technology that is attracting
a lot of interest, although there are few products available today and
still fewer customers. The vendor community is split on the efficacy of
the technology, and many describe burgeoning products as solutions in search
of a problem.
The notion of serverless backup moving data from disk to tape without
consuming processing cycles on the server that hosts the backup software is an ideal.
"The industry is always trying to find ways to move the [millions of
instructions per second] associated with a backup application to the least-
expensive resource on the network," said Bob Bryar, chief technical architect
with the storage services division of Comdisco Inc. "Serverless backup involves
finding a way to make the [network] switch or the storage array or tape
drive whichever is cheapest perform all the heavy lifting in the disk-to-tape
transfer."
Advocates of the technology say it solves a real problem in high-availability
application environments, where you don't want to risk a mission-critical
server going down or even redirecting a portion of its processing power
just to do a routine backup to tape. Its backers concede that, for now,
serverless backup is still a very high-end solution.
"This is really advanced technology, not for mom-and-pop jobs, but for
the enterprise," said Mike Adams, product marketing manager for Veritas,
whose NetBackup ServerFree Agent is among the first serverless backup solutions
on the market. "But over time, with current rates of data growth, there
will be a lot of movement to this technology."
A contrary view is offered by Fabrice Helliker, vice president of engineering
for BakBone Software Inc. According to Helliker, serverless backup is an
imprecise term that encompasses many product architectures, including disk
array-based, block-level data snapshots, the off-loading of data movement
responsibilities to specialized protocols such as Network Data Management
Protocol (NDMP) and a number of proprietary vendor schemes that promote
device-specific capabilities, such as the Veritas approach. The problem,
Helliker said, is that none of those methods is truly serverless.
"Third-party copy, or snapshotting, is the closest to serverless," he
said. "The array serves as the server and creates images or snapshots of
its data that can be migrated to tape. Very little information about the
data is stored so there is no granularity in this approach that enables
the recovery of a specific file." Information about data, including how
it is formatted and where it is located, is known as metadata.
The NDMP solution primarily applies to network-attached storage devices
and enables the embedded server on the NAS box to do some of the work in
the backup, according to Helliker. However, metadata is still written to
the server hosting the backup software, so it's not entirely serverless.
That is also true of other solutions that use proprietary, rather than NDMP,
data movers, he added.
Helliker noted that improvements in backup products have reduced the
load that applications impose on servers. "Getting rid of data-transfer
management responsibilities doesn't really free up many server cycles,"
Helliker said. "The metadata must still be recorded in the backup software
database, and that is where the cycles get consumed."
Several vendors said the marketing hype around serverless backup offerings
from Veritas and others is creating pressure for this "capability" to be
added to their products, even if it is not truly useful.
Toigo is an independent consultant and author specializing in business auto-mation
issues. He can be reached via his Web site at www.toigoproductions.com.