Buying storage in bulk: A new frontier

JPL's storage team delivers economies of scale to co-workers

To call NASA's Jet Propulsion Laboratory's old storage approach "distributed" would be an understatement. As space science projects developed, managers acquired computer gear, including storage. Piecemeal procurements led to a hodgepodge of storage systems. Each technology island had its own cadre of systems administrators. It was difficult, if not impossible, to isolate the cost of servers and storage.

"People were...creating silos of storage all over the lab," said Douglas Hughes, a service engineer with JPL Information Services. "When people did this, they ran into some very typical problems."

Storage procurements tended to emphasize low-cost arrays, not products providing adequate headroom. Storage silos were labor-intensive because JPL lacked sophisticated management software. Also, managers struggled to calculate total storage costs. There was no accurate way to assess storage costs.

"They weren't having the experience that they really should with a modern storage system," Hughes said.

JPL Information Services, the organization's IT arm, is trying to change that experience. For the past three years, the group has offered a Storage Service that lets customers plug into a centralized storage infrastructure. The shared service distributes the cost of hardware, management software and labor across customers.

"A small project can't buy the type of infrastructure we have," Hughes said. "It's just taking advantage of economies of scale."

Another advantage of shared storage: Project managers don't have to worry about conducting procurements or maintaining a storage environment. Customers, he said, are simply renting a service.

Arun Taneja, president of storage consultant firm Taneja Group, said the in-house storage service provider model, exemplified by JPL, will become increasingly common. "All large enterprises on the commercial side are moving in that direction right now," he said.

How it works

JPL's service offers storage in tiers. At the high end, Network Appliance's FAS940s and F820s clustered arrays provide primary online Fibre Channel disk storage. For performance that is a small step down, the lab uses Serial Advanced Technology Attachment (ATA) disk storage with NetApp's NearStore R200 and R150 products, "nearline" storage that is a good platform for disk-to-disk data backup.

The installation also includes two robotic tape libraries, IBM's 3584 and 3494 units. The storage devices connect to a storage-area network through two McData switches.

As for storage management software, IBM Tivoli Storage Manager handles the tape side. NetApp's SnapMirror and SnapRestore manage the disk side. Storage automation lets JPL manage the Storage Service with a handful of people. Hughes manages disk storage, while Luke Dahl, also a service engineer at JPL, oversees the tape operation. Hughes and Dahl each have a part-time storage engineer to help maintain the service.

JPL's storage service operates on a fee-for-service basis. "It's really the Costco [Wholesale] model," Hughes said, referring to the warehouse club where shoppers buy in bulk and divvy up their purchases among family and friends.

Customers pay $1,200 a month for each raw terabyte of Fibre Channel storage. Serial ATA storage costs $600 per terabyte. JPL sets aside storage in 1-terabyte blocks. Customers who can't use a full terabyte of storage can take what they need and sublet the remainder, Hughes said. The Storage Service also will bring together customers who can use 1 terabyte of storage.

The Storage Service also offers backup and recovery. Tape backup is billed on a per-gigabyte, per-month basis. In addition, JPL's service will arrange to house backup tapes off-site at Iron Mountain, a data protection firm.

Mark Weber, who runs NetApp's federal operation, said the Storage Service ranks among the best-run storage facilities he's observed. He said JPL has "a serious pricing model that people can rely on," adding that many organizations struggle to devise a pricing approach.

The service-level standard is straightforward for primary storage: Customers want nonstop access to their data, Hughes said. The backup and recovery service, however, varies more because of different customer requirements, which create different fee levels.

The frequency of backup sessions provides another variable. For example, a customer may want a point-in-time data snapshot every day or every few hours, for example. And then there's the on-site vs. off-site tape option. Backup and recovery policies are the primary customization factor, Hughes explained.

Shared benefits

The Storage Service offers flexibility in addition to economies of scale. A project that suddenly needs additional storage can rent the space instead of taking the time to conduct a procurement.

A crisis may also bring customers to the JPL Storage Service. In June, a project team lost a storage array and immediately needed 500G, Hughes said. He said the group will continue to rent space in such emergency situations.

But other customers view the service as a strategic move. JPL's Telecom Service, for instance, recently opted to go with the shared service to support the lab's voice-mail storage needs.

The Telecom Service's adoption of Cisco Systems' Unity Unified Messaging for institutionwide voice mail influenced its storage decision. Unity integrates with Microsoft Exchange, but JPL uses Sendmail for e-mail, not Exchange, said Beth Verish, a telecommunications engineer at JPL.

JPL deployed a pair of two-node Exchange server clusters using the Exchange 2000 and Windows 2000 Advanced servers. Data can't be stored locally in this Exchange environment, so the Telecom Service needed to pursue "off box" storage, Verish said.

In using the existing JPL Storage Service, "we don't have to worry about maintaining a storage system," she said.

Verish cited another benefit: The service's NetApp gear includes Exchange-aware software that provides a snapshot feature.

JPL's goal is to accommodate short-term renters and customers who need long-term shared storage services. "We are trying to provide an infrastructure that people can plug into and use with a lot of freedom," Hughes said.

The Storage Service aims to grow with its customers. To ensure adequate headroom, Hughes said, the storage system provides significant overhead margin. The storage operation uses about 50 terabytes of storage but has about 90 terabytes of extra nearline capacity and about 15 terabytes of extra high-end storage capacity. Only some of this is unused disk — the rest is open space in the system cabinets that can accommodate installation of new disks.

Customers also can reduce personnel costs because the Storage Service spreads its personnel costs across its customer base. Customers thus avoid the expense of obtaining a systems administrator.

Hughes said a chunk of JPL's storage is still distributed, meaning that the lab's divisions independently buy and maintain storage. However, enough lab customers use the service to make it self-funding.

But the point of the service, Hughes added, is not to make a profit. He seeks to offer service at a fair and reasonable price and remain a cost-effective option, he said.

"We have to be competitive," he said.

Looking ahead, Hughes said the Storage Service is on track to replace more of the lab's expensive storage silos. "It seems that people are beginning to look to shared services as a way of increasing their capacity and driving down their costs," he said.

Tips for running your own storage service

Douglas Hughes, a service engineer with the Information Services division of NASA's Jet Propulsion Laboratory, offered some advice to organizations seeking to deploy an internal storage service.


**********

A condo for storage

The Storage Service of NASA's Jet Propulsion Laboratory isn't just for renters.

The lab's Electronic Library Service uses the Storage Service's high-end clustered storage, nearline storage and tape backup. But the library service has also purchased its own storage "condo" that the Storage Service maintains on the library service's behalf, said Ann Bernath, a JPL service engineer.

The analogy is to a condo that someone can buy even though it is nestled in an apartment building filled with renters.

In JPL's case, the condo consists of some disk drives — about 2 terabytes worth — which are installed in the Storage Service's Network Appliance FAS940C array. The library service bought the drives and pays the Storage Service for maintenance.

The library service is now trying to move its data off the cluster, where it rents its space, to its own condo disks. As its storage demands grow, the service "can keep adding onto the FAS940C as we need space," Bernath said. Service officials said they can do better by owning their own disks and paying the Storage Service to maintain them and provide backup services.

"We have enough on our plate maintaining and operating our own service without the additional overhead of managing large storage devices," Bernath said.

— John Moore

Here are a few tips
  • Stay on top of the update cycle. A storage service's management chores include monitoring disk arrays and determining when they have hit the "tipping point," Hughes said. Disk failures become more common as storage devices age.
  • Know how and when to upgrade. A service needs to evolve the storage infrastructure underneath the clients, Hughes said. "So you bring new [arrays] online and move their data without interfering with operations," he said.
  • Keep automation in mind. Hughes said he advises future storage services to design systems as low-labor operations. As JPL adds components to its storage lineup, it should consider whether the addition will affect operations and maintenance. JPL has deployed storage management software to reduce the labor requirement. And when selecting managing software, Hughes said JPL looks for maturity. He said unproven, cutting-edge products are "just too risky for us."
  • Recruit vendors to help. "We always look at the degree of direct vendor support we get," Hughes said. He works with IBM, Insight and Network Appliance, which have a presence in the lab. "We see them face-to-face in design meetings," Hughes said.

— John Moore