A cost of consolidation?

Despite the push to consolidate data centers in Washington, D.C., and at state capitols nationwide, some would-be centralizers might want to recalculate the risks involved after the recent meltdown of state services in Virginia.

The failure of a single storage system Aug. 25 at a data center near Richmond took down 485 of the state’s 4,800 data servers, knocking out services at three state agencies for more than a week and affecting operations at two dozen others.

State Chief Information Officer Sam Nixon told the Washington Post that the crash was caused by the dual failure of a pair of redundant, 3-year-old memory cards, one of which was supposed to back up the other.

"The thing that is never supposed to happen happened," Nixon said.

Virginia officials said EMC, the company that designed and supplied the storage system, told them the outage had never occurred before in 1 billion hours of system use.

The debacle turns a spotlight on the continuity-of-operations risks associated with consolidating multiple IT operations on fewer, larger data processing and storage systems, as Virginia has done through a $2.4 billion contract it awarded to Northrop Grumman in 2003 and renegotiated this past spring.

The incident also calls into question the design and reliability of supposedly fault-tolerant systems when real-world problems trigger them into action. Information Age reported that a situation similar to the Virginia outage occurred earlier this year when e-mail hosting provider Intermedia lost service to many of its customers after a problem on its EMC storage-area network.

Intermedia officials said a backup storage controller took over when the primary one failed because of a system bug, but the backup device had insufficient capacity to shoulder the entire workload. The company said it has taken corrective action to ensure that there is enough spare capacity on the storage-area network to continue operation in case of future failures.

Virginia CIO Nixon told local TV station NBC12 that he remains committed to centralized IT services despite the recent snafu. But the news will likely have government IT officials elsewhere taking a second look at their system designs and procedures to make sure they can continue operations in case of unexpected problems.

The Fed 100

Save the date for 28th annual Federal 100 Awards Gala.

Featured

  • computer network

    How Einstein changes the way government does business

    The Department of Commerce is revising its confidentiality agreement for statistical data survey respondents to reflect the fact that the Department of Homeland Security could see some of that data if it is captured by the Einstein system.

  • Defense Secretary Jim Mattis. Army photo by Monica King. Jan. 26, 2017.

    Mattis mulls consolidation in IT, cyber

    In a Feb. 17 memo, Defense Secretary Jim Mattis told senior leadership to establish teams to look for duplication across the armed services in business operations, including in IT and cybersecurity.

  • Image from Shutterstock.com

    DHS vague on rules for election aid, say states

    State election officials had more questions than answers after a Department of Homeland Security presentation on the designation of election systems as critical U.S. infrastructure.

  • Org Chart Stock Art - Shutterstock

    How the hiring freeze targets millennials

    The government desperately needs younger talent to replace an aging workforce, and experts say that a freeze on hiring doesn't help.

  • Shutterstock image: healthcare digital interface.

    VA moves ahead with homegrown scheduling IT

    The Department of Veterans Affairs will test an internally developed scheduling module at primary care sites nationwide to see if it's ready to service the entire agency.

  • Shutterstock images (honglouwawa & 0beron): Bitcoin image overlay replaced with a dollar sign on a hardware circuit.

    MGT Act poised for a comeback

    After missing in the last Congress, drafters of a bill to encourage cloud adoption are looking for a new plan.

Reader comments

Thu, Sep 23, 2010 Allen

The story is a bit more interesting as four days of data were lost. Lesson 1 - expect the unexpected be it Titanic or computer equipment. Lesson 2 - when things break, ensure all the pieces can be put back together. Here an official source is needed for the VA. incident. Lesson 3 - use your backup system from time to time. Our auditor was amazed, yes amzed, that we use last nights backup to resfresh test system every day. Hence we know, know - not think, the restore works. May all this give voice to good people crying "test and train now" If money is an issue - they lack understanding of the propblem - IMO. Kind regards.

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above

More from 1105 Public Sector Media Group