Jerry Lohfink and Gil Hawk on continuity-of-operations planning
Continuity-of-operations planning: NFC's Jerry Lohfink and Gil Hawk<@VM>Question and Answer<@VM>Submit a question
National Finance Center Director Jerry Lohfink
and NFC CIO Gil Hawk
offered their experience and lessons learned in an online forum on government continuity-of-operations planning on Nov. 30.<@vm>Moderator:
Welcome to everyone and thank you for participating in GCN's second online forum on continuity of operations planning in government. Jerry Lohfink, director of USDAs National Finance Center, and Gil Hawk, NFC CIO, both of whom are usually located in New Orleans, have joined us to answer your questions about how they not only restored but expanded payroll operations in the wake of Katrina. Let's get started.
Bill- Reston, VA:
What was the biggest challenge in pulling off the plan? What part was most difficult when executing verses preparing for the continutity of operations?Jerry Lohfink:
The biggest challange was the communication with employees and key stakeholdes when the communication infrastructure was basically inoperable. The next challenge dealt with accommodation of employees and dependent needs at remote sites. We were very fortunate in that our employees had 800 numbers and alternate contact information for us that facilitated a rapid deployment. And, our staff did such a great job in concentrating on the mission in spite of all the personal issues they were facing.
Patrick, Washington, DC:
Could you comment on your experience(s) working with external organizations (i.e. other than your agency) during your continuity planning and also during impacting events? How important and how easy has it been for you to involve these external organizations (banks, vendors, telcos, public health and safety, etc.) and communicate with them during an impacting event?
Thank you.Jerry Lohfink:
We were very blessed in that our customer agencies worked extremely well with us and were quite understanding of the limitations inherent in those first days of recovery. Our plan contained the priorit of service recovery, so that there were very few surprises. We also had great support from our contract partners. They rose to the challenges and helped at every turn. In addition, the outpouring of support to our employees and their families in terms of clothing, food, and other necessities was a real morale boost that paid great dividends.
what would you do differently?Gil Hawk:
We needed better communications during the early stages of recovery and restoral operations. I would have had cell phones from a different area code than that affected by the disaster and also establish internet service provider accounts for e-mail connectivity. Finally, we may need to do some work on staging of the staff for deployment.
Bill- Reston, VA:
What would you do differently based on your past experience? Have you shared your story with many of your peers?Jerry Lohfink:
We have developed and are working with the Administration's effort on best practices and lessons learned. We are also preparing our own story to offer to interested parties after the first of the year.
Jim, Cary, NC:
What changes do you anticiapte making to your existing DR plan, if any based on the recent hurricane?Jerry Lohfink:
Obtaining contact numbers located outside the greater New Orleans area to use when locating employees.
JM Drake New Orleans LA:
How can we provide better reliability/redundcy in the commerical phone systems used to provide critical data links in Wide Area Networks.Gil Hawk:
Good question. We experienced many issues with what we thought was a fault tolerant telecommunications system. I think we need to work with the telecommunications companies on alternate routing and redundancy.
Vu, Washington DC:
Congrats to you and the NFC staffs for getting the job done under such limited resources.
What is the 1 major "lesson learned" from the experience that contributed to the successful COOP/CP implementation? Also, what is the 1 significant change you would like to improve to prepare NFC?
Thanks in advance.
Thanks, the largest lesson learned deals with acquiring emergency contact numbers located out of the greater New Orleans area for our employees. Too many of us had contact numbers with relatives and friends, in addition to our cell phones, that were also impacted by the storm and were inoperable.
Bob, Washington DC:
Would you continue to recommend a hot site approach to COOP, or are you considering parallel/disbursed operations in the future?Jerry Lohfink:
I beleive the answer rests in the business you operate. If you are an essential services organization with a fairly significant risk exposure, I would recommend an internal hot site where you can control your own destiny. However, for non-essential servies, a subscription service is a viable alternate. In terms of disbursed operations, we have been exploring that option for a couple of years and will continue to do so. There are a number of issues that you face with that. We will achieve parallel data center operations within a year. The distributed/parallel work force will take longer.
Mindy Gehrt, Kansas City:
Did you think that the Hot Site Exercises you performed prior to the event prepared you for what happened? If not, what do you think would have helped?Gil Hawk:
The exercises we did (two each year) proved to be invaluable as we did restoral and recovery operations. We had just completed a drill (practice) two weeks before Katrina. These exercises allow us to actually do a restoral operation, wring out our processes and procedures, and train our personnel. This was key to our success!
Emory Tate - McLean, VA:
Hard to see how out-of-area code cell phones would help in similar circumstances (other than SATCOM phones)...Certainly, once your folks were onsite at Sungard, having a local area code would make them more reachable there by cell, but on the ground at the primary site affected by the disaster, lacking cell service, there'd still be no reachability, right? Thx, E
I was referring to after we shut down and dispersed our staffs to different locations. We were still relying on cell service in 504 area code. No services, etc. We went and got cell phones for 202 area code and had much better reliability and service. Would have been better to have this up front.
Jim, Cary, NC:
What did you do for power and how were you able to keep it on/up for the extended amount of time. Also, did you lose any hardware, have replacement hardware available or find physical accomadations a problem. Tell us what were not asking.Jerry Lohfink:
Our emergency power plant at NFC stayed in service throughout the storm. It was used by local authorities to power pumps, lights, and other emergency equipment. We then powered the operations of the site recovery teams until the local power company returned online. We lost little in terms of hardware. A couple of fans and one power unit on a server did not come back online. These caused no service disruption. The challenge looking forward is housing for NFC staff. Approximately 542 NFC staff members lost their homes. We are working through FEMA to identify transitional housing opportunities for these floks. What we are not saying is: You must have a plan. You must practice the plan regularly and under alternate scenarios. You must have a limited number of clear goals that folks can assocaite with and which will drive your decision-making. Your people must understand their role in making the mission happen. You must have understanding leadership that appreciate the improtance of immediate decision-making in times of crisis and will support you in making things happen.
Jeffries; Washington, D.C.:
Is there a plan to have the Kansas City Computer Center serve as a hot site for NFC? I worked at USDA almost 10 years ago and that was discussed as an action item to be done "soon"!Jerry Lohfink:
Our plan is to have KC as the primary site and use a yet to be determined site near New Orleans as the backup/hot site. We adopted these plans some time ago and have worked to secure funding last year to make this happen. We are in the implementation process now (began before Katrina hit).
NFC elected not to take certain archive media with them in the initial deployment to Sungard. Given that decision delayed restoral of all services (e.g., customer access to historical data), would you decide to take all materials as part of a COOP deployment in the future?Jerry Lohfink:
We made a decision to continue processing the Saturday before Katrina struck in order to complete a timely payroll and disburse key administrative payments. We knew the risk was that we did not have time to back these up and get them offsite. Given the timing of things, I would make the same decision given the same circumstances.
Isabel from Maryland:
Did your disaster recovery provider have all the IT equipment and services you needed or did you have to acquire some of that later?Jerry Lohfink:
Interesting story which may take me days to completely answer. But, to be brief, no. However, we worked 24x7 to either negotiate, acquire, loan, etc., what was necessary.
Lou, Washington, DC:
Do disaster recovery/continuity of operations plans take into account if IT people have their homes wiped out and how to house them when you move to another work area?Gil Hawk:
Our plans did not really take into account the loss of housing or the communities where our folks lived. We planned for loss of the facility (work space)but not the homes. This was one area where we had to do real-time planning to accomodate the needs of our employees. Thanks to great support from USDA Headquarters staff, we were able to make this happen.
Emory Tate - McLean, VA:
Certainly having the 202 lines helped - big change from the fast-busies!
Biggest factor in getting our customers rewickered to NFC-North -- and I have to say your folks were absolutelyl super in helping get this done quickly -- was getting them off SNA and over to IP.
Do you see that approach/architecture as "the" key enabler going forward, or just another item in the "kit bag"?
Just another item in the tool box. One thing I have found in my years in this business in that there is no one answer for all connectivity issues. As we deal with so many customers, we must always be flexible enough to architect a connectivity solution which meets their needs.
Walt, Alexandria, Va.:
Can you please describe how often and how you prepared for this type of disaster?
Did you drill annually or every six months?
Did you work wtih other agencies?
We did at least desk-top drills each year, several call-tree drills, and two actual dr drills at our subscription site each year. Also, we participated in USDA and Government-wide drills as the occasions arose.We were fairly well prepared for whatever would happen to our facility and Government infrastructure. What we never prepared for was the loss of so many employee homes.
Explain the difficulties faced by NFC when moving to the alternate and insufficient space and inadequate equipment was available due to the large influx of systems at the facility in Philidelphia.Gil Hawk:
While we faced some challenges with equipment availability in Philadelphia, it was expected since that is one of the downsides of using a subscription service. You can't always count on having exactly what you had in your primary data center. However these were the same challenges we faced on our drills, so the IT staff was well prepared to meet the challenges. The impacts were some slowdown in processing and but you need to expect those when operating in a DR environment.
Mindy Gehrt, Kansas City:
How did you motivate managers to invest resources, both money and people for Continuity of Operations planning prior to this event? and Did you use the Strohl's software for any purpose during the event?Jerry Lohfink:
From my experience, involving the essential folks (managers and employees)in the entire process, helping them understand their importance to the process and that customers were counting on them provide the "self motivating" tools that focus them on being where they need to be and doing whatever is necessary to be successful. People thrive on accomplishment. People who have a track rrecord of accomplishment take these DR/COOP opportunities as an "ultimate test" of their can-do spirit. If us management folks can supply the infrastructure, tools and just don't get in their way, they will be successful.
We invest a lot of time and effort into DR/COOP. People understand that is part of their regular job, not a sideline or a maybe do this at some point, type of thing. When people understand the importance of what they do and appreciate how much others depend upon them, they motivate themselves to get the job done.
We did use several pieces of this. We did our business impact analysis planning with a Strohl product. Our DR/COOP plans are also in Strohl's products, however, paper copies were used during the event.
How well did the DR vendor work with you?Gil Hawk:
Just like any private, public relationship there are things that worked very well and things that were a little bumpy along the way. What is important is that we were able to work though all the issues and make this a success.
Tough question but crucial for those considering contracting an external vendor for replacement equipment when a disaster is declared: Did your DR provider provide everything you contracted for and for which you'd been paying a monthly fee all along?Gil Hawk:
Our DR provider provided most of what we had contracted for. However, there were some things that were not provided (not much) and we had to work through those. Again, these issues usually pop up during drills so you learned how to deal with them.
Peggy, New Orleans:
Have you implemented any of your "lessons learned"? As far as changing the your communication issues?Gil Hawk:
We now have cell phones (Blackberries) from different area code for key staff--gives voice and data capability. Some of the other lessons learned will wait until we complete our reconstitution. Remember we are still operating in disaster recovery sites.
Vu - Wash DC:
What can an agency do to ensure the pre-appointed essential personnel will be available when you needed them the most? How do you handle situations where personnels are not available or incapable to execute essential functions? Did you had to resort to outsourcing or alternate personnel whom normaly do not perform these functions?Jerry Lohfink:
From my experience, involving the essential folks in the entire process, helping them understand their importance to the process and that customers were counting on them provide the "self motivating" tools that focus them on being where they need to be and doing whatever is necessary to be successful. People thrive on accomplishment. People who have a track record of accomplishment take these DR/COOP opportunities as an "ultimate test" of their can-do spirit. If us management folks can supply the infrastructure, tools and just don't get in their way, they will be successful.Of course, you always have folks that have circumstances. We had hospitalized folks, people in shelters without transportation, etc. However, their pride and their pre-event coordination with their backups allowed for a very smooth fill-in of the second tier folks. In addition, you always have heroes who take on their load plus their teammates' load until they can arrive. Folks who don't want to pull their weight are placed in an unpaid status and are dealt with administratively at a later date. No.
Lee, Washington DC:
Based on your experience, what would you do differently if you had to deal with something like this again?Gil Hawk:
Not much. We had great success and why change now. However, we must remember that each disaster is a new set of circumstances and we must not be lulled into one size fits all.
John, Washington DC:
How much water damage did NFC's building get? Did you have your hardware on raised floors?Gil Hawk:
The NFC building took no damage from flood water. It did sustain minor water damage due to wind and storm (windows, doors, etc.). Our hardware is in fact on raised floor space. No equipment was damaged.
Emory Tate - McLean, VA:
Re the bumpy spots with your DR provider -- did you find you were COOP-ing into the same facility as other organizations with primary facilities taken out by Katrina or other disasters, and did that negatively impact the level of support to you in a significant way?Gil Hawk:
There was some and there was also some testing going on. It's a big operation so we expected some issues which we quickly overcame.
Jerry, you stated earlier: "Our plan is to have KC as the primary site and use a yet to be determined site near New Orleans as the backup/hot site What does that mean you are considering moving from the current NFC site in New Orleans East?Jerry Lohfink:
We are looking at all options. The most likily alternative is to move back into the same space. We will also look at other space in the area to see if any of them give us any additional advantages and allow us to reuse the existing space for other purposes. It is the normal due dliligence one goes through when examinining any decision point.
Our time has expired for today's forum. We'd like to thank Jerry and Gil for sharing their experiences and lessons learned and for all the questions that were submitted. Look for another GCN forum in January.
Connect with the GCN staff on Twitter @GCNtech.