Strength in numbers

Follow these 5 steps to build an effective and indispensable computer security incident response team

Download

Find a link to last week’s story on FCW.com’s
download at www.fcw.com/download.

Government technology departments work in a tough neighborhood. Network intruders, unsavory gangs of malware hacks and criminally manipulative social-engineering schemers lurk everywhere, waiting to strike with heartless, ill intent. Sometimes, despite their best efforts, the good guys don’t win.

When that happens, some agencies summon a specialized security unit into action.

That group, the computer incident response team (CIRT) or computer security incident response team (CSIRT), is a rapid deployment force for addressing security breaches. The Federal Information Security Management Act calls for agencies to develop “procedures for detecting, reporting and responding to security incidents.” Response teams can help fulfill that mandate.

Those teams’ purpose is to shut down or contain incidents, minimize organizational disruption and data loss, and avoid damage to reputation. But to realize those potential benefits, agencies must avoid some common team-building pitfalls: insufficiently documented procedures, communication gaps and poor preparation.

Here are some steps that agency executives can take to create a response team from scratch or re-evaluate an existing one to ensure that it remains focused and on track.

Step 1: Compose the team

Technical employees form the nucleus of a response team. Those employees generally come from an organization’s information technology and communications departments. A specialized IT security group or an around-the-clock security operations center, often found in larger organizations, might also supply core response team
members.

The core members include analysts who break down a given incident, determine the likely cause and coordinate the response.

“In the ideal scenario, there is a designated CIRT team with individuals who have been trained…to be able to do things like fault isolation,” said Yong-Gon Chon, senior vice president of services at SecureInfo, an information assurance company whose customers include the Air Force, Army and the Homeland Security Department. In fault isolation, security employees identify where an anomaly is occurring and work to minimize its effect on the organization’s infrastructure.

The response team might include platform specialists, said Georgia Killcrece, team leader for CSIRT development at the Software Engineering Institute’s Computer Emergency Readiness Team program. Those specialists address incidents that affect a particular platform, such as an application, operating system or hardware component.

Help-desk employees also contribute to an incident response team, Killcrece added. An organization may set up a help desk dedicated to incident response or draw support from the IT department’s daily help-desk team. In either case, the help desk receives calls and
e-mail messages when users report security problems.

Chon said response teams should include employees experienced in protecting a chain of custody. That set of procedures is designed to preserve digital evidence so it can be presented in court.

Some agencies might want to have a forensic investigator on the team, although that depends on their goals for incident response, said Jason Reed, principal consultant at SystemExperts, a network security consulting firm. A team that is only interested in deterring attackers probably doesn’t require forensics expertise. But a team that seeks to catch and prosecute intruders would require those skills.

The response team isn’t limited to the technically skilled, however. Killcrece said incident management functions extend beyond an IT department into legal, human resources and public relations
departments.

Members of the team from those departments contribute their specialized expertise. Legal counsel can advise on liability issues, while human resources employees can provide guidance on investigations involving insiders.

In cases that may involve the loss of personal information, an organization’s chief privacy officer can be brought in to support the team, said Bob Post, a vice president at Booz Allen Hamilton. He emphasized the need to bring in people from the business side who can help prioritize incidents.

“The tech guys have different priority schemes,” Post said. “What they go after may be of lower value to the business.”

The response team must also reach out to individual network managers and systems administrators who are in a position to remediate the security issues affecting their zones of responsibility.

“The CIRT folks rarely have control of systems,” said Amit Yoran, chief executive officer at NetWitness and former director of DHS’ National Cyber Security Division. “The CIRT team, itself, can…have a tremendous amount of expertise but can only be successful if it establishes positive working relations with a wide variety of types of expertise across the department or agency.”

Whatever the team’s exact composition, it is imperative for an effective response that the core technical team and the multidisciplinary ad hoc team come together quickly, Post said.

Step 2: Devise a plan

The team must have an incident response plan or policy to anchor its activities.

Many CIRT teams arise in the aftermath of a major incident, but those teams often fail to retrace their steps and document response policies, Killcrece said. Compounding the problem are organizations that rely on the heroism of one or two team members, she added. If the brain trust is away from the office, the CIRT effort may collapse in the absence of a documented plan and contingency planning.

“If a critical member of the team is out of the picture…then all of a sudden the house of cards crumples because there is no solid foundation on which the [CIRT] services and activities are based,” Killcrece said.

By contrast, an incident response plan defines roles and responsibilities for dealing with incidents. It also sets forth procedures for responding to security issues. Chon said agencies generally adopt a high-level response approach designed to address any type of attack. That said, a handful of specific response scenarios may be written into a response plan. The three categories typically highlighted, Chon said, are disruption in service; human threats, including unauthorized access; and malware.

The plan is the institutional memory that agencies can refer to in times of trouble. “Organizations typically don’t have as mature a plan as we would like to see in place,” said Dick Mackey, vice president of consulting at SystemExperts. “The situation where they have this institutional memory has not cropped up very many times.”
Agencies, however, can get a jump on creating a plan. A number of experts pointed to the National Institute of Standards and Technology’s Special Publication 800-61. It provides a template to help start from scratch.

“The nice thing about using [SP 800-61] as a framework is that it lets organizations tailor their incident response to their own business context,” Reed said.

That document, “The Computer Security Incident Handling Guide,” includes a section on creating incident response policies and procedures and describes different models for building a response team. In addition, the publication offers guidance on how to handle various incidents.

Other information sources include the CERT, which offers online guides for building and measuring the effectiveness of response plans.

Once a plan is in place, the team should create and maintain a call list with contact information for team members, stakeholders within the organization, external entities such as law enforcement agencies, and contractors.

Killcrece said communication among team members and organizationwide isn’t always as good as it could be. Communication is important to the efficiency of a team, she added.

Step 3: Consider contractor support

Even large agencies might not have all the employees they need to staff a response team. IT security contractors offer a range of services, and government entities have the option of outsourcing some or all of the response team’s functions.

Yoran said public-sector teams vary from being composed of only government employees to being composed mostly of contract employees. Yoran said he is wary of the latter option.

“I would be very cautious of a complete outsourcing,” Yoran said. “The security of the IT system today translates directly to the security of data and the security of the business of your agencies or department. I don’t think you can outsource that in its entirety. You need to have government folks intimately involved in the process and decision-making of how incidents are…responded to.”
Michael Brown, director of the Federal Aviation Administration’s Office of Information Systems Security, said his organization runs a composite team that’s divided about half and half between federal workers and contract employees.

Government employees manage the team, while contractors tend to focus on analytical work, Brown said. The analytical skill sets include programming security devices and interpreting the security event data they generate.

Bret Padres, director of the incident response program at security service firm Mandiant, said malware analysis is one response team role that organizations often lack and may choose to outsource. The task of reverse engineering malware to determine exactly what it does has become more critical for resolving incidents, he said.
If a team decides to fill some team roles with contract employees, it should solidify those relationships early on rather than scramble to find contractors after disaster strikes. Mandiant offers its services on a retainer basis. Those contracts obligate the firm to dispatch an on-site response team within 48 hours and provide a two-hour response via the company’s hotline.

Step 4: Test the plan

Security consultants report that many organizations don’t put their response teams and plans through their paces.

“Drill, drill, drill,” Yoran said. “You can’t just passively monitor and expect to perform well in a time of crises.”

Tabletop exercises provide one testing option. In a tabletop exercise, the response team is given an incident scenario, and the members talk through the response plan, step by step. That approach tests the feasibility of the procedures and can be effective in identifying communication gaps, Post said.

Reed said he can attest to the latter point. He participated in a tabletop test for a customer who had a firewall monitoring contract with a managed security services provider. The exercise started, and the contractor’s representative described the company’s role: calling the customer to notify him of an incident. That statement was met with silence, Reed said. The customer assumed the contractor was on the hook for managing incident response, not just reporting trouble.

“It was a huge disconnect in expectations,” Reed said.

Another type of test involves team members performing their roles, instead of discussing them. In this case, the team reacts to a simulated incident, such as malware entering a network or a denial-of-service attack. Such exercises might use a virtual infrastructure that mirrors what the organization has in its production environment. Alternatively, the production network may be used as the test site during off-hours.

Simulations generally involve more employees, time and expense than tabletop tests. For that reason, the tabletop exercise is the most commonly used test method.

“Most exercise needs are satisfied using desktop or paper,” said John Woodman, principal consultant at Keane Federal Systems. “This approach offers our customers the added benefit of minimizing impact on already strained budgets and other mission priorities.”

Simulated attacks conducted in production environments are generally restricted to critical, high-threat systems that are constantly monitored, he added.

How often should tests take place? Some security experts say tests should be an annual occurrence. Others say tests should happen twice a year and more often in agencies that are subject to frequent attacks.

Step 5: Update the plan regularly

The never-updated plan is a malady related to the untested plan. Chon said one of the biggest mistakes organizations make is to put together a team, sign off on a plan and fail to maintain it. A stale plan can hinder response when an incident occurs.

“When someone pulls out the plan, half the names of the resources are no longer applicable,” Chon said.

Roles, responsibilities and contact information must be constantly updated to reflect personnel changes, consultants said. In addition, teams should conduct a postmortem after tests or incidents. Teams can use those sessions to sort which procedures worked and which could use improvement, Killcrece said.

The lessons learned may be integrated into a more holistic security program, Post said.

Reader comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above