USGS meets spikes in demand for earthquake info with content network

download-pdf
Content delivery network helps the U.S. Geological Survey keep its sites available even when traffic surges

The magnitude 9.0 Tohoku earthquake that took place March 11 near the northeast coast of Honshu, Japan, resulted in 6,000 hits per second on the U.S. Geological Survey’s earthquake-related websites, which are delivered via a content delivery network (CDN) service.

The peak Web traffic record achieved by USGS still stands at 52,000 hits per second following the 7.2 magnitude earthquake that occurred on Easter Sunday 2010. USGS now has additional metric data available on site traffic, which shows that earthquake sites received 400,000 visits and 1.5 million page views on an average day for the month surrounding the latest Honshu earthquake. The peak following the recent Japan earthquake and tsunami was 900,000 visits and 2.5 million page views per day. 

According to Lorna Schmidt, program manager for the USGS Enterprise Web Program, earthquakes that occur in the United States, particularly in densely populated areas result in significantly larger spikes on our websites than those that occur outside the United States. Such massive spikes in data and information requests, produced by flash crowds who felt the earthquake, generally occur very soon after an earthquake event. “Such spikes, or peaks, generally have a precipitous drop off to a more smooth, elevated request rate for some period surrounding the event. This may be due to news coverage, family interest or other means of distributing information in the event,” she said.  

USGS relies on a Level 3-supplied CDN to manage the flash crowds for earthquake data requests and provides cached access to data from globally distributed systems. Access to source data is critical to USGS’ primary mission, providing updated, refreshed content to citizens as well as internal programs and processes. The CDN allows USGS to get the job done with an infrastructure that is leased through a CDN contract, rather than owned, to reduce costs.

During the recent 9.0M Japan earthquake and tsunami, the Level 3 CDN performed bursting, which replicates requested data through the global network, serving increasingly more bandwidth to users until peak demand was satisfied. Then the CDN eased off replicating as demand drops back to more typical levels. USGS pays service fees both for DNS and bursting charges.

Data coming into USGS, such as earthquake sensor and streamflow data, travels multiple paths from thousands of stations to redundant systems to ensure delivery to the USGS emergency notification system and to the public. USGS Web content hosted within the NatWeb infrastructure, such as local science centers and many National USGS Program sites, is managed in a cloud of file and Web servers at three data centers across the United States. Each center’s servers are configured identically for backup purposes, with data replicated among all servers.  

The current environment supports the agency’s continuity-of-operations plan and allows USGS to continue to provide access to natural resources data, no matter how great the demand. A CDN with a worldwide presence reduces geographic latency, or the distance between the requester and server, USGS officials explained.

Lessons learned 

Over the years, USGS has learned that testing the system is important. In addition, monitoring Web sites before, during and after an event can help improve resiliency. In all phases of the information life cycle, organizations must build and account for redundancy. USGS officials recommend that organizations design any Web infrastructure with performance demands and outages in mind and plan for the necessary redundancy upfront. The current server infrastructure, along with the use of a CDN, allows USGS to gather data and keep it available even when demand peaks during emergency situations.

As part of life cycle planning, the Enterprise Web Program and USGS will continue to evaluate strategies for delivery of USGS content and information, taking into account a fluctuating budget, security, performance and technology requirements. 

A little background

At the U.S. Geological Survey, the mission is science, including the collection, analysis and distribution of data and information used to help answer an array of complex questions that span multiple disciplines in the realm of natural sciences.  

USGS has 9,000 employees in 400 locations around the world. USGS maintains data centers in the United States, but its technology reach extends to streamgauges, volcano sites and earthquake reporting stations around the world.

In 2010, USGS reported the following from Netflow statistics and log file analysis:

  • 3.8 petabytes of data transferred from all ports in and out of USGS.
  • A total of 949 terabytes of Web traffic in and out of USGS, not including content delivered by Level 3 to public earthquake-related websites and www.usgs.gov.
  • 127 terabytes of data delivered by NatWeb, which hosts half of USGS' websites, including www.usgs.gov, but not NWISWeb or the earthquake sites.


About the Author

Barbara DePompa is a freelance writer for 1105 Government Information Group’s Content Solutions unit. This Snapshot report was commissioned by the Content Solutions unit, an independent editorial arm of 1105 Government Information Group. Specific topics are chosen in response to interest from the vendor community; however, sponsors are not guaranteed content contribution or review of content before publication. For more information about 1105 Government Information Group Content Solutions, please email us at GIGCustomMedia@1105govinfo.com