Rising to the challenge of mapping health data

Brand Niemann offers inspiration and tips for analyzing and integrating the reams of health data available online.

Brand Niemann is senior data scientist at Semanticommunity.net and former senior enterprise architect and data scientist at the Environmental Protection Agency.

To build on my recent series of articles on data science, I decided to make HealthData.gov my latest exploration.

Recently, Sonny Bhagowalia, deputy associate administrator of the Office of Citizen Services and Innovative Technologies at the General Services Administration, wrote in a tweet that Data.gov's move to the cloud would yield more and better mashups of government data. Meanwhile, Todd Park, chief technology officer at the Health and Human Services Department, announced the new Health Indicators Warehouse and welcomed us to HealthData.gov. And Health 2.0 announced two new developer challenges: Healthy People 2020 and Go Viral to Improve Health. In addition, George Thomas, an enterprise architect at HHS, is working on Clinical Quality Linked Data on HealthData.gov to help achieve Linked Open Government Data goals.

So there are five major sites now with health data — HealthyPeople.gov, Health2Challenge.org, HealthIndicators.gov, HealthData.gov and Data.Medicare.gov — that can be integrated (i.e., mashed up). I inventoried the resources and datasets at those five sites in several spreadsheets and looked for opportunities to analyze them individually and collectively. I also entered the Healthy People 2020 and Go Viral to Improve Health challenges. Previously, I had built a health data indicators warehouse in the cloud as part of the Health Data Visualization Challenge of 2010, so this data science project was not completely new to me.

I started with Spotfire’s library of U.S. state and county boundaries because I knew I would be doing interactive maps with the spatial data at those five sites. Then I imported the spreadsheet data and created a separate tab in Spotfire for each major site as follows:

  • HealthyPeople.gov: Inventory to understand contents and apply business intelligence and analytics.
  • Health2Challenge.org: Inventory to understand contents and build on previous work.
  • HealthIndicators.gov: New interface to catalog and data to support business intelligence and analytics.
  • HealthData.gov: New data catalog to expedite discovery and download for business intelligence and analytics.
  • Data.Medicare.gov: Inventory of datasets to expedite discovery and download for hospital selection example.

The focus of the challenge was to extract the goals and objectives from the state-specific Healthy People 2010 and 2020 plans, map them, and integrate them with the databases above.

All that work is documented on the wiki page and its attachments so others can check and produce their own integrations. The Healthy People 2020 challenge was submitted March 7, and the Go Viral to Improve Health Challenge is due April 27. The latter includes work with more community-level data sources such as the Pellucid Health Care Transparency tables and the data sources in the book “Visualizing Data Patterns with Micromaps” by Dan Carr of George Mason University and Linda Williams Pickle, formerly with the National Cancer Institute. The latter also links to the recent work to build VIVO, an open-source Semantic Web application, in the cloud for the National Institutes of Health's Workshop on Value Added Services for VIVO.

I hope this article has piqued your interest in taking the challenge to analyze health databases — and makes it easier for you to get started.

About the Author

Brand Niemann is senior data scientist at Semanticommunity.net and former senior enterprise architect and data scientist at the Environmental Protection Agency.

FCW in Print

In the latest issue: Looking back on three decades of big stories in federal IT.


  • Anne Rung -- Commerce Department Photo

    Exit interview with Anne Rung

    The government's departing top acquisition official said she leaves behind a solid foundation on which to build more effective and efficient federal IT.

  • Charles Phalen

    Administration appoints first head of NBIB

    The National Background Investigations Bureau announced the appointment of its first director as the agency prepares to take over processing government background checks.

  • Sen. James Lankford (R-Okla.)

    Senator: Rigid hiring process pushes millennials from federal work

    Sen. James Lankford (R-Okla.) said agencies are missing out on younger workers because of the government's rigidity, particularly its protracted hiring process.

  • FCW @ 30 GPS

    FCW @ 30

    Since 1987, FCW has covered it all -- the major contracts, the disruptive technologies, the picayune scandals and the many, many people who make federal IT function. Here's a look back at six of the most significant stories.

  • Shutterstock image.

    A 'minibus' appropriations package could be in the cards

    A short-term funding bill is expected by Sept. 30 to keep the federal government operating through early December, but after that the options get more complicated.

  • Defense Secretary Ash Carter speaks at the TechCrunch Disrupt conference in San Francisco

    DOD launches new tech hub in Austin

    The DOD is opening a new Defense Innovation Unit Experimental office in Austin, Texas, while Congress debates legislation that could defund DIUx.

Reader comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above

More from 1105 Public Sector Media Group