DHS updates on data mining

Shutterstock image (by Andrii_M): computer binary code. 

The Department of Homeland Security uses software tools to extract insights from its vast troves of data. Under federal law, DHS must make an annual report to Congress on its use of data mining to allay concerns about possible privacy violations.

The latest report, released publicly April 20, said that "no decisions about individuals are made based solely on data mining results" and that DHS investigators "apply their own judgment and expertise to bear in making determinations about individuals initially identified through data mining activities."

The data mining report, the associated privacy reports and record system notices together provide updates on how DHS is integrating its data systems across all its component agencies and its progress on its strategy to create a centralized "data lake" for investigators.

As of October 2016, DHS had wrangled 17 datasets into the DHS Data Framework. These include some of the large travel and immigration databases, including the I-94 system for foreign visitors, the Electronic System for Travel Authorization and the Passenger Name Record system.

The Framework is divided into two related systems --  a data lake called Neptune and a classified query system called Cerberus, which is used for counterterrorism probes. In 2016, according to the report, DHS tapped Cerberus to "facilitate bulk information sharing with U.S. government partners." In this context, "bulk" refers to data that isn't selected based on specific identifiers or other search terms "reasonably likely to exclude any intelligence or information not relevant to the need giving rise to the recipient's request."

DHS also noted that it was looking to replace an interim solution that allows users of the Framework to make classified queries to identify terror suspects linked to ISIS, al-Qaida and their affiliates to address the risk of "foreign fighters" entering the U.S.  According to the report, DHS "defined a set of operational requirements that the Data Framework must meet in order to fully replace the interim process."

A key goal of the Framework was to apply the "One DHS" policy to integrate and manage data across all sources. However, familiar issues of interoperability hamper the integration of systems. One planned feature -- keeping the data in the Framework coordinated with the data in the source systems -- had to be postponed. DHS "discovered that the source IT systems are not always able to accommodate" delete notifications from source systems, " due to a number of constraints, such as resources, legacy systems, and disruptions to operational support."

Accordingly, according to the report, an update to the data retention policies of the Framework will be addressed in a forthcoming privacy assessment.

The report also identified two new data mining systems. The Socrates pilot, administered by Customs and Border Protection, and the Fraud Detection and National Security Data System under the control of the U.S. Citizenship and Immigration Service. The Socrates pilot is being operated in conjunction with the Johns Hopkins University Applied Physics Laboratory and involves analyzing large international trade datasets to identify patterns of tariff avoidance, importation of counterfeit merchandise and other illicit trade activity. The longstanding Fraud Detection and National Security Data System, which tracks fraud in immigration applications, has added analytical capacity.

About the Author

Adam Mazmanian is executive editor of FCW.

Before joining the editing team, Mazmanian was an FCW staff writer covering Congress, government-wide technology policy, health IT and the Department of Veterans Affairs. Prior to joining FCW, Mr. Mazmanian was technology correspondent for National Journal and served in a variety of editorial roles at B2B news service SmartBrief. Mazmanian started his career as an arts reporter and critic, and has contributed reviews and articles to the Washington Post, the Washington City Paper, Newsday, Architect magazine, and other publications. He was an editorial assistant and staff writer at the now-defunct New York Press and arts editor at the online network in the 1990s, and was a weekly contributor of music and film reviews to the Washington Times from 2007 to 2014.

Click here for previous articles by Mazmanian. Connect with him on Twitter at @thisismaz.

The Fed 100

Read the profiles of all this year's winners.


  • Then-presidential candidate Donald Trump at a 2016 campaign event. Image: Shutterstock

    'Buy American' order puts procurement in the spotlight

    Some IT contractors are worried that the "buy American" executive order from President Trump could squeeze key innovators out of the market.

  • OMB chief Mick Mulvaney, shown here in as a member of Congress in 2013. (Photo credit Gage Skidmore/Flickr)

    White House taps old policies for new government makeover

    New guidance from OMB advises agencies to use shared services, GWACs and federal schedules for acquisition, and to leverage IT wherever possible in restructuring plans.

  • Shutterstock image (by Everett Historical): aerial of the Pentagon.

    What DOD's next CIO will have to deal with

    It could be months before the Defense Department has a new CIO, and he or she will face a host of organizational and operational challenges from Day One

  • USAF Gen. John Hyten

    General: Cyber Command needs new platform before NSA split

    U.S. Cyber Command should be elevated to a full combatant command as soon as possible, the head of Strategic Command told Congress, but it cannot be separated from the NSA until it has its own cyber platform.

  • Image from Shutterstock.

    DLA goes virtual

    The Defense Logistics Agency is in the midst of an ambitious campaign to eliminate its IT infrastructure and transition to using exclusively shared, hosted and virtual services.

  • Fed 100 logo

    The 2017 Federal 100

    The women and men who make up this year's Fed 100 are proof positive of what one person can make possibile in federal IT. Read on to learn more about each and every winner's accomplishments.

Reader comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above

More from 1105 Public Sector Media Group