DHS updates on data mining

Shutterstock image (by Andrii_M): computer binary code. 

The Department of Homeland Security uses software tools to extract insights from its vast troves of data. Under federal law, DHS must make an annual report to Congress on its use of data mining to allay concerns about possible privacy violations.

The latest report, released publicly April 20, said that "no decisions about individuals are made based solely on data mining results" and that DHS investigators "apply their own judgment and expertise to bear in making determinations about individuals initially identified through data mining activities."

The data mining report, the associated privacy reports and record system notices together provide updates on how DHS is integrating its data systems across all its component agencies and its progress on its strategy to create a centralized "data lake" for investigators.

As of October 2016, DHS had wrangled 17 datasets into the DHS Data Framework. These include some of the large travel and immigration databases, including the I-94 system for foreign visitors, the Electronic System for Travel Authorization and the Passenger Name Record system.

The Framework is divided into two related systems --  a data lake called Neptune and a classified query system called Cerberus, which is used for counterterrorism probes. In 2016, according to the report, DHS tapped Cerberus to "facilitate bulk information sharing with U.S. government partners." In this context, "bulk" refers to data that isn't selected based on specific identifiers or other search terms "reasonably likely to exclude any intelligence or information not relevant to the need giving rise to the recipient's request."

DHS also noted that it was looking to replace an interim solution that allows users of the Framework to make classified queries to identify terror suspects linked to ISIS, al-Qaida and their affiliates to address the risk of "foreign fighters" entering the U.S.  According to the report, DHS "defined a set of operational requirements that the Data Framework must meet in order to fully replace the interim process."

A key goal of the Framework was to apply the "One DHS" policy to integrate and manage data across all sources. However, familiar issues of interoperability hamper the integration of systems. One planned feature -- keeping the data in the Framework coordinated with the data in the source systems -- had to be postponed. DHS "discovered that the source IT systems are not always able to accommodate" delete notifications from source systems, " due to a number of constraints, such as resources, legacy systems, and disruptions to operational support."

Accordingly, according to the report, an update to the data retention policies of the Framework will be addressed in a forthcoming privacy assessment.

The report also identified two new data mining systems. The Socrates pilot, administered by Customs and Border Protection, and the Fraud Detection and National Security Data System under the control of the U.S. Citizenship and Immigration Service. The Socrates pilot is being operated in conjunction with the Johns Hopkins University Applied Physics Laboratory and involves analyzing large international trade datasets to identify patterns of tariff avoidance, importation of counterfeit merchandise and other illicit trade activity. The longstanding Fraud Detection and National Security Data System, which tracks fraud in immigration applications, has added analytical capacity.

About the Author

Adam Mazmanian is executive editor of FCW.

Before joining the editing team, Mazmanian was an FCW staff writer covering Congress, government-wide technology policy, health IT and the Department of Veterans Affairs. Prior to joining FCW, Mr. Mazmanian was technology correspondent for National Journal and served in a variety of editorial roles at B2B news service SmartBrief. Mazmanian started his career as an arts reporter and critic, and has contributed reviews and articles to the Washington Post, the Washington City Paper, Newsday, Architect magazine, and other publications. He was an editorial assistant and staff writer at the now-defunct New York Press and arts editor at the online network in the 1990s, and was a weekly contributor of music and film reviews to the Washington Times from 2007 to 2014.

Click here for previous articles by Mazmanian. Connect with him on Twitter at @thisismaz.

Rising Stars

Meet 21 early-career leaders who are doing great things in federal IT.


  • SEC Chairman Jay Clayton

    SEC owns up to 2016 breach

    A key database of financial information was breached in 2016, possibly in support of insider trading, said the Securities and Exchange Commission.

  • Image from

    DOD looks to get aggressive about cloud adoption

    Defense leaders and Congress are looking to encourage more aggressive cloud policies and prod reluctant agencies to embrace experimentation and risk-taking.

  • Shutterstock / Pictofigo

    The next big thing in IT procurement

    Steve Kelman talks to the agencies that have embraced tech demos in their acquisition efforts -- and urges others in government to give it a try.

  • broken lock

    DHS bans Kaspersky from federal systems

    The Department of Homeland Security banned the Russian cybersecurity company Kaspersky Lab’s products from federal agencies in a new binding operational directive.

  • man planning layoffs

    USDA looks to cut CIOs as part of reorg

    The Department of Agriculture is looking to cut down on the number of agency CIOs in the name of efficiency and better communication across mission areas.

  • What's next for agency cyber efforts?

    Ninety days after the Trump administration's executive order, FCW sat down with agency cyber leaders to discuss what’s changing.

Reader comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above

More from 1105 Public Sector Media Group