One DHS remains a work in progress

The multiple components at the department have an existing architecture of databases "not conducive to effective implementation," report finds.

Data analytics

The ability of Homeland Security employees to search across the department's many databases under a unified policy is advancing, but the DHS Privacy Office says the capabilities are far from ideal.

The Privacy Office's 2013 data mining report, sent to Congress in late February, said that while DHS is developing its scalable data framework program to support the One DHS policy, the existing architecture of databases "is not conducive to effective implementation."

One DHS, the report said, was implemented to allow department personnel timely access to relevant and necessary information that can be locked in stove-piped data systems that are common in the multiple agency components that make up DHS.

The report noted that DHS is working to smooth data accessibility, conducting three pilots to test different capabilities needed to implement the framework: the Neptune Pilot, the Common Entity Index Prototype (CEI Prototype), and the Cerberus Pilot.

DHS's job is complicated not only by stove-piped systems, said the report, but by privacy protections that can vary agency to agency. "Existing information maintained by the department is subject to privacy, civil rights and civil liberties, and other legal and policy protections, and it is collected under different authorities and for various purposes." Access to data across the agency can be "cumbersome, time-intensive, and requires personnel to log on and query separate databases in order to determine what information DHS systems contain about a particular individual."

The goal of the data framework is to provide a user the ability to search an amalgamation of data extracted from multiple DHS systems for a specific purpose and to view the information in a clear and accessible format. The framework, it said, will enable efficient and cost-effective searches across DHS databases in both classified and unclassified domains.

According to the privacy office, the framework uses four elements for controlling data: user attributes, data tagging, context and dynamic access. The trials will address all four components, using test data from Customs and Border Protection's Electronic System for Travel Authorization, Immigration and Customs Enforcement's Student and Exchange Visitor Information Systems, and the Transportation Security Administration Alien Flight Student Program.

The report said the framework will incorporate a user-attribute hub being developed through the DHS Office of the Chief Information Officer. The hub will maintain a listing of a system user's attributes for determining access control, including the component in which the individual works, location and job series.

According to the report, the Neptune Pilot will test data-tagging residing in the Sensitive but Unclassified/For Official Use Only domain, and will ingest and tag data in a data repository called "Neptune." Data in the Neptune Pilot will be shared with the CEI Prototype and the Cerberus Pilot, but will not be accessible for other purposes.

The CEI Prototype will also reside on the SBU/FOUO domain. It will receive a subset of the tagged data from the Neptune Pilot and correlate data from across component datasets.

The CEI Prototype will test the utility of the Neptune-tagged data -- specifically, the ability to ensure that only users with certain attributes are able to access data based on defined purposes using the dynamic access control process. This prototype will use data tags to test the third and fourth elements of the DHS data framework, which are context and dynamic access control.

The Cerberus Pilot, the report said, will reside in the Top Secret/Sensitive Compartmented Information domain. It will receive all of the tagged data from the Neptune Pilot in a separate data repository called Cerberus. Cerberus will test the ability to ensure that only users with certain attributes are able to access data based on defined purposes using the dynamic access control process. Cerberus will leverage the data tags to test the context and dynamic access control elements of the data framework. Cerberus will also test the ability to perform simple and complex searches across different component datasets using different analytical tools.

During the pilot phase of the data framework, the report said several different types of search tools and analytical capabilities will be tested. Planned search capabilities include pattern-based searches designed to identify previously unknown individuals who might pose threats to homeland security.

The DHS Privacy Office said it has been "intensively involved in the development of these capabilities and in the DHS data framework as a whole since its inception." It said it will evaluate the need for updated privacy impact assessments and continue to be involved in the development of the governance structure for the framework.