SSA to create dozens of new datasets

Social Security Administration said it will be protecting against release of any personally identifiable information as it extracts new datasets for open government and transparency projects.

The datasets to be released in 2010 and 2011 include information on disability claims, disability appeals, retirement claims and earning. The data will be extracted from existing sources, such as the agency’s Case Processing Management System and Appeals Review Processing System, and scrubbed of personally identifying material, SSA said.

The Social Security Administration said it will create and release more than three dozen additional sets of data it is pulling from its information systems after scrubbing the data of personally identifiable information.

The agency has already released 22 datasets on Data.gov as part of the White House's open government and transparency program.

In the next phase, SSA plans to create and release additional high-value datasets.  The SSA outlines its plans in the “Data Inventory and Plan for Releasing High-Value Data” report posted on its website.


Related story:

Data.gov shows how not to do open government


“We will extract reports that contain summarized data so that even if the public combines our datasets with other data, no violations of privacy can occur,” the report states.

SSA officials also intend to run the new datasets by an internal review board to ensure they comply with security and privacy stipulations, they said.

Another group of more than two dozen high-value datasets will be released starting in 2012, including annual statistics for field office claims, disability claims filed via the Internet, disability claims accuracy, and the total number of SSA workers on duty.