Big Data

Making big data work

abstract head representing big data

Behind the "big data" cliché is an explosion in the volume of information collected by sensors, cameras, social media, e-commerce, science experiments, weather satellites, logistics and a host of other sources. But to extract valuable insights from the terabytes and petabytes of information, analysts have to know how to use datasets in their systems, and compare data from different sources.

A standards-based approach is one way to facilitate this process, and the National Institute of Standards and Technology is leading an effort to bring some consensus in terms of the logistics, structure, and security of data, to the user community. A draft of the NIST Big Data Interoperability Framework, released April 6, looks to establish common a set of definitions for data science, and common ground, or "reference architecture," for what constitutes usability, portability, analytics, governance and other concepts.

"One of NIST's big data goals was to develop a reference architecture that is vendor-neutral and technology- and infrastructure-agnostic, to enable data scientists to perform analytics processing for their given data sources without worrying about the underlying computing environment," said NIST's Digital Data Advisor Wo Chang.

The framework is less a policy document than an agreed-upon set of questions that need to be answered, and challenges that need to be addressed in order to produce a consensus-based set of global standards for the production, storage, analysis and safeguarding of large, diverse datasets. NIST isn't looking to write specs for operational systems, or rules for information exchange or security. NIST's Big Data Public Working Group, which includes scientists in government, academia and the private sector, has released a seven-volume document designed to "clarify the underlying concepts of big data and data science to enhance communication among big data producers and consumers," per the report.

A set of use cases collected from contributors gets at the challenges facing government, researchers and industry in maintaining the viability and usability of current data, while preparing for the future.

For example, the National Archives and Records Administration faces the problem of processing and managing a huge amount of varied data, structured and unstructured, from different government agencies, that may have to be gathered from different clouds, and tagged to respond to queries, while preserving security and privacy where required by law.

The Census Bureau is exploring the possibility of using non-traditional sources from e-commerce transactions, wireless communications and public-facing social media data to augment or mash up with its survey data to improve statistical estimates, and produce data that is closer to real-time. But that data has to be reliable and maintain confidentiality.

On the security side, the NIST report calls attention to the future – the problem of protecting data that might need to outlast the lifespan and usefulness of the systems that house it, and the security measures that protect it.

Some types of data, including medical imaging data, security video and geospatial imaging were until relatively recently considered too large to be conveniently analyzed and shared over computer networks, and therefore weren't created with security and privacy in mind – that could be a problem down the road. The Internet of Things and the new troves of sensor data created by connected devices could create vulnerabilities for devices and data that were not previously considered.

NIST is accepting comments on the framework through May 21.

About the Author

Adam Mazmanian is executive editor of FCW.

Before joining the editing team, Mazmanian was an FCW staff writer covering Congress, government-wide technology policy and the Department of Veterans Affairs. Prior to joining FCW, Mazmanian was technology correspondent for National Journal and served in a variety of editorial roles at B2B news service SmartBrief. Mazmanian has contributed reviews and articles to the Washington Post, the Washington City Paper, Newsday, New York Press, Architect Magazine and other publications.

Click here for previous articles by Mazmanian. Connect with him on Twitter at @thisismaz.


Featured

  • Telecommunications
    Stock photo ID: 658810513 By asharkyu

    GSA extends EIS deadline to 2023

    Agencies are getting up to three more years on existing telecom contracts before having to shift to the $50 billion Enterprise Infrastructure Solutions vehicle.

  • Workforce
    Shutterstock image ID: 569172169 By Zenzen

    OMB looks to retrain feds to fill cyber needs

    The federal government is taking steps to fill high-demand, skills-gap positions in tech by retraining employees already working within agencies without a cyber or IT background.

  • Acquisition
    GSA Headquarters (Photo by Rena Schild/Shutterstock)

    GSA to consolidate multiple award schedules

    The General Services Administration plans to consolidate dozens of its buying schedules across product areas including IT and services to reduce duplication.

Stay Connected

FCW Update

Sign up for our newsletter.

I agree to this site's Privacy Policy.