DOE's new data-management tool aims to tame LHC data
- By Mark Rockwell
- Jul 09, 2015
The Brookhaven National Laboratory's workload management system is being used to tap the Titan supercomputer resources at the Oak Ridge National Laboratory. (Image: TITAN supercomputer / BNL)
Department of Energy researchers have developed a data-management tool for its supercomputers to help handle the tidal wave of data generated by the recently restarted Large Hadron Collider (LHC) across the Atlantic.
The LHC in Switzerland now operates at nearly twice its former collision energy, according to Brookhaven National Laboratory. That increase has left data physicists sifting through a monumental pile of data. A pilot project under the Department of Energy’s Office of Science is using a big data management tool developed by physicists at Brookhaven and the University of Texas at Arlington.
According to Brookhaven, a workload management system, dubbed PanDA (for Production and Distributed Analysis), was originally designed by high-energy physicists to handle data analysis jobs for the LHC’s ATLAS collaboration. ATLAS is part of the LHC project that detects particles created by proton collisions.
During the LHC’s first run, from 2010 to 2013, PanDA made ATLAS data available for analysis by 3,000 scientists around the world using the LHC’s worldwide grid of networked computing resources. The latest rendition of the supercomputer workload management system, called Big PanDA, likewise aims to meet big data challenges in many areas of science by maximizing the use of limited supercomputing resources.
Big PanDA schedules jobs opportunistically on the DOE's Titan supercomputer in Oak Ridge, Tenn. According to Brookhaven, Big PanDA's opportunistic "fill in the gap" operation on Titan doesn’t conflict with the supercomputer's ability to schedule traditional, very large computing jobs.
The integration of Big PanDA on Titan, according to Brookhaven, is the first large-scale use of leadership class supercomputing facilities coupled with a workload management to assist in the analysis of experimental high-energy physics data. The program will have immediate benefits for ATLAS, lab officials predicted.
As the volume of data increases with the LHC collision energy, so does the need for running simulations that help scientists interpret experimental results. Those simulations, a Brookhaven announcement said, are best handled by supercomputers.
The prototype Big PanDA software has been significantly modified from its original design, lab officials said, and “backfills” simulations of the collisions taking place at the LHC into spaces between typically large supercomputing jobs, minimizing valuable computer run time.
Mark Rockwell is a senior staff writer at FCW, whose beat focuses on acquisition, the Department of Homeland Security and the Department of Energy.
Before joining FCW, Rockwell was Washington correspondent for Government Security News, where he covered all aspects of homeland security from IT to detection dogs and border security. Over the last 25 years in Washington as a reporter, editor and correspondent, he has covered an increasingly wide array of high-tech issues for publications like Communications Week, Internet Week, Fiber Optics News, tele.com magazine and Wireless Week.
Rockwell received a Jesse H. Neal Award for his work covering telecommunications issues, and is a graduate of James Madison University.
Click here for previous articles by Rockwell.
Contact him at [email protected] or follow him on Twitter at @MRockwell4.