Why it's time to supercharge data infrastructure funding

Userba011d64_201/Getty Images

Advances in technology have set the stage for pivotal collaborations between computer scientists, statisticians, mathematicians and domain scientists—but government needs to pitch in.

As the nation faces a myriad of issues – from the ongoing residual effects of the pandemic to the growing consequences of climate change – top academic minds are partnering to develop data-driven solutions to better inform federal and state policymakers and the Biden administration. The recently passed FY2023 omnibus spending package reflects this priority, with language directing federal agencies like the National Science Foundation to review its investments in biological research infrastructure and their impact and consider which approaches offer more flexibility to evaluate and maintain critical research infrastructure. 

Data are valuable resources that when made accessible to researchers and leaders can be used to arm cities to boost their digital intelligence and bring essential services online, or to democratize AI by ensuring communities can take control of their own data and how it is shared. On the flip side, data access had hugely disparate impacts on communities at the height of the COVID-19 pandemic, where access to quality data and computing resources was limited. 

The world is full of uncaptured data needed to supercharge research efforts. With new technological advancements expanding possibilities for data collection, researchers can use data to solve the world's most pressing challenges and, in turn, improve our resilience to quickly evolving crises. However, this isn't possible without unfettered access to relevant data, research infrastructure, and data analysis tools. 

As a researcher myself, I know firsthand how powerful these tools can be. In my work as a professor at University of Arizona, I've witnessed the pivotal collaboration between computer scientists, statisticians, mathematicians and domain scientists in pursuit of leading research and education initiatives around the data sciences.  

My passion for advancing the critical work of our academic and industry partners has led to the development of a highly expansive, impactful cyberinfrastructure ecosystem designed to empower academics and researchers nationwide: CyVerse. This open science workspace for collaborative data-driven discovery was designed to study and develop solutions for core societal challenges such as preventing future pandemics, tackling climate change, and sustainably feeding future generations. 

CyVerse's national cyberinfrastructure assists with research across all disciplines and has led to the training of scientists nationwide on how to best leverage it. This is a core example of how providing scientists and researchers with a powerful platform to handle huge datasets and complex real-time analyses can lead to data-driven discovery. 

The creation of effective policies across life science disciplines originates from comprehensive research. Congress and federal agencies like the National Science Foundation have already allocated some funding for data infrastructure to support research, however, solving growing challenges requires a higher and more sustained investment in the data-driven infrastructure. Without access to real-time data and the requisite computing infrastructure to analyze them, innovative research is put on hold or worse, eliminated altogether.

These research infrastructure tools have widespread benefits not only for researchers, but also those invested in the findings of that research. They allow researchers to spend less time reinventing analysis tools and more time perfecting final research products. Thus, allocating funding for such infrastructure helps bring research findings to Congress and the Administration in a timely manner. Consistent data infrastructure provides continuity to inform relevant research.

However, data science programs – which have become a vital pillar of innovation research – are at risk. 

While some experts are hopeful that Congress will continue to provide stable budgets for federal research agencies, partisan battles and gridlock may undermine such efforts. It is imperative that the new Congress come together to ensure bipartisan support for funding legislation that will support innovations in science for years to come. 

Guaranteeing consistent and reliable data analysis tools requires the federal government to prioritize adequate funding for these initiatives. That's why research infrastructure must be a key priority for Congress and the Biden administration. While the language included in the omnibus package is a promising start, NSF must follow through on those directives to ensure quality funding for biological infrastructure. 

As time goes on, more sophisticated means of data collection and analysis will be required to meet the volume of information for collection purposes. Data infrastructure providers are prepared to meet these challenges but will be unable to do so without proper funding. By allocating funding towards research innovation today, researchers can continue to work towards solving the unknown challenges of tomorrow. 

Eric Lyons is an associate professor in the school of Plant Sciences at the University of Arizona and co-principal investigator and project director of CyVerse, a $115 million National Science Foundation-funded project to provide cyber infrastructure for life science research.

NEXT STORY: People Who Share Ideology Have Similar 'Neural Fingerprints'