Updated digital forensics database speeds criminal investigations

GettyImages/ Alistair Berg

NIST’s expanded, more searchable database will help law enforcement find incriminating data in electronic media.

To make it easier for forensic investigators to find relevant data on computers, cellphones and other electronic equipment seized in police raids, the National Institute of Standards and Technology has released a major update to its National Software Reference Library.

The first major update to NSRL in 20 years improved the search capabilities and increased both the number and types of records to reflect the variety of software investigators might encounter. The new features will make it easier to filter out large quantities of unimportant data so they can focus attention on finding relevant evidence, NIST said in its announcement.  

“There are hardly any major crimes that don’t have connections to digital technology, because criminals use cellphones,” said Doug White, a NIST computer scientist who helps maintain the NSRL. “Only some of the data on a phone or other device might be relevant to an investigation, though. The update should make it easier for police to separate the wheat from the chaff.” 

If forensic investigators are analyzing a home computer in connection with a child pornography case, they first want to weed out the graphics associated with other programs, for example.

“You want to run your investigation as quickly and efficiently as possible, so what you need is a way to get rid of all the video game images. Then you can run your more computationally expensive analysis on the files that remain,” White said.

The expanded NSRL includes a hash, or electronic fingerprint, of more than a billion software records – up from half that in 2019 -- that can be used to help forensic investigators sift through the computer’s data. White said he expects NSRL’s growth to continue as entries from internet-of-things devices get added.

The update, NIST said, uses the SQLite format, improving users’ ability to create custom filters to find what they need for a particular investigation. It’s also easier for NSRL managers to maintain. They can distribute dataset changes as comparatively small updates “rather than sending out the entire dataset anew, saving time and effort for users,” officials said.

Meanwhile, the NSRL will be available in its old format for those who may need time to adjust to the changes. 

“We will continue to publish the dataset in both the 2.0 and 3.0 formats through December 2022,” White said. “After that, there is a relatively easy query that users can run to generate the 2.0 dataset if it proves necessary.”