The National Institute of Standards and Technology (NIST) plans to add over 200,000 Android and iOS apps to a software library that helps digital forensics investigators target potential evidence.
NIST, the government-funded organization responsible for driving cybersecurity and other technology standards, on Thursday added the first 23,000 mobile apps to its National Software Reference Library (NSRL), a resource used in national security related cyber investigations.
The derivative of the NSRL is the NIST’s software fingerprint database, the Reference Data Set (RDS), which is often used to filter out files on a seized computer that aren’t pertinent to the investigation and focus efforts on files that might contain evidence.
As of December, the RDS has 15 million files from Android apps and 6 million files from iOS apps, each of which are available as 1GB DVD images.
Those file counts should rise significantly in coming months as NIST expects to add 200,000 new mobile apps to the NSRL next year. However, given there are now over two million Android apps in Google Play and the same number of iOS apps in Apple’s App Store, it’s likely going to take many years to capture the full picture.
Indeed, the computer scientists at NIST responsible for the library will probably never actually catch up to the current state, since they face the impossible task of maintaining an “up-to-date” repository of all software.
The NSRL so far contains some 50 million software programs, ranging from word processing applications to old Atari games and potential malware.
“Our goal is to help investigators, so we prioritize the software they are most likely to encounter in the field,” said Doug White, the NIST computer scientist heading up NSRL.
“We also focus on what we consider dual-use software—things that can be used for good or bad,” including keystroke loggers and network scanners.
NIST doesn’t actually share the software programs but rather a hash of files behind each program that’s available in the RDS. Each file is run through an algorithm to produce a 40 character string. So far the RDS contains about 180 million file hashes.
Forensics investigators don’t just use the database to filter out irrelevant files. According to White, the FBI tapped NIST to help its investigation into the disappearance of Malaysia Airlines flight MH370 in order to figure out which flight path it may have taken. The FBI wanted a hash of files associated with every flight simulator program it had and White obliged by adding more than 120,000 flight map-related files.