Skip to main content
eScholarship
Open Access Publications from the University of California

UC Santa Cruz

UC Santa Cruz Electronic Theses and Dissertations bannerUC Santa Cruz

Software signature derivation from sequential digital forensic analysis

Creative Commons 'BY' version 4.0 license
Abstract

Hierarchical storage system namespaces are notorious for their immense size, which is a significant hindrance for any computer inspection. File systems for computers start with tens of thousands of files, and the Registries of Windows computers start with hundreds of thousands of cells. An analysis of a storage system, whether for digital forensics or locating old data, depends on being able to reduce the namespaces down to the features of interest. Typically, having such large volumes to analyze is seen as a challenge to identifying relevant content. However, if the origins of files can be identified---particularly dividing between software and human origins---large counts of files become a boon to profiling how a computer has been used. It becomes possible to identify software that has influenced the computer's state, which gives an important overview of storage system contents not available to date.

In this work, I apply document search to observed changes in a class of forensic artifact, cell names of the Windows Registry, to identify effects of software on storage systems. Using the search model, a system's Registry becomes a query for matching software signatures. To derive signatures, file system differential analysis is extended from between two storage system states to many sequences of states. The workflow that creates these signatures is an example of analytics on data lineage, from branching data histories. The signatures independently indicate past presence or usage of software, based on consistent creation of measurably distinct artifacts. A signature search engine is demonstrated against a machine with a selected set of applications installed and executed. The optimal search engine according to that machine is then turned against a separate corpus of machines with a set of present applications identified by several non-Registry forensic artifact sources, including the file systems, memory, and network captures. The signature search engine corroborates those findings, using only the Windows Registry.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View