posted on 2006-01-01, 00:00authored byDavid Brumley, James Newsome, Dawn Song, Hao Wang, Somesh Jha
In this paper we explore the problem of creating vulnerability
signatures. A vulnerability signature matches all exploits of a given vulnerability, even polymorphic or metamorphic variants. Our work departs from previous approaches by focusing on the semantics of the program and
vulnerability exercised by a sample exploit instead of the
semantics or syntax of the exploit itself. We show the semantics of a vulnerability define a language which contains
all and only those inputs that exploit the vulnerability. A
vulnerability signature is a representation (e.g., a regular
expression) of the vulnerability language. Unlike exploit-based signatures whose error rate can only be empirically
measured for known test cases, the quality of a vulnerability
signature can be formally quantified for all possible inputs.
We provide a formal definition of a vulnerability signature and investigate the computational complexity of creating and matching vulnerability signatures. We also systematically explore the design space of vulnerability signatures.
We identify three central issues in vulnerability-signature
creation: how a vulnerability signature represents the set
of inputs that may exercise a vulnerability, the vulnerability
coverage (i.e., number of vulnerable program paths) that is
subject to our analysis during signature creation, and how
a vulnerability signature is then created for a given representation and coverage.
We propose new data-flow analysis and novel adoption
of existing techniques such as constraint solving for automatically generating vulnerability signatures. We have
built a prototype system to test our techniques. Our experiments show that we can automatically generate a vulnerability signature using a single exploit which is of much
higher quality than previous exploit-based signatures. In
addition, our techniques have several other security applications, and thus may be of independent interest.