posted on 2007-01-01, 00:00authored byJuan Caballero, Shobha Venkataraman, Pongsin Poosankam, Min G Kang, Dawn Song, Avrim Blum
Fingerprinting is a widely used technique among the networking and security communities for identifying different
implementations of the same piece of networking software
running on a remote host. A fingerprint is essentially a set of
queries and a classification function that can be applied on
the responses to the queries in order to classify the software
into classes. So far, identifying fingerprints remains largely
an arduous and manual process. This paper proposes a
novel approach for automatic fingerprint generation, that
automatically explores a set of candidate queries and applies machine learning techniques to identify the set of valid
queries and to learn an adequate classification function.
Our results show that such an automatic process can generate accurate fingerprints that classify each piece of software
into its proper class and that the search space for query exploration remains largely unexploited, with many new such
queries awaiting discovery. With a preliminary exploration,
we are able to identify new queries not previously used for
fingerprinting.