posted on 2005-12-01, 00:00authored byShoba Venkataraman, Avrim Blum, Dawn Song, Subhabrata Sen, Oliver Spatscheck
We formulate and address the problem of discovering dynamic malicious regions
on the Internet. We model this problem as one of adaptively pruning a known
decision tree, but with additional challenges: (1) severe space requirements, since
the underlying decision tree has over 4 billion leaves, and (2) a changing target
function, since malicious activity on the Internet is dynamic. We present a novel
algorithm that addresses this problem, by putting together a number of different
“experts” algorithms and online paging algorithms. We prove guarantees on our
algorithm’s performance as a function of the best possible pruning of a similar
size, and our experiments show that our algorithm achieves high accuracy on large
real-world data sets, with significant improvements over existing approaches.