Anomaly Detection Amidst Constant Anomalies: Training IDS On Constantly Attacked Data (CMU-CyLab-08-006)
journal contributionposted on 08.04.2008 by M. Patrick Collins, Michael K. Reiter
Any type of content formally published in an academic journal, usually following a peer-review process.
Automated attack tools and the presence of a large number of untrained script kiddies has led to popular protocols such as SSH being constantly attacked by clumsy high-failure scans and bot harvesting attempts. These constant attacks result in a dearth of clean, attack-free network traffic logs, making training anomaly detectors for these protocols prohibitively difficult. We introduce a new filtering technique that we term attack reduction; attack reduction reduces the impact of these high-failure attacks on the traffic logs and can be used to extract a statistical model of normal activity without relying on prior assumptions about the volume of normal traffic. We demonstrate that a simple anomaly detection system (counting the number of hosts using SSH) trained on unfiltered data from our monitored network would fail to detect an attack involving 91,000 hosts; in contrast, it can be calibrated to detect attacks involving as few as 370 hosts using our attack reduction methodology. In addition, by using the same statistical model we use for filtering attacks, we estimate the required training time for an IDS and demonstrate that the system will be viable in as little as five hours