posted on 2004-01-01, 00:00authored byKathleen CarleyKathleen Carley, D M Thomas, Scott L Arouh, James B Kraiman, Jonathan R Davis
The pattern of an emergent disease or bioterrorist attack is likely to differ from that of a naturally-occurring epidemic. To detect such patterns is a challenging undertaking since most disease surveillance systems process vast amounts of data collected over wide and disparate geographic regions. These databases range in size into the terabytes - making meaningful analysis and conclusions about the data impracticable, expensive, and unresponsive to immediate situations. Robust, automated, non-template-based real-time processing techniques capable of monitoring large-scale disease, healthcare, and experimental data sets are needed to discriminate between naturally occurring events and emergent diseases or bioterrorist attacks.
We describe the application of data mining techniques in developing an Automated Anomaly Detection Processor (AADP), which uses the Self Organizing Map clustering algorithm in conjunction with a Gaussian Mixture Model and a Bayesian Analyzer probabilistic model to detect anomalous occurrences in health data sets.