posted on 2006-07-01, 00:00authored byJingrui He, Jaime G. Carbonell
In machine learning, the new-class discovery problem
remains an open challenge, especially for emergent rare
classes. However, the challenge is of crucial importance for applications such as detecting new financial
fraud patterns, new viral mutations and new network
malware, most of which `hide' among vast volumes of
normal data and observations. This paper focuses on
a new approach, based on local-topology density estimation, applicable to discovering examples of the rare
classes rapidly, despite non-separability with the majority class(es). The new method, called ALICE, and
its variant MALICE, are shown effective both theoretically and empirically in outperforming other methods
in the literature, both on challenging synthetic data
and on real data sets.