Rare Class Discovery Based on Active Learning
In machine learning, the new-class discovery problem remains an open challenge, especially for emergent rare classes. However, the challenge is of crucial importance for applications such as detecting new financial fraud patterns, new viral mutations and new network malware, most of which `hide' among vast volumes of normal data and observations. This paper focuses on a new approach, based on local-topology density estimation, applicable to discovering examples of the rare classes rapidly, despite non-separability with the majority class(es). The new method, called ALICE, and its variant MALICE, are shown effective both theoretically and empirically in outperforming other methods in the literature, both on challenging synthetic data and on real data sets.