Prior-Free Rare Category Detection

He, Jingrui; Carbonell, Jaime G.

doi:10.1184/R1/6625385.v1

file.pdf (402.18 kB)

Prior-Free Rare Category Detection

journal contribution

posted on 2009-04-01, 00:00 authored by Jingrui He, Jaime G. Carbonell

Rare category detection is an open challenge in machine learning. It plays the central role in applications such as detecting new ¯nancial fraud patterns, detecting new network malware, and scienti¯c discovery. In such cases rare categories are hidden among huge volumes of normal data and observations. In this paper, we propose a new method for rare category detection named SEDER, which requires no prior information about the data set. It implicitly performs semiparametric density estimation using specially designed exponentially families, and then picks the examples for labeling where the neighborhood density changes the most. SEDER can work in the cases where the data is not separable. Its unique feature over all existing methods lies in its prior-free nature, i.e. it does not require any prior information about the data set (e.g. the number of classes, the proportion of the di®erent classes, etc.). Therefore, it is more suitable for real applications. Experimental results on both synthetic and real data sets demonstrate the superiority of SEDER.

History

Publisher Statement

Copyright SIAM

Date

2009-04-01

Usage metrics

Keywords

Software Research

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Prior-Free Rare Category Detection

History

Publisher Statement

Date

Usage metrics

Categories

Keywords

Licence

Exports