Carnegie Mellon University
Browse

Scalable and Robust Group Discovery on Large Transactional Data

Download (196.85 kB)
journal contribution
posted on 2005-12-01, 00:00 authored by Patrick Pakyan Choi, Andrew W Moore, Jeremy Kubica
The need for time-critical analysis and understanding of the underlying group structure from transactional data has been growing in domains such as law enforcement and customs. Kubica et al. (2003) proposed k-groups, an algorithm based on probabilistic generative model for discovering underlying groups in data. Even though k-groups is reported to be signficantly faster than its predecessor GDA (Kubica et al., 2002), k-groups is too slow and memory-intensive for large data in practice. This paper presents XGDA, a framework for scalable and robust group discovery. Evaluation of the performances of XGDA and k-groups shows that XGDA can handle extremely large datasets in reasonable time and yields more robust solutions than k-groups.

History

Publisher Statement

All Rights Reserved

Date

2005-12-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC