Carnegie Mellon University
Browse

Transformation-based Probabilistic Clustering with Supervision

Download (303.04 kB)
journal contribution
posted on 2014-07-01, 00:00 authored by Siddarth Gopal, Yiming Yang

One of the common problems with clustering is that the generated clusters often do not match user expectations. This paper proposes a novel probabilistic framework that exploits supervised information in a discriminative and transferable manner to generate better clustering of unlabeled data. The supervision is provided by revealing the cluster assignments for some subset of the ground truth clusters and is used to learn a transformation of the data such that labeled instances form well-separated clusters with respect to the given clustering objective. This estimated transformation function enables us to fold the remaining unlabeled data into a space where new clusters hopefully match user expectations. While our framework is general, in this paper, we focus on its application to Gaussian and von MisesFisher mixture models. Extensive testing on 23 data sets across several application domains revealed substantial improvement in performance over competing methods.

History

Publisher Statement

Copyright © 2014 by AUAI Press

Date

2014-07-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC