posted on 2012-10-01, 00:00authored byMaria-Florina Balcan, Avrim Blum
In this paper, we initiate a theoretical study of the problem
of clustering data under interactive feedback.We introduce a query-based
model in which users can provide feedback to a clustering algorithm in
a natural way via split and merge requests. We then analyze the “clusterability” of different concept classes in this framework — the ability
to cluster correctly with a bounded number of requests under only the
assumption that each cluster can be described by a concept in the class
— and provide efficient algorithms as well as information-theoretic upper
and lower bounds.