Carnegie Mellon University
file.pdf (1.06 MB)
Download file

Nonparametric Divergence Estimation and its Applications to Machine Learning

Download (1.06 MB)
journal contribution
posted on 2011-01-01, 00:00 authored by Barnabas Poczos, Liang Xiong, Jeff Schneider

Low-dimensional embedding, manifold learning, clustering, classification, and anomaly detection are among the most important problems in machine learning. Here we consider the setting where each instance of the inputs corresponds to a continuous probability distribution. These distributions are unknown to us, but we are given some i.i.d. samples from each of them. While most of the existing machine learning methods operate on points, i.e. finite-dimensional feature vectors, in our setting we study algorithms that operate on groups, i.e. sets of feature vectors. For this purpose, we propose new nonparametric, consistent estimators for a large family of divergences and describe how to apply them for machine learning problems. As important special cases, the estimators can be used to estimate R´enyi, Tsallis, Kullback-Leibler, Hellinger, Bhattacharyya distance, L2 divergences, and mutual information. We present empirical results on synthetic data, real word images, and astronomical data sets.




Usage metrics