Treelets —An Adaptive Multi-Scale Basis for Sparse Unordered Data

2012-04-01T00:00:00Z (GMT) by Ann B. Lee Boaz Nadler Larry Wasserman
<p>In many modern applications, including analysis of gene expression and text documents, the data are noisy, high-dimensional, and unordered -- with no particular meaning to the given order of the variables. Yet, successful learning is often possible due to sparsity; the fact that the data are typically redundant with underlying structures that can be represented by only a few features. In this paper we present treelets -- a novel construction of multi-scale orthonormal bases that extends wavelets to non-smooth signals. Treelets capture the internal structure of the data and can as a dimensionality reduction tool significantly improve inference and prediction. We examine a variety of situations where treelets outperform principal component analysis and some common variable selection methods. The proposed method is illustrated on a linear mixture model, and on two real data sets: internet advertisements and DNA microarray data.</p>

Categories

Keyword(s)

License

In Copyright