Greedy Feature Selection for Subspace Clustering

Dyer, Eva L.; Sankaranarayanan, Aswin C.; Baraniuk, Richard G.

doi:10.1184/R1/6468848.v1

file.pdf (934.3 kB)

Greedy Feature Selection for Subspace Clustering

journal contribution

posted on 2013-03-01, 00:00 authored by Eva L. Dyer, Aswin C. Sankaranarayanan, Richard G. Baraniuk

Unions of subspaces provide a powerful generalization of single subspace models for collections of high-dimensional data; however, learning multiple subspaces from data is challenging due to the fact that segmentation---the identification of points that live in the same subspace---and subspace estimation must be performed simultaneously. Recently, sparse recovery methods were shown to provide a provable and robust strategy for exact feature selection(EFS)---recovering subsets of points from the ensemble that live in the same subspace. In parallel with recent studies of EFS with ℓ₁-minimization, in this paper, we develop sufficient conditions for EFS with a greedy method for sparse signal recovery known as orthogonal matching pursuit (OMP). Following our analysis, we provide an empirical study of feature selection strategies for signals living on unions of subspaces and characterize the gap between sparse recovery methods and nearest neighbor (NN)-based approaches. In particular, we demonstrate that sparse recovery methods provide significant advantages over NN methods and that the gap between the two approaches is particularly pronounced when the sampling of subspaces in the data set is sparse. Our results suggest that OMP may be employed to reliably recover exact feature sets in a number of regimes where NN approaches fail to reveal the subspace membership of points in the ensemble.

History

Publisher Statement

Date

2013-03-01

Usage metrics

Keywords

subspace clustering unions of subspaces hybrid linear models sparse approximation structured sparsity nearest neighbors low-rank approximation

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Greedy Feature Selection for Subspace Clustering

History

Publisher Statement

Date

Usage metrics

Categories

Keywords

Licence

Exports