file.pdf (406.46 kB)
Download file

Column Subset Selection with Missing Data via Active Sampling

Download (406.46 kB)
journal contribution
posted on 01.02.2015, 00:00 authored by Yining Wang, Aarti Singh

Column subset selection of massive data matrices has found numerous applications in real-world data systems. In this paper, we propose and analyze two sampling based algorithms for column subset selection without access to the complete input matrix. To our knowledge, these are the first algorithms for column subset selection with missing data that are provably correct. The proposed methods work for row/column coherent matrices by employing the idea of adaptive sampling. Furthermore, when the input matrix has a noisy low-rank structure, one algorithm enjoys a relative error bound.


Publisher Statement

Copyright 2015 by the authors



Usage metrics