Algorithms for Learning Latent Models : Establishing Tractability to Approaching Optimality
Modern machine learning relies on algorithms that fit expressive latent models to large datasets. While such tasks are easy in low dimensions, real-world datasets are truly high-dimensional, often leading to computational intractability. Additionally, a prerequisite to deploying models in real-world systems is to ensure that their behavior degrades gracefully when the modeling assumptions no longer hold. Therefore, there is a growing need for efficient algorithms that fit reliable and robust models to data and are accompanied with provable guarantees.
In this thesis, we focus on designing such efficient and robust algorithms for learning latent variable models. In particular, we investigate two complementary regimes arising in learning latent models: establishing computational tractability and approaching computational optimality. The first regime considers learning highdimensional latent models where no efficient algorithms were known. We resolve several central open questions in this regime, by providing the first polynomial time algorithms for robustly learning a mixture of Gaussians, robust linear regression and learning two-layer neural networks. The second regime considers models where polynomial time algorithms were already well-established. Here, we show that we can obtain algorithms with information-theoretically minimal running time and sample complexity. In particular, we show that for several low-rank models there is no statistical vs. computational trade-off.
- Computer Science
- Doctor of Philosophy (PhD)