Finding and Leveraging Structure in Learning Problems
In the last several years we have witnessed the creation of data at an unprecedented rate and the size of datasets available in various applications has exploded. This data comes from everywhere: sensors used to gather climate information, sky survey telescopes used to collect astronomy data, customers who generate purchase records, and gene expression data from microarrays to name a few. Modern machine learning and statistics has focused extensively on solving various inference problems involving these datasets. In this thesis we develop robust estimation proce dures with theoretical guarantees for a variety of learning problems using noisy and high-dimensional data.
Learning from noisy and high-dimensional data can be impossible if we do not exploit structure available in the data or learning task and in this thesis we focus on understanding the statistical and computational aspects of finding and leveraging structure in these datasets and learning problems.
The challenges we address in this thesis broadly fall into three categories: high dimensional sparse learning, clustering from noisy high-dimensional data and topological data analysis. In each case our main focus is on developing principled, efficient algorithms that leverage hidden structure and providing rigorous theoretical analysis of their performance. In several cases we also provide (statistical) lower bounds to establish the fundamental statistical limits for the problems we consider.
History
Date
2013-08-01Degree Type
- Dissertation
Thesis Department
- Statistics and Data Science
Degree Name
- Doctor of Philosophy (PhD)