Carnegie Mellon University
Browse

Towards Efficient Automated Machine Learning

Download (3.3 MB)
thesis
posted on 2021-04-14, 17:48 authored by Liam Li
Machine learning is widely used in a variety of different disciplines to develop predictive models for variables of interest. However, building such solutions is a time
consuming and challenging discipline that requires highly trained data scientists and domain experts. In response, the field of automated machine learning (AutoML) aims
to reduce human effort and speedup the development cycle through automation. Due to the ubiquity of hyperparameters in machine learning algorithms and the impact that a well-tuned hyperparameter configuration can have on predictive
performance, hyperparameter optimization is a core problem in AutoML. More recently, the rise of deep learning has motivated neural architecture search (NAS), a specialized instance of a hyperparameter optimization problem focused on automating the design of neural networks. Naive approaches to hyperparameter optimization like grid search and random search are computationally intractable for large scale tuning problems. Consequently, this thesis focuses on developing efficient and principled methods for hyperparameter optimization and NAS. In particular, we make progress towards answering the following questions with the aim of developing algorithms for more efficient and effective automated machine learning:
1. Hyperparameter Optimization
(a) How can we effectively use early-stopping to speed up hyperparameter optimization?
(b) How can we exploit parallel computing to perform hyperparameter optimization in the same time it takes to train a single model in the sequential setting?
(c) For multi-stage machine learning pipelines, how can we exploit the structure of the search space to reduce total computational cost?
2. Neural Architecture Search
(a) What is the gap in performance between state-of-the-art weight-sharing NAS methods and random search baselines?
(b) How can we develop more principled weight-sharing methods with provably faster convergence rates and improved empirical performance?
(c) Does the weight-sharing paradigm commonly used in NAS have applications to more general hyperparameter optimization problems?
Given these problems, this thesis is organized into two parts. The first part focuses on progress we have made towards efficient hyperparameter optimization by
addressing Problems 1a, 1b, and 1c. The second part focuses on progress we have made towards understanding and improving weight-sharing for neural architecture
search and beyond by addressing Problems 2a, 2b, and 2c.

History

Date

2020-05-17

Degree Type

  • Dissertation

Department

  • Machine Learning

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

Ameet Talwalkar

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC