Carnegie Mellon University
Browse
hhu_MachineLearning_2019.pdf (1.87 MB)

Anytime Prediction and Learning for the Balance between Computation and Accuracy

Download (1.87 MB)
thesis
posted on 2019-06-27, 17:01 authored by Hanzhang HuHanzhang Hu
When choosing machine learning algorithms, one often has to balance between two opposing factors, the computational speed and the accuracy of predictors. The trade-off during testing is often difficult to balance, because the test-time computational budget may be agnostic at training, and the budget may vary during testing. Analogously, given a novel data-set, one often lacks prior knowledge in the appropriate predictor complexity and training computation, and furthermore, may want to interrupt or prolong the training based on training results. In this work, we address these trade-offs between computation and accuracy via anytime
prediction and learning, which are algorithms that can be interrupted at any time and still produce valid solutions. Furthermore, the quality of the results improves with the consumed computation before interruption. With the versatility to adjust to any budget, anytime algorithms automatically utilize any agnostic computational budget to the maximum extent. To address the test-time trade-off, we study anytime predictors, whose prediction computation
can be interrupted during testing. We start with developing provably near-optimal anytime linear predictors, and derive a theoretical performance limitation for anytime predictors that are based on ensemble methods. Then we develop practical anytime predictions within individual neural networks via multi-objective optimization. Furthermore, leveraging these anytime predictors as weak learners, we circumvent the performance limitation on ensemble-based anytime predictors. For the train-time trade-off, we consider the neural architecture search problem, where one
seeks the optimal neural network structure for a data-set. We draw a parallel between this bi-level combinatorial optimization problem and the feature selection problem for linear prediction, and develop an iterative network growth algorithm that is inspired by a forward selection algorithm.
We also consider the problem of training on large data-sets, and develop no-regret gradient boosting algorithms for stochastic data streams.

History

Date

2019-05-14

Degree Type

  • Dissertation

Department

  • Machine Learning

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

J. Andrew Bagnell Martial Hebert

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC