Carnegie Mellon University
Browse
wneiswanger_MachineLearning_2019.pdf (8.06 MB)

Post-inference Methods for Scalable Probabilistic Modeling and Sequential Decision Making

Download (8.06 MB)
thesis
posted on 2020-02-25, 21:16 authored by William NeiswangerWilliam Neiswanger
Probabilistic modeling refers to a set of techniques for modeling data that allows one to specify assumptions about the processes that generate data, incorporate prior beliefs
about models, and infer properties of these models given observed data. Benefits include uncertainty quantification, multiple plausible solutions, reduction of overfitting,
better performance given small data or large models, and explicit incorporation of a priori knowledge and problem structure. In recent decades, an array of inference algorithms have been developed to estimate these models.
This thesis focuses on post-inference methods, which are procedures that can be applied after the completion of standard inference algorithms to allow for increased
efficiency, accuracy, or parallelism when learning probabilistic models of big data sets. These methods also allow for scalable computation in distributed or online
settings, incorporation of complex prior information, and better use of inference results in downstream tasks. A few examples include:
• Embarrassingly parallel inference. Large data sets are often distributed over a collection of machines. We first compute an inference result (e.g. with Markov chain Monte Carlo or variational inference) on each machine, in parallel, without communication between machines. Afterwards, we combine the results to yield an inference result for the full data set.
• Prior swapping. Certain model priors limit the number of applicable inference algorithms, or increase their computational cost. We first choose any “convenient
prior” (e.g. a conjugate prior, or a prior that allows for computationally cheap inference), and compute an inference result. Afterwards, we use this result to efficiently perform inference with other, more sophisticated priors or
regularizers.
• Sequential decision making and optimization. Model-based sequential decision making and optimization methods use models to define acquisition functions. We compute acquisition functions using the inference result from any
probabilistic program or model framework, and perform efficient inference in sequential settings.
We also describe the benefits of combining the above methods, present methodology for applying the embarrassingly parallel procedures when the number of machines is dynamic or unknown at inference time, illustrate how these methods can be applied for spatiotemporal analysis and in covariate dependent models, show ways to
optimize these methods by incorporating test-functions of interest, and demonstrate how these methods can be implemented in probabilistic programming frameworks
for automatic deployment.

History

Date

2019-08-22

Degree Type

  • Dissertation

Department

  • Machine Learning

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

Eric Xing