Statistical Inference for Optimal Transport
Optimal transport is a flexible framework for comparing probability distributions, which has received a recent surge of interest as a methodological tool in statistics. The aim of this thesis is to develop procedures for performing valid and efficient statistical inference for various objects arising from the optimal transport framework. On the one hand, we derive a semiparametric efficient estimator of the quadratic Wasserstein distance between probability measures of arbitrary fixed dimension. On the other hand, we develop a pointwise central limit theorem for the quadratic optimal transport map between multivariate periodic distributions. We also develop nonasymptotic and sequential inferential procedures for various optimal transport divergence functionals. These results provide a step toward the longstanding problem of performing practical inference for optimal transport in arbitrary dimension. Along the way, this thesis studies the related question of performing minimax estimation for optimal transport maps and costs, leading to new minimax upper or lower bounds for these problems in various settings. We close with an application of these ideas to a problem arising in experimental high energy physics, where we show that optimal transport can be used to address the problem of data-driven background estimation, arising in the search for new physical phenomena at the Large Hadron Collider.
History
Date
2024-05-10Degree Type
- Dissertation
Department
- Statistics and Data Science
Degree Name
- Doctor of Philosophy (PhD)