Carnegie Mellon University
mbonvini_phd_statsds_2023.pdf (2.68 MB)

Topics in Nonparametric Causal Inference

Download (2.68 MB)
posted on 2023-05-17, 19:22 authored by Matteo BonviniMatteo Bonvini

We study several problems related to the identification and the efficient estimation of parameters arising in causal inference. In the first part of this thesis, we consider the problem of conducting sensitivity analysis to the no-unmeasuredconfounding assumption in observational studies. Roughly speaking, confounders are variables that affect both the treatment receipt and the outcome. To estimate causal effects, all such variables must be measured and properly taken into account in the statistical analysis. This is an untestable assumption in the problems considered here because the treatment is not randomly assigned by the experimenter. Therefore, in these settings, gauging the impact of departures from this assumption on the causal effects’ estimates is of great practical relevance. In one project, we develop a novel framework that bounds the average treatment effect (ATE) as a function of the proportion of units for which the treatment-outcome association is confounded. In other work, we propose and analyze a suite of models for obtaining bounds on certain causal effects when a marginal structural model is assumed. 

In the second part of this thesis, we study the efficient estimation of two popular causal parameters: the dose-response function (DRF) and the level sets of the conditional ATE (CATE) curve. The DRF measures the expected outcome if everyone in the population takes a given treatment level. When the treatment is continuous, this parameter is a curve, viewed as a function of the infinitely many treatment values. We study several procedures to estimate the DRF and derive an estimator that, under certain conditions and to the best of our knowledge, achieves the lowest mean-square-error currently known in the literature. In a second paper, we derive the minimax optimal estimator of CATE level sets and provide upper bounds on the risk of other simpler estimation procedures. CATE level sets are a useful quantity to compute in many applications because they identify units with large treatment effects, which is the crucial information needed to optimally allocate the treatment. 

Finally, in the third part of this thesis, we study the effects of reduced mobility on the number of Covid-19 deaths. We tackle this problem by specifying a marginal structural model motivated by an epidemic model. Our analysis finds that, for many US States and at the beginning of the pandemic, a decrease in mobility leads to significantly fewer deaths. 




Degree Type

  • Dissertation


  • Statistics and Data Science

Degree Name

  • Doctor of Philosophy (PhD)


Edward H. Kennedy

Usage metrics



    Ref. manager