Many modern problems in causal inference have non-trivial complications beyond the classical settings of randomized trials, parametric models, and average treatment effects.
Despite their inherent complexities, many recent questions in causal inference are still tackled via overly simplified methods and data structures. My thesis is dedicated to overcoming some of these methodological limitations of classical causal inference, aiming to bridge the gap
between methodological development and practice, by effectively harness advanced machine learning tools. My work can be categorized into the following three sub-topics.
a.) Stochastic interventions for general longitudinal data. We generalize novel "incremental" intervention effects to accommodate subject dropout in longitudinal studies. Our
methods do not require positivity or parametric assumptions, and are less sensitive to the curse of dimensionality. We present efficient nonparametric estimators, showing that they converge at √ n rates and yield uniform inferential guarantees. Importantly, we argue that incremental effects are much more efficient than conventional deterministic effects in a novel infinite time horizon setting, where the number of timepoints can grow to infinity. b.) Causal effects based on distributional distances. We have proposed a novel nonstandard causal effect based on the discrepancy between unobserved counterfactual distributions
(i.e., L1 distance), in order to provide more nuanced and valuable information about treatment effects than simple mean shifts. We consider single- and multi-source randomized studies, as well as observational studies, and analyze error bounds and asymptotic properties of the proposed estimators. Special difficulties arise due to the non-smoothness of the L1 distance functional. c.) Causal clustering. We give a novel adaptation of unsupervised learning methods for analyzing treatment effect heterogeneity. Specifically, we pursue an efficient way to
uncover subgroup structure in conditional treatment effects by leveraging tools in clustering analysis. We find conditions under which k-means, density-based, and hierarchical clustering algorithms can be successfully adopted into our framework. For k-means causal clustering, we develop a novel estimator that attains fast convergence rates and asymptotic normality of the cluster centers, even under weak nonparametric conditions on nuisance function estimation. Unlike previous studies, our framework can be easily extended to outcome-wide studies.