Causal Inference with Complex Data Structures and Non-Standard Effects

Kim, Kwangho

doi:10.1184/R1/12789875.v1

thesis final_KK.pdf (4.58 MB)

Causal Inference with Complex Data Structures and Non-Standard Effects

thesis

posted on 2020-08-17, 18:02 authored by Kwangho Kim

Many modern problems in causal inference have non-trivial complications beyond the classical settings of randomized trials, parametric models, and average treatment effects.

Despite their inherent complexities, many recent questions in causal inference are still tackled via overly simplified methods and data structures. My thesis is dedicated to overcoming some of these methodological limitations of classical causal inference, aiming to bridge the gap

between methodological development and practice, by effectively harness advanced machine learning tools. My work can be categorized into the following three sub-topics.

a.) Stochastic interventions for general longitudinal data. We generalize novel "incremental" intervention effects to accommodate subject dropout in longitudinal studies. Our

methods do not require positivity or parametric assumptions, and are less sensitive to the curse of dimensionality. We present efficient nonparametric estimators, showing that they converge at √ n rates and yield uniform inferential guarantees. Importantly, we argue that incremental effects are much more efficient than conventional deterministic effects in a novel infinite time horizon setting, where the number of timepoints can grow to infinity. b.) Causal effects based on distributional distances. We have proposed a novel nonstandard causal effect based on the discrepancy between unobserved counterfactual distributions

(i.e., L1 distance), in order to provide more nuanced and valuable information about treatment effects than simple mean shifts. We consider single- and multi-source randomized studies, as well as observational studies, and analyze error bounds and asymptotic properties of the proposed estimators. Special difficulties arise due to the non-smoothness of the L1 distance functional. c.) Causal clustering. We give a novel adaptation of unsupervised learning methods for analyzing treatment effect heterogeneity. Specifically, we pursue an efficient way to

uncover subgroup structure in conditional treatment effects by leveraging tools in clustering analysis. We find conditions under which k-means, density-based, and hierarchical clustering algorithms can be successfully adopted into our framework. For k-means causal clustering, we develop a novel estimator that attains fast convergence rates and asymptotic normality of the cluster centers, even under weak nonparametric conditions on nuisance function estimation. Unlike previous studies, our framework can be easily extended to outcome-wide studies.

History

Date

2020-07-01

Degree Type

Dissertation

Department

Statistics

Degree Name

Doctor of Philosophy (PhD)

Advisor(s)

Edward Kennedy Larry Wasserman

Usage metrics

Keywords

causal inference statistical machine learning nonparametric statistics efficient influence function incremental effects positivity counterfactual density estimation treatment effect heterogeneity clustering

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Causal Inference with Complex Data Structures and Non-Standard Effects

History

Date

Degree Type

Department

Degree Name

Advisor(s)

Usage metrics

Categories

Keywords

Licence

Exports