Robust Inference for Single-Cell CRISPR Screens
CRISPR is a genome engineering technology that has enabled scientists to precisely manipulate and perturb human genomes. Single-cell CRISPR screens combine CRISPR genome engineering and single-cell sequencing to survey the effects of CRISPR perturbations on the molecular phenotypes of individual cells. Single-cell CRISPR screens have generated substantial academic and industrial interest in recent years, promising to accelerate cancer, longevity, and medical genetics research. However, single-cell CRISPR screens pose considerable statistical challenges, currently limiting the reliability of conclusions drawn on the basis of single-cell CRISPR screen experiments. The broad objective of this thesis is to develop statistically rigorous and computationally efficient methods for the analysis of single-cell CRISPR screen data.
To this end we make three main contributions. First, we leverage cluster- and cloud-scale computing to conduct an extensive empirical investigation of a diverse array of single-cell CRISPR screen datasets, identifying the most pressing statistical challenges that the data pose. Next, we develop statistical methods that address these challenges both in theory and practice. An application of the proposed methods to real data indicates considerably improved performance relative to existing methods. Finally, we implement the proposed methods in efficient and practical software aimed at working biologists. Taken together, these contributions help put single-cell CRISPR screen data analysis onto more solid statistical footing, thereby facilitating the application of single-cell CRISPR screen technology to accelerate biological discovery. The methods that we develop — which primarily focus on assumption- and compute-lean hypothesis testing — may be of independent statistical interest.
History
Date
2023-07-05Degree Type
- Dissertation
Department
- Statistics and Data Science
Degree Name
- Doctor of Philosophy (PhD)