Advances in Statistical Gene Networks
Gene networks hold immense importance in understanding the underlying mechanisms that govern cellular activities and organismal behavior. As the true gene interaction is not observable, people often resort to observable gene expression data to statistically infer the gene network. In this thesis, we address the immense challenges in the statistical gene network, including 1) benchmark tool for gene network estimation 2) nonlinear gene network estimation methods 3) the application of gene networks in Autism associated gene understanding.
In Chapter 2, we address benchmarking imputation methods on gene coexpression estimation. We develop a new simulation tool that allows realistic simulation of a homogeneous cell group, heterogeneous cell groups, as well as complex cell groups relationships such as tree and trajectory structure, together with gene co-expression structure. We show the usefulness of our tool by accessing the effect of gene expression denoising methods on downstream gene co-expression estimation. In Chapter 3, we address the limitation of current gene co-expression estimation methods in capturing nonlinear relationships. We show that averaging cell-specific gene coexpression over a population gives a novel dependence measure that can detect any non-linear, non-monotone, and non-global relationship. We formally establish the consistency and robustness and demonstrate its advantage over a large family of dependence measures. In Chapter 4, we explore the application of various types of gene networks in a case study of identifying active genes associated with autism spectrum disorders (ASD). To enable a systematic investigation, we also develop a novel gene group interaction measure, which extends an existing idea addressing the challenges when the true gene groups are unknown to nonlinear setups. Using a unified network-assisted gene risk modeling, we found that some types of gene networks are evidently more useful than others for our task: they help identify an assortment of unique “active” and “reactive” gene communities that are biologically interesting.
History
Date
2023-07-19Degree Type
- Dissertation
Department
- Statistics and Data Science
Degree Name
- Doctor of Philosophy (PhD)