Carnegie Mellon University
Browse

Probabilistic Single Cell Lineage Tracing

Download (35.4 MB)
thesis
posted on 2021-01-12, 22:06 authored by Chieh LinChieh Lin
Cell lineage tracing is a long-standing open problem in biology. To solve this problem, new technologies that can profile single-cells have been introduced in the last decade. Currently, studies attempt to construct lineage relationships using timeseries single-cell RNA sequencing (scRNA-Seq) data or by utilizing artificial mutations for marking cells. The former studies rely on pseudo-time ordering which suffers from shortcomings that can impact their accuracy. The latter often apply phylogeny-based methods which often lead to hundreds of candidate trees. There is no current method to combine single-cell lineage trees from different individuals of
the same organism to reconstruct a single invariant lineage for the same species. In this thesis, we present a set of machine learning models that focus on reconstructing
single-cell lineages. We developed a probabilistic model based on Continuous-State Hidden Markov Model (CSHMM) to reconstruct trajectories and branchings from time series scRNA-Seq data. The model is then extended by learning
the dynamics of regulatory interactions that take place during the process being sutdied (CSHMM-TF). We next present a method that integrates sequence and expression data, In addition, we developed LinTIMaT, a statistical model for reconstructing single-cell lineage trees using both artificial mutations and scRNA-Seq data and for constructing a general invariant lineage tree from multiple cell lineage trees of the same species. Finally, we apply CSHMM to a new dataset and show that it is capable of reconstructing lineage relationships and provides important novel insights
for studying lung development.

History

Date

2020-04-17

Degree Type

  • Dissertation

Department

  • Machine Learning

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

Ziv Bar-Joseph