Carnegie Mellon University
CMU-CB-21-103.pdf (37.97 MB)

Genome-Driven Personalized Medicine of Cancer via Machine Learning and Phylogenetic Models

Download (37.97 MB)
posted on 2023-06-20, 20:37 authored by Yifeng TaoYifeng Tao

Cancer proceeds from the accumulation of genomic alterations, and develops into heterogeneous cell populations in an evolutionary process. Therefore, the prognoses of cancer patients, such as survival profile, metastasis, and drug response, are encoded by the large-volume genome data. We first investigate the reliable phenotype inference of cancer through well-designed interpretable machine learning models. By leveraging the power of large-scale genomic data and external biomedical knowledge base, we utilize deep learning models for the accurate inference of cancer phenotypes, including transcriptome expression levels, transcription factor activities, and drug resistance. We address the interpretability of models through techniques such as attention mechanisms to identify driver mutations and critical biomarkers. Secondly, we reveal the intra-/inter-tumor heterogeneity and mechanism of tumor progression via robust deconvolution and phylogenetic algorithms. We formulate the deconvolution of bulk tumor molecular data mathematically as a biologically inspired matrix factorization problem, and propose a neural network and then an improved hybrid optimizer to solve the problem robustly and accurately. We develop and apply a Minimum Elastic Potential algorithm to reconstruct the evolutionary trajectory from the unmixed clones. Finally, we improve the prognostic prediction of cancer by incorporating machine learning and evolutionary methods. Clinicians traditionally focused on the pathological features and driver-level genomic profiles to facilitate the treatment. However, it is possible that critical clones, instead of the bulk tumor as a whole, affect the prognoses. We explore the questions by integrating both the evolutionary mutational features, driver-level features, and clinical features to improve the prognostic prediction of cancer. We develop an L0-regularized Cox regression model, and find that the evolutionary features account for roughly 1/3 of all the available features, depending on cancer types and sequencing techniques. 




Degree Type

  • Dissertation


  • Computational Biology

Degree Name

  • Doctor of Philosophy (PhD)


Russell Schwartz

Usage metrics


    Ref. manager