Phylogenetic Reconciliation with Transfers
In order to distinguish essays and pre-prints from academic theses, we have a separate category. These are often much longer text based documents than a paper.
Correctly inferring the events in the history of a gene family is crucial to relating gene evolution to ecological adaptation, understanding the evolution of gene function, and inferring homology relationships. Information about the evolution of a gene family can be obtained from the incongruence between a gene tree and the associated species tree. Phylogenetic reconciliation algorithms infer gene events through a formal comparison of the gene and species trees. In this thesis, I expand the reconciliation framework to capture more complex evolutionary scenarios. Current reconciliation algorithms are capable of inferring duplication, transfer, and loss events when both the gene tree and the species tree are well resolved. They are not, however, well equipped to infer events with non-binary trees. A non-binary species tree indicates that evolutionary forces are also in play on the population level. A nonbinary gene tree reflects uncertainty in the gene tree branching order, due to insufficient information from the sequence data. Reconciliation algorithms that do not account for these processes can result in incorrect inference of events. To address these problems, I first introduce an algorithm that reconciles a binary gene tree with a non-binary species tree while accounting for gene tree incongruence that could result from population processes rather than gene events. Second, I present an exact algorithm and several fast heuristics that use reconciliation to resolve uncertainty in a non-binary gene tree. In a parsimony optimization framework, reconciliation seeks the solution that minimizes the weighted sum of the inferred events. One major challenge of this approach is how to select event weights that will infer the correct event history. First, I tackle this challenge from a probabilistic perspective by considering how the underlying gene event rates influence the best choice of event weights. Second, I use a topological approach to identify common tradeos between histories with transfers and histories with duplications and losses. By making these tradeos explicit, my approach provides a framework for applying biological intuition to the problem of weight selection. This significantly improves the researcher's understanding of how different weights will affect the reconciliation result, leading to better estimation of gene events. I have implemented the algorithms described above in Notung, a publically available and widely used reconciliation software package. Using this software, I demonstrate the applicability of my theoretical work to concrete biological problems via a phylogenomic analysis in Cyanobacteria. The results reveal a link between gain and loss of photosynthesis genes and niche adaptation.