Mixed Integer Linear Programming for Maximum-Parsimony Phylogeny.pdf.pdf' (1.15 MB)
Download fileMixed Integer Linear Programming for Maximum-Parsimony Phylogeny Inference
journal contribution
posted on 2011-06-01, 00:00 authored by Srinath Sridhar, Fumei Lam, Guy BlellochGuy Blelloch, Ramamoorthi RaviRamamoorthi Ravi, Russell SchwartzRussell SchwartzReconstruction of phylogenetic trees is a fundamental problem in computational biology. While excellent heuristic methods
are available for many variants of this problem, new advances in phylogeny inference will be required if we are to be able to continue
making effective use of the rapidly growing stores of variation data now being gathered. In this paper, we present two integer linear
programming (ILP) formulations for finding the most parsimonious phylogenetic tree from a set of binary variation data. One method
uses a flow-based formulation that can produce exponential numbers of variables and constraints in the worst case. The method has,
however, proven to be extremely efficient in practice on data sets that are well beyond the reach of the available provably efficient
methods, solving several large mtDNA and Y-chromosome instances within a few seconds and giving provably optimal results in times
competitive with fast heuristics that cannot guarantee optimality. An alternative formulation establishes that the problem can be solved
with a polynomial-sized ILP. We further present a Web server that was developed based on the exponential-sized ILP that performs
fast maximum parsimony inferences and serves as a front end to a database of precomputed phylogenies spanning the human
genome.