Carnegie Mellon University
Browse
biology-1439.pdf (618.21 kB)

Generalized buneman pruning for inferring the most parsimonious multi-state phylogeny.

Download (618.21 kB)
journal contribution
posted on 2011-03-01, 00:00 authored by Navodit Misra, Guy E. Blelloch, Ramamoorthi RaviRamamoorthi Ravi, Russell SchwartzRussell Schwartz

Accurate reconstruction of phylogenies remains a key challenge in evolutionary biology. Most biologically plausible formulations of the problem are formally NP-hard, with no known efficient solution. The standard in practice are fast heuristic methods that are empirically known to work very well in general, but can yield results arbitrarily far from optimal. Practical exact methods, which yield exponential worst-case running times but generally much better times in practice, provide an important alternative. We report progress in this direction by introducing a provably optimal method for the weighted multi-state maximum parsimony phylogeny problem. The method is based on generalizing the notion of the Buneman graph, a construction key to efficient exact methods for binary sequences, so as to apply to sequences with arbitrary finite numbers of states with arbitrary state transition weights. We implement an integer linear programming (ILP) method for the multi-state problem using this generalized Buneman graph and demonstrate that the resulting method is able to solve data sets that are intractable by prior exact methods in run times comparable with popular heuristics. We further show on a collection of less difficult problem instances that the ILP method leads to large reductions in average-case run times relative to leading heuristics on moderately hard problems. Our work provides the first method for provably optimal maximum parsimony phylogeny inference that is practical for multi-state data sets of more than a few characters.

History

Publisher Statement

This is a copy of an article published in theJournal of Computational Biology © 2011 Mary Ann Liebert, Inc.;Journal of Computational Biology is available online at: http://online.liebertpub.com

Date

2011-03-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC