Bioinformatický seminár

Tue 8 Mar. 2011, 17:20
I-9

Title: Mukul et al. Robinson-Foulds supertrees
Speaker: Kubo Kováč

BACKGROUND: Supertree methods synthesize collections of small phylogenetic
trees with incomplete taxon overlap into comprehensive trees, or
supertrees, that include all taxa found in the input trees. Supertree
methods based on the well established Robinson-Foulds (RF) distance have
the potential to build supertrees that retain much information from the
input trees. Specifically, the RF supertree problem seeks a binary
supertree that minimizes the sum of the RF distances from the supertree to
the input trees. Thus, an RF supertree is a supertree that is consistent
with the largest number of clusters (or clades) from the input trees.
RESULTS: We introduce efficient, local search based, hill-climbing
heuristics for the intrinsically hard RF supertree problem on rooted
trees. These heuristics use novel non-trivial algorithms for the SPR and
TBR local search problems which improve on the time complexity of the best
known (naive) solutions by a factor of Theta(n) and Theta(n2) respectively
(where n is the number of taxa, or leaves, in the supertree). We use an
implementation of our new algorithms to examine the performance of the RF
supertree method and compare it to matrix representation with parsimony
(MRP) and the triplet supertree method using four supertree data sets. Not
only did our RF heuristic provide fast estimates of RF supertrees in all
data sets, but the RF supertrees also retained more of the information
from the input trees (based on the RF distance) than the other supertree
methods. CONCLUSIONS: Our heuristics for the RF supertree problem, based
on our new local search algorithms, make it possible for the first time to
estimate large supertrees by directly optimizing the RF distance from
rooted input trees to the supertrees. This provides a new and fast method
to build accurate supertrees. RF supertrees may also be useful for
estimating majority-rule(-) supertrees, which are a generalization of
majority-rule consensus trees.