Emmanuel Noutahi, Magali Semeria, Manuel Lafond, Jonathan Seguin, Bastien Boussau, Laurent Gueguen, Nadia El-Mabrouk, Eric Tannier. Efficient Gene Tree Correction Guided by Genome Evolution. PLoS One, 11(8):e0159559. 2016.

Download from publisher: https://dx.plos.org/10.1371/journal.pone.0159559 PubMed

MOTIVATIONS: Gene trees inferred solely from multiple alignments of homologous
sequences often contain weakly supported and uncertain branches. Information for 
their full resolution may lie in the dependency between gene families and their
genomic context. Integrative methods, using species tree information in addition 
to sequence information, often rely on a computationally intensive tree space
search which forecloses an application to large genomic databases. RESULTS: We
propose a new method, called ProfileNJ, that takes a gene tree with statistical
supports on its branches, and corrects its weakly supported parts by using a
combination of information from a species tree and a distance matrix. Its low
running time enabled us to use it on the whole Ensembl Compara database, for
which we propose an alternative, arguably more plausible set of gene trees. This 
allowed us to perform a genome-wide analysis of duplication and loss patterns on 
the history of 63 eukaryote species, and predict ancestral gene content and order
for all ancestors along the phylogeny. AVAILABILITY: A web interface called
RefineTree, including ProfileNJ as well as a other gene tree correction methods, 
which we also test on the Ensembl gene families, is available at:
http://www-ens.iro.umontreal.ca/~adbit/polytomysolver.html. The code of ProfileNJ
as well as the set of gene trees corrected by ProfileNJ from Ensembl Compara
version 73 families are also made available.