2-AIN-505, 2-AIN-251: Seminar in Bioinformatics (1), (3)
Winter 2021

Nicholas J. Croucher, Andrew J. Page, Thomas R. Connor, Aidan J. Delaney, Jacqueline A. Keane, Stephen D. Bentley, Julian Parkhill, Simon R. Harris. Rapid phylogenetic analysis of large samples of recombinant bacterial wholegenome sequences using Gubbins. Nucleic acids research, 43(3):e15. 2015.

Download preprint: not available

Download from publisher: not available PubMed

Related web page: not available

Bibliography entry: BibTeX


The emergence of new sequencing technologies has facilitated the use of bacterial
whole genome alignments for evolutionary studies and outbreak analyses. These
datasets, of increasing size, often include examples of multiple different
mechanisms of horizontal sequence transfer resulting in substantial alterations
to prokaryotic chromosomes. The impact of these processes demands rapid and
flexible approaches able to account for recombination when reconstructing
isolates' recent diversification. Gubbins is an iterative algorithm that uses
spatial scanning statistics to identify loci containing elevated densities of
base substitutions suggestive of horizontal sequence transfer while concurrently 
constructing a maximum likelihood phylogeny based on the putative point mutations
outside these regions of high sequence diversity. Simulations demonstrate the
algorithm generates highly accurate reconstructions under realistically
parameterized models of bacterial evolution, and achieves convergence in only a
few hours on alignments of hundreds of bacterial genome sequences. Gubbins is
appropriate for reconstructing the recent evolutionary history of a variety of
haploid genotype alignments, as it makes no assumptions about the underlying
mechanism of recombination. The software is freely available for download at
github.com/sanger-pathogens/Gubbins, implemented in Python and C and supported on
Linux and Mac OS X.