2-AIN-506, 2-AIN-252: Seminar in Bioinformatics (2), (4)
Leto 2023
Abstrakt

Tuukka Norri, Bastien Cazaux, Saska Donges, Daniel Valenzuela, Veli Makinen. Founder Reconstruction Enables Scalable and Seamless Pangenomic Analysis. Bioinformatics, 37(24):4611-4619. 2021.

Download preprint: not available

Download from publisher: not available PubMed

Related web page: not available

Bibliography entry: BibTeX

Abstract:

MOTIVATION: Variant calling workflows that utilize a single reference sequence 
are the de facto standard elementary genomic analysis routine for resequencing 
projects. Various ways to enhance the reference with pangenomic information have 
been proposed, but scalability combined with seamless integration to existing 
workflows remains a challenge. RESULTS: We present PanVC with founder sequences, 
a scalable and accurate variant calling workflow based on a multiple alignment of 
reference sequences. Scalability is achieved by removing duplicate parts up to a 
limit into a founder multiple alignment, that is then indexed using a hybrid 
scheme that exploits general purpose read aligners. Our implemented workflow uses 
GATK or BCFtools for variant calling, but the various steps of our workflow (e.g. 
vcf2multialign tool, founder reconstruction) can be of independent interest as a 
basis for creating novel pangenome analysis workflows beyond variant calling. 
AVAILABILITY: Our open access tools and instructions how to reproduce our 
experiments are available at the following address: 
https://github.com/algbio/panvc-founders. SUPPLEMENTARY INFORMATION: 
Supplementary data are available at Bioinformatics online.