Bioinformatický seminár

Tue 2 Oct. 2012, 17:20

Title: Son K. Pham et al (2012) Pathset Graphs: A Novel Approach for Comprehensive Utilization of Paired Reads in Genome Assembly
Speaker: Mic Nánási

Abstract One of the key advances in genome assembly that has led to a significant
improvement in contig lengths has been improved algorithms for utilization of
paired reads (mate-pairs). While in most assemblers, mate-pair information is
used in a post-processing step, the recently proposed Paired de Bruijn Graph
(PDBG) approach incorporates the mate-pair information directly in the assembly
graph structure. However, the PDBG approach faces difficulties when the variation
in the insert sizes is high. To address this problem, we first transform
mate-pairs into edge-pair histograms that allow one to better estimate the
distance between edges in the assembly graph that represent regions linked by
multiple mate-pairs. Further, we combine the ideas of mate-pair transformation
and PDBGs to construct new data structures for genome assembly: pathsets and
pathset graphs.