Bioinformatický seminár

Tue 24 Jan. 2012, 17:20

Title: Gao et al. Opera: Reconstructing Optimal Genomic Scaffolds with High-Throughput Paired-End Sequences
Speaker: Mic Nánási

Scaffolding, the problem of ordering and orienting contigs,
typically using paired-end reads, is a crucial step in the assembly of
high-quality draft genomes. Even as sequencing technologies and mate-pair
protocols have improved significantly, scaffolding programs still rely on
heuristics, with no guarantees on the quality of the solution. In this
work, we explored the feasibility of an exact solution for scaffolding and
present a first fixed-parameter tractable solution for assembly (Opera).
We also describe a graph contraction procedure that allows the solution to
scale to large scaffolding problems and demonstrate this by scaffolding
several large real and synthetic datasets. In comparisons with existing
scaffolders, Opera simultaneously produced longer and more accurate
scaffolds demonstrating the utility of an exact approach. Opera also
incorporates an exact quadratic programming formulation to precisely
compute gap sizes (Availability: