2-AIN-505, 2-AIN-251: Seminár z bioinformatiky (1) a (3)
Zima 2015
Abstrakt

Leena Salmela, Kristoffer Sahlin, Veli Mäkinen, Alexandru I. Tomescu:. Gap Filling as Exact Path Length Problem. In RECOMB 2015, pp. 281-292, 2015. Springer.

Download preprint: not available

Download from publisher: http://dx.doi.org/10.1007/978-3-319-16706-0_29

Related web page: not available

Bibliography entry: BibTeX

Abstract:

One of the last steps in a genome assembly project is filling the gaps
between consecutive contigs in the scaffolds. This problem can be naturally 
stated as finding an s-t path in a directed graph whose sum of arc costs 
belongs to a given range (the estimate on the gap length). Here s and t are 
any two contigs flanking a gap. This problem is known to be NP-hard in 
general. Here we derive a simpler dynamic programming solution than already 
known, pseudo-polynomial in the maximum value of the input range. We 
implemented various practical optimizations to it, and compared our exact 
gap filling solution experimentally to popular gap filling tools. Summing 
over all the bacterial assemblies considered in our experiments, we can in 
total fill 28% more gaps than the best previous tool and the gaps filled by 
our method span 80% more sequence. Furthermore, the error level of the newly 
introduced sequence is comparable to that of the previous tools.