2-AIN-505, 2-AIN-251: Seminár z bioinformatiky (1) a (3)
Zima 2020

Mikhail Kolmogorov, Jeffrey Yuan, Yu Lin, Pavel A. Pevzner. Assembly of long, error-prone reads using repeat graphs. Nature biotechnology, 37(5):540-546. 2019.

Download preprint: not available

Download from publisher: https://www.nature.com/articles/s41587-019-0072-8 PubMed

Related web page: not available

Bibliography entry: BibTeX


Accurate genome assembly is hampered by repetitive regions. Although long single 
molecule sequencing reads are better able to resolve genomic repeats than
short-read data, most long-read assembly algorithms do not provide the repeat
characterization necessary for producing optimal assemblies. Here, we present
Flye, a long-read assembly algorithm that generates arbitrary paths in an unknown
repeat graph, called disjointigs, and constructs an accurate repeat graph from
these error-riddled disjointigs. We benchmark Flye against five state-of-the-art 
assemblers and show that it generates better or comparable assemblies, while
being an order of magnitude faster. Flye nearly doubled the contiguity of the
human genome assembly (as measured by the NGA50 assembly quality metric) compared
with existing assemblers.