2-AIN-505, 2-AIN-251: Seminár z bioinformatiky (1) a (3)
Zima 2015
Abstrakt

Konstantin Berlin, Sergey Koren, Chen-Shan Chin, James P. Drake, Jane M. Landolin, Adam M. Phillippy. Assembling large genomes with single-molecule sequencing and locality-sensitivehashing. Nature biotechnology, 33(6):623-630. 2015.

Download preprint: not available

Download from publisher: not available PubMed

Related web page: not available

Bibliography entry: BibTeX

Abstract:

Long-read, single-molecule real-time (SMRT) sequencing is routinely used to
finish microbial genomes, but available assembly methods have not scaled well to 
larger genomes. We introduce the MinHash Alignment Process (MHAP) for overlapping
noisy, long reads using probabilistic, locality-sensitive hashing. Integrating
MHAP with the Celera Assembler enabled reference-grade de novo assemblies of
Saccharomyces cerevisiae, Arabidopsis thaliana, Drosophila melanogaster and a
human hydatidiform mole cell line (CHM1) from SMRT sequencing. The resulting
assemblies are highly continuous, include fully resolved chromosome arms and
close persistent gaps in these reference genomes. Our assembly of D. melanogaster
revealed previously unknown heterochromatic and telomeric transition sequences,
and we assembled low-complexity sequences from CHM1 that fill gaps in the human
GRCh38 reference. Using MHAP and the Celera Assembler, single-molecule sequencing
can produce de novo near-complete eukaryotic assemblies that are 99.99% accurate 
when compared with available reference genomes.