2-AIN-505, 2-AIN-251: Seminar in Bioinformatics (1), (3)
Winter 2021
Abstrakt

Dhaivat Joshi, Shunfu Mao, Sreeram Kannan, Suhas Diggavi. QAlign: aligning nanopore reads accurately using current-level modeling. Bioinformatics, 37(5):625-633. 2021.

Download preprint: not available

Download from publisher: not available PubMed

Related web page: not available

Bibliography entry: BibTeX

Abstract:

MOTIVATION: Efficient and accurate alignment of DNA/RNA sequence reads to each
other or to a reference genome/transcriptome is an important problem in genomic
analysis. Nanopore sequencing has emerged as a major sequencing technology and
many long-read aligners have been designed for aligning nanopore reads. However, 
the high error rate makes accurate and efficient alignment difficult. Utilizing
the noise and error characteristics inherent in the sequencing process properly
can play a vital role in constructing a robust aligner. In this article, we
design QAlign, a pre-processor that can be used with any long-read aligner for
aligning long reads to a genome/transcriptome or to other long reads. The key
idea in QAlign is to convert the nucleotide reads into discretized current levels
that capture the error modes of the nanopore sequencer before running it through 
a sequence aligner. RESULTS: We show that QAlign is able to improve alignment
rates from around 80% up to 90% with nanopore reads when aligning to the genome. 
We also show that QAlign improves the average overlap quality by 9.2, 2.5 and
10.8% in three real datasets for read-to-read alignment. Read-to-transcriptome
alignment rates are improved from 51.6% to 75.4% and 82.6% to 90% in two real
datasets. AVAILABILITY AND IMPLEMENTATION:
https://github.com/joshidhaivat/QAlign.git. SUPPLEMENTARY INFORMATION:
Supplementary data are available at Bioinformatics online.