2-AIN-506, 2-AIN-252: Seminar in Bioinformatics (2), (4)
Summer 2024

Vikram S. Shivakumar, Omar Y. Ahmed, Sam Kovaka, Mohsen Zakeri, Ben Langmead. Sigmoni: classification of nanopore signal with a compressed pangenome index. bioRxiv, 2023.

Download preprint: not available

Download from publisher: not available PubMed

Related web page: not available

Bibliography entry: BibTeX


Improvements in nanopore sequencing necessitate efficient classification methods, 
including pre-filtering and adaptive sampling algorithms that enrich for reads of 
interest. Signal-based approaches circumvent the computational bottleneck of 
basecalling. But past methods for signal-based classification do not scale 
efficiently to large, repetitive references like pangenomes, limiting their 
utility to partial references or individual genomes. We introduce Sigmoni: a 
rapid, multiclass classification method based on the r-index that scales to 
references of hundreds of Gbps. Sigmoni quantizes nanopore signal into a discrete 
alphabet of picoamp ranges. It performs rapid, approximate matching using 
matching statistics, classifying reads based on distributions of picoamp matching 
statistics and co-linearity statistics. Sigmoni is 10-100x faster than previous 
methods for adaptive sampling in host depletion experiments with improved 
accuracy, and can query reads against large microbial or human pangenomes.