2-AIN-505, 2-AIN-251: Seminar in Bioinformatics (1), (3)
Winter 2024
Abstrakt

Dominik Stanojevic, Zhe Li, Sara Bakic, Roger Foo, Mile Sikic. Rockfish: A transformer-based model for accurate 5-methylcytosine prediction from nanopore sequencing. Nat Commun, 15(1):5580. 2024.

Download preprint: not available

Download from publisher: https://www.nature.com/articles/s41467-024-49847-0 PubMed

Related web page: not available

Bibliography entry: BibTeX

Abstract:

DNA methylation plays an important role in various biological processes, 
including cell differentiation, ageing, and cancer development. The most 
important methylation in mammals is 5-methylcytosine mostly occurring in the 
context of CpG dinucleotides. Sequencing methods such as whole-genome bisulfite 
sequencing successfully detect 5-methylcytosine DNA modifications. However, they 
suffer from the serious drawbacks of short read lengths and might introduce an 
amplification bias. Here we present Rockfish, a deep learning algorithm that 
significantly improves read-level 5-methylcytosine detection by using Nanopore 
sequencing. Rockfish is compared with other methods based on Nanopore sequencing 
on R9.4.1 and R10.4.1 datasets. There is an increase in the single-base accuracy 
and the F1 measure of up to 5 percentage points on R.9.4.1 datasets, and up to 
0.82 percentage points on R10.4.1 datasets. Moreover, Rockfish shows a high 
correlation with whole-genome bisulfite sequencing, requires lower read depth, 
and achieves higher confidence in biologically important regions such as CpG-rich 
promoters while being computationally efficient. Its superior performance in 
human and mouse samples highlights its versatility for studying 5-methylcytosine 
methylation across varied organisms and diseases. Finally, its adaptable 
architecture ensures compatibility with new versions of pores and chemistry as 
well as modification types.