Janik Sielemann, Katharina Sielemann, Broňa Brejová, Tomáš Vinař, Cedric Chauve. plASgraph2: using graph neural networks to detect plasmid contigs from an assembly graph. Frontiers in Microbiology, 14:fmicb.2023.1267695. 2023.

Download preprint: not available

Download from publisher: https://doi.org/10.3389/fmicb.2023.1267695

Related web page: not available

Bibliography entry: BibTeX

Abstract:

Identification of plasmids from sequencing data is an important and 
challenging problem related to antimicrobial resistance spread and other 
One-Health issues. We provide a new architecture for identifying plasmid 
contigs in fragmented genome assemblies built from short-read data. We 
employ graph neural networks (GNNs) and the assembly graph to propagate the 
information from nearby nodes, which leads to more accurate classification, 
especially for short contigs that are difficult to classify based on 
sequence features or database searches alone. We trained plASgraph2 on a 
data set of samples from the ESKAPEE group of pathogens. plASgraph2 either 
outperforms or performs on par with a wide range of state-of-the-art 
methods on testing sets of independent ESKAPEE samples and samples from 
related pathogens. On one hand, our study provides a new accurate and easy 
to use tool for contig classification in bacterial isolates; on the other 
hand, it serves as a proof-of-concept for the use of GNNs in genomics. Our 
software is available at https://github.com/cchauve/plasgraph2 and the 
training and testing data sets are available at 
https://github.com/fmfi-compbio/plasgraph2-datasets.