Aniket Mane, Mahsa Faizrahnemoon, Tomas Vinar, Brona Brejova, Cedric Chauve. PlasBin-flow: a flow-based MILP algorithm for plasmid contigs binning. Bioinformatics, 39(S1):i288-i296. 2023.

Download preprint: not available

Download from publisher: https://dx.doi.org/10.1093/bioinformatics/btad250

Related web page: not available

Bibliography entry: BibTeX

Abstract:

MOTIVATION: The analysis of bacterial isolates to detect plasmids is 
important due to their role in the propagation of antimicrobial resistance. 
In short-read sequence assemblies, both plasmids and bacterial chromosomes 
are typically split into several contigs of various lengths, making 
identification of plasmids a challenging problem. In plasmid contig 
binning, the goal is to distinguish short-read assembly contigs based on 
their origin into plasmid and chromosomal contigs and subsequently sort 
plasmid contigs into bins, each bin corresponding to a single plasmid. 
Previous works on this problem consist of de novo approaches and reference-
based approaches. De novo methods rely on contig features such as length, 
circularity, read coverage, or GC content. Reference-based approaches 
compare contigs to databases of known plasmids or plasmid markers from 
finished bacterial genomes. 

RESULTS: Recent developments suggest that leveraging information contained 
in the assembly graph improves the accuracy of plasmid binning. We present 
PlasBin-flow, a hybrid method that defines contig bins as subgraphs of the 
assembly graph. PlasBin-flow identifies such plasmid subgraphs through a 
mixed integer linear programming model that relies on the concept of 
network flow to account for sequencing coverage, while also accounting for 
the presence of plasmid genes and the GC content that often distinguishes 
plasmids from chromosomes. We demonstrate the performance of PlasBin-flow 
on a real dataset of bacterial samples. 

AVAILABILITY AND IMPLEMENTATION: https://github.com/cchauve/PlasBin-flow.