2-AIN-505, 2-AIN-251: Seminár z bioinformatiky (1) a (3)
Zima 2020
Abstrakt

Katelyn McNair, Carol Zhou, Elizabeth A. Dinsdale, Brian Souza, Robert A. Edwards. PHANOTATE: a novel approach to gene identification in phage genomes. Bioinformatics, 35(22):4537-4542. 2019.

Download preprint: not available

Download from publisher: https://doi.org/10.1093/bioinformatics/btz265 PubMed

Related web page: not available

Bibliography entry: BibTeX

Abstract:

MOTIVATION: Currently there are no tools specifically designed for annotating
genes in phages. Several tools are available that have been adapted to run on
phage genomes, but due to their underlying design, they are unable to capture the
full complexity of phage genomes. Phages have adapted their genomes to be
extremely compact, having adjacent genes that overlap and genes completely inside
of other longer genes. This non-delineated genome structure makes it difficult
for gene prediction using the currently available gene annotators. Here we
present PHANOTATE, a novel method for gene calling specifically designed for
phage genomes. Although the compact nature of genes in phages is a problem for
current gene annotators, we exploit this property by treating a phage genome as a
network of paths: where open reading frames are favorable, and overlaps and gaps 
are less favorable, but still possible. We represent this network of connections 
as a weighted graph, and use dynamic programing to find the optimal path.
RESULTS: We compare PHANOTATE to other gene callers by annotating a set of 2133
complete phage genomes from GenBank, using PHANOTATE and the three most popular
gene callers. We found that the four programs agree on 82% of the total predicted
genes, with PHANOTATE predicting more genes than the other three. We searched for
these extra genes in both GenBank's non-redundant protein database and all of the
metagenomes in the sequence read archive, and found that they are present at
levels that suggest that these are functional protein-coding genes. AVAILABILITY 
AND IMPLEMENTATION: https://github.com/deprekate/PHANOTATE. SUPPLEMENTARY
INFORMATION: Supplementary data are available at Bioinformatics online.