Juraj Mešťánek. Software for Annotation of Protein Coding Genes in Yeast Mitochondrial Genomes. Master thesis, Comenius University in Bratislava, 2010. Supervised by Broňa Brejová.
Download preprint: 10mestanekth.pdf, 800Kb
Download from publisher: https://stella.uniba.sk/zkp-storage/ddp/dostupne/FM/2010/2010-FM-oxpwYR/
Related web page: http://compbio.fmph.uniba.sk/mtconrad
Bibliography entry: BibTeX
Abstract:
In this thesis we present a software tool for automated computational prediction of protein coding genes in yeast mitochondrial genomes. Our tool is based on conditional random fields (CRFs). To produce an accurate annotation, our tool combines information from several different sources. We use Exonerate to align reference proteins extracted from model organisms to the genome being annotated. We also use RNAWeasel to predict the positions of introns based on their characteristic structural motifs. Finally, we use multiple alignment of mitochondrial genomic sequences from several yeast species to look for evolutionary signatures typical for protein-coding regions. These three sources of information as well as the studied nucleotide sequence form a set of observations used in our CRF model to predict positions of exons and introns. We have tested our tool on genes from 33 mitochondrial genomes. Currently, we predict 78% of genes and 70% of exons perfectly. In future we plan to make our tool available and easy to use for the life science community.