Computational Biology @ Comenius University in Bratislava:

Michal Nánási. Decoding of Hidden Markov Models with Applications to Sequence Alignment. PhD thesis, Comenius University in Bratislava, 2014. Supervised by Broňa Brejová.

Download preprint: 14nanasith.pdf, 1398Kb

Download from publisher: http://alis.uniba.sk/storage/dpg/dostupne/FM/2015/2015-FM-17881/

Related web page: not available

Bibliography entry: BibTeX

Abstract:

We study two important problems in computational biology: sequence annotation and
sequence alignment. In the thesis we concentrate on the use of hidden Markov models
(HMMs), well established generative probabilistic models.
In the first part, we study the sequence annotation problem, specifically the two-stage
HMM decoding algorithms and the computational complexity of related problems. In
particular, we demonstrate that two-stage algorithms can be used to increase the accuracy
of decoding, and we prove the NP-hardness for three problems appropriate for the first
stage: the most probable set problem, the most probable restriction problem and the most
probable footprint problem.
The second part of the thesis focuses on alignment of sequences that contain tandem
repeats. Tandem repeats are highly repetitive elements withing genomic sequences that
cause biases in alignments. To address this issue, we introduce a new HMM that mod-
els alignments containing tandem repeats, combine it with existing and new decoding
algorithms, and evaluate our approach experimentally.
In both problems, we use the decoding algorithms to improve the accuracy of HMM
predictions. Decoding algorithms are often neglected, and most of the development is
focused on the structure of an HMM. However, a proper selection of a decoding method
can lead to significant improvements in the predictions.
Keywords: hidden Markov models, decoding, annotation, alignment