Publication details

Rastislav Sramek, Brona Brejova, Tomas Vinar. On-line Viterbi algorithm for analysis of long biological sequences. In Raffaele Giancarlo, Sridhar Hannenhalli, ed., Algorithms in Bioinformatics: 7th International Workshop (WABI), 4645 volume of Lecture Notes in Computer Science, pp. 240-251, Philadelphia, PA, USA, September 2007. Springer.
Preprint, 243Kb | Download from publisher | BibTeX

Abstract

Hidden Markov models (HMMs) are routinely used for analysis of long
genomic sequences to identify various features such as genes, CpG
islands, and conserved elements. A commonly used Viterbi algorithm
requires O(mn) memory to annotate a sequence of length n with an
m-state HMM, which is impractical for analyzing whole chromosomes. In
this paper, we introduce the on-line Viterbi algorithm for decoding
HMMs in much smaller space.  Our analysis shows that our algorithm has
the expected maximum memory Theta(mlog n) on two-state HMMs. We also
experimentally demonstrate that our algorithm significantly reduces
memory of decoding a simple HMM for gene finding on both simulated and
real DNA sequences, without a significant slow-down compared to the
classical Viterbi algorithm.