Brona Brejova, Daniel G. Brown, Tomas Vinar. The most probable labeling problem in HMMs and its application to bioinformatics. Journal of Computer and System Sciences, 73(7):1060-1077. 2007. Early version in WABI 2004.

Download preprint: 07paths.pdf, 335Kb

Download from publisher:

Related web page: not available

Bibliography entry: BibTeX

See also: early version


Hidden Markov models (HMMs) are often used for biological sequence
annotation. Each sequence feature is represented by a collection of
states with the same label. In annotating a new sequence, we seek the
sequence of labels that has highest probability. Computing this most
probable annotation was shown NP-hard by Lyngsoe and Pedersen. We
improve their result by showing that the problem is NP-hard for a
specific HMM, and present efficient algorithms to compute the most
probable annotation for a large class of HMMs, including abstractions
of models previously used for transmembrane protein topology prediction
and coding region detection.  We also present  a small experiment showing
that the  maximum probability annotation is more accurate than the
labeling that results from simpler heuristics.