Publication details
Brona Brejova, Daniel G. Brown, Tomas Vinar.
The most probable annotation problem in HMMs and its application to bioinformatics.
Journal of Computer and System Sciences,
73(7):1060-1077.
2007.
Early version in WABI 2004.
Preprint, 335Kb | Download from publisher | Early version | BibTeX
Abstract
Hidden Markov models (HMMs) are often used for biological sequence annotation. Each sequence feature is represented by a collection of states with the same label. In annotating a new sequence, we seek the sequence of labels that has highest probability. Computing this most probable annotation was shown NP-hard by Lyngsoe and Pedersen. We improve their result by showing that the problem is NP-hard for a specific HMM, and present efficient algorithms to compute the most probable annotation for a large class of HMMs, including abstractions of models previously used for transmembrane protein topology prediction and coding region detection. We also present a small experiment showing that the maximum probability annotation is more accurate than the labeling that results from simpler heuristics.