Marcel Kucharik, Jakub Kovac, Brona Brejova.
Gene finding with complex external information.
In Markéta Lopatková, ed.,
Information Technologies - Applications and Theory (ITAT),
788 volume of CEUR-WS,
pp. 39-46, Vrátna dolina, Slovakia, September 2011. Best paper award.
Preprint, 320Kb | Download from publisher | BibTeX
The goal of gene finding is to locate genes, which are important segments of DNA encoding proteins. Programs solving this task are based on hidden Markov models (HMMs) capturing statistical features extracted from known genes, but often also incorporate hints about the correct gene structure extracted from experimental data. Existing gene finding programs can use such external information only in a limited way. Typically, they can process only simple hints describing a single part of the gene structure, because these are relatively easy to incorporate to standard HMM algorithms, but cannot cope with complex hints spanning multiple parts. We have developed an efficient algorithm able to process such complex hints. Our experiments show that this approach slightly increases the accuracy of gene prediction. We also prove that a more general class of hints leads to an NP-hard problem.