Brona Brejova, Daniel G. Brown, Tomas Vinar.
Optimal DNA signal recognition models with a fixed amount of intrasignal dependency.
In G. Benson, R. Page, ed.,
Algorithms and Bioinformatics: 3rd International Workshop (WABI),
2812 volume of Lecture Notes in Bioinformatics,
pp. 78-94, Budapest, Hungary, September 2003. Springer.
Preprint, 111Kb | Download from publisher | BibTeX
We study new probabilistic models for signals in DNA. Our models allow dependencies between multiple non-adjacent positions, in a generative model we call a higher-order tree. Computing the model of maximum likelihood is equivalent in our context to computing a minimum directed spanning hypergraph, a problem we show is NP-complete. We instead compute good models using simple greedy heuristics. In practice, the advantage of using our models over more standard models based on adjacent positions is modest. However, there is a notable improvement in the estimation of the probability that a given position is a signal, which is useful in the context of probabilistic gene finding. We also show that there is little improvement by incorporating multiple signals involved in gene structure into a composite signal model in our framework, though again this gives better estimation of the probability that a site is an acceptor site signal.