Publication details

Brona Brejova, Daniel G. Brown, Ming Li, Tomas Vinar. ExonHunter: a comprehensive approach to gene finding. Bioinformatics, 21(S1):i57-i65. 2005. Intelligent Systems for Molecular Biology (ISMB 2005).
Preprint, 144Kb | Download from publisher | Webpage | Early version | BibTeX | PubMed

Abstract

Motivation: We present ExonHunter, a new and comprehensive gene finder
system that outperforms existing systems, featuring several new ideas
and approaches. Our system combines numerous sources of information
(genomic sequences, ESTs, and protein databases of related species)
into a gene finder based on a hidden Markov model in a novel and
systematic way. In our framework, various sources of information are
expressed as partial probabilistic statements about positions in the
sequence and their annotation. We then combine these into the final
prediction via a quadratic programming method, which we show is an
extension of existing methods. Allowing only partial statements is key
to our transparent handling of missing information and coping with the
heterogeneous character of individual sources of information. As well,
we give a new method for modeling length distribution of intergenic
regions in hidden Markov models.

Results: On a commonly used test set, ExonHunter performs significantly 
better than the existing gene finders ROSETTA, SLAM, or TWINSCAN, with more 
than two thirds of genes  predicted completely correctly.