Brona Brejova, Daniel G. Brown, Ian M. Harrower, Tomas Vinar. New bounds for motif finding in strong instances. In Combinatorial Pattern Matching, 17th Annual Symposium (CPM 2006), 4009 volume of Lecture Notes in Computer Science, pp. 94-105, Barcelona, Spain, July 2006.

Download preprint: 06ptas.pdf, 410Kb

Download from publisher:

Related web page: not available

Bibliography entry: BibTeX


Many algorithms for motif finding that are commonly used in
bioinformatics start by sampling r potential motif occurrences from n
input sequences. The motif is derived from these samples and evaluated
on all sequences.  This approach works extremely well in practice, and
is implemented by several programs.  Li, Ma and Wang have shown that a
simple algorithm of this sort is a polynomial-time approximation
scheme.  However, in 2005, we showed specific instances of the motif
finding problem for which the approximation ratio of a slight
variation of this scheme converges to one very slowly as a function of
the sample size r, which seemingly contradicts the high performance of
sample-based algorithms.  Here, we account for the difference by
showing that, for a variety of different definitions of
\"strong\" binary motifs, the approximation ratio of sample-based
algorithms converges to one exponentially fast in r.  We also
describe \"very strong\" motifs, for which the simple sample-based
approach always identifies the correct motif, even for modest values
of r.