2-AIN-506 a 2-AIN-252: Seminár z bioinformatiky (2) a (4)
Leto 2016

Marina M.-C. Vidovic , Nico Görnitz, Klaus-Robert Müller , Gunnar Rätsch , Marius Kloft . Opening the Black Box: Revealing Interpretable Sequence Motifs in Kernel-Based Learning Algorithms. In Machine Learning and Knowledge Discovery in Databases (ECML PKDD), 2015.

Download preprint: not available

Download from publisher: http://link.springer.com/chapter/10.1007/978-3-319-23525-7_9

Related web page: not available

Bibliography entry: BibTeX


This work is in the context of kernel-based learning algorithms for sequence 
data. We present a probabilistic approach to automatically extract, from the 
output of such string-kernel-based learning algorithms, the subsequences—or 
motifs—truly underlying the machine’s predictions. The proposed framework 
views motifs as free parameters in a probabilistic model, which is solved 
through a global optimization approach. In contrast to prevalent approaches, 
the proposed method can discover even difficult, long motifs, and could be 
combined with any kernel-based learning algorithm that is based on an 
adequate sequence kernel. We show that, by using a discriminate kernel 
machine such as a support vector machine, the approach can reveal 
discriminative motifs underlying the kernel predictor. We demonstrate the 
efficacy of our approach through a series of experiments on synthetic and 
real data, including problems from handwritten digit recognition and a 
large-scale human splice site data set from the domain of computational