Martin Macko, Martin Králik, Broňa Brejová, Tomáš Vinař.
OB-Fold Recognition Combining Sequence and Structural Motifs.
In B. Brejová, ed.,
Information Technologies - Applications and Theory (ITAT),
number 1649 in CEUR-WS,
pp. 18-25, Tatranské Matliare, Slovakia, 2016.
Download from publisher | Webpage | BibTeX
Remote protein homology detection is an important step towards understanding protein function in living organisms. The problem is notoriously difficult; distant homologs can often be detected only by a combination of sequence and structural features. We propose a new framework, where important sequence and structural features are described by the user in the form of a descriptor, and the descriptor is then used to search a database of protein sequences and score potential candidates. We develop algorithms necessary to support such search using support vector machines and discrete optimization methods. We demonstrate our approach on the example of the telomere-binding OB-fold domain, showing that not only we can distinguish between Telo_bind family members and negatives, but we also identify proteins from related protein families carrying similar OB-fold domains. Prototype implementation of the descriptor search software is available for Linux operating system at http://compbio.fmph.uniba.sk/descal/