Bioinformatický seminár

Tue 26 Mar. 2013, 17:20
I-9

Title: Michael I. Sadowski (2013) Prediction of protein domain boundaries from inverse covariances
Speaker: Dávid Repka

It has been known even since relatively few structures had been solved that
longer protein chains often contain multiple domains, which may fold separately
and play the role of reusable functional modules found in many contexts. In many 
structural biology tasks, in particular structure prediction, it is of great use 
to be able to identify domains within the structure and analyze these regions
separately. However, when using sequence data alone this task has proven
exceptionally difficult, with relatively little improvement over the naive method
of choosing boundaries based on size distributions of observed domains. The
recent significant improvement in contact prediction provides a new source of
information for domain prediction. We test several methods for using this
information including a kernel smoothing-based approach and methods based on
building alpha-carbon models and compare performance with a length-based
predictor, a homology search method and four published sequence-based predictors:
DOMCUT, DomPRO, DLP-SVM, and SCOOBY-DOmain. We show that the kernel-smoothing
method is significantly better than the other ab initio predictors when both
single-domain and multidomain targets are considered and is not significantly
different to the homology-based method. Considering only multidomain targets the 
kernel-smoothing method outperforms all of the published methods except DLP-SVM. 
The kernel smoothing method therefore represents a potentially useful improvement
to ab initio domain prediction.