Thu 21 Jul. 2011, 15:00
Title: Towards Empirical Models of the Central Dogma of Molecular Biology
Speaker: Gunnar Raetsch, MPI Tuebingen
In our work we aim at understanding and modeling the processes of gene expression and RNA transcript processing with the aid of machine learning. I will give the main motivations of our work and a non-technical introduction into the basic concepts of machine learning for the analysis of sequence data. Furthermore, I will describe our recent work on algorithms for the analysis of deep sequencing data (RNA-seq) that we use to obtain detailed qualitative and quantitative in silico replicates of transcriptomes necessary for subsequent analysis and modeling. I am going to discuss two projects in which we applied such methods which are important steps for understanding gene expression and RNA processing regulation: a) the analysis of a spatio-temporal atlas of C. elegans gene expression [1,2] and the analysis of genome, proteome and expression variation 19 strains of Arabidopsis thaliana .  Spencer, WC, Zeller, G, Watson, JD, Henz, SR, Watkins, KL, McWhirter, RD, Petersen, S, Sreedharan, VT, Widmer, C, Reinke, V, Petrella, L, Strome, S, Von Stetina, S, Katz, M, Rätsch, G, and Miller III, DM (2010). A Spatial and Temporal Map of C. elegans Gene Expression. Genome Research. Advance access Dec. 22, 2010.  Gerstein, MB, et al.,, Henz, SR, et al.,, Rätsch, G, et al.,, Zeller, G, et al.,, and Waterston, RH (2010). Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project. Science, 330(6012):1775-1787.  Xiangchao Gan, Oliver Stegle, Jonas Behr, Joshua G. Steffen, Philipp Drewe, Katie L. Hildebrand, Rune Lyngsoe, Sebastian J. Schultheiss, Edward J. Osborne, Vipin T. Sreedharan, Andre Kahles, Regina Bohnert, Gèraldine Jean, Paul Derwent, Paul Kersey, Eric Belfield, Nicholas Harberd, Eric Kemen, Paula Kover, Christopher Toomajian, Richard M. Clark, Gunnar Rätsch, Richard Mott (2011). Multiple reference genomes and transcriptomes for Arabidopsis thaliana. "Accepted in principle" for Nature. Dr. Gunnar Rätsch received the Diplom in computer science from the University of Potsdam (Germany) in 1998, along with the Jacob Jacoby prize for the best student of the faculty of natural sciences. Three years later, he obtained a Ph.D. in natural sciences with his work on Boosting at the Fraunhofer Institute in Berlin for which he received the Michelson award from the University of Potsdam. Gunnar Rätsch has been a postdoctoral fellow in the Research School of Information Sciences and Engineering of the Australian National University in Canberra (Australia), at the Max Planck Institute for Biological Cybernetics in Tübingen (Germany) and at Fraunhofer FIRST in Berlin (Germany). Currently, he is leading a research group at the Friedrich Miescher Laboratory of the Max Planck Society in Tübingen (Germany). In 2007 he was awarded the Olympus prize from the German Association for Pattern Recognition. He is interested in Machine Learning techniques such as Boosting, Support Vector Machines as well as methods for the analysis of structured data and their application in computational biology. The last few years he has focussed on the analysis of transcriptome data, gene finding, the prediction of alternative splicing and the analysis of genomic and phenotypic variation.