Thu 21 Jul. 2011, 15:00
CH1-222

Title: Towards Empirical Models of the Central Dogma of Molecular Biology
Speaker: Gunnar Raetsch, MPI Tuebingen

In our work we aim at understanding and modeling the processes of
gene expression and RNA transcript processing with the aid of
machine learning. I will give the main motivations of our work
and a non-technical introduction into the basic concepts of
machine learning for the analysis of sequence data. Furthermore,
I will describe our recent work on algorithms for the analysis of
deep sequencing data (RNA-seq) that we use to obtain detailed
qualitative and quantitative in silico replicates of
transcriptomes necessary for subsequent analysis and modeling. I
am going to discuss two projects in which we applied such methods
which are important steps for understanding gene expression and
RNA processing regulation: a) the analysis of a spatio-temporal
atlas of C. elegans gene expression [1,2] and the analysis of
genome, proteome and expression variation 19 strains of
Arabidopsis thaliana [3].

[1] Spencer, WC, Zeller, G, Watson, JD, Henz, SR, Watkins, KL,
McWhirter, RD, Petersen, S, Sreedharan, VT, Widmer, C, Reinke, V,
Petrella, L, Strome, S, Von Stetina, S, Katz, M, Rätsch, G, and
Miller III, DM (2010). A 
Spatial 
and 
Temporal 
Map 
of 
C.

elegans 
Gene 
Expression. Genome Research. Advance access
Dec. 22, 2010.

[2] Gerstein, MB, et al.,, Henz, SR, et al.,, Rätsch, G, et al.,,
Zeller, G, et al.,, and Waterston, RH (2010).  Integrative
Analysis of the Caenorhabditis elegans Genome by the modENCODE
Project. Science, 330(6012):1775-1787.

[3] Xiangchao Gan, Oliver Stegle, Jonas Behr, Joshua G. Steffen,
Philipp Drewe, Katie L. Hildebrand, Rune Lyngsoe, Sebastian J.
Schultheiss, Edward J. Osborne, Vipin T. Sreedharan, Andre
Kahles, Regina Bohnert, Gèraldine Jean, Paul Derwent, Paul
Kersey, Eric Belfield, Nicholas Harberd, Eric Kemen, Paula Kover,
Christopher Toomajian, Richard M. Clark, Gunnar Rätsch, Richard
Mott (2011).  Multiple reference genomes and transcriptomes for
Arabidopsis thaliana. "Accepted in principle" for Nature.


Dr. Gunnar Rätsch received the Diplom in computer science from the
University of Potsdam (Germany) in 1998, along with the Jacob Jacoby
prize for the best student of the faculty of natural sciences. Three
years later, he obtained a Ph.D. in natural sciences with his work on
Boosting at the Fraunhofer Institute in Berlin for which he received
the Michelson award from the University of Potsdam. Gunnar Rätsch has
been a postdoctoral fellow in the Research School of Information
Sciences and Engineering of the Australian National University in
Canberra (Australia), at the Max Planck Institute for Biological
Cybernetics in Tübingen (Germany) and at Fraunhofer FIRST in Berlin
(Germany). Currently, he is leading a research group at the Friedrich
Miescher Laboratory of the Max Planck Society in Tübingen (Germany).
In 2007 he was awarded the Olympus prize from the German Association
for Pattern Recognition. He is interested in Machine Learning
techniques such as Boosting, Support Vector Machines as well as
methods for the analysis of structured data and their application in
computational biology. The last few years he has focussed on the
analysis of transcriptome data, gene finding, the prediction of
alternative splicing and the analysis of genomic and phenotypic
variation.