2-AIN-505, 2-AIN-251: Seminár z bioinformatiky (1) a (3)
Zima 2015
Abstrakt

Petr Novak, Pavel Neumann, Jiri Pech, Jaroslav Steinhaisl, Jiri Macas. RepeatExplorer: a Galaxy-based web server for genome-wide characterization ofeukaryotic repetitive elements from next-generation sequence reads. Bioinformatics, 29(6):792-793. 2013.

Download preprint: not available

Download from publisher: not available PubMed

Related web page: not available

Bibliography entry: BibTeX

Abstract:

MOTIVATION: Repetitive DNA makes up large portions of plant and animal nuclear
genomes, yet it remains the least-characterized genome component in most species 
studied so far. Although the recent availability of high-throughput sequencing
data provides necessary resources for in-depth investigation of genomic repeats, 
its utility is hampered by the lack of specialized bioinformatics tools and
appropriate computational resources that would enable large-scale repeat analysis
to be run by biologically oriented researchers. RESULTS: Here we present
RepeatExplorer, a collection of software tools for characterization of repetitive
elements, which is accessible via web interface. A key component of the server is
the computational pipeline using a graph-based sequence clustering algorithm to
facilitate de novo repeat identification without the need for reference databases
of known elements. Because the algorithm uses short sequences randomly sampled
from the genome as input, it is ideal for analyzing next-generation sequence
reads. Additional tools are provided to aid in classification of identified
repeats, investigate phylogenetic relationships of retroelements and perform
comparative analysis of repeat composition between multiple species. The server
allows to analyze several million sequence reads, which typically results in
identification of most high and medium copy repeats in higher plant genomes.