Bioinformatický seminár

Tue 29 Mar. 2011, 17:20
I-9

Title: Ratan et al. Calling SNPs without a reference sequence
Speaker: Martin Králik

BACKGROUND: The most common application for the next-generation sequencing
technologies is resequencing, where short reads from the genome of an
individual are aligned to a reference genome sequence for the same
species. These mappings can then be used to identify genetic differences
among individuals in a population, and perhaps ultimately to explain
phenotypic variation. Many algorithms capable of aligning short reads to
the reference, and determining differences between them have been
reported. Much less has been reported on how to use these technologies to
determine genetic differences among individuals of a species for which a
reference sequence is not available, which drastically limits the number
of species that can easily benefit from these new technologies. RESULTS:
We describe a computational pipeline, called DIAL (De novo Identification
of Alleles), for identifying single-base substitutions between two closely
related genomes without the help of a reference genome. The method works
even when the depth of coverage is insufficient for de novo assembly, and
it can be extended to determine small insertions/deletions. We evaluate
the software's effectiveness using published Roche/454 sequence data from
the genome of Dr. James Watson (to detect heterozygous positions) and
recent Illumina data from orangutan, in each case comparing our results to
those from computational analysis that uses a reference genome assembly.
We also illustrate the use of DIAL to identify nucleotide differences
among transcriptome sequences. CONCLUSIONS: DIAL can be used for
identification of nucleotide differences in species for which no reference
sequence is available. Our main motivation is to use this tool to survey
the genetic diversity of endangered species as the identified sequence
differences can be used to design genotyping arrays to assist in the
species' management. The DIAL source code is freely available at
http://www.bx.psu.edu/miller_lab/.