2-AIN-505, 2-AIN-251: Seminár z bioinformatiky (1) a (3)
Zima 2017

Adam M. Novak et al.. Genome Graphs. Technical Report 101378, bioRxiv, 2-17.

Download preprint: not available

Download from publisher: https://doi.org/10.1101/101378

Related web page: not available

Bibliography entry: BibTeX


There is increasing recognition that a single, monoploid reference genome 
is a poor universal reference structure for human genetics, because it 
represents only a tiny fraction of human variation. Adding this missing 
variation results in a structure that can be described as a mathematical 
graph: a genome graph. We demonstrate that, in comparison to the existing 
reference genome (GRCh38), genome graphs can substantially improve the 
fractions of reads that map uniquely and perfectly. Furthermore, we show 
that this fundamental simplification of read mapping transforms the variant 
calling problem from one in which many non-reference variants must be 
discovered de-novo to one in which the vast majority of variants are simply 
re-identified within the graph. Using standard benchmarks as well as a 
novel reference-free evaluation, we show that a simplistic variant calling 
procedure on a genome graph can already call variants at least as well as, 
and in many cases better than, a state-of-the-art method on the linear 
human reference genome. We anticipate that graph-based references will 
supplant linear references in humans and in other applications where 
cohorts of sequenced individuals are available.