2-AIN-506 a 2-AIN-252: Seminár z bioinformatiky (2) a (4)
Leto 2021

Peter Ebert, Peter A. Audano, Qihui Zhu, Bernardo Rodriguez-Martin, David Porubsky, Marc Jan Bonder, Arvis Sulovari, Jana Ebler, Weichen Zhou, Rebecca {Serra Mari}, Feyza Yilmaz, Xuefang Zhao, PingHsun Hsieh, Joyce Lee, Sushant Kumar, Jiadong Lin, Tobias Rausch, Yu Chen, Jingwen Ren, Martin Santamarina, Wolfram Hops, Hufsah Ashraf, Nelson T. Chuang, Xiaofei Yang, Katherine M. Munson, Alexandra P. Lewis, Susan Fairley, Luke J. Tallon, Wayne E. Clarke, Anna O. Basile, Marta Byrska-Bishop, Andre Corvelo, Uday S. Evani, Tsung-Yu Lu, Mark J. P. Chaisson, Junjie Chen, Chong Li, Harrison Brand, Aaron M. Wenger, Maryam Ghareghani, William T. Harvey, Benjamin Raeder, Patrick Hasenfeld, Allison A. Regier, Haley J. Abel, Ira M. Hall, Paul Flicek, Oliver Stegle, Mark B. Gerstein, Jose M. C. Tubio, Zepeng Mu, Yang I. Li, Xinghua Shi, Alex R. Hastie, Kai Ye, Zechen Chong, Ashley D. Sanders, Michael C. Zody, Michael E. Talkowski, Ryan E. Mills, Scott E. Devine, Charles Lee, Jan O. Korbel, Tobias Marschall, Evan E. Eichler. Haplotype-resolved diverse human genomes and integrated analysis of structuralvariation. Science, 2021.

Download preprint: not available

Download from publisher: not available PubMed

Related web page: not available

Bibliography entry: BibTeX


Long-read and strand-specific sequencing technologies together facilitate the de 
novo assembly of high-quality haplotype-resolved human genomes without
parent-child trio data. We present 64 assembled haplotypes from 32 diverse human 
genomes. These highly contiguous haplotype assemblies (average contig N50: 26
Mbp) integrate all forms of genetic variation even across complex loci. We
identify 107,590 structural variants (SVs), of which 68% are not discovered by
short-read sequencing, and 278 SV hotspots (spanning megabases of gene-rich
sequence). We characterize 130 of the most active mobile element source elements 
and find that 63% of all SVs arise by homology-mediated mechanisms. This resource
enables reliable graph-based genotyping from short reads of up to 50,340 SVs,
resulting in the identification of 1,526 expression quantitative trait loci as
well as SV candidates for adaptive selection within the human population.