2-AIN-505, 2-AIN-251: Seminár z bioinformatiky (1) a (3)
Zima 2017

Christoph Lippert, Riccardo Sabatini, M. Cyrus Maher, Eun Yong Kang, Seunghak Lee, Okan Arikan, Alena Harley, Axel Bernal, Peter Garst, Victor Lavrenko, Ken Yocum, Theodore Wong, Mingfu Zhu, Wen-Yun Yang, Chris Chang, Tim Lu, Charlie W. H. Lee, Barry Hicks, Smriti Ramakrishnan, Haibao Tang, Chao Xie, Jason Piper, Suzanne Brewerton, Yaron Turpaz, Amalio Telenti, Rhonda K. Roby, Franz J. Och, J. Craig Venter. Identification of individuals by trait prediction using whole-genome sequencingdata. Proceedings of the National Academy of Sciences of the United States of America, 114(38):10166-10171. 2017.

Download preprint: not available

Download from publisher: not available PubMed

Related web page: not available

Bibliography entry: BibTeX


Prediction of human physical traits and demographic information from genomic data
challenges privacy and data deidentification in personalized medicine. To explore
the current capabilities of phenotype-based genomic identification, we applied
whole-genome sequencing, detailed phenotyping, and statistical modeling to
predict biometric traits in a cohort of 1,061 participants of diverse ancestry.
Individually, for a large fraction of the traits, their predictive accuracy
beyond ancestry and demographic information is limited. However, we have
developed a maximum entropy algorithm that integrates multiple predictions to
determine which genomic samples and phenotype measurements originate from the
same person. Using this algorithm, we have reidentified an average of >8 of 10
held-out individuals in an ethnically mixed cohort and an average of 5 of either 
10 African Americans or 10 Europeans. This work challenges current conceptions of
personal privacy and may have far-reaching ethical and legal implications.