Logo image
Towards a reference genome that captures global genetic diversity
Journal article   Open access   Peer reviewed

Towards a reference genome that captures global genetic diversity

Karen H. Y. Wong, Walfred Ma, Chun-Yu Wei, Erh-Chan Yeh, Wan-Jia Lin, Elin H. F. Wang, Jen-Ping Su, Feng-Jen Hsieh, Hsiao-Jung Kao, Hsiao-Huei Chen, …
Nature communications, v 11(1), pp 5482-5482
30 Oct 2020
PMID: 33127893
url
https://doi.org/10.1038/s41467-020-19311-wView
Published, Version of Record (VoR)CC BY V4.0 Open

Abstract

Computational biology and bioinformatics Genetic variation Genomics
The current human reference genome is predominantly derived from a single individual and it does not adequately reflect human genetic diversity. Here, we analyze 338 high-quality human assemblies of genetically divergent human populations to identify missing sequences in the human reference genome with breakpoint resolution. We identify 127,727 recurrent non-reference unique insertions spanning 18,048,877 bp, some of which disrupt exons and known regulatory elements. To improve genome annotations, we linearly integrate these sequences into the chromosomal assemblies and construct a Human Diversity Reference. Leveraging this reference, an average of 402,573 previously unmapped reads can be recovered for a given genome sequenced to ~40X coverage. Transcriptomic diversity among these non-reference sequences can also be directly assessed. We successfully map tens of thousands of previously discarded RNA-Seq reads to this reference and identify transcription evidence in 4781 gene loci, underlining the importance of these non-reference sequences in functional genomics. Our extensive datasets are important advances toward a comprehensive reference representation of global human genetic diversity. The human reference genome does not fully reflect human genetic diversity. Here, the authors analyse 338 human genome assemblies from diverse populations to identify missing sequences, define non-reference unique insertions and construct a Human Diversity Reference.

Metrics

22 Record Views
32 citations in Scopus

Details

UN Sustainable Development Goals (SDGs)

This publication has contributed to the advancement of the following goals:

#3 Good Health and Well-Being

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types
Domestic collaboration
International collaboration
Web of Science research areas
Genetics & Heredity
Logo image