Do databases of exome sequences reliably correlate with the prevalence of individuals with defective DNA repair?
In this molecular epidemiologic study examining 3 large exome sequence databases totaling more than 200 000 alleles, unexpectedly high frequencies were found of 2 mutations associated with xeroderma pigmentosum in DNA repair genes (XPF [ERCC4] p.P379S, 0.4% and XPC p.P334H, 0.3%). These frequencies estimate the presence of more than 8000 people with xeroderma pigmentosum in the United States with these mutations, yet only 4 individuals were clinically identified in this study.
Unsuspected mutations in known genes with a predisposition for skin cancer may be responsible for some of the high frequency of skin cancers in the general population.
Wide use of genomic sequencing to diagnose disease has raised concern about the extent of genotype-phenotype correlations.
To correlate disease-associated allele frequencies with expected and reported prevalence of clinical disease.
Design, Setting, and Participants
Xeroderma pigmentosum (XP), a recessive, cancer-prone, neurocutaneous disorder, was used as a model for this study. From January 1, 2017, to May 4, 2018, the Human Gene Mutation Database and a cohort of patients at the National Institutes of Health were searched and screened to identify reported mutations associated with XP. The clinical phenotype of these patients was confirmed from reports in the literature and National Institutes of Health medical records. The genetically predicted prevalence of disease based on frequency of known pathogenic mutations was compared with the prevalence of patients clinically diagnosed with phenotypic XP. Exome sequencing of more than 200 000 alleles from the Genome Aggregation Database, the National Cancer Institute Division of Cancer Epidemiology and Genetics database of healthy controls, and an Inova Hospital Study database was used to investigate the frequencies of these mutations in the general population.
Main Outcomes and Measures
Listing of all reported mutations associated with XP, their frequencies in 3 large exome sequence databases, determination of the number of patients in the United States with XP using modeling equations, and comparison of the observed and reported numbers of patients with XP with specific mutations.
A total of 156 pathogenic missense and nonsense mutations associated with XP were identified in the National Institutes of Health cohort and the Human Gene Mutation Database. The Genome Aggregation Database provided frequency data for 65 of these mutations, with a total allele frequency of 1.13%. The XPF (ERCC4) mutation, p.P379S, had an allele frequency of 0.4%, and the XPC mutation, p.P334H, had an allele frequency of 0.3%. With the Hardy-Weinberg equation, it was determined that there should be more than 8000 patients who are homozygous for these mutations in the United States. In contrast, only 3 patients with XP were reported as having the XPF mutation, and 1 patient was reported as having the XPC mutation.
Conclusions and Relevance
The findings from this study suggest that clinicians should approach large genomic databases with caution when trying to correlate the clinical implications of genetic variants with the prevalence of disease risk. Unsuspected mutations in known genes with a predisposition for skin cancer may be responsible for some of the high frequency of skin cancers in the general population.
Pugh J, Khan SG, Tamura D, et al. Use of Big Data to Estimate Prevalence of Defective DNA Repair Variants in the US Population. JAMA Dermatol. 2019;155(1):72–78. doi:10.1001/jamadermatol.2018.4473
Customize your JAMA Network experience by selecting one or more topics from the list below.
Create a personal account or sign in to: