The 3 schemes correspond to the 3 Gene Ontology categories, A, Biological processes. B, Molecular function. C, Cellular component.
Hysi PG, Mahroo OA, Cumberland P, Wojciechowski R, Williams KM, Young TL, Mackey DA, Rahi JS, Hammond CJ. Common Mechanisms Underlying Refractive Error Identified in Functional Analysis of Gene Lists From Genome-Wide Association Study Results in 2 European British Cohorts. JAMA Ophthalmol. 2014;132(1):50-56. doi:10.1001/jamaophthalmol.2013.6022
Copyright 2014 American Medical Association. All Rights Reserved. Applicable FARS/DFARS Restrictions Apply to Government Use.
To date, relatively few genes responsible for a fraction of heritability have been identified by means of large genetic association studies of refractive error.
To explore the genetic mechanisms that lead to refractive error in the general population.
Design, Setting, and Participants
Genome-wide association studies were carried out in 2 British population-based independent cohorts (N = 5928 participants) to identify genes moderately associated with refractive error.
Main Outcomes and Measures
Enrichment analyses were used to identify sets of genes overrepresented in both cohorts. Enriched groups of genes were compared between both participating cohorts as a further measure against random noise.
Groups of genes enriched at highly significant statistical levels were remarkably consistent in both cohorts. In particular, these results indicated that plasma membrane (P = 7.64 × 10−30), cell-cell adhesion (P = 2.42 × 10−18), synaptic transmission (P = 2.70 × 10−14), calcium ion binding (P = 3.55 × 10−15), and cation channel activity (P = 2.77 × 10−14) were significantly overrepresented in relation to refractive error.
Conclusions and Relevance
These findings provide evidence that development of refractive error in the general population is related to the intensity of photosignal transduced from the retina, which may have implications for future interventions to minimize this disorder. Pathways connected to the procession of the nerve impulse are major mechanisms involved in the development of refractive error in populations of European origin.
Refractive error is the most common ocular disorder.1 It affects about 25% of the adult population in Europe and the United States2 but its prevalence approaches three-quarters among the younger age groups of Southeast Asia.3 Refractive error and in particular one of its manifestations, myopia, is an important risk factor for serious ocular complications and blindness.1,4 Through direct financial costs and indirect costs of loss of productivity, myopia costs societies billions of dollars each year.5
Myopia is by far the most prevalent form of refractive error6 and usually occurs as a result of elongation of the axial length of the eye beyond its focal plane.7 Refraction at birth is almost normally distributed, centered in the hyperopic ranges, but within 5 years of age, the center of the distribution moves away from hyperopia as the standard deviations of that distribution decrease,8 primarily as a result of axial length changes.9,10
The prevalence of refractive error and myopia varies both across geographic regions11 and among different ethnicities living together within the same geographic locations.12,13 This is indicative of both strong environmental influences and genetic predispositions. Higher socioeconomic status,14 education,15 outdoor activity, and especially near work16 are recognized risk factors for myopia and life-course research has recently shown that key prenatal and early childhood factors associated with general growth and development (eg, maternal age, intrauterine growth retardation, smoking in pregnancy, and changing socioeconomic status) are also implicated.17
Yet parental myopia is the strongest predictor for myopia in school-aged children.18 Heritability studies have found that a large portion of the refractive error variation is due to inherited factors.19 Variants at 2 genetic loci, seemingly involved in synaptic transmission and transduction of visual signal,20,21 have been identified through genome-wide association studies (GWAS).22,23
Current knowledge of both environmental and genetic factors predisposing to refractive error is patchy and does not allow us to have an integrated view of the mechanisms that might influence the process of emmetropization. Most complex diseases are polygenic24,25 and because of the relatively low power of real-life GWAS cohorts, true signals can be drowned out by the random noise of false-positive results. However, some of the truly positive associations, albeit at less than genome-wide statistical significance, will generally rank higher in GWAS results because of their inherent role in disease etiology and pathophysiology. Also, by virtue of the biological interactions leading to disease, these genes are likely to share commonalities, such as participation in the same functional classes or known biological classes. Genome-wide association study results of sufficient power would be expected to be enriched for functional gene sets or biological pathways that are of relevance to the phenotype studied.
To further elucidate the potential mechanisms that cause refractive error and myopia in the general population, we carried out a meta-analysis of gene list enrichment analyses of results from 2 separate GWAS of refractive error. To minimize heterogeneity introduced from subtle racial or ethnic differences or exposure to the social environment, we focused on 2 cohorts drawn from the white British general population: the TwinsUK cohort26 and the 1958 British Birth Cohort.27
The TwinsUK data set comprised 4270 individuals (8% male) with both genotypic and phenotypic information. The phenotypic information consisted of spherical equivalent derived from noncyclopleged autorefraction using an ARM-10 autorefractor (Takagi Ltd). The spherical equivalent was defined as the sphere value (in diopters) plus half of the cylinder value of the same eye. The average of measurements from both eyes was used for the analysis. The mean (SD) spherical equivalent was −0.4 (2.73) diopter, with a range of −25.12 to +9.4 diopters, and the mean age was 53 years. Individuals with previous ocular surgery including cataract and refractive surgery were excluded, as were those with other systemic or ocular conditions that might influence refraction. All participants were unaware of the eye studies at the time of enrollment; they gave informed written consent to participate in genetic association studies. The study and biomedical examination protocols were approved by the St Thomas Hospital Local Research Ethics Committee (TwinsUK) and the South East MultiCentre Research Ethics Committee and the Oversight Committee for the 1958 British Birth Cohort. The research adhered to the tenets of the Declaration of Helsinki.
Genotyping was carried out using 3 genotyping platforms from Illumina: the HumanHap 300k Duo for part of the TwinsUK Cohort and the HumanHap610 quad array for the rest of the cohort, as described elsewhere.22 Stringent quality control measures were applied. Individuals were included if their genotyping success rate exceeded 95% and did not show excess or low heterozygosity (defined by the interval of 0.2-0.4). Single-nucleotide polymorphisms (SNPs) were included in the imputation if they had a genotype success rate of at least 0.95 if their minor allele frequency was superior to 0.005 and at least 0.99 if their minor allele frequency was 0.01 to 0.05. Only SNPs that were within Hardy-Weinberg equilibrium (P > 10−4) and had a minor allele frequency of 0.04 or more were regressed. Imputation of nongenotyped loci was done with IMPUTE28 based on HapMap2 CEU haplotypes. Association between genotypes and phenotypes was computed using the score test as implemented in the program MERLIN,29 as published elsewhere.22
The second data set was a random subset of 1658 individuals from the 1958 British Birth Cohort, with exclusion of individuals with previous ocular surgery or those with other systemic or ocular conditions that might influence refraction.17 The phenotypic information was obtained through noncycloplegic autorefraction (Nikon Retinomax 2) of both eyes of each participant. Spherical equivalent was calculated as described earlier. Genotyping was done in 7 overlapping batches using the Illumina Human1M-Duo chip (2 batches of 1000 and 313 participants), Illumina Infinium HumanHap550 (2 different batches of 770 and 260 participants), Affymetrix 6.0 (317 participants), Affymetrix 5.0 Human SNP Array (299 participants), and the Metabochip30 (1323 participants). Genotypes were aligned in the same strand, the consistency of which was checked using local patterns of linkage disequilibrium between SNPs as implemented in the routine “flip scan” in PLINK.31 Individuals were checked for genotyping success rate (all exceeded 99%) and for excess or low heterozygosity (all participants were checked and found within the predefined interval of 0.2-0.4). Single-nucleotide polymorphisms were included in the analysis if they had a genotype success rate of at least 0.95, were within Hardy-Weinberg equilibrium (P > 10−4), and had a minor allele frequency of 0.04 or more. Imputation was done using MaCH132 and genotypes meeting conventional quality control criteria were analyzed genome-wide using PLINK.31
Results from both the GWAS described earlier were annotated using the ENSEMBL database (www.ensembl.org). We considered that the effect of a SNP may be conferred by its direct or indirect impacts over the whole gene unit, including the transcript and its regulatory regions. The latter are not fully understood and several studies have shown that regulatory elements are often at long distances (or in trans) with the transcript.33 Yet expression quantitative trait loci located within ±100 kilobases (kb) from the transcript are both more frequent and have stronger effects than others.34 Under the assumption that this interval would link the putative effect of a SNP variant with a gene’s function, SNPs within 100 kb of a gene’s transcript were annotated as part of that gene. Single-nucleotide polymorphisms that were within 100 kb of more than 1 gene were annotated with the gene nearest their genomic location or both genes in cases where the 2 transcripts overlapped. For each of the 2 separate gene list enrichment analyses, genes were assigned a significance according to the strongest associated SNP annotated to them in the respective GWAS. Because the main rationale was to enrich for genes with true association with refractive error, the participating studies needed to have at least a minimum power to detect true effects between 0.1 and 0.5 diopter, consistent with the range of effects observed in previous successful myopia association studies.22,23 In the smaller cohort (1958 British Birth Cohort), 80% power for that range of alleles and variant frequencies was achieved at α = .001, and for the sake of consistency, this threshold was adopted for both participating cohorts. Only genes whose best-associated SNP passed a preset significance threshold of 0.001 were included in the subsequent enrichment analyses.
Gene list enrichment analyses were used to assess if the genes that were identified as statistically significant in the GWAS participated in specific annotation categories more than by chance. These analyses were carried out using DAVID (the Database for Annotation, Visualization, and Integrated Discovery).35,36 We focused our choice on functional categories for the enrichment of the 3 classes of the Gene Ontology (GO) database (http://www.geneontology.org/) and BioCarta (http://www.biocarta.com/) and KEGG (Kyoto Encyclopedia of Genes and Genomes) (http://www.kegg.jp/) pathways.
From our analyses, we obtained the Fisher exact test for enrichment, a fold change compared with expectations, and a hypergeometric Bonferroni-corrected value controlling for multiple testing. Fisher exact test results from both analyses were meta-analyzed using the Fisher combined probability method.37 Only results whose fold-change enrichments were going in the same direction in both cohorts (ie, both greater than 1 or both smaller than 1) were considered. To avoid spurious associations, only results that passed Bonferroni multiple-testing correction (P < .05) in each of the 2 data sets were considered for meta-analysis. Finally, because GO entries are not necessarily independent classes but are organized hierarchically with partial or complete overlaps between the different entries, we used tree visualizations generated by the AmiGO tools (http://amigo.geneontology.org) to ascertain intradependencies of the significantly associated GO entries.
We investigated 14 721 entries from the GO database: 9818 biological processes, 1208 cellular component entries, and 3695 different molecular functions. The GO entries are organized hierarchically and often partially overlap with other entries. Each of these categories had between 18 546 (biological processes) and 19 042 (cellular component) unique annotated human genes. We also investigated 202 KEGG and 217 BioCarta pathways.
A total of 3312 unique genes or transcripts were identified for TwinsUK and 2483 in the 1958 British Birth Cohort. Of these, only 2422 genes or transcripts (73.1%) from the TwinsUK data set and 1878 (75.6%) from the 1958 British Birth Cohort had any functional annotation in any of the relevant databases. The number of genes entering the analyses corresponded to roughly 10% of the total number of annotated genes in each of the GO major categories.
With the exception of the BioCarta pathways, in which no significant enrichment was observed in any of the data sets, high levels of entry enrichment were found in all other databases. Despite stringent predefined criteria, there was remarkable overlap between results in both data sets, both in terms of strong probabilities for significant enrichment and, with few exceptions, highly correlated values of fold enrichment for the respective terms (Tables 1, 2, and 3).
Across the different GO categories, strong suggestions of association between sensory pathways and refractive error were found. Among the most associated GO entries were adhesion of cells (P = 2.42 × 10−18), synapse (P = 2.80 × 10−14) and synaptic transmission (P = 2.70 × 10−14), and ion channel (P = 2.84 × 10−13) and passive voltage-gated channel activity (P = 6.50 × 10−10). Most of these suggest sensory participation in refractive error pathophysiology. Interestingly, however, there was a strong morphogenesis presence of highly enriched genes in our analysis: cell (P = 1.63 × 10−14) and axon (P = 1.10 × 10−10) morphogenesis and the axon guidance KEGG pathway (P = 1.90 × 10−8). A simplified scheme of the associated GO entries taking into account their dependencies is shown in the Figure.
Functional inferences from GWAS data can provide some insight into the molecular mechanisms that influence emmetropization, the highly active process of self-regulation of axial growth in response to changes in the focal plane.7 Visually guided regulation appears to act at a local level and persists even if communication between the retina and the central nervous system is eliminated.38 Local defocus involving only parts of the retina causes differential local elongation.39 Cell adhesion and junction functional categories showed some of the highest statistical enrichment in both cohorts, consistent with highly significant associations previously reported for the GJD2 gene with refractive error.23 The visual signal starts when photons are absorbed by the photopigment present in rod or cone photoreceptors, eventually leading to hyperpolarization of these cells and a reduction in glutamate release.40,41 Photoreceptors synapse with bipolar cells and horizontal cells at the outer plexiform layer. The connexin 36 protein, coded by the GJD2 gene, appears to play an important stabilizing role here.42- 44 Also, RASGRF1 expression localizes to these layers, with knockout mice showing reduced retinal responses.21 RASGRF1 has also been associated with refractive error.22 Synaptic transmission GO entries and channel activity as well as calcium and calmodulin (calcium is important in both synaptic transmission and retinal adaptation) showed consistent patterns of highly significant enrichment in both data sets. Thus, the results of this study, taken together with the previously reported genetic associations, lend strong support to the notion that alterations in transmission of the visual signal, even at the first synapse, play a significant role in the development of refractive error.
Gene Ontology entries are often imprecise annotation categories describing known gene functionalities but lacking the precise mechanisms by which these genes act coherently together. KEGG pathways, on the other hand, are less extensive but better characterized entries from a molecular point of view. Interestingly, the 2 KEGG pathways that met our conservative significance criteria were both related to electrophysiologically active cells involved in signal transmission. Axon guidance is the most strongly enriched category and may suggest a role for axonal/presynaptic alterations affecting synaptic transmission. The arrhythmogenic right ventricular cardiomyopathy enrichment in our data sets seems to suggest that at least some of the genes and their protein products instrumental in visual signaling pathways have multiple roles in the organism, affecting cellular electrical activity in other contexts, organs, or systems, which adds strength to life-course epidemiological observations, for example, relating general growth to ocular growth.17 Clearly, future drug development to prevent myopia progression will need to target pathways specific to the failure of emmetropization, akin to the actions of the HMG-CoA reductase enzyme targeted by statins, rather than more general pathways. The hope is that identification of pathways through this gene enrichment process will identify potential targets, which might not have been found by GWAS or other genetic studies.
Enrichment analysis based on GWAS data cannot irrefutably prove involvement or lack of involvement of specific pathways or molecular mechanisms. First, both enrichment and genetic association studies are methods of a probabilistic nature, where the right balance between high-sensitivity thresholds and high specificity is complicated by the relatively high level of multiple testing. In this work, we used some very conservative methods to control for multiple testing by requiring Bonferroni significant probabilities and consistent replication in both data sets. This option may have favored specificity at the expense of sensitivity. For example, the glutamate receptor ontology, which could be related to glutamate transmission of the signal to bipolar cells, narrowly missed inclusion because it did not reach Bonferroni-corrected statistical significance in the smaller 1958 British Birth Cohort data set. Second, we deliberately opted to minimize heterogeneity, both genetic and environmental, by studying enrichment in 2 populations with similar demographic characteristics. The high complexity of ethnic and environmental factors leading to refractive error45 makes it unlikely that the mechanisms we identified are either undisputedly universal or exclusive across the world. Finally, enrichment for annotated functional classes always has the inherent publication bias risks because existing knowledge at any given moment may cover different areas unevenly within the same scientific discipline.
Notwithstanding these potential limitations, our work provides some clues as to the most likely mechanisms involved in refractive error and emmetropization. These results were surprising in 2 different ways. First, they were remarkably in agreement with what we know so far about photosignal transduction and transmission. Second, the results between the larger TwinsUK data set and the much smaller 1958 British Birth Cohort were very consistent in the direction of changes and magnitude. This reflects the robustness of the association and enrichment signals, lending additional support to the notion that the results of this study represent a credible advance in our endeavor to identify the pathophysiology of refractive error at a molecular level. The consistency of enrichment analyses is encouraging and may be used more widely in the future, both to improve knowledge on multigenic mechanisms of complex disease and to analyze GWAS in a way that limits their focus to smaller prioritized groups of genes. This would reduce the high dimensionality of SNP data sets and increase the power to detect more heritable risk factors associated with clinical phenotypes.
Corresponding Author: Christopher J. Hammond, MD, Department of Twin Research and Genetic Epidemiology, King’s College London, St Thomas Hospital Campus, 3rd Floor S Wing Block D, Westminster Bridge Road, London SE1 7EH, England (email@example.com).
Submitted for Publication: January 12, 2013; final revision received April 2, 2013; accepted June 16, 2013.
Published Online: November 21, 2013. doi:10.1001/jamaophthalmol.2013.6022.
Author Contributions: Dr Hysi had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Hysi, Young, Mackey, Hammond.
Acquisition of data: Hysi, Cumberland, Young, Mackey, Rahi, Hammond.
Analysis and interpretation of data: Hysi, Mahroo, Cumberland, Wojciechowski, Williams, Hammond.
Drafting of the manuscript: Hysi, Hammond.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Hysi, Cumberland, Wojciechowski.
Obtained funding: Young, Mackey, Rahi, Hammond.
Administrative, technical, or material support: Hysi, Mahroo, Williams, Mackey, Hammond.
Study supervision: Rahi, Hammond.
Conflict of Interest Disclosures: None reported.
Funding/Support: Phenotyping for the 1958 British Birth Cohort was funded by the Medical Research Council Health of the Public grant (principal investigators, C. Power, PhD, and D. P. Strachan, MD) and the genetic studies, by the Wellcome Trust (grant 083478 to Dr Rahi), with additional personal funding (Dr Hysi) by the Ulverscroft Vision Research Group. The Centre for Paediatric Epidemiology and Biostatistics was supported in part by the Medical Research Council in its capacity as the MRC Centre of Epidemiology for Child Health. Research at the UCL Institute of Child Health, Great Ormond Street Hospital for Children, Moorfields Eye Hospital NHS Foundation Trust, and UCL Institute of Ophthalmology receives a proportion of their funding from the National Institute for Health Research Biomedical Research Centres funding scheme. TwinsUK received funding from the Wellcome Trust; the European Union MyEuropia Marie Curie Research Training Network; Guide Dogs for the Blind Association; the European Community’s FP7 (grant HEALTHF22008201865GEFOS); ENGAGE (grant HEALTHF42007201413); the FP-5 GenomEUtwin Project (grant QLG2CT200201254); US National Institutes of Health/National Eye Institute (grant 1RO1EY018246, principal investigator, T. L. Young, MD); NIH Center for Inherited Disease Research; and the National Institute for Health Research comprehensive Biomedical Research Centre award to Guy’s and St Thomas’ National Health Service Foundation Trust partnering with King’s College London. Dr Hysi is the recipient of a Fight for Sight ECI award. Dr Williams acknowledges financial support from the TFC Frost Charitable Trust. Dr Mahroo is a recipient of the Fight for Sight New Lecturers’ Grant Award.
Role of the Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Disclaimer: The views expressed are those of the authors and not necessarily those of the National Health Service, the National Institute for Health Research, or the Department of Health.