Penetrance and Pleiotropy of Polygenic Risk Scores for Schizophrenia, Bipolar Disorder, and Depression Among Adults in the US Veterans Affairs Health Care System

This cross-sectional study benchmarks the penetrance of current neuropsychiatric polygenic risk scores in the Veterans Health Administration health care system, and explores associations between polygenic risk scores and broad categories of human disease via phenome-wide association studies.

S erious mental illnesses such as schizophrenia, bipolar disorder, and major depression are leading causes of disability and public health expenditure, and affected persons disproportionately experience increased morbidity and early mortality. Recent years have seen important advances in our understanding of the complex multifactorial underpinnings of serious mental illnesses, with genome-wide association studies (GWAS) yielding robust and replicable associations with specific loci (270, 64, and 44 for schizophrenia, bipolar disorder, and major depression, respectively). [1][2][3] However, small effect sizes at individual variants and extreme polygenicity have thwarted the transformative mechanistic insights needed for development of novel therapeutics and prevention strategies.
Polygenic risk scores (PRSs) aggregate genetic associations across the genome, including many variants that do not attain genome-wide significance and can account for more variance in liability than genome-wide significant findings alone, albeit they are typically less predictive than a positive family history 4 or certain rare copy number variants. 5 Ever-increasing GWAS sample sizes have seen the variance in liability captured by PRSs climb steadily, from 3% in the first demonstrative application to schizophrenia 6 to upwards of 10% in recent Psychiatric Genomics Consortium analyses. 1 As applied to nonpsychiatric traits, the clinical utility of PRSs is emergent, 7 and potential applications in psychiatry are actively being explored, 8 including risk stratification and predicting treatment response.
With large biobanks now linking the electronic health records (EHRs) of hundreds of thousands of patients to their individual-level genomic data, there are opportunities to explore the associations of PRSs (or specific variants) with a wide range of clinical phenotypes, ie, a genotype-to-phenotype or reverse genetics paradigm. Also known as phenome-wide association studies (PheWAS), 9 this unbiased, disease-agnostic approach has the potential to uncover hitherto unrecognized associations between distinct diagnostic entities and to aid in disentangling complex pleiotropic associations. A recent application of PheWAS from the PsycheMERGE Consortium analyzed schizophrenia PRSs in more than 100 000 patients from 4 large health care systems (Geisinger Health System, Mount Sinai Health System, Partners HealthCare System, and Vanderbilt University Medical Center) and uncovered robust associations with both psychiatric and nonpsychiatric diagnoses. 10 The Veterans Health Administration (VHA) is the largest integrated health care system in the United States, with 171 medical centers and 1112 outpatient clinics serving more than 9 million veterans. Launched in 2010, the Million Veteran Program (MVP) is a landmark endeavor that links genomic laboratory testing, survey-based self-report data, and EHRs spanning decades, with the goal of creating a mega-biobank and novel evidence base for precision medicine initiatives. 11 Demographically and clinically, the 850 000 enrolled participants reflect the population that uses the VHA, with overrepresentation of older individuals and male individuals, as well as higher rates of multiple, chronic conditions compared with the general population, 12,13 despite better access to health care.
Cooperative Studies Program (CSP) #572 14 is a cohort of approximately 9300 veterans with schizophrenia or bipolar 1 disorder who received detailed in-person assessments of clinical diagnosis, functioning, and symptomatology. Within this companion study to the MVP, we evaluated the sensitivity of ICD-9/10 billing codes for schizophrenia, bipolar disorder, and major depression, applying case-control definitions of varying stringency and breadth of clinical phenotype (eg, schizophrenia vs any psychosis). We benchmarked the penetrance of current neuropsychiatric PRSs for schizophrenia, 1 bipolar disorder, 2 and major depression 3 in 400 000 ancestrally diverse MVP participants and explored the broad associations of PRS with physical and mental health conditions via PheWAS. Finally, recognizing the considerable shared genetic basis of these disorders, we applied genomic structural equation modeling to derive common and disorder-specific latent genetic factors for comparative genomic analyses and explored pleiotropic associations of these latent factors with PheWAS.

Study Participants
This study was approved by the Veterans Affairs (VA) Central Institutional Review Board, and all patients provided written informed consent. Additional details of study ascertainment and assessment are described elsewhere. 14

CSP #572
Participants were recruited through their clinicians, posted notices at participating VA hospitals, and through word of mouth from January 2011 to January 2020. All patients received the Structured Clinical Interview for the DSM 15 and met lifetime DSM-IV criteria for schizophrenia (n = 3953) or bipolar 1 disorder (n = 5425). Patients with major neurologic illnesses or medical problems that could interfere with central nervous system function were excluded. Information from medical records, patients' clinicians, or other informants were used, if needed, to confirm diagnoses. Diagnosed substance misuse

Key Points
Question What is the penetrance of polygenic risk scores (PRSs) for schizophrenia, bipolar disorder, and major depression among US veterans who use the Veterans Health Administration health care system and what health problems are associated with having a higher polygenic burden?

Findings
In this cross-sectional study of 707 299 individuals, PRSs were associated with having ever received a relevant psychiatric diagnosis and were enriched among more frequently hospitalized patients. Higher PRSs were associated with increased odds for numerous mental and physical health diagnoses, even among individuals who lack a formal diagnosis.
Meaning Individual-level PRSs informed by large-scale genetic studies are portable across US health care systems and have emergent potential for risk stratification, albeit with disparate specificity across ancestries.
was not an exclusion criterion, given some concerns about representativeness. Participants received a brief cognitive assessment and the University of California, San Diego, Performance-Based Skills Assessment, Brief version, a performance-based measure of everyday functional skills. 16

MVP
Participants were active users of the VHA health care system and were recruited through invitational mailings or by MVP staff while receiving clinical care. Informed consent and authorization per the Health Insurance Portability and Accountability Act were the only other inclusion criteria. Participants were recruited from January 2011 to January 2020. All participants completed a baseline survey, which includes information on demographic factors, health status, lifestyle habits, military experiences, medical history, and family history of specific illnesses and physical features; many also completed an optional lifestyle survey. 11 At the time of manuscript submission, 4697 individuals (approximately 50% of CSP #572 and approximately 0.7% of MVP) were dually enrolled in both CSP #572 and MVP. CSP #572 and MVP participants were genotyped on the MVP 1.0 Axiom array 17 (eMethods in Supplement 1). Participants were classified as being of African or European ancestry using the harmonized ancestry and race and ethnicity method, 18 which combines information on genetic ancestry with self-identified race and ethnicity.

EHRs
For 9378 CSP #572 participants and 697 921 nonoverlapping MVP enrollees, we extracted ICD-9/10 billing codes related to schizophrenia, bipolar disorder, and major depression (eTable 1 in Supplement 1) and prescription records for commonly prescribed antipsychotics, mood stabilizers, and antidepressants (eTable 2 in Supplement 1) from the VHA Corporate Data Warehouse.
We compared CSP #572 participants' Structured Clinical Interview for the DSM-confirmed diagnoses with the ICD-9/10 codes recorded in their EHRs. Given the challenge of the differential diagnosis, for individuals with both schizophrenia and bipolar disorder codes, we took the mode of the 5 most recent entries as the prevailing diagnosis.

PRS Profiling
We constructed PRSs from published Psychiatric Genomics Consortium GWAS results, 1-3 testing these for association with disease outcomes in the MVP cohort. Variants that met quality control filtering in both the training and target data sets were clumped in the appropriate 1000 Genomes Project phase 3 population (r 2 > 0.1; 500-kb window), excluding the major histocompatibility complex. For varying P value thresholds in each training data set (eMethods in Supplement 1), scores were constructed by summing the number of tested alleles weighted by their effect estimates (ie, the log of the allelic odds ratio). To better facilitate comparison of our results with those based on civilian cohorts, we used a similar approach to Zheutlin et al, 10 including comparing results based on a recently developed bayesian framework that applies continuous shrinkage to test statistics, PRScontinuous shrinkage. 19

Genomic Structural Equation Modeling
We used genomic structural equation modeling 20 to model the genetic covariance structure underlying schizophrenia, bipolar disorder, and major depression. Briefly, genomic structural equation modeling models the multivariate genetic architecture of complex traits by estimating individual singlenucleotide variants (SNVs; formerly, single-nucleotide polymorphisms or SNPs) associations on latent constructs, is robust to sample overlap and sample-size imbalance, and does not require individual-level genotypes. We estimated the SNV associations with a common factor (shared across disorders) as well as associations specific to each disorder (eMethods in Supplement 1).

PheWAS
We used PheWAS to explore the associations between neuropsychiatric PRSs and phecodes representing groupings of associated ICD-9/10 billing codes. 9 When testing individual phecodes, we required individuals with these disorders and controls to have 2 or more and zero codes, respectively. We applied logistic regression to test scaled PRSs (mean [SD], 0 [1]) for association with phecodes within ancestry groups, covarying for age, age 2 , sex, and 6 ancestry principal components.
We performed a series of sensitivity analyses, covarying for selected diagnoses or treatment with antipsychotics, mood stabilizers, and antidepressants, or removing individuals with any lifetime diagnosis of psychotic, mood, or substance disorders (eMethods in Supplement 1). Two-sided P values were statistically significant at 10 −5 . Analysis took place from January 2021 to January 2022.

Validation of EHR-Derived Phenotypes
We first sought to evaluate the precision and accuracy of EHRderived phenotypes to capture caseness based on the Structured Clinical Interview for the DSM-based diagnoses available in CSP #572. Among 9378 patients, 8962 (95.6%) were correctly assigned using 2 or more relevant phecodes.
Overall, 1153 of 3953 confirmed patients with schizophrenia (29.2%) had 2 or more bipolar disorder-related phecodes, and 1869 of 5425 confirmed patients with bipolar 1 disorder Comparing receiver operating characteristic curves for predictive models based on the varying criteria displayed in the Table, we concluded that a minimum of 2 phecodes offered the best overall balance of sensitivity vs specificity (Figure 1; eTable 6 in Supplement 1).

Penetrance of Neuropsychiatric PRS in the VA Health Care System
Benchmarking results for neuropsychiatric PRSs based on varying P value thresholds are given in eTables 7 to 9 in Supplement 1. Case prevalence estimates for each decile of PRS, representing the absolute disease prevalence, are displayed in Figure 2. As expected, the prevalence of serious mental illnesses was higher among veterans treated at VHA facilities than in the general population. 21, 22 The prevalence of schizophrenia among participants of European ancestry in the top and bottom deciles of schizophrenia PRS was 4% and 0.9%, respectively, corresponding to 4.8-fold higher odds (95% CI, 4.22-5.43; P < 10 −131 ). Individuals in the top decile had approximately 2.4-fold higher odds of being diagnosed with schizophrenia than those below the 90% percentile (95% CI, 2.26-2.55; P < 10 −183 ); 2-fold higher odds for psychosis (phecode 295.3) (95% CI, 1.85-2.09; P < 10 −101 ); 1.6-fold higher odds for bipolar disorder (95% CI, 1.53-1.66; P < 10 −106 ); and 1.2fold (95% CI, 1.15-1.21; P < 10 −35 ) for major depression.

Cross-Ancestry Portability of Neuropsychiatric PRS
The higher prevalence of schizophrenia and major depression among individuals of African ancestry was largely not associated with individuals' risk strata ( Figure 2) and as high as 6.8% and 51%, respectively, in the top 10%. In contrast, across bipolar disorder PRS deciles, the absolute prevalence of bipolar disorder was between 5.8% and 6.8%.
Individuals of African ancestry in the top decile of schizophrenia PRSs had approximately 1.4-fold higher risk of diagnoses of

Schizophrenia PRS Have Equivalent Relative Penetrance in Civilian and Veteran Health Care Systems
The demographics of the US veteran population differ from cohorts recruited from civilian health care systems. For instance, the VA population is composed of mostly male individuals (approximately 90%) and has higher prevalence of neuropsychiatric illnesses. 23

Higher Loadings of Neuropsychiatric PRS in More Chronic Illness
We observed a trend of increased polygenic loading in more chronic illness presentations. 1,24 Patients who received inpa-   Associations between schizophrenia and bipolar disorder PRSs with inpatient treatment and number of hospitalizations were replicable in participants of African ancestry (eTable 14-15 in Supplement 1).

Pleiotropic Influences of Neuropsychiatric PRS
Higher polygenic loading for schizophrenia, bipolar disorder, and major depression was associated with increased odds for numerous psychiatric diagnoses and physical health conditions (eTables 18-20 and eFigures 1-3 in Supplement 1). Asso-ciations between schizophrenia and respiratory symptoms and infections and between major depression and cardiovascular disease, hypertension, diabetes, somatic symptoms, and respiratory problems were robust among veterans without a lifetime diagnosis of psychotic, mood, or substance use disorders or who received relevant pharmacological treatment. Comparing PheWAS results, we observed relative enrichments of schizophrenia polygenic risk in mental health problems and of major depression PRSs in circulatory, respiratory, and endocrine problems, among others (eFigures 4-7 in Supplement 1).
Associations between schizophrenia PRSs and dental problems, respiratory symptoms, skin infections, substance use disorders, and suicide were replicable among veterans of African ancestry (eTables 21-23 and eFigures 8-10 in Supplement 1) and remained significant after adjusting for lifetime diagnoses and medications.
We investigated the apparent protective effects of schizophrenia PRSs 10 by examining associations between individual genome-wide significant schizophrenia loci 1 and selected phecodes (Figure 3), based on the same linkage disequilibrium-independent SNVs and analytic framework as used for PheWAS. Using a simple binomial test, we found that significantly more schizophrenia-associated SNVs than expected by chance (ie, 50%) had reversed directions of asso-

Genomic Structural Equation Modeling and Latent Factor PRS
Comparing our primary results with those based on latent genomic factors, we found that both schizophrenia-specific and common factor PRSs were associated with increased odds of psychosis-spectrum diagnoses (eTables 24-30 in Supplement 1). Observed protective associations of schizophrenia PRSs for sleep apnea, osteoarthritis, and hearing loss appear to be driven by schizophrenia-specific influences (eTables 24-25 in Supplement 1). The majority of associations between PRSs and broader psychiatric diagnoses and physical health problems were driven by a shared genetic liability (eTables 29-30 in Supplement 1).

Polygenic Validation of the Psychosis-Affective Spectrum
We further explored the transdiagnostic spectrum concept via hierarchical assignments of participants to schizophrenia, bipolar disorder, major depression, or related diagnoses; schizoaffective disorders, bipolar II disorder, cyclothymia, and dysthymia were considered as intermediate categories of illness and were included in analyses given adequate sample sizes (eTables 31-32 in Supplement 1). Figure 4 displays estimated PRSs estimates across disorders, comparing individuals with these disorders to a common set of screened controls.

Discussion
Building on our previous reports that published GWAS results are robustly generalizable to the US veteran population, 25 we have demonstrated that the penetrance of schizophrenia PRSs is equivalent across VA and civilian health care systems, despite marked differences in absolute prevalence. 10 Leveraging the VA's extensive EHR, we confirm and extend reported associations between neuropsychiatric PRSs and broad disease categories in approximately 400 000 individuals. We derived novel, latent factors indexing disorder-specific and shared cross-disorder risk and attempted to disentangle widespread pleiotropy from confounding through extensive secondary modeling. We first validated an EHR-based phenotyping approach in an embedded, well-characterized cohort with confirmed diagnoses of schizophrenia or bipolar 1 disorder and found that a simple approach 10 requiring 2 or more ICD-9/10 codes correctly identified approximately 95% of cases. Applying this phenotyping strategy to the full MVP cohort, we detected prevalence of schizophrenia, bipolar disorder, and major depression that are several-fold higher than in the general population, although representative of the US veteran population at large. 23 Rates of schizophrenia and major depression among veterans of African ancestry were markedly elevated compared with veterans of European ancestry, which may be suggestive of implicit bias in diagnosis, 26 self-selection bias for VHA utilization, limited alternatives for health care, or other structural issues.
Critically, despite markedly higher prevalence of schizophrenia, bipolar disorder, and major depression in the VA health care system, we did not find predictive values to be meaningfully attenuated. Schizophrenia PRSs yielded effect sizes for participants of European ancestry that were within the 95% CI of those reported by the PsychEMERGE consortium. 10 The latter suggests that the penetrance of within-population relative PRSs in the US veteran population is equivalent to that of civilian cohorts despite nonrandom recruitment and exposure to distinct environmental factors and experiences.
Current neuropsychiatric PRSs were robustly associated with a range of psychiatric problems. Higher schizophrenia PRSs also increased risk of suicide, obsessive compulsive disorder and personality disorders, anxiety, and substance use behaviors, as well as a host of physical and somatic symptoms, recapitulating recent findings based on 4 US civilian health care systems. 10 Notably, increased risks of certain infections and dental problems were detectable even when adjusting for psychotic and affective diagnoses and treatment, and excluding diagnosed substance use disorders, suggesting that neuropsychiatric liability may be penetrant even in individuals who lack a formal diagnosis. Other associations, such as those observed with erectile dysfunction and polydipsia, were explained by relevant diagnoses and adverse reactions of prescribed medications. 27 We observed protective associations of schizophrenia PRSs with hearing loss, osteoarthritis, and diabetes, in contrast to widely documented, risk-increasing iatrogenic effects of second-generation antipsychotics. 28 Post hoc analyses of schizophrenia-associated loci 1 revealed an enrichment of SNVs at which the schizophrenia risk allele is protective for these conditions, suggestive of antagonistic pleiotropic effects.
Schizophrenia-specific PRSs were primarily associated with diagnoses of schizophrenia, paranoid disorders, psychosis, and schizoid personality disorder but only modestly with bipolar disorder, evincing some fidelity of published GWAS to Kraepelinian dichotomy. Intriguingly, major depression-specific and common factor PRSs, but not schizophrenia-specific PRSs, were strongly associated with increased number of comorbidities, suggesting that underlying confounding may be driving many of the observed pleiotropic associations. Future studies using mendelian randomization and within-family approaches may help to determine whether these findings are due to some likely causal influence or accounted for by other shared environmental risk factors (eg, socioeconomic status).
Strikingly, the higher prevalence of schizophrenia and major depression among veterans of African ancestry were largely not associated with individuals' risk strata; for example, only veterans of European ancestry in the uppermost schizophrenia PRS decile had absolute risk comparable with those of African ancestry in the lowest decile. Despite lower crosspopulation generalizability of bipolar disorder and major depression PRSs, comparisons of individuals of African ancestry at extremes of PRSs yielded significant associations.

Limitations
We did not attempt to model environmental or experiential differences associated with participants' military service, which may partially explain increased rates of some illnesses. We did not specifically investigate the implications of predominantly male ascertainment in MVP. CSP #572 participants largely served in the period between the Vietnam War and Gulf War conflicts, while MVP participants' service eras were more broadly distributed.
Because available EHR data are restricted to treatment received at VA facilities, any relevant medical history outside the VHA health system, including before or during participants' military service, is limited to self-report.
We focused on populations of African and European ancestry in the current study because these broadly defined ancestries comprised the majority of the CSP #572 and MVP cohorts. Our ongoing work in this area will extend these findings to diverse Asian, Hispanic, and Latino populations.