Use of Steroid Profiling Combined With Machine Learning for Identification and Subtype Classification in Primary Aldosteronism

Key Points Question Does steroid profiling combined with machine learning offer a potential 1-step strategy to facilitate diagnosis and subtype classification for treatment stratification of patients with primary aldosteronism? Findings This diagnostic study involving patients tested for primary aldosteronism found that those with unilateral adenomas harboring pathogenic KCNJ5 sequence variants showed the most clinical benefit from surgical intervention and could be effectively identified at a single screening step using machine-learning combinatorial marker profiles of 7 steroids. Meaning The outlined strategy offers a potential approach to improve diagnosis of primary aldosteronism and facilitate more efficient and effective stratification of patients for surgical intervention.


Introduction
Applications of artificial intelligence, including machine learning, are gaining increasing recognition for informing medical decision-making. [1][2][3][4] Machine learning may be particularly useful in heterogeneous disorders where there is a need for stratification to guide therapy. [5][6][7][8] One such disorder is primary aldosteronism (PA), a common cause of secondary hypertension with 2 main subtypes for which treatment stratification is crucial but difficult. 9,10 With a prevalence of 5% to 7% among unselected patients with hypertension and up to 20% among patients with severe hypertension, PA affects large numbers of patients and is associated with considerable morbidity exceeding that of patients with primary hypertension (PHT) and similar elevations of blood pressure. 11,12 The aforementioned considerations highlight the importance of effective methods for diagnosis and treatment of PA, which must allow for stratification according to unilateral vs bilateral hypersecretion of aldosterone. 9,10 Cure of the former can be achieved by adrenalectomy, whereas mineralocorticoid receptor antagonists are indicated for the bilateral subtype. Attaining this stratification is not simple and usually requires adrenal venous sampling (AVS), a technically demanding, expensive, time-consuming, and not infallible procedure. 9,[13][14][15][16] In 2 independent studies, 13,16 discordant lateralization results were observed in 24% to 28% of patients who underwent AVS with vs without adrenocorticotropin. In another study, 14 clinical outcomes did not differ according to determination of unilateral disease by AVS vs radiological imaging. In a fourth study, 15 there were no significant differences in rates of biochemical cure (76% vs 69%) in patients younger than 65 years who underwent adrenalectomy according to AVS lateralization ratios larger vs smaller than 4.
Apart from the difficulties and limited effectiveness of AVS for subtype classification, there are also problems with earlier steps in the diagnosis of PA. Although the aldosterone to renin ratio (ARR) offers a time-honored method for screening, there is considerable overlap of ratios among patients with and without PA 17,18 ; thus, at ARR cutoffs selected to optimize diagnostic sensitivity, there are many false-positives, leading to the need for confirmatory studies. 19,20 Such multiple steps, poor standardization, requirements to consider antihypertensive medications, and difficulties with AVS all represent barriers to diagnostic stratification; consequently, most patients remain undiagnosed and are not appropriately treated. 9,21 Improved approaches for diagnostic stratification are therefore needed.
With the aforementioned considerations in mind, we examined the use of mass spectrometrybased steroid profiling combined with machine learning for diagnostic stratification, with the hypothesis that this approach at screening might facilitate case detection and also allow for subtype classification. This hypothesis was based on findings that distinct steroid profiles in adrenal venous plasma of patients with bilateral and unilateral PA translated to similarly distinct profiles in peripheral plasma. 22 Patients with unilateral aldosterone-producing adenomas (APAs) due to pathogenic sequence variants of KCNJ5 have particularly distinct steroid profiles. 23 These patients also have larger and more clearly visualized APAs and show the most favorable outcomes after adrenalectomy. [24][25][26][27] The use of steroid profiles to identify these patients may, therefore, be especially useful. Thus, the primary objective of this study was to establish whether steroid profiling could facilitate both identification and subtype classification of patients with PA, particularly those with unilateral APAs due to KCNJ5 sequence variants.

Statistical Analysis
Statistical analyses used JMP Pro statistical software version 14 (SAS Institute). Unless otherwise specified, significance was defined as P < .05. Statistical tests were 2-tailed and included the Fisher exact test and the Mann-Whitney U test. Nominal logistic modeling was used to assess for associations of the presence versus absence of a pathogenic KCNJ5 sequence variant with PASO criteria based-outcomes according to sex and age as additional covariates. Associations are shown according to whole model and likelihood ratio tests. Data for steroids were normalized by logarithmic transformation before analyses, including for generation of geometric means and 95% CIs. Leastsquares multivariable models were used to assess differences in plasma steroids according to patient group, age, sex, and assay batch. Differences among patient groups were assessed using the Tukey honest significance test. Logistic regression was used to generate receiver operating characteristic curves, with selections of steroids in profiles based on both stepwise regression and likelihood ratios for each steroid. Differences between areas under receiver operating characteristic curves (AUROCs) and data from confusion matrices were used to assess performance of logistic regression models.
Data were normalized according to upper cutoffs of reference intervals, which for most of the plasma steroids were specific for either or both age and sex (eTable 3 in the Supplement). Data analyses were performed from September 2018 to August 2019.

Machine Learning
In brief, the machine-learning workflow involved 3 phases (eFigure 2 in the Supplement): data preparation, model learning, and external validation. Data preparation included several procedures for normalization, batch correction, and, in some models, adjustments for age and sex (see eTable 3, eTable 4, and eTable 5 in the Supplement). At this stage, each of the 13 different data sets was subdivided into 2 different proportions for learning and external validation data sets, as outlined in eAppendix 1 in the Supplement. After data preparation, machine-learning tasks for feature selection, model training, and sample classification in the second model learning phase were performed according to different algorithms, with their application in this phase restricted to learning data sets.
Feature selection involved the use of 4 different algorithms to identify specific steroid combinations that provided either optimal segregation of patients with and without PA or identification of those with unilateral disease due to KCNJ5 sequence variants among all patients.
Several combinations of the aforementioned procedures were investigated for optimized data analysis and assessed according to 9 machine-learning algorithms corresponding to variations of 4 commonly used models in medicine: random forest (RF), support vector machine (SVM), linear discriminant analysis, and logistic regression. A total of 585 models arising from 13 data sets and 9 machine-learning algorithms were tested, each involving a 10 times, 5-fold cross-validation step (eFigure 3 in the Supplement). Optimal classification, determined as part of the final validation phase according to either AUROC or F scores, was determined according to external validations achieved by application of algorithms for each of the 585 models applied to external validation data sets.

Genotype-Related Therapeutic Outcomes and Patient Group Reclassification
Among patients who underwent adrenalectomy because of presumed unilateral PA, those with APAs due to KCNJ5 variants were, on average, 5 compared with those with wild-type KCNJ5 APAs ( Table 1). According to the PASO classification, the presence of KCNJ5 variants conferred significantly better clinical and biochemical outcomes after adrenalectomy compared with the absence of KCNJ5 variants. However, logistic modeling indicated that improved blood pressure control in patients with APAs due to KCNJ5 variants vs wild-type APAs was accounted for by the younger age and female predominance of patients with KCNJ5 variants. In contrast, the presence of a KCNJ5 variant remained independently associated with biochemical cure.
The overall postadrenalectomy biochemical cure rate in this study was 88.5%; the cure rates were 96.6% for patients with KCNJ5 variants and 83.6% for patients without KCNJ5 variants.

Steroid Profiles
With least squares adjustments of sex, age, and assay batch, all plasma steroids showed some differences among the 5 patient groups (eTable 4 in the Supplement). Plasma 18-oxocortisol showed differences among all groups but especially the group with unilateral APAs due to KCNJ5 variants, in whom plasma concentrations were 6.2-to 10.3-fold higher than all other groups ( Table 2). Plasma 18-hydroxycortisol in the KCNJ5 variant group was also 3.3-to 4.0-fold higher than in other groups.
Plasma aldosterone in the 2 unilateral disease groups, which did not differ, were higher than in the other 3 groups. Other steroids were either similarly increased in patients with PA or showed differing patterns or decreases or increases compared with patients with hypertension according to the particular subtype of PA.   (Figure 1). Combination of the steroid profile with the ARR was nevertheless more effective for discriminating PA from PHT than use of either the steroid profile (difference in AUROC, 0.089; 95% CI, 0.059 to 0.119; P < .001) or the ARR (difference in AUROC, 0.036; 95% CI, 0.013 to 0.060; P = .003) alone. Combination of the steroid profile with the ARR improved performance over

Steroid Profiling With Machine Learning
After batch corrections (eFigure 4 and eFigure 5 in the Supplement) and using feature selection within machine-learning approaches, combinatorial markers composed of up to 7 steroids were identified that offered best performance for discriminating patient groups (eFigure 5, eFigure 6, eFigure 7, eFigure 8, eFigure 9, and eFigure 10 in the Supplement). Among those steroids, aldosterone, 18-oxocortisol, and 18-hydroxycortisol commonly occupied the top 3 places for discriminatory power. The next steroid with useful discriminatory power was 11-deoxycorticosterone, followed by several others depending on the model.
The final selection of models for optimal classification was reduced to 21 best models according to either AUROCs or F scores (eTable 7 in the Supplement). Among these, an RF model provided optimal performance for the classification of patients with and without PA, whereas a nonlinear (radial basis function kernel) SVM model was optimal for patients with APAs due to KCNJ5 variants (Figure 2). For both models, aldosterone, 18-oxocortisol, and 18-hydroxycortisol occupied the top 3 places, with 11-deoxycorticosterone following in fourth and fifth places, respectively, for the SVM and RF models. For the SVM model, cortisone, 11-deoxycortisol, and androstenedione replaced corticosterone, 17-hydroxyprogesterone, and dehydroepiandrosterone as selected features of the RF model.
Performance of RF and SVM models upon external validation was similar or even appeared to exceed that of the learning series (

Discussion
To our knowledge, this study is the first to demonstrate the application of multidimensional pattern recognition and machine learning for analysis of steroidomic data in the diagnosis of PA. This approach offers the potential for more efficient and effective diagnostic stratification than the  present study, the 88.5% postadrenalectomy biochemical cure rate lies between those found previously 29,41,43 and is similar to that found in a single prospective study. 14 The failure of adrenalectomy to cure PA may reflect asymmetric bilateral disease in some patients. 44 Aldosteroneproducing cell clusters have been identified in the zona glomerulosa of aging adrenal glands and in the adrenal glands of patients with PA due to bilateral adrenal aldosterone hypersecretion; in both cases, cells of those clusters are characterized by high rates of pathogenic variants of CACNAID, but not KCNJ5. 45,46 This raises the possibility that KCNJ5 sequence variants might be characteristic of unilateral adenomas. Nevertheless, 2 of our patients with APAs due to KCNJ5 variants did not experience complete biochemical cure after adrenalectomy, suggesting that KCNJ5 sequence variants are not strictly associated with unilateral disease. Nevertheless, failure to reach cure in patients with APAs due to KCNJ5 variants was rare, confirming findings that these patients show more clinical benefit after adrenalectomy than others. [25][26][27] As we further establish here, the benefit in terms of biochemical cure is independent of age and sex, further highlighting the importance of triaging patients with APAs due to KCNJ5 variants for further interventions.  There have been other studies that combined steroid profiling with machine learning, 47,48 but, to our knowledge, this is the first to apply a combinatorial marker design strategy to PA. The potential benefits for diagnostic stratification of PA are multiple. First, during screening it may be possible to more effectively distinguish patients with PA from those with other causes of hypertension. Second, by identifying within the same screening step patients with unilateral APAs due to KCNJ5 variants, it should be possible to immediately triage those patients for AVS; alternatively, with clear imaging evidence of a unilateral adenoma, it may be possible to directly proceed to an adrenalectomy without AVS. These considerations underscore the potential advantages of moving away from traditional unidimensional approaches (eg, ARR) for diagnostic stratification to multidimensional approaches that take advantage of today's computational power for applications of artificial intelligence.

Conclusions
These findings suggest that plasma steroid profiles obtained during initial screening for PA can improve case detection beyond that possible using the ARR alone. Moreover, the use of distinctive profiles to identify patients with unilateral APAs due to KCNJ5 variants further illustrates the potential of steroid profiling for disease stratification at a single screening step. Along with advances in functional imaging 49-51 and other measurements, such as the angiotensin peptidome, 52,53 steroid profiling combined with machine learning may facilitate more rapid identification of patients with PA for appropriate therapeutic interventions. As detailed in eAppendix 3 in the Supplement, such strategies are now being tested in further patient populations, and with those developments it may become possible to screen more than the small proportion of patients with PA who are currently tested and treated according to disease subtype.