Multivariable modeling and goodness-of-fit validation results using plasma (A) and CSF (B) biomarkers. Dashed lines delimit predictions that are 50% above or 33% below actual total disease duration. Dots represent predictions that fall within this range (match), upward arrowheads predictions that are above this range (high), and downward arrowheads predictions that are below this range (low). Shaded bands represent prediction limit 95% CI.
Multivariable modeling and goodness-of-fit validation results using combined plasma and CSF biomarkers as well as plasma-CSF ratios. Dash-dotted lines delimit predictions that are 50% above or 33% below actual total disease duration. Dots represent predictions that fall within this range (match), upward arrowheads predictions that are above this range (high), and downward arrowheads predictions that are below this range (low). Shaded band represents prediction limit 95% CI.
eTable 1. Plasma regression statistics
eTable 2. CSF regression statistics
eTable 3. Combined CSF and plasma regression statistics
Su XW, Simmons Z, Mitchell RM, Kong L, Stephens HE, Connor JR. Biomarker-Based Predictive Models for Prognosis in Amyotrophic Lateral Sclerosis. JAMA Neurol. 2013;70(12):1505-1511. doi:10.1001/jamaneurol.2013.4646
Although median survival in amyotrophic lateral sclerosis (ALS) is 2 to 4 years, survival ranges from months to decades, creating prognostic uncertainty. Strategies to predict prognosis would benefit clinical management and outcomes assessments of clinical trials.
To identify biomarkers in plasma and cerebrospinal fluid (CSF) of patients with ALS that can predict prognosis.
Design, Participants, and Setting
We conducted a retrospective study of plasma (n = 29) and CSF (n = 33) biomarkers identified in samples collected between March 16, 2005, and August 22, 2007, from patients with ALS at an academic tertiary care center. Participants included patients who were undergoing diagnostic evaluation in the neurology outpatient clinic and were eventually identified as having definite, probable, laboratory-supported probable, or possible ALS as defined by revised El-Escorial criteria. All were white and none had a family history of ALS. Clinical information extended from initial presentation to death. Genotyping for hemochromatosis (HFE) gene status was performed. Multiplex and immunoassay analysis of plasma and CSF was used to measure levels of 35 biomarkers. Statistical modeling was used to identify biomarker panels that could predict total disease duration.
Main Outcomes and Measures
Total disease duration, defined as the time from symptom onset to death, was the main outcome. The hypothesis being tested was formulated after data collection.
Multivariable models for total disease duration using biomarkers from plasma, CSF, and plasma and CSF combined incorporated 7, 6, and 6 biomarkers to achieve goodness-of-fit R2 values of 0.769, 0.617, and 0.962, respectively. After classification into prognostic categories, actual and predicted values achieved moderate to good agreement, with Cohen κ values of 0.526, 0.515, and 0.930 for plasma, CSF, and plasma and CSF combined models, respectively. Inflammatory biomarkers, including select interleukins, growth factors such as granulocyte colony-stimulating factor, and l-ferritin, had predictive value.
Conclusions and Relevance
This study provides proof-of-concept for a novel multivariable modeling strategy to predict ALS prognosis. These results support unbiased biomarker discovery efforts in larger patient cohorts with detailed longitudinal follow-up.
Amyotrophic lateral sclerosis (ALS) is highly heterogeneous in clinical course and survival, with the latter ranging from months to decades. The cause of most cases of ALS remains unknown. Although several genetic and environmental factors have been identified, each accounts for a fraction of the total caseload.1 These elements confound clinical management as well as the development of new treatments. Biomarkers have the potential to address both of these concerns.
Biomarkers are objective physiologic measures reflecting biological processes or treatment effects.2,3 By improving prognostic determination, they could aid clinical care planning, including discussions of gastrostomy tube placement, noninvasive ventilation, and end-of-life decisions. Prognostic biomarkers also could have a meaningful effect on the conduct of clinical trials. It is now often impossible to determine whether a potential therapeutic agent in a failed clinical trial is wholly ineffective or may have benefited a subgroup of patients who could be stratified by as-yet unidentified biomarkers. Biomarker subgroup analyses in clinical trials has the potential to permit the stratification of clinical response results according to predicted prognosis.
Several ALS biomarker studies have been conducted or are ongoing. In blood, amino acids,4- 6 inflammatory cytokines,7- 11 growth factors,12- 14 and metabolites15,16 have been studied. In cerebrospinal fluid (CSF), researchers have analyzed similar classes of biomarkers,6,17- 19 with special attention to neurofilament protein,20- 23 τ,24 S100-β,24,25 and cystatin C.26,27 Previous studies28,29 have demonstrated that protein biomarkers in both plasma and CSF may aid the diagnosis and stratification of patients with ALS. However, these studies focused primarily on diagnosis rather than clinically relevant prognostic end points and generally followed a targeted rather than unbiased discovery approach. As such, biomarker studies using unbiased screens of candidate targets for disease progression and prognosis are limited.
This study attempted to identify biomarkers relevant for ALS prognosis. Study design, recruitment, sample collection and biobanking, quality control, and data analysis proceeded with attention to recent ALS biomarker research guidelines.30 The results support the usefulness of biomarker-based approaches to analyze ALS disease progression and prognosis.
We conducted a retrospective study of plasma and CSF biomarkers identified in samples collected between March 16, 2005, and August 22, 2007, from patients who were undergoing diagnostic evaluation in the neurology outpatient clinic and were eventually identified as having definite, probable, laboratory-supported probable, or possible ALS.31 Clinical information extended from initial presentation to death and spanned June 1, 1989, through March 30, 2013. Blood samples were collected by venipuncture and were centrifuged immediately to isolate plasma. After written informed consent was received from patients, CSF samples were obtained by standard lumbar puncture under sterile conditions with local anesthetic. Plasma and CSF were obtained between 8 am and 12 pm to limit circadian effects. Samples were frozen after collection, then later thawed on ice and centrifuged to remove particulate matter. Protease inhibitor cocktail (for use with mammalian cell and tissue extracts; Sigma-Aldrich) was added, and samples were refrozen at −80°C in 200-µL aliquots until use. The study was approved by the Penn State Hershey Medical Center institutional review board.
Clinical variables recorded included date of birth, sex, site and time of symptom onset, ALS Functional Rating Scale–Revised scores,32 and time of death. Total disease duration was defined as time from symptom onset to death. None of the patients underwent tracheostomy and mechanical ventilation. Because of the suggested association between ALS and hemochromatosis gene (HFE) polymorphisms,33- 37 histidine-63-to-aspartic acid (H63D) and cysteine-282-to-tyrosine (C282Y) HFE genotyping was performed.
Genomic DNA was purified from leukocytes (QIAamp DNA Mini kit; Qiagen). Polymerase chain reaction followed by restriction fragment length analysis and confirmation DNA sequencing as previously reported35 were used to analyze H63D and C282Y HFE status.
Multiplex biomarker analysis of plasma and CSF was performed using an assay system (Bio-Plex Pro Human Cytokine 27-Plex; Bio-Rad Laboratories). Briefly, 50 µL of plasma at a 1:3 dilution or undiluted CSF was added to 50 µL of antibody-conjugated beads on respective assay wells, followed by 25 µL of detection antibody. At that time, 50 µL of streptavidin-phycoerythrin was added, and the reaction proceeded to completion. Washes between steps were performed using an automated wash-station (Bio-Plex Pro; Bio-Rad Laboratories). Assay plates were read using a multiplex system (Bio-Plex 200; Bio-Rad Laboratories), and data were analyzed using commercial software (Bio-Plex Manager; Bio-Rad Laboratories). Analyte concentration was calculated based on the standard curve for each cytokine. Each sample was analyzed in duplicate, and the coefficient of variance was less than 10% for each sample included in the final analysis. Levels of all analytes were measured using the multiplex assay system except those related to iron metabolism, which are not available on the multiplex panel. These biomarkers were analyzed by enzyme-linked immunosorbent assay (ELISA), immunoradiometric assay, or atomic absorption spectrometry as detailed below.
Plasma and CSF levels of β-2 microglobulin (β-2 microglobulin BioAssay ELISA Kit; USBiological) and transferrin (Human Transferrin ELISA Kit; Bethyl Laboratories) were analyzed using ELISA. l-ferritin levels were measured by immunoradiometric assay, which uses an antibody targeting human spleen ferritin largely composed of l-ferritin (Coat-A-Count Ferritin IRMA; Siemens Medical Solutions), and h-ferritin levels were measured by ELISA according to previously published protocols.38 Plasma levels of C-reactive protein (Human C-Reactive Protein/CRP Quantikine ELISA Kit; R&D Systems) and pro-hepcidin (DRG Hepcidin Prohormone ELISA Kit; DRG International) were analyzed by ELISA. Assays were conducted per manufacturers’ protocols.
Total iron content in plasma and CSF was determined by digestion in ultrapure nitric acid (9598-00; J.T. Baker, ), 1:4 vol/vol, followed by incubation at 60°C for 24 hours. Samples were diluted 1:100 in ddH2O and then analyzed (Perkin Elmer Atomic Absorption Spectrometer 600 series). Transferrin saturation in plasma was calculated as transferrin saturation (%) = plasma iron (mol/L) / [2 transferrin (mol/L)] 100. Replicate sample variation was less than 5%, and an external standard was included in each set of analyses.
Total disease duration was calculated from clinical data and treated as the dependent variable. Descriptive statistics were calculated for biomarker and clinical values. Multiple linear regression incorporating stepwise-forward, main effects–only analyses was conducted. Variable selection algorithms used McHenry’s39 method. Briefly, the variable with highest R2 was entered as an independent variable, followed by the variable that next most increased predictive likelihood. Switching was integral to the algorithm, such that with each successive variable added to the model, all other variables were checked for increases in the likelihood function, until the set of variables for an n-variable model was stable. The process was then repeated for n +1 variables until no further significant variables remained.
Models were validated using the original biomarker and clinical values of the plasma- and CSF-discovery cohorts; limitations of this approach are covered in the Discussion section. Confidence bands were constructed for model-based predictions at the individual-subject level. Actual and predicted values for total disease duration were grouped into prognostic categories, and agreement was analyzed using Cohen’s κ statistics. Commercial software was used for statistical analyses (SSA, version 9.3; SAS Institute or NCSS, version 8; NCSS LLC). Significance tests were 2-tailed, with significance set at the P < .05 level.
Plasma and CSF samples were available from 29 and 33 patients with ALS, respectively. Of these, 18 patients provided both plasma and CSF samples. All participants were white, and none had a family history of ALS. Descriptive statistics of patient characteristics are provided in Table 1.
Biomarkers analyzed are listed in Table 2. Five biomarkers were associated with greater than 10% variability between measurements and were excluded from subsequent analyses: interleukin 4 (IL-4), IL-8, IL-15, and IL-17 in plasma and IL-1β in CSF. Status of C282Y HFE was excluded because of the limited number of patients harboring this allele. A total of 31 biomarkers and 3 categorical clinical variables (sex, H63D HFE status, and site of symptom onset) were entered into individual plasma or CSF multivariable model algorithms. For the combined plasma and CSF model, each available plasma and CSF biomarker was entered into the algorithm, as well as the ratio of plasma-CSF levels for each biomarker that was measured in both plasma and CSF. The ranges of biomarkers assayed are given in Table 2.
The plasma duration model is presented in Figure 1A, with regression coefficient CIs, model selection iteration summaries, and analysis of variance tables presented in the Supplement (eTable 1). The predictive model achieved R2 = 0.769 and took the following form: disease duration = 0.128 (C-X-C motif chemokine 10 [IP-10]) + 12.825 (IL-10) – 22.472 (IL-1β) – 0.248 (IL-1RA) – 5.467 (IL-12) – 0.00463 (chemokine [C-C motif] ligand 5 [RANTES]) + 0.380 (eotaxin) + 114.654. The agreement between actual and predicted total duration was categorized, with predictions more than 50% above or 33% below actual (a geometrically symmetric error range) classified as high or low predictions, respectively, and those within this range classified as a match. Prediction matched actual in 14 cases, was high in 9 cases, and was low in 6 cases.
The CSF duration model is presented in Figure 1B, with additional statistical information presented in the Supplement (eTable 2). The predictive model achieved R2 = 0.617 and took the following form: disease duration = 1.731 (IL-9) – 1.799 (age of onset) – 35.485 (IL-5) – 4.657 (IL-12) + 4.277 (macrophage inflammatory protein-1-β [MIP-1β]) + 4.870 (granulocyte colony-stimulating factor [G-CSF]) + 155.210. After categorization as above, prediction matched actual in 19 cases, was high in 6 cases, and was low in 8 cases.
The combined plasma and CSF duration model is presented in Figure 2, with additional statistical information presented in the Supplement (eTable 3). The predictive model achieved R2 = 0.962 and took the following form: total disease duration = 0.1323 (plasma IP-10) – 18.004 (CSF IL-8) + 10.871 (plasma IL-5) + 0.338 (plasma l-ferritin) + 0.176 (CSF chemokine [C-C motif] ligand 2 [MCP-1]) – 7.480 [plasma/CSF interferon-γ [IFN-γ] – 12.058. After categorization as above, prediction matched actual in 16 cases, was high in 1 case, and was low in 1 case.
Actual and predicted disease duration were classified into prognostic categories using a separate scheme according to the following criteria: rapid progression (disease duration <2 years); average progression (disease duration 2-4 years); and slow progression (disease duration >4 years). Agreement was measured using weighted Cohen κ statistics on the resultant contingency tables (Table 3). The combined plasma and CSF model resulted in agreement between actual and predicted prognostic categories in 17 of 18 patients. Actual and predicted prognoses achieved κ = 0.526 for the plasma model, κ = 0.515 for the CSF model, and κ = 0.930 for the combined plasma and CSF model.
This study used a novel modeling strategy to predict ALS prognosis using panels of plasma and CSF biomarkers, with 2 significant findings. First, the methods identified several biomarkers with predictive value that are biologically relevant to ALS, including inflammatory cytokines, growth factors, and markers of iron metabolism, suggesting new directions for research on characteristics of disease pathophysiology. Second, multivariable modeling techniques serve as proof-of-concept for a novel strategy aimed at predicting prognosis. Mathematically, the models incorporated a manageable number of predictive factors to achieve reasonable goodness of fit. Although the particular models obtained in this study may not be generalizable to larger cohorts, the results argue for the usefulness of multivariable modeling in biomarker-based ALS research.
Plasma models had reasonable ability to predict total disease duration using 7 biomarkers; in descending order of predictive value (as measured by R2), these were IP-10, IL-10, IL-1β, IL-1RA, IL-12, RANTES, and eotaxin. Among these,IL-1β and IL-12 predict shorter disease duration in the model. These cytokines are secreted by activated macrophages to stimulate T-cell–based inflammatory responses and may reflect deleterious chronic inflammation. In contrast, IL-10 predicts longer disease duration. This immunoregulator mediates a number of anti-inflammatory effects via suppression of macrophages and antigen-presenting cells; inhibition of several inflammatory cytokines, including IL-1β and IL-12; and prevention of overwhelming immune responses leading to tissue damage.40 These results suggest that, in periphery, a lower level of inflammation is associated with longer disease duration. Although speculative, the plasma cytokine profile is suggestive of M2 vs M1 macrophage activation.41 In this context, the inverse relationship between IL-1RA, which inhibits the effects of IL-1β, and total disease duration suggests IL-1RA levels may be more important as a marker of systemic inflammation than a direct indicator of an anti-inflammatory response. Moreover, the expression of IL-1RA may indicate chronic inflammation in ALS. We also showed that RANTES, which has been previously associated with ALS42 and is a chemotactic molecule for T cells, eosinophils, and basophils, predicts shorter disease duration.
The CSF model incorporated 6 biomarkers; in descending order of predictive value, these were IL-9, age of onset, IL-5, IL-12, MIP-1β, and G-CSF. Consistent with clinical observations, increasing age of onset predicts shorter disease duration, arguing for the validity of results. As in plasma, IL-12 in CSF predicts shorter disease duration, supporting the argument for a negative effect of inflammation in both the central nervous system and periphery on disease duration. Levels of G-CSF, a growth factor previously shown to have neuroprotective effects on motor neurons in ALS,18 predicts longer disease duration in the CSF model; MIP-1β, a chemoattractant for macrophages and microglia, was also a positive predictor in CSF. These results support the concept that balanced immune modulation in the central nervous system, possibly mediated by distinct classes of immune regulators (eg, M2 vs M1 macrophages), is required to prevent neurotoxicity.41,43 In CSF, IL-9 and IL-5, together with eotaxin in plasma, mediate eosinophil-based immune responses and may be connected to findings suggesting elevated eosinophil-derived neurotoxin levels in the CSF of patients with ALS.44
The combined plasma and CSF model, which achieved high R2 and Cohen κ, was based on 6 biomarkers; in descending order of predictive value, these were plasma IP-10, CSF IL-8, plasma IL-5, plasma l-ferritin, CSF MCP-1, and the ratio of plasma-CSF IFN-γ. Plasma l-ferritin predicts longer disease duration in the model, implicating iron status in ALS disease course. Other markers of iron metabolism were not significant. A low plasma-CSF IFN-γ ratio predicts longer disease duration, whereas absolute levels did not affect disease duration. This may reflect a situation in which low levels of systemic inflammation, coupled with moderate levels of central nervous system immune activation, promote neuroprotection. The positive correlation between plasma IP-10 and total disease duration suggests that a specific induction of this cytokine by IFN-γ, vs the direct proinflammatory effects of IFN-γ, is beneficial. In CSF, levels of IL-8, a proinflammatory cytokine that activates neutrophils, predicts shorter duration, whereas MCP-1, which attracts monocytes, predicts longer duration. This may again implicate differing immune cells, or even subclasses of the same cell, in central nervous system immune modulation that affords neuroprotection. Results also suggest that a combination of plasma and CSF biomarkers has greater predictive power than their levels in plasma or CSF alone.
With regard to prediction accuracy, R2 best reflects goodness of fit; however, it is not likely to be immediately intuitive to the clinician. Classification by error bands best accounts for continuous error between actual and predicted duration. Less error is acceptable when durations are short vs when they are long: a prediction error of 6 months is unacceptable if the actual duration is 2 months; however, it is reasonable if the actual duration is 10 years. However, this method allows greater error with longer duration times. Classification by prognostic categories is potentially the most clinically relevant. However, this method is artificially stringent at the boundaries between categories: for example. an overestimate of only 2 months when the actual duration is 23 months yields a mismatch. Practical application of the results requires attention to these strengths and limitations.
This study provides proof-of-concept for a novel multivariable modeling strategy to predict ALS prognosis. Statistical methods used relatively simple models and coefficients because it was assumed that, all else being equal, less complex formulas requiring fewer independent factors provide more valid predictions. Models incorporated only main effects and basic interaction terms (biomarker plasma-CSF ratios), maximizing the generalizability of equations by decreasing the potential for detecting false-positive higher-order relationships.
A limitation of this study is its cross-sectional design, using samples obtained at a single time for each patient. It is possible that levels of biomarkers change both in absolute value and relative to one another during the disease course. Better knowledge of these variations may improve the precision of models and elucidate underlying disease mechanisms. A longitudinal follow-up study using measurements of biomarkers at more than one point in patients’ disease trajectories is planned.
The small sample sizes precluded independent discovery and validation cohorts for predictive modeling. This is another limitation of this study, potentially resulting in artificially inflated goodness of fit. However, the present results provide a compelling starting point for the use of this method in larger cohorts. More generally, survival time prediction using statistical models suffers from inherent shortcomings independent of any specific strategy. Individual variations in survival times are large enough that the best clinical models provide only approximate indications of prognosis. Point predictions are most error prone, with rates of serious error (predictions less than half or more than twice of the actual values) often exceeding 50%.45 Predictions using prognostic categories (Table 3) may be more appropriate.
The present results suggest that multivariable models incorporating plasma biomarker panels may have prognostic value in ALS. Future studies should use unbiased discovery methodologies in large patient cohorts, with detailed longitudinal follow-up.
Accepted for Publication: August 5, 2013.
Corresponding Author: James Robert Connor, PhD, Department of Neurosurgery, Office C3830J, Mailcode: H110, The Pennsylvania State University College of Medicine, 500 University Dr, Hershey, PA 17033 (firstname.lastname@example.org).
Published Online: October 21, 2013. doi:10.1001/jamaneurol.2013.4646.
Author Contributions: Mr Su had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Simmons, Connor.
Acquisition of data: Su, Simmons, Mitchell, Stephens.
Analysis and interpretation of data: Su, Kong.
Drafting of the manuscript: Su, Simmons, Connor.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Su, Kong.
Conflict of Interest Disclosures: None reported.
Funding/Support: The study was supported in part by the Paul and Harriett Campbell Fund for ALS Research; the Zimmerman Family Love Fund; and the ALS Association, Greater Philadelphia Chapter.
Role of the Sponsor: The sponsors provide general funding for ALS research under the supervision of James Connor and Zachary Simmons; none had roles in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.