Receiver operating characteristic curves for the 3 prediction functions in the validation data set. Continuous, categorical, and categorical S indicate the best-fitting continuous model, best-fitting categorical model, and simplified categorical models, respectively.
Event rate by total score in the development and validation data sets. A, Best-fitting categorical model. B, Simplified categorical model. Light grey indicates the development data set and dark grey the validation data set. Some categories were combined because of very small sample size. CKD indicates chronic kidney disease.
Kshirsagar AV, Bang H, Bomback AS, Vupputuri S, Shoham DA, Kern LM, Klemmer PJ, Mazumdar M, August PA. A Simple Algorithm to Predict Incident Kidney Disease. Arch Intern Med. 2008;168(22):2466–2473. doi:10.1001/archinte.168.22.2466
Despite the growing burden of chronic kidney disease (CKD), there are no algorithms (to our knowledge) to quantify the effect of concurrent risk factors on the development of incident disease.
A combined cohort (N = 14 155) of 2 community-based studies, the Atherosclerosis Risk in Communities Study and the Cardiovascular Health Study, was formed among men and women 45 years or older with an estimated glomerular filtration rate (GFR) exceeding 60 mL/min/1.73 m2 at baseline. The primary outcome was the development of a GFR less than 60 mL/min/1.73 m2 during a follow-up period of up to 9 years. Three prediction algorithms derived from the development data set were evaluated in the validation data set.
The 3 prediction algorithms were continuous and categorical best-fitting models with 10 predictors and a simplified categorical model with 8 predictors. All showed discrimination with area under the receiver operating characteristic curve in a range of 0.69 to 0.70. In the simplified model, age, anemia, female sex, hypertension, diabetes mellitus, peripheral vascular disease, and history of congestive heart failure or cardiovascular disease were associated with the development of a GFR less than 60 mL/min/1.73 m2. A numeric score of at least 3 using the simplified algorithm captured approximately 70% of incident cases (sensitivity) and accurately predicted a 17% risk of developing CKD (positive predictive value).
An algorithm containing commonly understood variables helps to stratify middle-aged and older individuals at high risk for future CKD. The model can be used to guide population-level prevention efforts and to initiate discussions between practitioners and patients about risk for kidney disease.
Patients with chronic kidney disease (CKD) lose kidney function and over time are at risk of developing end-stage kidney disease (ESKD). Predicting individuals at risk for CKD is an important first step in modifying the progressive course of CKD. Early identification1,2 of CKD would provide the best opportunity to implement strategies known to decelerate the loss of kidney function.3- 5
Epidemiological studies6- 16 have identified independent risk factors for CKD, including diabetes mellitus (DM), hypertension, vascular disease, and advanced age. Yet, clinical experience has shown that individuals often have 1 or more concurrent risk factors. For our group's previous studies,17,18 a system was developed and validated to quantify the likelihood of prevalent CKD based on the presence of 1 or more risk factors. The development of an algorithm that predicts future CKD would be important to health care practitioners and to patients. To date, we are unaware of any studies that have quantified the cumulative effect of concurrent risk factors on the development of incident kidney disease in the general population.
Therefore, we derived and validated a simple risk score to predict incident kidney disease in a group of community-dwelling middle-aged and older adults. Incident CKD was defined as a glomerular filtration rate (GFR) less than 60 mL/min/1.73 m2, a level generally considered to reflect abnormal kidney function even in participants with expected age-related decline in GFR. As with the recent prediction rule for prevalent kidney disease,17 we attempted to retain the following 2 important characteristics for the new prediction rule: (1) the use of routinely available and minimally intrusive variables easily understood by lay persons and by health care practitioners and (2) estimation of the cumulative effect of concurrent risk factors on the likelihood of developing renal disease.
This study analyzed subject-level data from 2 community-based, prospective, public-use data sets to ascertain the relationship between baseline characteristics and incident CKD. Risk scoring rules were developed based on data available in the event-free population.
Data were combined from 2 nonconcurrent cohort studies, the Atherosclerosis Risk in Communities (ARIC) Study and the Cardiovascular Health Study (CHS) (http://www.nhlbi.nih.gov/resources/deca/datasets_obv.htm). Detailed descriptions of these 2 studies have been published previously.19,20
Briefly, ARIC enrolled 15 732 biracial participants aged 45 to 64 years between 1987 and 1989 (visit 1) from 4 communities and followed them up for a maximum of 4 visits, approximately 3 years apart, for a maximum follow-up of 9 years. The CHS recruited 5201 participants 65 years and older between 1989 and 1990 from 4 communities. Both studies recruited from 2 common communities of Forsyth County, North Carolina, and Washington County, Maryland. The 2 distinct recruiting regions selected by ARIC are suburban Minneapolis, Minnesota, and Jackson, Mississippi, whereas the CHS recruited from Sacramento, California, and Pittsburgh, Pennsylvania. Between 1992 and 1993, the CHS enrolled an additional 687 black subjects to increase minority participation. The CHS participants were followed up annually for up to 10 years.
We chose comprehensive demographic and clinical variables related to CKD. These included age, sex, marital status, race/ethnicity, education level, smoking status, body mass index, hemoglobin levels and anemia, DM, hypertension, peripheral vascular disease (PVD), history of cardiovascular disease (CVD), and history of heart failure, as well as triglyceride level and concentrations of low-density lipoprotein (LDL) (calculated using the Friedwald equation) and high-density lipoprotein (HDL) cholesterol. Family history of hypertension and DM was also recorded in ARIC. Definitions of derived variables are available from the corresponding author on request.
In ARIC, serum creatinine level was measured at baseline and at visit 2 (3-year follow-up) and at visit 4 (9-year follow-up) using the modified kinetic Jaffe method. In the CHS, serum creatinine level was measured at baseline and at years 3 and 7 of the follow-up period by means of a colorimetric method (Kodak Ektachem 700 analyzer; Eastman Kodak Corporation, Rochester, New York). The black cohort in the CHS had only 2 serum creatinine measurements, while the original cohort study had 3 serum creatinine measurements.
Kidney function was quantified by estimated GFR from the 4-variable Modification of Diet in Renal Disease (MDRD) study function as follows21: GFR in milliliters per minute per 1.73 m2 = 186 × (serum creatinine level in milligrams per deciliter)-1.154 × (age in years)-0.203 × (1.212 if black) × (0.742 if female). Because serum creatinine levels vary across clinical laboratories, creatinine data in our study were calibrated using published adjustment coefficients. In ARIC, serum creatinine levels were calibrated for the MDRD equation using the following constants: -0.24, -0.24, and 0.18 mg/dL for the creatinine levels at visits 1, 2, and 4, respectively.22 (To convert creatinine level to micromoles per liter, multiply by 88.4.) In the CHS, serum creatinine levels were calibrated with constants of -0.11 mg/dL (for the original cohort) and -0.04 mg/dL (for the CHS black cohort) for baseline,23 -0.04 for 3-year follow-up, and -0.11 for 7-year follow-up.16 Incident CKD was considered a GFR less than 60 mL/min/1.73 m2 occurring at any time during the follow-up period. This corresponds to stage 3 or higher CKD based on the Kidney Disease Outcomes and Quality Initiative guidelines.21
The split-sample method was used for risk equation and score development and for validation.24,25 Eligible participants from the data set were randomly allocated to development and validation samples using a 2:1 ratio within each data set. Multiple logistic regression analysis was used to create a prediction model in the development data set.
We used 3 different strategies for multivariate modeling. First, we used continuous variables whenever available (listed in Table 1), aiming for efficient estimation of regression variables and minimal residual confounding with increased statistical power. All covariates were considered main effects. Backward elimination technique was used to reach the final model: a factor with the largest P value was deleted one at a time until all the predictors in the model were significant at P ≤ .05. After reaching the final parsimonious model, the significance of each of the deleted variables was tested to ensure that no covariate was erroneously omitted in this sequential process. Second, all the covariates in the final model were replaced with categorical versions that could be converted to a user-friendly integer risk score. Third, a simplified categorical model was constructed omitting less readily available variables. To assess discrimination, area under the receiver operating characteristic curve (AUC) was computed. Akaike and Bayesian information criteria were evaluated as model fit statistics,26,27 with lower values indicating better model fit. The Hosmer-Lemeshow goodness-of-fit test was also performed.28
Once the most parsimonious model was reached, diagnostic properties were tested in the validation data set that was not used for model building. Using the regression coefficients in the risk function, the probability of developing CKD was estimated, which allowed the establishment of a rule to characterize different degrees of risk based on cut points of the probability distribution. Basic scoring was by rounding up the regression coefficients from the multiple regression model to appropriate integers and by capturing the monotonicity of continuous risk (eg, for age).17
The prediction models were evaluated based on the following standard measures: AUC, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and percentage of persons who are identified to be at high risk by a specified selection strategy. The final cut-point selection for populations at high risk was determined using the index by Youden29 (sensitivity + specificity - 1).
For the sensitivity analysis, 2 additional analyses were performed to determine the robustness of the results. First, we performed an analysis with the outcome as a GFR less than 60 mL/min/1.7 m2 and at least a 10 mL/min change in GFR from baseline. Second, we used a Cox proportional hazards regression model for survival analysis to better account for censoring of follow-up data.
All analyses were performed using commercially available statistical software (SAS version 9.1; SAS Institute, Cary, North Carolina). Two-sided hypotheses tests with 5% type I error were adopted for all statistical inferences.
At the baseline visit, ARIC had 15 732 participants and the CHS had 5888 participants, for a combined cohort of 21 620 individuals. A total of 1998 individuals were excluded from analysis because of missing data for creatinine level, GFR formula, or baseline renal insufficiency (150 missing data and 464 prevalence cases in ARIC and 172 missing data and 1212 prevalent cases in the CHS). In addition, 3823 individuals from ARIC and 1644 individuals from the CHS were excluded because there were no incident outcome data by the time of the last contact day. Therefore, the final study population consisted of 14 155 participants (11 295 in ARIC and 2860 in the CHS). Of these individuals, 9470 participants were randomly selected from the original data set to form the development data set, and the remaining 4624 comprised the validation data set after excluding 61 individuals with missing covariate data.
The characteristics of the study population at baseline by event status are given in Table 1. On average, participants with CKD were almost 5 years older and less likely to have completed high school compared with participants without CKD. A greater proportion of white participants had CKD. Participants with CKD were also more likely to have DM, PVD, hypertension, and a history of CVD.
During the study period, 1605 participants developed incident CKD: 1123 of 11 295 (9.9%) from ARIC and 482 of 2860 (16.9%) from the CHS. Using continuous variables, the multiple logistic regression model in the development data set identified the following 10 predictors of incident CKD: age, DM, PVD, anemia, female sex, white race/ethnicity, systolic blood pressure, history of CVD, history of heart failure, and HDL cholesterol concentration (data not shown).
Table 2 gives the results of categorical variables in the full (best fitting) and simplified prediction models for incident CKD. The AUCs for the 2 models were similar, 0.70 vs 0.69, while the more comprehensive model demonstrated improved model fits as reflected in smaller Akaike and Bayesian information criteria values. The Hosmer-Lemeshow goodness-of-fit test showed no lack of fit for the 2 fitted models (P > .2). The 3 prediction equations were fitted to the full data set, and the results are available from the corresponding author on request.
Diagnostic characteristics of the full (best fitting) and simplified categorical models using validation data are given in Table 3. For both models, the sensitivity and NPV of the model decreased, while its specificity and PPV increased as the event rate estimated by the prediction model increased, corresponding to higher total scores derived. For example, for a score of 6 or higher in the full model, which defined about 10% of participants tested as high risk, the sensitivity was 25% and the specificity 93%, with the actual risk of CKD (ie, the PPV) being 29%. At the other (lower) end, a score of 3 or higher defined almost 74% of participants tested at high risk and yielded high sensitivity (91%) and low specificity (28%), with actual risk (PPV) of 14%. The NPV remained higher than 90% for all thresholds examined. The Youden index identified a score of 5 or higher for the full model and a score of 3 or higher for the simplified model as the cut points that yielded the highest level of test accuracy.
Figure 1 shows the corresponding receiver operating characteristic curves from 3 different modeling strategies. Figure 2 shows the proportions of participants who progressed to CKD during the study period by total risk scores in the 2 categorical models in the development and validation data sets. The estimated event rates from the combined data set are given in a user-friendly format in Table 4.
For the sensitivity analysis, the use of a definition of incident disease that included a GFR less than 60 mL/min/1.73 m2 and at least a 10 mL/min decrease in GFR from baseline did not markedly change the approximate 0.70 AUC of the model. Cox proportional hazards regression analysis for censored data also yielded results similar to those of logistic regression analysis, including the identical scoring system (data not shown).
We developed a simple scoring algorithm that stratifies persons at risk of developing clinically significant CKD. This prediction rule, which we believe to be the first of its kind, translates a parsimonious set of medical and demographic characteristics into a mean likelihood of developing CKD among middle-aged and older adults with 4 to 9 years of follow-up. These characteristics are often present together and cumulatively affect the risk of kidney disease. Most of the characteristics—age, DM, hypertension, and CVD (divided into PVD, heart failure, and coronary artery disease)—are easily identified by health care practitioners and by the general public. Most other variables (eg, serum lipid and hemoglobin levels) are also frequently checked by health care practitioners.
Risk scores have recently been emphasized as practical tools to help stratify individuals at increased risk of having kidney disease.1,2 In previous studies,17,18 a prediction rule was developed and validated that identified individuals likely to have existing CKD. That analysis used cross-sectional data and could not predict future disease. Herein, we report a tool for prediction of kidney disease akin to the Framingham risk score, a widely used cardiovascular scoring system that is used extensively to stratify individuals and guide therapy in diverse settings. Clinicians can use our scoring system to communicate expected risk among patients and to facilitate discussions about possible preventive strategies. This algorithm also has potential public health applications. It can be posted on medical Web sites for the public to access or may be used in community settings to identify individuals who may wish to be referred to health care practitioners. The identification of high-risk individuals using this scoring system can also optimize the benefit and cost-effectiveness of targeted screening and monitoring. At the least, the prediction rule can be used in concert with other public health initiatives to increase the awareness of CKD, which has been traditionally low.30,31
The 2 categorical models, the best fitting and the simplified, were comparable in terms of discrimination (ie, distinguishing cases vs noncases), with the best-fitting model having greater statistical fit. When information on laboratory variables such as HDL cholesterol concentration is unavailable, the use of the simplified model would be appropriate. Based on the diagnostic characteristics, we recommend using a cut point of 5 with the best-fitting model and a cut point of 3 with the simplified model. Because the score predicts future events, the immediate response to a high test score would likely involve increased frequency of screening, intensive management of risk factors (DM and hypertension), and possibly lifestyle modifications, as well as further laboratory testing. In our opinion, the benefits of such interventions far outweigh their minimal risk (false-positive prediction of renal disease) and may simultaneously serve to reduce cardiovascular risk.
In this investigation, we purposefully chose to define CKD for the prediction equation using a GFR less than 60 mL/min/1.73m2 (rather than <90 mL/min/1.73m2) for several reasons. First, we wanted to minimize the detection of individuals with age-related physiologic decline in kidney function. Nevertheless, age adds the greatest predictive capability of all the variables; in fact, the other variables add about 5% additional predictive capacity to the equation. Second, it has become increasingly clear to practicing clinicians that a GFR of 60 mL/min/1.73m2 represents a practical threshold for clinical action. Third, the MDRD estimation formula, derived among individuals with a baseline GFR less than 60 mL/min/1.73m2, is most accurate for individuals with stage 3 or higher CKD.32
We excluded the baseline level of kidney function in the prediction model because of clinical and methodological concerns. Clinically, the algorithm is designed to help health care practitioners and potential patients by focusing on variables that are available without prior serologic testing. Methodologically, the use of an estimated GFR term as a potential exposure variable and as an outcome would cause colinearity, potentially introducing bias.
The finding of increased risk of CKD among white subjects is noteworthy. While there are well-documented ethnic/racial differences in incident and prevalent ESKD33 and in different stages of CKD, some racial/ethnic differences observed in the prevalence of CKD may be related to variation in the rates of progression among black subjects and white subjects. In National Health and Nutrition Examination Survey (NHANES) III and NHANES 1999-2000 data, black subjects had a lower age-adjusted prevalence of CKD than white subjects.31 In the NHANES 1999-2002 data used in this study,31 the prevalence of CKD was similarly higher among white subjects compared with black subjects. Baseline results from the renal Reasons for Geographic and Racial Difference in Stroke cohort support these findings.34 Furthermore, in the present CHS data set, fewer follow-up visits for black subjects may translate into fewer opportunities to capture incident cases of CKD. Speculatively, the competing risk of death may be greater than the risk of CKD among black subjects. Finally, different statistical weights are assigned to the 2 racial/ethnic groups in the MDRD formula.
There are several important limitations to our study. First, no qualitative or quantitative urinary indexes were available at the baseline visit. Unlike a previous study17 that predicted prevalent renal disease, this analysis could not include proteinuria as a variable. Low levels of proteinuria and hematuria are important clues to the presence of underlying kidney disease, particularly glomerulonephritis and diabetic nephropathy and sometimes autosomal dominant polycystic kidney disease. Therefore, it is possible that participants in ARIC or the CHS with preserved renal function but who had proteinuria, hematuria, or both were included in the baseline cohort of participants. Nevertheless, our scoring system would still be useful in identifying those individuals with urinary abnormalities who are at risk for deterioration in GFR (eg, those with DM, hypertension, and vascular disease).
A second limitation of the scoring system is our inability to include family history of kidney disease in the model. Most epidemiological studies (eg, ARIC, CHS, Framingham, and NHANES) do not include questions about family history of CKD. More recent studies such as the Kidney Early Evaluation Program35 have begun surveying family history among targeted populations with a high burden of ESKD. In the future, we anticipate that national or community-based surveys will add family history of kidney disease to better assess its effect on CKD. In studies36,37 of ESKD, family history has been demonstrated to modify the effect of DM and hypertension; therefore, inclusion of family history may alter our prediction model.
Third, we based our diagnosis of CKD on a single estimate of GFR, which we acknowledge tends to overestimate the incidence of kidney disease. Estimated GFR measurements exhibit a high degree of intraindividual variability and ideally require second measurements to accurately represent kidney function.38 The use of successive GFR measurements, had they been available, would likely have reduced the incidence of CKD but should not have affected the association of the predictor variables with the outcome. Furthermore, most studies of CKD, epidemiologic and interventional, use single serum creatinine measurements.
Fourth, this risk prediction model applies primarily to groups defined by a parsimonious set of clinically relevant variables rather than directly to individuals. This is a limitation common to all risk prediction models.39 Indeed, prevention of CKD and ESKD may require population-based interventions that are beyond the control of individual physicians and patients (eg, the risk contribution of CVD).40 We caution that our risk prediction rule serves only as a guideline and should not be taken as an absolute definition of high risk.
Several strengths of the study are worthy of emphasis. This analysis uses 2 well-studied and representative community-based cohorts (ARIC and CHS), albeit with somewhat different follow-up periods. In addition, the complementary age of the participants of the 2 cohorts provides an age range that mirrors the age range of most individuals who are at risk of developing CKD. Finally, to our knowledge, this is the first prediction model for incident CKD.
Reaching ESKD represents a serious health event that may be significantly delayed or even prevented.3- 5 The global burden of CKD is growing,41- 43 with the incidence of ESKD more than doubling in Europe and the United States during the past 2 decades.33,44 Our study demonstrates that the risk of incident CKD is predictable with good accuracy in a middle-aged or older population bearing major risk factors for CKD. In clinical practice, the use of our scoring system would allow clinicians to easily stratify the renal disease risk of individual patients. Patients may also calculate their risk scores and probabilities of developing this asymptomatic disease during a 10-year period and query their primary care practitioners about their renal function. Furthermore, this scoring system may provide guidance for policymakers by calling attention to the multiple conditions that contribute to rising CKD and ESKD prevalences. Additional studies and validation of this prediction rule in various real-world settings (eg, high vs low risk, clinical vs community settings, and among different ethnic groups) will be necessary to rigorously assess its usefulness.
Correspondence: Abhijit V. Kshirsagar, MD, MPH, Division of Nephrology and Hypertension, Department of Medicine, School of Medicine, University of North Carolina at Chapel Hill, Campus Box 7155, 7017 Burnett-Womack Hall, Chapel Hill, NC 27599-7155 (firstname.lastname@example.org).
Accepted for Publication: September 28, 2007.
Author Contributions:Study concept and design: Kshirsagar and Bang. Analysis and interpretation of data: Kshirsagar, Bang, Vupputuri, Shoham, Kern, and Mazumdar. Drafting of the manuscript: Kshirsagar, Bang, and Bomback. Critical revision of the manuscript for important intellectual content: Kshirsagar, Bomback, Vupputuri, Shoham, Klemmer, Mazumdar, and August. Statistical analysis: Bang, Vupputuri, Shoham, and Mazumdar. Obtained funding: Bang and Mazumdar. Study supervision: Kshirsagar and Bang.
Financial Disclosure: None reported.
Funding/Support: This study was supported by the University of North Carolina Kidney Center and by Clinical and Translational Science Award UL1-RR024996 to Drs Bang and Mazumdar.
Disclaimer: The ARIC and the CHS are conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with ARIC and CHS investigators, respectively. The manuscript was prepared using limited access data sets obtained from the NHLBI, and the conclusions do not necessarily reflect the opinions or views of ARIC, the CHS, or the NHLBI.
Additional Contributions: Kylie Braynt, MS, provided CHS data setup, and Sean Coady, MA, assisted with data issues and clarifications. The staff and participants of ARIC and the CHS contributed valuable information for health research.