eAppendix 2 lists the inclusion and exclusion criteria for title and abstract review. Specific exclusion codes were not recorded at the abstract level.
Kansagara D, Englander H, Salanitro A, Kagen D, Theobald C, Freeman M, Kripalani S. Risk Prediction Models for Hospital ReadmissionA Systematic Review. JAMA. 2011;306(15):1688-1698. doi:10.1001/jama.2011.1515
Author Affiliations: VA Evidence-Based Synthesis Program (Dr Kansagara and Ms Freeman), Department of General Internal Medicine (Drs Kansagara and Kagen), Portland Veterans Affairs Medical Center, Portland, Oregon; Department of Internal Medicine, Oregon Health & Science University, Portland (Drs Kansagara, Englander, and Kagen); Geriatric Research, Education and Clinical Center, VA Tennessee Valley Healthcare System, Nashville (Dr Salanitro); and Section of Hospital Medicine, Division of General Internal Medicine and Public Health, Department of Medicine, Vanderbilt University, Nashville, Tennessee (Drs Salanitro, Theobald, and Kripalani).
Context Predicting hospital readmission risk is of great interest to identify which patients would benefit most from care transition interventions, as well as to risk-adjust readmission rates for the purposes of hospital comparison.
Objective To summarize validated readmission risk prediction models, describe their performance, and assess suitability for clinical or administrative use.
Data Sources and Study Selection The databases of MEDLINE, CINAHL, and the Cochrane Library were searched from inception through March 2011, the EMBASE database was searched through August 2011, and hand searches were performed of the retrieved reference lists. Dual review was conducted to identify studies published in the English language of prediction models tested with medical patients in both derivation and validation cohorts.
Data Extraction Data were extracted on the population, setting, sample size, follow-up interval, readmission rate, model discrimination and calibration, type of data used, and timing of data collection.
Data Synthesis Of 7843 citations reviewed, 30 studies of 26 unique models met the inclusion criteria. The most common outcome used was 30-day readmission; only 1 model specifically addressed preventable readmissions. Fourteen models that relied on retrospective administrative data could be potentially used to risk-adjust readmission rates for hospital comparison; of these, 9 were tested in large US populations and had poor discriminative ability (c statistic range: 0.55-0.65). Seven models could potentially be used to identify high-risk patients for intervention early during a hospitalization (c statistic range: 0.56-0.72), and 5 could be used at hospital discharge (c statistic range: 0.68-0.83). Six studies compared different models in the same population and 2 of these found that functional and social variables improved model discrimination. Although most models incorporated variables for medical comorbidity and use of prior medical services, few examined variables associated with overall health and function, illness severity, or social determinants of health.
Conclusions Most current readmission risk prediction models that were designed for either comparative or clinical purposes perform poorly. Although in certain settings such models may prove useful, efforts to improve their performance are needed as use becomes more widespread.
An increasing body of literature attempts to describe and validate hospital readmission risk prediction tools. Interest in such models has grown for 2 reasons. First, transitional care interventions may reduce readmissions among chronically ill adults.1- 3 Readmission risk assessment could be used to help target the delivery of these resource-intensive interventions to the patients at greatest risk. Ideally, models designed for this purpose would provide clinically relevant stratification of readmission risk and give information early enough during the hospitalization to trigger a transitional care intervention, many of which involve discharge planning and begin well before hospital discharge. Second, there is interest in using readmission rates as a quality metric. The Centers for Medicare & Medicaid Services (CMS) recently began using readmission rates as a publicly reported metric and has plans to lower reimbursement to hospitals with excess risk-standardized readmission rates.4Quiz Ref IDValid risk adjustment methods are required for calculation of risk-standardized readmission rates, which could be used for hospital comparison, public reporting, and reimbursement determinations. Models designed for these purposes should have good predictive ability; be deployable in large populations; use reliable data that can be easily obtained; and use variables that are clinically related to and validated in the populations in which use is intended.5
This systematic review was performed to synthesize the available literature on validated readmission risk prediction models, describe their performance, and assess their suitability for clinical or administrative use.
We searched Ovid MEDLINE, CINAHL, and the Cochrane Library (Central Trial Registry, Systematic Reviews, and Abstracts of Reviews of Effectiveness) from database inception through March 2011, and EMBASE through August 2011, for studies published in the English language of readmission risk prediction models in medical populations. All citations were imported into an electronic database (EndNote X2, Thomson Reuters, New York, NY). The search strategies are provided in detail in eAppendix 1.
All of the authors reviewed the citations and abstracts identified from electronic literature searches using the eligibility criteria shown in eAppendix 2. Full-text articles of potentially relevant references were retrieved and each was independently assessed for eligibility by 2 of the authors. Eligible articles were published in English and evaluated the ability of statistical models to predict hospital readmission risk. Quiz Ref IDBecause a set of predictive factors derived in only 1 population may lack validity and applicability,6 we included only studies of models that were tested in both a derivation and a validation cohort, even if these results were presented in separate reports. We neither prespecified the method of validation, nor excluded studies in which the derivation and validation cohorts were drawn from the same population (ie, split-half validation). We did not limit studies by diagnosis within medical populations. We excluded studies that focused on psychiatric, surgical, and pediatric populations because factors contributing to readmission risk might be considerably different in these patient groups. Finally, we excluded studies from developing nations because these were unlikely to provide directly applicable results.
From each study, we abstracted the following: population characteristics, setting, number of patients in the derivation and validation cohorts, timeframe of readmission outcome, readmission rate, range of readmission rates according to predicted risk, and model discrimination. To facilitate a high-level comparison of predictor variables, we grouped final model variables into 1 of 6 categories (medical comorbidity, mental health comorbidity, illness severity, prior use of medical services, overall health and function, and sociodemographic and social determinants of health).7
To characterize the practical utility of each model, 2 of the authors independently abstracted the type of data used and the timing of data collection from each study. Disagreements between reviewers about these classifications were resolved through group discussion. Data type consisted of administrative, primary (eg, survey, chart review), or both. Regarding timing, we classified a model as using real-time data if the variables would be available on or shortly after index hospital admission, and as using retrospective data if the variables would not be available early during a hospitalization. For example, a model using prior health care use and data from patient surveys conducted early during a hospitalization would be classified as using real-time data, while a model using length of stay or discharge diagnostic codes for the index hospitalization would be classified as using retrospective data. Because of coding delays, models relying on administrative codes from index hospital admissions were considered retrospective.
The c statistic with 95% confidence intervals (when available) were used to describe model discrimination. The c statistic, which is equivalent to the area under the receiver operating characteristic curve, is defined as the proportion of times the model correctly discriminates a pair of high- and low-risk individuals.8 A c statistic of 0.50 indicates that the model performs no better than chance; a c statistic of 0.70 to 0.80 indicates modest or acceptable discriminative ability; and a c statistic of greater than 0.80 indicates good discriminative ability.9,10 If the c statistic was not reported, we abstracted other operational statistics such as sensitivity, specificity, and predictive values for representative risk score cutoffs when available. Model calibration is the degree to which predicted rates are similar to those observed in the population. To describe model calibration, we report the range of observed readmission rates from the predicted lowest to highest risk groupings.
To guide our methodological assessment of included studies, we adapted elements (including cohort definition, follow-up, adequacy of prognostic and outcome variable measurement, and the validation method) from a prognosis study quality tool and clinical decision rule assessment tool (eTable).6,11
The included studies were too heterogenous to permit meta-analysis. Therefore, we qualitatively synthesized results, focusing on model discrimination, the populations in which the model has been tested, practical aspects of model implementation, and the types of variables included in each model.
From 7843 titles and abstracts, 286 articles were selected for full-text review (Figure). Of these, 30 studies of 26 unique models across a broad variety of settings and patient populations met our inclusion criteria (Table 1, Table 2, and Table 3). Most studies (n = 23) were based on US health care data. The remainder were from Australia (2 studies), England (n = 2), Ireland (n = 1), Switzerland (n = 1), or Canada (n = 1). Fourteen studies included only patients aged 65 years or older. Of these, 7 relied solely on Medicare administrative data. Four studies used Veterans Affairs' data.
Total sample size ranged from 173 patients to more than 2.7 million patients. The outcome of 30-day readmission was reported most commonly, although some models chose other follow-up intervals ranging from 14 days to 4 years. Among 21 studies reporting c statistics (Table 1, Table 2, and Table 3), values ranged from 0.55 to 0.83, but only 6 studies reported a c statistic above 0.70, indicating modest discriminative ability. Performance was similar between studies using split-sample validation methods (n = 21; c statistic range: 0.59-0.75), and those that used external validation methods (n = 9; c statistic range: 0.53-0.83). Among models that analyzed the relationship between risk categories and actual readmission rates, a substantial gradient in readmission rate was present between patients at the lowest and at the highest risk level. For example, among 6 models using 30-day readmission as an outcome, the lowest and highest risk groups differed by 20.4 to 34.5 percentage points in their actual readmission rates.
Fourteen models were based on retrospective administrative data and could potentially be used for hospital comparison purposes (Table 1). Most of these included variables for medical comorbidity and use of prior medical services, but a few considered mental health, functional status, and social determinant variables (Table 4). The 3 models with c statistics of 0.70 or higher were developed and tested in large European or Australian cohorts. One examined the risk of 2 or more unplanned readmissions for all hospitalized patients in England, including pediatric and obstetric patients, for 1 calendar year.13 A Swiss study17 examined potentially preventable readmissions. An Australian model incorporating more than 100 medical comorbidities and administrative social determinant variables performed at a modest level in asthma patients, but poorly in patients with myocardial infarction.20
The 9 large population-based or multicenter US studies generally had poor discriminative ability (c statistic range: 0.55-0.65). The CMS used a methodologically rigorous process to create 3 models for congestive heart failure, acute myocardial infarction, and pneumonia admissions based on hierarchical condition categories, which are groups of related comorbidities.14- 16 All 3 models showed relatively poor ability to predict 30-day all-cause readmissions (c statistics: 0.61 for congestive heart failure, 0.63 for acute myocardial infarction, and 0.63 for pneumonia). A recent study evaluating the CMS heart failure model and an older heart failure model fared similarly (c statistics: 0.59 and 0.61, respectively).18,23 The other 4 US models have limited generalizability; for example, one model captured readmissions to 1 medical center only,24 and the other models were developed more than 2 decades ago.12,22,25
Three administrative data−based models were designed to identify high-risk patients in real-time to potentially facilitate targeted interventions (Table 2). A model with modest discriminative ability (c statistic: 0.72; 95% CI, 0.70-0.75) examined 30-day heart failure readmissions in a single urban US health system with a large socioeconomically disadvantaged population.26 It incorporated variables from an automated electronic medical record system, including numerous social factors such as number of address changes, census tract socioeconomic status, history of cocaine use, and marital status. The only study that focused specifically on Medicaid enrollees used a risk score range of 0 to 100 for 12-month readmissions and found that patient cost profiles varied widely with risk score.27 Finally, a British model used data on use of prior medical services and comorbidity, and also controlled for observed and expected hospital readmission rates, but predictive ability remained modest (c statistic: 0.69).28
Nine models incorporated survey or chart review data and could potentially be used for clinical intervention purposes, although 5 used data unlikely to be available early during a hospitalization (Table 2). The best performing of these models used administrative data on comorbidity and prior use of medical services (c statistic: 0.77) along with functional status data (c statistic: 0.83) from the Medicare Beneficiaries Survey to predict a composite outcome of hospital readmissions and nursing home transfers.29 The survey was not routinely administered during index hospitalization and it is unclear to what extent the use of retrospective survey data affects the predictive ability of the model. Similarly, a medical record study in Ireland retrospectively applied a 9-item questionnaire, including items such as discharge polypharmacy, and performed modestly well (c statistic: 0.70).31 A simple Canadian model used medical comorbidities up through index hospital discharge along with index hospital length of stay and prior use of medical services (c statistic: 0.68; 95% CI, 0.65-0.71).35 Increasing scores on another 4-item model of medical comorbidities, prior use of medical services, and levels of creatinine at discharge were associated with increasing readmission rates in patients with heart failure.30
Four models incorporated primary data collected in real time (Table 3). Only 2 of these models have been tested in contemporary populations; the others were conducted more than 2 decades ago. One survey-based model developed at 6 academic hospitals included social determinant, comorbidity, prior use of medical services, and self-rated health variables, but had poor predictive ability (c statistic: 0.61).38 The Probability of Repeated Admission is a simple 8-item survey tool developed in older Medicare beneficiaries; however, it also had poor predictive ability across several studies (c statistic range: 0.56-0.61; 95% CI, 0.44-0.67).39- 41
A comparison of the types of variables considered for and included in the final models can provide some information about the contribution of different types of variables to readmission risk prediction (Table 4). Quiz Ref IDNearly all studies included medical comorbidity data and many included variables for prior use of medical services, usually prior hospitalizations. Basic sociodemographic variables such as age and sex were considered by most studies but, in many instances, these variables did not contribute enough to be included in the final model. Table 4 also highlights important gaps in model development in that few studies considered variables associated with illness severity, overall health and function, and social determinants of health.
Six studies compared the performance of different models within the same population and offer further insights about the incremental value of different types of variables (Table 5). Amarasingham et al26 found a model based on automated electronic medical records that incorporated sociodemographic factors such as drug use and housing discontinuities was more predictive than comorbidity-based models. Coleman et al29 found that the inclusion of variables such as functional status from survey data improved model performance slightly compared with the use of medical services and comorbidity-based administrative data alone (c statistics: 0.83 vs 0.77, respectively). A large Swiss study of potentially preventable readmission risk compared a simple nonclinical model, a Charlson comorbidity–based model, and a more complex hierarchical diagnosis and procedures-based model called SQLape (Striving for Quality Level and Analyzing of Patient Expenditures), and found small differences among them (c statistics: 0.67, 0.69, and 0.72, respectively).17
Other comparative studies found little difference among models. Clinical data such as laboratory and physiological variables from medical records or registries did not enhance performance of claims-only CMS models.14- 16,31 A US study of older patients found that an intricate International Classification of Diseases, Ninth Revision code-based disease complexity system added little discriminative ability to a poorly performing Health Care Financing Administration model.22 Finally, Allaudeen et al40 found internal medicine interns using a gestalt approach predicted readmissions with a similarly poor level of ability as an older, established survey-based model (ie, Probability of Repeated Admission) in a small, single-center cohort.
Only 1 model attempted to explicitly define and identify potentially preventable readmissions.46 Investigators conducted a systematic medical record review to define potentially preventable readmissions and develop an administrative data–based algorithm. A subsequent Swiss study compared the performance of 3 models in predicting readmissions according to their algorithm.17
In this systematic review, we found 26 readmission risk prediction models of medical patients tested in a variety of settings and populations. Several are being applied currently in clinical, research, and policy arenas. Quiz Ref IDHalf of the models were largely designed to facilitate calculation of risk-standardized readmission rates for hospital comparison purposes. The other half were clinical models that could be used to identify high-risk patients for whom a transitional care intervention might be appropriate. Most models in both categories have poor predictive ability.
Readmission risk prediction remains a poorly understood and complex endeavor. Indeed, models of patient-level factors such as medical comorbidities, basic demographic data, and clinical variables are much better able to predict mortality than readmission risk.18,26,35 Broader social, environmental, and medical factors such as access to care, social support, substance abuse, and functional status contribute to readmission risk in some models, but the utility of such factors has not been widely studied.
Quiz Ref IDIt is likely that hospital and health system–level factors, which are not present in current readmission risk models, contribute to risk.47 For instance, the timeliness of postdischarge follow-up, coordination of care with the primary care physician, and quality of medication reconciliation may be associated with readmission risk.48,49 The supply of hospital beds may independently contribute to higher readmission rates.50 Finally, the quality of inpatient care could also contribute to risk,51 although the evidence is mixed.52 Although the inclusion of such hospital-level factors would conceivably improve the predictive ability of models, it would be inappropriate to include them in models that are used for risk-standardization purposes. Doing so would adjust hospital readmission rates for the very deficits in quality and efficiency that hospital comparison efforts seek to reveal, and which could be targets for quality improvement interventions.
Public reporting and financial penalties for hospitals with high 30-day readmission rates are spurring organizations to innovate and implement quality improvement programs.53,54 Nevertheless, the poor discriminative ability of most of the administrative models we examined raises concerns about the ability to standardize risk across hospitals to fairly compare hospital performance. Until risk prediction and risk adjustment become more accurate, it seems inappropriate to compare hospitals in this way and reimburse (or penalize) them on the basis of risk-standardized readmission rates. Others have reached similar conclusions,55 and also have expressed concern that such financial penalties could exacerbate health disparities by penalizing hospitals with fewer resources.56 Still others have argued that readmission rate is an incomplete accountability measure that fails to consider “the real outcomes of interest—health, quality of life, and value.”57
Use of readmission rates as a quality metric assumes that readmissions are related to poor quality care and are potentially preventable. However, the preventability of readmissions remains unclear and understudied. We found only 1 validated prediction model that explicitly examined potentially preventable readmissions as an outcome, and it found that only about one-quarter of readmissions were clearly preventable.17 A recent systematic review of 34 studies found wide variation in the percentage of readmissions considered preventable and estimates ranged from 5% to 79% (median, 27%).58 More work is needed to develop readmission risk prediction models with an outcome of preventable readmissions. This could not only improve risk-standardization efforts, but also allow hospitals to better focus limited clinical resources in readmission avoidance programs.
As with models that are used for risk-standardization, readmission risk models that are intended for clinical use also have certain requirements and limitations. Clinical models would ideally provide data prior to discharge, discriminate high- from low-risk patients, and would be adapted to the settings and populations in which they are to be used. Few models met all these criteria, and only 1 of these (a single-center study) had acceptable discriminative ability.26 As with the risk-adjustment models, most of the models developed for clinical purposes had poor predictive ability, although notable exceptions suggest the addition of social or functional variables may improve overall performance.26,29
The best choice of model may depend on setting and the population being studied. The success of some models in certain populations and the lack of success of others suggest that the patient-level factors associated with readmission risk may differ according to the population studied. For example, while medical comorbidities may account for a large proportion of risk in some populations, social determinants may disproportionately influence risk in socioeconomically disadvantaged populations. Our review found that few models have incorporated such variables.
Even though the overall predictive ability of the clinical models was poor, we did find that high- and low-risk scores were associated with a clinically meaningful gradient of readmission rates. This is important given resource constraints and the need to selectively apply potentially costly care transition interventions. Even limited ability to identify a proportion of patients at risk for future high-cost medical services use can increase the cost-effectiveness of such programs.28,59
Of note, few models incorporated clinically actionable data that could be used to triage patients to different types of interventions. For example, marginally housed patients or those struggling with substance abuse might require unique discharge services. Relatively simple, practical models that use real-time clinically actionable data, such as the Project BOOST model, have been created, but their performance has not yet been rigorously validated.60
Our review concurs with and adds to the findings of several other reviews that found deficiencies in risk prediction models. One recent review limited to US studies examined general risk factors for preventable readmissions, but did not search explicitly for validated models, and many of the included studies had poor study designs.61 The study's authors suggested that measures of poor health such as comorbidity burden, prior medical services use, and increasing age were associated with readmissions. Three other reviews focused on specific diagnoses and found few readmission risk models for heart failure,55 chronic obstructive pulmonary disease,62 and myocardial infarction.63
Our review has certain limitations. We included studies outside of the United States, given that portions of US health care may resemble other countries' health systems, but applicability of models from other countries to the United States may still be limited. Our classifications of data types, data collection timing, and the intended use of each model are subject to interpretation, but we attempted to mitigate subjectivity by using a dual-review and consensus process. Finally, few studies directly compared models within the same population, and summary statistics such as the c statistic should not be used to directly compare models across different populations.
Additional research is needed to assess the true preventability of readmissions in US health systems. Given the broad variety of factors that may contribute to preventable readmission risk, models that include factors obtained through medical record review or patient report may be valuable. Innovations to collect broader variable types for inclusion in administrative data sets should be considered. Future studies should assess the relative contributions of different types of patient data (eg, psychosocial factors) to readmission risk prediction by comparing the performance of models with and without these variables in a given population. These models should ideally be based on population-specific conceptual frameworks of risk. Implementation of risk stratification models and their effect on work flow and resource prioritization should be assessed in a broad variety of hospital settings. Also, given that many models have limited predictive ability and may require some investment of time and cost to implement, future studies should further evaluate the relative value of clinician gestalt compared with predictive models in assessing readmission risk.
In summary, readmission risk prediction is a complex endeavor with many inherent limitations. Most models created to date, whether for hospital comparison or clinical purposes, have poor predictive ability. Although in certain settings such models may prove useful, better approaches are needed to assess hospital performance in discharging patients, as well as to identify patients at greater risk of avoidable readmission.
Corresponding Author: Devan Kansagara, MD, MCR, Portland Veterans Affairs Medical Center, Mailcode RD71, 3710 SW US Veterans Hospital Rd, Portland, OR 97239 (firstname.lastname@example.org).
Author Contributions: Dr Kansagara had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Kansagara, Englander, Theobald, Kripalani.
Acquisition of data: Kansagara, Englander, Salanitro, Kagen, Theobald, Freeman, Kripalani.
Analysis and interpretation of data: Kansagara, Englander, Salanitro, Kagen, Theobald, Kripalani.
Drafting of the manuscript: Kansagara, Englander, Salanitro, Kripalani.
Critical revision of the manuscript for important intellectual content: Kansagara, Englander, Salanitro, Kagen, Theobald, Freeman, Kripalani.
Administrative, technical, or material support: Freeman.
Study supervision: Kansagara, Kripalani.
Conflict of Interest Disclosures: All authors have completed and submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest and none were reported.
Funding/Support: This report is based on research conducted by the Evidence-Based Synthesis Program Center located at the Portland VA Medical Center, and funded by the Department of Veterans Affairs and the Veterans Health Administration, Office of Research and Development, Health Services Research and Development. The research also was funded in part by Vanderbilt CTSA grant 1 UL1 RR024976 from the National Center for Research Resources, National Institutes of Health.
Role of the Sponsor: The funding organizations had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript.
Disclaimer: The findings and conclusions in this report are those of the authors who are responsible for its contents; the findings and conclusions do not necessarily represent the views of the Department of Veterans Affairs or the US government. Therefore, no statement in this article should be construed as an official position of the Department of Veterans Affairs.
Additional Contributions: We thank Rose Relevo, MLS, MS, AHIP, research librarian (Oregon Health & Science University), for constructing and deploying the search strategy, as well as Tomiye Akagi, BA, administrative assistant (Portland VA Medical Center). We also thank Ed Vasilevskis, MD, Frank Harrell, PhD, Art Wheeler, MD, and Italo Biaggioni, MD (all 4 with Vanderbilt University) for critically reviewing a draft of the manuscript. Dr Wheeler was compensated by the Vanderbilt CTSA grant. Drs Vasilevskis, Harrell, and Biaggioni did not receive compensation for their contributions.