Imperiale TF, Dominitz JA, Provenzale DT, Boes LP, Rose CM, Bowers JC, Musick BS, Azzouz F, Perkins SM. Predicting Poor Outcome From Acute Upper Gastrointestinal Hemorrhage. Arch Intern Med. 2007;167(12):1291-1296. doi:10.1001/archinte.167.12.1291
Copyright 2007 American Medical Association. All Rights Reserved. Applicable FARS/DFARS Restrictions Apply to Government Use.2007
Uncertainty about the outcome of acute upper gastrointestinal bleeding often results in a longer-than-necessary hospital stay.
We derived and internally validated clinical prediction rules (CPRs) to predict outcome from upper gastrointestinal bleeding. This multisite, prospective cohort study involved consecutive patients admitted for acute upper gastrointestinal bleeding. Multivariate logistic regression was used to derive CPRs on two thirds of the cohort (derivation set) that predicted bleeding-specific outcomes (rebleeding, need for urgent surgery, or hospital death [poor outcome 1]) and bleeding-specific outcomes plus new or worsening comorbidity (poor outcome 2). Both CPRs were then tested on the remaining third of the cohort (validation set).
A total of 391 individuals (99% men; mean age, 63.4 years) were enrolled, of which 4.6% rebled and 3.1% died. Independent predictors of poor outcome 1 were APACHE (Acute Physiology and Chronic Health Evaluation) II score of 11 or greater, esophageal varices, and stigmata of recent hemorrhage. Predictors of poor outcome 2 were these 3 factors plus unstable comorbidity on admission. Of patients with no risk factors, only 1 (1.1%) of 92 experienced poor outcome 1 and only 6 (6.2%) of 97 experienced poor outcome 2. Risks in the validation set were comparable. The CPRs identified 37.8% and 32.2% of patients in the derivation and validation sets, respectively, who were eligible for a shorter hospital stay.
Patients admitted with acute upper gastrointestinal bleeding were unlikely to have a poor outcome if these risk factors were absent. These CPRs might make hospital management more efficient by identifying low-risk patients for whom early hospital discharge is possible.
Acute upper gastrointestinal (GI) hemorrhage is a life-threatening condition that results in 250 000 to 300 000 hospitalizations and 15 000 to 30 000 deaths per year in the United States.1,2 More than $2.5 billion are spent annually for inpatient care of this problem.2 Rates of major morbidity and mortality are 10% to 12% and 8% to 10%, respectively, and they have remained fairly constant during the past 40 years. Considerable variation in resource use for management has been demonstrated.2- 4
Most patients are at low risk for further hemorrhage and mortality, but many remain in the hospital for several days beyond the high-risk period. Less common are patients readmitted soon after discharge for rebleeding or a delayed complication of the index hospitalization. Both scenarios represent inefficiency resulting from uncertainty in predicting patient outcome. Accurate risk stratification would be useful for identifying low-risk patients, who might not require hospitalization or who could be discharged soon after admission; intermediate-risk patients, who do not need intensive care; and high-risk patients, who require aggressive care in a closely monitored setting.
Although several clinical prediction rules (CPRs) for risk stratification of patients with acute upper GI hemorrhage have been published,5- 11 none have achieved widespread use in clinical practice. The reasons include insufficient validation,8- 11 limited clinical applicability,5,7 complexity of use,6,10- 12 and inability to identify patients at risk for non-GI comorbidity that requires continued inpatient care.5- 11 We attempted to develop and internally validate 2 CPRs to stratify risk of poor outcome from acute upper GI hemorrhage—one to predict risk of GI complications and the other to predict risk of GI and non-GI complications. The goal was to create CPRs that were clinically relevant, valid, and easy to use.
This prospective cohort study involved Veterans Affairs Medical Centers in Durham, Indianapolis, and Seattle. The study protocol was approved by institutional review boards at each site. All the patients admitted to the hospital with acute upper GI hemorrhage were considered for enrollment if they met at least 1 criterion in Table 1. Patients were excluded for any of the following reasons: bleeding while hospitalized for another reason; inpatient transfer from another hospital; presence of a gastrostomy tube, esophageal stent, or other nonbiliary upper GI device; terminal illness affecting physician management of the bleeding episode; and use of warfarin or full-dose heparin.
During hospitalization, patients were followed up by research nurses (L.P.B., C.M.R., and J.C.B.) who abstracted admission and daily information, including demographic features; comorbidity; medications; results of laboratory, radiographic, and endoscopic tests; and treatments. Comorbidity was considered unstable if it was identified as a problem by the admitting physician and required orders directed toward diagnosis (eg, cardiac enzymes for chest pain) or treatment (eg, nebulized bronchodilators for wheezing). Daily progress notes were reviewed to determine whether complications occurred during hospitalization, including rebleeding, the need for advanced techniques to control bleeding (repeated endoscopy, angiography with embolization, transjugular intrahepatic portosystemic shunt insertion, and surgery), the development of new or worsened comorbidity (identified as a separate problem requiring treatment), or death. Patients were contacted by telephone at least 30 days after hospital discharge to determine whether they had rebled or were hospitalized anywhere within 30 days of discharge.
We defined 2 outcomes a priori. Poor outcome 1 is a GI hemorrhage–specific composite variable and includes rebleeding, the need for urgent surgery or an advanced technique to control hemorrhage (eg, radiographic embolization and transjugular intrahepatic portosystemic shunt), and all-cause hospital mortality. Poor outcome 2 is a more-inclusive composite outcome that includes poor outcome 1 plus new or worsening comorbidity, which was identified from a clinical diagnosis that developed subsequent to admission (eg, pneumonia or stroke) or a preexisting illness that required more than the patient's usual medications (eg, use of intravenous diuretics for congestive heart failure). For both outcomes, only major rebleeding was considered, defined as hematemesis, bloody nasogastric aspirate, or bleeding documented endoscopically, along with either hypotension or a decrease in the hematocrit value of more than 4% in 24 hours.
Study nurses completed standardized data collection forms for admission data, daily clinical information, endoscopic findings, and postdischarge follow-up. Descriptive analyses were conducted on the entire data set. Before univariate analysis, the data set was randomly divided into a derivation set containing two thirds of the observations and a validation set containing the remaining third.
From the derivation set, we built a logistic regression model using methods suggested by Hosmer and Lemeshow.13 Univariate analysis was used to identify candidate variables for multivariate analysis for both composite outcomes using a threshold P≤.20. Candidate variables were then entered into forward and backward selection logistic regression to identify variables with significant main effects (α = .05). Likelihood ratio and Score and Wald statistics were examined for each variable in the model, and each variable coefficient was compared with the coefficient from the univariate model. All 3 statistics were consistent in value and significance. Model goodness of fit was examined by computing the deviance, the Hosmer-Lemeshow statistic, and by examining outliers and influential groups through computation of residuals and Δβs.13,14 Model discrimination between patients with vs without poor outcomes was evaluated using the C statistic, which is the area under the receiver operating characteristic curve.15 Only the subgroup of 360 patients who underwent esophagogastroduodenoscopy were included in the multivariate analyses.
To facilitate clinical use of the CPRs we decided a priori to assign 1 point for each independent predictor variable. Risks of poor outcomes 1 and 2 were then measured based on the number of predictor variables, with discrimination measured using the C statistic. Both models were tested on the remaining third of the cohort for validation (reproducibility). C statistics were compared between the derivation and validation subgroups. The nonparametric median test was used to compare the length of stay among the different risk groups. A statistical software program (SAS version 9.1; SAS Institute Inc, Cary, NC) was used for all analyses.
A total of 1034 patients were evaluated during the 35-month enrollment period; 643 patients were excluded because they did not meet the case definition of acute upper GI bleeding or they met the exclusion criteria: 231 from Durham, 212 from Indianapolis, and 200 from Seattle. Demographic features of the excluded patients and reasons for exclusion were comparable across study sites.
Three hundred ninety-one patients were enrolled: 134 from Durham, 110 from Indianapolis, and 147 from Seattle. Mean ± SD patient age was 63.4 ± 13.5 years, and 136 patients (34.8%) were older than 70 years. Seventy-one percent of the patients were white, 26% were black, and 99% were men. Clinical characteristics of the cohort are given in Table 2. Hypotension (systolic blood pressure <100 mm Hg) was present initially in 94 patients (24.0%). Esophagogastroduodenoscopy was performed in 360 patients (92.1%).
Clinical and demographic features were comparable among the 3 sites (Table 3). At all the sites, esophagogastroduodenoscopies were performed by staff gastroenterologists and fellows concurrently. Acid suppression was used initially in more than 90% of patients, with no difference in use among sites based on whether stigmata of recent hemorrhage (SRH) were present. Among patients with SRH, the proportions treated endoscopically were comparable across sites: 52% for Durham, 62% for Indianapolis, and 55% for Seattle. The frequencies of endoscopic treatment types (ie, injection, electrocoagulation, and ligation) were also comparable.
Thirty-one patients (7.9%) experienced major rebleeding (n = 18; 4.6%), required surgery for bleeding (n = 6), or died during the hospital stay (n = 12) (poor outcome 1), whereas 85 patients (21.7%) had major rebleeding, required surgery, died during hospitalization, or developed new or unstable comorbidity (poor outcome 2). Hospital mortality was 3.1% (n = 12); death within 30 days of hospital discharge occurred in 11 patients (2.8%).
Several variables were associated with each composite outcome (Table 4). Because only 19 patients (7.8%) from the derivation set of 244 experienced poor outcome 1, the number of candidate variables was limited to 3 to avoid deriving an overfitted model. The final model included SRH, esophageal varices, and APACHE (Acute Physiology and Chronic Health Evaluation) II score of 11 or greater (Table 5). The C statistic was 0.81, indicating good model discrimination, and the goodness-of-fit test indicated good model calibration (P = .80). Forward and backward elimination regression produced the same 3 variables. Validation on the remaining third of the cohort (n = 116), in which the risk of poor outcome 1 was 7%, produced a C statistic of 0.83, which was not statistically significantly different from that for the derivation set.
Based on the same derivation set of 244 patients, of which 52 (21.3%) experienced poor outcome 2, 4 variables were identified: presence of unstable comorbidity on admission, APACHE II score of 11 or greater, SRH, and esophageal varices (Table 4). This model demonstrated good discrimination between patients with and without poor outcome 2 (C statistic = 0.78) and good calibration (P = .69). Forward and backward elimination regression again resulted in the same independent variables. In validation, in which the risk of poor outcome 2 was 22%, the C statistic was 0.76, which is no different statistically from that of the derivation group.
Using the independent variables, we created 2 CPRs with which to predict poor outcome. Results for both CPRs are given in Table 5. For poor outcome 1, 1 point was assigned to each predictor such that scores ranged from 0 to 3. Because the magnitude of risk of poor outcome 1 was comparable for patients with scores of 2 and 3, we combined these 2 subgroups, which resulted in 3 risk categories: low risk (a score of 0), intermediate risk (a score of 1), and high risk (a score of 2 or 3), with risks of 1.1%, 5.0%, and 25.5%, respectively, in the derivation group. We used the same procedures for poor outcome 2, assigning 1 point for each of 4 variables (the aforementioned 3 plus unstable comorbidity on admission). In this case, the high-risk category consisted of patients with scores of 2 or greater. Risks of poor outcome 2 for low-, intermediate-, and high-risk categories were 4.8%, 16.7%, and 46.5%, respectively.
Risk scores for the 2 CPRs in the validation group are given in Table 6. Because the risks of both poor outcomes were comparable in the derivation and validation groups for each score level, we combined groups to increase the precision of the risk estimates. For the cohort of 360, risks of poor outcome 1 were 1.5%, 4.7%, and 24.7%, in the low-, intermediate-, and high-risk groups, respectively. For poor outcome 2, comparable risks were 5.7%, 17.5%, and 49.0%, respectively. Risk gradients for all 3 risk categories resulted in clinically meaningful and statistically significant separation of risk for both CPRs (Table 6). C statistics for the derivation group, validation group, and entire cohort were 0.81, 0.83, and 0.81, respectively, for poor outcome 1 and 0.78, 0.76, and 0.78, respectively, for poor outcome 2.
The CPRs were then dichotomized into a score of 0 vs 1 or more points (Table 7). The dichotomized CPR detected 25 of 27 patients with poor outcome 1 (sensitivity, 92.6%) and identified 136 patients (37.8%) with a score of 0 and a good outcome who were eligible for a shorter hospital stay. For poor outcome 2, the dichotomized CPR detected 71 of 78 patients (sensitivity, 91.0%) and identified 116 patients (32.2%) who were eligible for a shorter hospital stay. Positive and negative likelihood ratios are 1.6 and 0.17, respectively, for outcome 1 and 1.5 and 0.22 for outcome 2, suggesting greater value in excluding poor outcomes among persons with none of the factors, reflective of the CPRs' high sensitivity.
Length of hospital stay was closely related to risk score. For poor outcomes 1 and 2, median lengths of stay for patients with risk scores of 0, 1, and 2 or more were 3, 4, and 5 days, respectively (P<.001). Of the 136 patients who did not experience poor outcome 1 and who had no risk factors for it (ie, a risk score of 0), 11 (8.1%) were readmitted within 30 days of discharge. These 11 individuals, readmitted a mean of 13.7 days after discharge, had readmission diagnoses consistent with their index admission diagnoses: 5 had recurrent upper GI bleeding, 1 had elective resection of a gastric adenocarcinoma, 1 had upper GI bleeding after starting anticoagulation, and 1 each had anemia, syncope, bowel obstruction, and a perforated gastric ulcer. These same 11 patients (9.5%) compose the group readmitted within 30 days among the 116 patients who did not experience poor outcome 2 and who had no risk factors for it.
The APACHE II score is often not available, so we used the derivation set post hoc to rerun the regression analyses without APACHE II scores to determine whether other models would discriminate comparably with the original models. The model for poor outcome 1 had a C statistic of 0.79 and contained the variables SRH (odds ratio [OR], 5.16; 95% confidence interval [CI], 1.86-14.34), esophageal varices (OR, 4.39; 95% CI, 1.47-13.10), and 2 or more comorbid conditions (OR, 3.15; 95% CI, 1.09-8.52). When this model was used as a risk index with 1 point for each variable, the risk of poor outcome 1 in the low-risk group (score of 0) was 1.2%, in the intermediate-risk group (score of 1) was 5.3%, and in the high-risk group (score ≥2) was 24%. The model for poor outcome 2 had a C statistic of 0.76 and contained the following 4 variables: unstable comorbidity on admission (OR, 5.90; 95% CI, 2.55-13.68), serum creatinine level of 1.5 mg/dL or greater (≥133 μmol/L) (OR, 3.05; 95% CI, 1.27-7.41), esophageal varices (OR, 3.41; 95% CI, 1.47-7.91), and SRH (OR, 2.85; 95% CI, 1.36-5.97). For the corresponding risk index, the low-risk group (score of 0) had a risk of 5%, the intermediate-risk group (score of 1) had a risk of 20%, and the high-risk group (score ≥2) had a risk of 47%. In validation, the C statistics for both models declined by approximately 0.10.
Acute upper GI hemorrhage is a life-threatening problem, one for which hospital-based care is often inefficient because of a lack of accurate prognostic information. For this reason, we derived and internally validated CPRs that stratify risk of poor outcome, either directly related to bleeding or for any outcome that would prolong the hospital stay. These CPRs showed good discrimination, detecting nearly all patients with a poor outcome and identifying a subgroup for which a shorter hospital stay was possible.
Although both CPRs performed reasonably well, neither detected all patients with “poor outcomes.” When scores were dichotomized, the GI bleed–specific CPR missed 2 patients who experienced poor outcome 1 (of a total of 138), both of whom had rebleeding. Both patients underwent endoscopy on the day of admission, revealing a clean-based duodenal ulcer in 1 patient and gastric and esophageal erosions in the other. In retrospect, these erosions were likely due to nasogastric tube trauma. Both patients rebled within 24 hours of endoscopy. Repeated endoscopy revealed the same clean-based ulcer in the first patient, whereas the second patient had a Dieulafoy lesion, which was treated endoscopically. Because rebleeding occurred soon after the index endoscopy in both patients (and within 24 hours of admission), it is unlikely that these patients would have been “missed” as rebleeders unless the CPR was used to discharge patients immediately after endoscopy, a use for which it is not intended but one for which it could be tested.
The second, more inclusive CPR missed 7 of 123 patients. Two of the 7 patients experienced rebleeding and have already been described. The remaining 5 patients developed nonbleeding complications: 1 developed urinary catheter–induced Escherichia coli sepsis; 3 developed pneumonia manifested by fever, leukocytosis, and a new infiltrate on chest radiography; and 1 developed alcohol withdrawal syndrome.
All variables in the 2 CPRs are readily measured and available except the APACHE II score. Although these post hoc models without APACHE II scores did not perform as well as the originals, they offer potential alternatives and are worthy of further development and testing.
Several systems for risk stratification of acute upper GI hemorrhage have been published. They vary in how and when they apply to acute upper GI hemorrhage. For example, 2 can be used to aid the decision about the need for hospitalization,7,10 whereas others apply to level of inpatient care,8,10,11,16 timing of endoscopy,5 and length of hospital stay.6,12 All these studies5,7- 11,16,17 contain variables that we tested as candidate predictor variables.
Probably the most widely known CPR for acute upper GI bleeding was created by Rockall and colleagues.6,17 This CPR uses age, shock, comorbidity, endoscopic diagnosis, and SRH to predict the risks of rebleeding and death. It was developed on patients admitted with bleeding and those who bled while hospitalized for other reasons, 2 subgroups with distinctly different prognoses. It was subsequently validated retrospectively on 2531 patients admitted for acute upper GI hemorrhage,17 of which 744 in the low-risk group (score <2) had a risk of rebleeding of 4.3% and mortality of 0.1%. Although the Rockall CPR effectively stratifies the risks of rebleeding and mortality, the scoring is difficult to remember, and it does not consider the non-GI morbid events that often affect hospital stay.
The study populations of Rockall et al6,17 and the present study differ in calendar time by several years and in criteria for upper GI hemorrhage. Furthermore, the Rockall rule and the present CPRs measure performance for different outcomes: the Rockall CPR measures discrimination for rebleeding and death separately, whereas the present CPRs measure discrimination for composite outcomes that include rebleeding, need for surgery, and death. Nonetheless, we examined performance of the Rockall rule on the present study population. In the validation study by Rockall et al,17 the C statistic was 0.72 (95% CI, 0.69-0.74) for rebleeding and 0.81 (95% CI, 0.78-0.84) for death. On this Veterans Affairs cohort, the Rockall rule has a C statistic of 0.52 (95% CI, 0.44-0.60) for rebleeding (P = .03 for the difference in C statistics) and 0.64 (95% CI, 0.48-0.80) for death (P = .03 for the difference in C statistics). The C statistic for poor outcome 1 was 0.81 (95% CI, 0.73-0.90), which is clinically and statistically greater than Rockall et al's values of 0.52 for rebleeding and 0.64 for death.
Despite the development of several CPRs for outcome from acute upper GI hemorrhage, none has achieved widespread use. This lack of integration into clinical practice may be due to a lack of appropriately rigorous validation,10,18 medicolegal concerns, incongruence with current management,5 imprecision for clinical outcomes, or lack of ease of clinical use in day-to-day practice.6,10- 12,16 For these reasons, we attempted to develop the CPRs described herein. Both CPRs perform reasonably well and are easy to use. However, they have limitations that require comment. First, the population from which they were created was US veterans, 99% of whom were men. This population has unique characteristics that may not generalize to nonveteran populations. Second, although both CPRs performed reasonably well in internal, split-sample validation, such validation is preliminary. More rigorous validation is required before these CPRs can be used in clinical practice.18 Third, the performance of these CPRs was not perfect. Neither identified all persons who had adverse outcomes either directly or indirectly related to GI bleeding. Such imperfection of prediction rules is a reminder that these tools must be considered as aids to clinical judgment, not as substitutes for it.
In summary, we created and preliminarily validated 2 simple and potentially useful CPRs for patients with acute upper GI hemorrhage. Both CPRs identify a low-risk group in which a shortened hospital stay and perhaps outpatient management may be considered. Subsequent research should include further validation of these CPRs, particularly in nonveteran populations, and an evaluation of the effect of presenting information about risk to managing physicians to determine whether and how such information affects clinical decision making.
Correspondence: Thomas F. Imperiale, MD, The Regenstrief Institute Inc, 1050 Wishard Blvd (RG-6), Indianapolis, IN 46202.
Accepted for Publication: March 2, 2006.
Author Contributions: Dr Imperiale had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Study concept and design: Imperiale, Dominitz, and Boes. Acquisition of data: Imperiale, Dominitz, Provenzale, Boes, Rose, and Bowers. Analysis and interpretation of data: Imperiale, Dominitz, Musick, Azzouz, and Perkins. Drafting of the manuscript: Imperiale and Bowers. Critical revision of the manuscript for important intellectual content: Imperiale, Dominitz, Provenzale, Rose, Musick, and Perkins. Statistical analysis: Azzouz and Perkins. Obtained funding: Imperiale. Administrative, technical, and material support: Dominitz, Boes, and Bowers. Study supervision: Imperiale and Provenzale.
Financial Disclosure: None reported.
Funding/Support: This study was funded by grant IIR 98-213 from the Veterans Affairs Health Services Research and Development Service.
Previous Presentation: This study was presented at the American College of Gastroenterology meeting; October 13, 2003; Baltimore, Md.