Befeler AS, Palmer DE, Hoffman M, Longo W, Solomon H, Di Bisceglie AM. The Safety of Intra-abdominal Surgery in Patients With CirrhosisModel for End-Stage Liver Disease Score Is Superior to Child-Turcotte-Pugh Classification in Predicting Outcome. Arch Surg. 2005;140(7):650–654. doi:10.1001/archsurg.140.7.650
We hypothesized that the model for end-stage liver disease (MELD) score may be a better and less subjective method than the Child-Turcotte-Pugh score for stratifying patients with cirrhosis before abdominal surgery.
Retrospective medical record review.
Tertiary care institution.
Fifty-three adult patients with histologically proven cirrhosis undergoing abdominal surgery at Saint Louis University Hospital, St Louis, Mo, between 1991 and 2001. Those undergoing hepatic surgery (such as resection or transplantation) or closed abdominal surgery (such as hernia repair) were excluded.
Main Outcome Measure
A poor outcome after surgery was defined as death or liver transplantation within 90 days of the operative procedure or a hospital stay of longer than 21 days. Demographic, clinical, and laboratory features predictive of poor outcome were assessed by multivariate analysis.
A total of 13 patients (25%) had poor outcomes including 9 deaths (17%). Model for end-stage liver disease score and plasma hemoglobin levels lower than 10 g/dL were found to be independent predictors of poor outcomes. A MELD score of 14 or greater was a better clinical predictor of poor outcome than Child-Turcotte-Pugh class C.
A MELD score of 14 or greater should be considered as a replacement for Child-Turcotte-Pugh class C as a predictor of being very high risk for abdominal surgery. Patients with cirrhosis with hemoglobin levels lower than 10 g/dL should receive corrective blood transfusions before abdominal surgery.
Deciding whether patients with cirrhosis are medically fit to undergo abdominal surgery remains a common clinical dilemma. Not only do the complications of cirrhosis, such as coagulopathy, thrombocytopenia, varices, and ascites, increase the technical difficulty and risk of the surgical procedure directly, but the combination of general anesthesia and an abdominal incision may lead to new or worsening hepatic decompensation.
Previous studies have identified various risk factors associated with poor outcomes, including intraoperative blood transfusion, low albumin levels, abnormal coagulation parameters, ascites, elevated alkaline phosphatase levels, gastrointestinal bleeding, biliary surgery, urinary tract infection, pulmonary failure, emergency surgery, and Child-Turcotte-Pugh (CTP) score.1- 3 Because of the diversity of these factors, many physicians use the CTP score to stratify patients and identify operative risks associated with cirrhosis. While the CTP score is generally reliable, it contains subjective parameters that may blur the margins of categorization, such as ascites and hepatic encephalopathy. The model for end-stage liver disease (MELD) score was originally developed to predict survival after transjugular intrahepatic portosystemic shunts and was shown to be a reliable and superior predictor of mortality to CTP score.4 Subsequently, it was shown to be superior to CTP score as a predictor of all-cause 3-month mortality in a variety of cohorts of patients with cirrhosis5 and has been adopted as the method of organ allocation for liver transplantation in the United States.6 We hypothesized that MELD score may be a better and less subjective method for stratifying patients with cirrhosis before abdominal surgery.
A list of all patients with biopsy-proven cirrhosis who underwent abdominal surgery at Saint Louis University Hospital, St Louis, Mo, from 1991 to 2001 was generated from the surgical pathology database. Electronic and paper medical records were obtained, and a retrospective medical record review was performed. To be included in the analysis, the subject had to have biopsy-proven cirrhosis of any etiology and undergo intra-abdominal surgery within this period. Patients having hepatic surgery (including transplantation) or extraperitoneal abdominal surgery (for example, hernia repair) were excluded.
For the purposes of our analysis, the subjects were divided into 2 groups, those with and without poor outcome. Poor outcome was defined as any 1 of the following: death, liver transplantation within 90 days of the operative procedure, or a hospital stay longer than 21 days, a prolonged stay being a surrogate marker of postoperative morbidity.
We used χ2 and t tests to examine the differences in baseline characteristics of the 2 groups. Single variables were then assessed using logistic regression to predict associations between the test variable and poor outcome. Multivariable logistic regression models were then constructed using variables that were significant in univariable analysis. Models were compared based on statistical significance of omnibus model χ2, Cox and Snell pseudo-R2, and ability to correctly classify patient groups. Receiver operating characteristic curves were used to assess the best trade-offs between sensitivity and specificity for the proposed clinical models of prediction. P values less than .05 were considered significant except in model building where P values less than .10 were used. All statistical analysis was run on SPSS version 11.5 (SPSS Inc, Chicago, Ill).
Sensitivity, specificity, positive predictive value, negative predictive value, positive likelihood ratio, and negative likelihood ratio were calculated with a spreadsheet. Likelihood ratios are methods to assess the ability of a test or measurement to predict an outcome but, as opposed to positive and negative predictive values, they are independent of disease prevalence. This makes them easier to apply to a different population of patients than the one studied. Positive likelihood ratios usually range from 1 to 10, with the higher numbers indicating a higher likelihood of disease given a positive test. Negative likelihood ratios usually range from 0 to 1, with smaller numbers indicating better likelihood of absence of disease given a negative test.
The baseline demographics, etiology of liver disease, preoperative laboratory and clinical findings, prognostic classifications, type of operations, and outcomes are listed in Table 1. Indications for cholecystectomy included pain, cholecystitis, cholelithiasis, porcelain gallbladder, bariatric surgery, and trauma. Indications for laparotomy included gastrointestinal bleeding, small-bowel obstruction, perforation with peritonitis, appendicitis, ulcerative colitis with dysplasia, and colon cancer. The laparotomy group included 19 bowel resections; 4 had adhesiolysis. Surgery was performed as an emergency in 22 of the cases and urgently in the rest.
A total of 13 patients (25%) had poor outcomes including 9 deaths (17%). One patient in the cholecystectomy group died of a bile leak and subsequent sepsis and had a MELD score of 19 and CTP score of 10. Eight patients in the laparotomy group died of multiorgan failure, respiratory failure, acute tubular necrosis, sepsis, hepatorenal syndrome, peritonitis, hepatic encephalopathy, and hepatic failure. Eight patients had a hospital stay longer than 21 days, 5 of whom also died within 90 days. One patient underwent transplantation within 30 days of her procedure. An additional patient died 109 days after surgery.
Binary logistic regression was used to assess for predictors of poor outcomes. Table 2 shows significant associations between poor outcomes and African American race, end-stage renal disease with dialysis, ascites, hemoglobin level, CTP score, cholecystectomy vs exploratory laparotomy, and MELD score.
Multivariable analysis was then used to assess for the best independent predictors of poor outcome. Stepwise logistic regression modeling was not used because it results in arbitrary models that do not have a good basis in theory or prior empirical knowledge. In addition, stepwise models could not properly deal with the overlap in the parameters of the CTP and MELD scores.
The CTP score (scale, 5-15) was chosen as the basis of the first set of models because it is the most commonly used clinical parameter to assess the safety of surgery in patients with cirrhosis (Table 3, model 1). Plasma hemoglobin level was then added because it was found to be a powerful predictor and is not included as part of the CTP score. The addition of hemoglobin level to CTP score resulted in a statistically improved model with better Cox and Snell pseudo-R2 (Table 3, model 2). Addition of type of surgical procedure did not improve model 2. Addition of renal dialysis to model 2 resulted in a statistically improved model with better pseudo-R2 but loss of significance of the CTP score variable. An analysis was also performed using CTP score coded as class A, B, or C. All the models generated were equal to or inferior to those indicated earlier. Thus, model 2 was the best CTP score–based predictor of poor outcome.
The MELD score (continuous scale, 0-40) was then assessed as the basis of the predictor model (Table 3, model 3). Addition of hemoglobin level to MELD score resulted in a statistically improved model and better Cox and Snell pseudo-R2, but the MELD score variable became only borderline significant (Table 3, model 4). When the hemoglobin level was converted to a categorical variable (> or <10 g/dL), then the model was statistically improved with maintenance of significance of all of the variables in the model (Table 3, model 5). Adding type of operation to model 5 did not result in an improved predictive model. Creating a model with MELD score and type of operation resulted in borderline significance for the type of operation variable and a low Cox and Snell psuedo-R2. Models were also created using MELD score as a categorical variable using cutoffs based on the investigators’ experience with predicting survival in patients waiting for liver transplantation (MELD score categories, 6-9, 10-17, 18-24, >25). All the models generated were equal to or inferior to those indicated earlier. As described later, receiver operating characteristic curves indicated better cutoffs for this data set.6- 21 Multivariable models using MELD score coded this way resulted in the best Cox and Snell pseudo-R2 and maintenance of significance for all variables (Table 3, model 6). Thus, CTP score plus hemoglobin level and MELD score plus hemoglobin level appear to be the best models to predict poor outcome.
To be clinically useful, the predictors from these models need to be used to create some simple rules for dividing the patients into categories of risk for abdominal surgery. Thus, receiver operating characteristic curves were constructed for continuous-scale CTP and MELD scores to predict poor outcome and they yielded similar areas under the curve of 0.814 and 0.826, respectively. The standard CTP cutoff score for class B maximized sensitivity and the cut point for class C was the optimal balance of sensitivity and specificity. A MELD score of 14 or greater was the optimal cutoff and a MELD score of 22 or greater maximized specificity by receiver operating characteristic curve. Table 4 indicates the sensitivity, specificity, positive and negative predictive values, and positive and negative likelihood ratios for CTP score, MELD score, and modified CTP and MELD scores. Given the relatively small size of some of the categories and the resulting low statistical power, 95% confidence intervals for the measures in Table 4 were usually overlapping.
While CTP class B or C identified all the patients with poor outcome, it had low positive predictive value and positive likelihood ratio, thus limiting it as a clinical tool. A MELD score of 9 or greater had similar performance to CTP class B or C. A MELD score of 14 or greater had improved sensitivity and negative predictive value for poor outcome. The positive likelihood ratio of a MELD score of 14 or greater was better than CTP class B or C (P<.05) but no different than CTP class C. It correctly predicted 77% of poor outcomes compared with 23% for CTP class C. The negative likelihood ratio of a MELD score of 14 or greater was better than CTP class C (P<.05). Thus, it had the best balance between positive and negative likelihood ratios. Adding additional points for hemoglobin level did not improve either CTP or MELD score as a clinical predictor of poor outcome, despite the fact it was an independent predictor of outcome.
In patients with liver cirrhosis undergoing abdominal surgery, the CTP class has previously been shown to be an accurate preoperative predictor of outcomes,2,12 with elective surgery generally being considered safe in patients with CTP class A or B. Operations in patients with CTP class C and emergency surgery have carried high mortalities.14
Despite the quoted mortality rate of 30% in patients with liver cirrhosis undergoing abdominal surgery,1 numerous articles have reported the safety of performing a laparoscopic cholecystectomy in patients with CTP class A and B cirrhosis,7- 12 with laparoscopic cholecystectomy tending to be the procedure of choice11,13,14 and conversion to open cholecystectomy being considered part of appropriate technique and not a complication. In addition, operative technique and postoperative care have undoubtedly improved since the 1984 data were acquired.
The CTP classification was initially designed to assign risk for patients with liver disease undergoing portocaval shunt surgery and was later found to be useful in other types of surgery. There are, however, inherent problems in using the CTP classification. Thus, the presence and degree of hepatic encephalopathy is assessed subjectively, and consequently, the scoring system is prone to interobserver variability. The presence or absence of ascites on clinical assessment may not correspond with more accurate findings on imaging studies, such as ultrasonography, and the significance of ascites in a patient undergoing peritoneal dialysis is unclear. Also, no significance is placed on whether a prolonged prothrombin time corrects with administration of vitamin K.
On the other hand, the MELD scoring system uses objective parameters and is easy to calculate. The MELD score is able to accurately predict 3-month mortality among patients with chronic liver disease5 and is widely used in organ allocation and to predict the outcome of transjugular intrahepatic shunting. Our data demonstrate that patients with MELD scores of 14 and higher have a significantly increased risk of a poor outcome when undergoing nongallbladder intra-abdominal surgery. Conversely, patients with MELD scores less than 14 generally do well.
In this study, the clinical utility of a MELD score of 14 or greater in predicting poor outcomes was superior to the previous standard of CTP class C. It correctly predicted 77% vs 23% of the total poor outcomes. If the MELD score was less than 14, then the chance of a poor outcome was only 9% compared with 22% for CTP class A and B. These differences were confirmed by a superior negative likelihood ratio. A MELD score of 14 or greater provides the optimal balance between sensitivity and specificity, as shown by its high positive likelihood ratio and low negative likelihood ratio. Child-Turcotte-Pugh class B or C and a MELD score of 9 or greater provide excellent negative likelihood ratios but have relatively low positive likelihood ratios. They predict poor outcomes in 55% and 65%, respectively, of patients who actually had good outcomes. If used as a clinical rule, many patients would needlessly be denied surgery. Thus, this study suggests that the MELD score should replace CTP class as a clinical tool to predict outcomes after intra-abdominal surgery in patients with cirrhosis.
Low preoperative hemoglobin levels were found to be independently important. This was associated with a poor outcome of a degree similar to a MELD score greater than 18. Previous studies have identified intraoperative blood loss as a predictor of an adverse outcome1- 3; yet, a low preoperative hemoglobin level was not identified as a risk in these studies. However, a study by Metcalf et al16 did find that patients with cirrhosis undergoing colectomy with a plasma hemoglobin level less than 90% of normal had a significantly higher mortality. Current surgical guidelines state that no specific hematocrit value is an indication for preoperative transfusion in a stable patient. However, a symptomatic patient with anemia who is about to undergo a procedure that involves significant blood loss should receive a transfusion before the operation.17 Currently, the use of a hematocrit value of 30% (or a hemoglobin level of 10 g per 100 mL) as a transfusion trigger is no longer acceptable without considering the clinical situation.18
In a recent multicenter, randomized, controlled study of transfusion in patients in the critical care setting, a liberal transfusion strategy (transfusion for hemoglobin level <10 g per 100 mL) was compared with a restrictive strategy (transfusion for hemoglobin level <7 g per 100 mL). The restrictive strategy was found to be at least as effective as the liberal strategy, with the possible exception of patients with acute myocardial infarction and unstable angina.19
A retrospective, multicenter cohort study of patients undergoing surgical repair of hip fracture was conducted to determine the effect of perioperative transfusion on 30- and 90-day mortality. A total of 8787 patients were studied; all were older than 60 years. The authors were unable to demonstrate any differences in the outcome of patients who received transfusions at a transfusion trigger of 8 g per 100 mL of hemoglobin compared with those receiving blood for a hemoglobin level of 10 g per 100 mL.20
Our study results suggest that, contrary to the recommendations mentioned earlier, patients with cirrhosis and anemia undergoing abdominal surgery might benefit from preoperative correction of their anemia. All multivariable models, whether based on the MELD score or CTP score, showed that low hemoglobin level was an independent predictor of poor outcome. Controlling for MELD score, low hemoglobin level added an 8.9-time additional risk for poor outcomes. Clearly, our findings should be validated in a prospective fashion, but the number of patients required may make this difficult to achieve in a reasonable amount of time. The benefits of blood transfusion should also be tested prospectively, if possible, before being adopted into practice.
This analysis has similar limitations to other studies of patients with cirrhosis undergoing abdominal surgery, including retrospective methods and relative small size. Thus, the lack of statistical power creates many overlapping 95% confidence intervals for the parameters in Table 4. Additionally, the end point of poor outcome included duration of hospitalization postoperation as a surrogate marker. The data were reanalyzed using 28 days rather than 21 days as an indicator of poor outcome, and none of the logistic models changed significantly. Eight patients had a hospital stay longer than 21 days, including 5 who died within 90 days, 1 who died within 120 days, and 1 who underwent a liver transplantation within 90 days. Thus, hospital stay longer than 21 days appears to be a good marker of poor outcome.
In conclusion, we believe that physicians should be able to make a more informed decision regarding the risks of intra-abdominal surgery in patients with cirrhosis by using the MELD score, opting for less invasive bridging procedures where possible, and correcting reversible factors, including anemia, prior to surgery.
Correspondence: Adrian M. Di Bisceglie, MD, Division of Gastroenterology and Hepatology, Saint Louis University School of Medicine, 3635 Vista Ave, St Louis, MO 63110 (firstname.lastname@example.org).
Accepted for Publication: October 22, 2004.