Receiver operator characteristic curves for performance of Portsmouth (P) Physiologic and Operative Severity Score for the Enumeration of Mortality and Morbidity (POSSUM) and colorectal (Cr) POSSUM scores in patients with colorectal cancer.
Horzic M, Kopljar M, Cupurdija K, Bielen DV, Vergles D, Lackovic Z. Comparison of P-POSSUM and Cr-POSSUM Scores in Patients Undergoing Colorectal Cancer Resection. Arch Surg. 2007;142(11):1043–1048. doi:10.1001/archsurg.142.11.1043
To compare the Portsmouth (P) Physiological and Operative Severity Score for the Enumeration of Mortality and Morbidity (POSSUM) and specialized colorectal (Cr) POSSUM scoring systems in the prediction of mortality after resection of colorectal cancer.
Retrospective study of patients after resection of colorectal cancer.
One hundred twenty patients with complete medical records who underwent resection of colorectal cancer between January 1, 1996, and December 31, 2004, at our institution were enrolled in the study.
Main Outcome Measures
P-POSSUM and Cr-POSSUM scores were calculated for each patient. In-hospital mortality rate and number of deaths within 30 days after surgery were recorded. The ratio of observed to expected deaths was calculated for each analysis.
The P-POSSUM system underpredicted mortality by 25%, with no significant difference between the predicted and observed values (P = .96). The observed to expected ratio for Cr-POSSUM was 1.11, with no significant difference between the observed and predicted values (P = .19). Area under the receiver operating curve for P-POSSUM was 0.70 and for Cr-POSSUM was 0.59.
Both P-POSSUM and Cr-POSSUM perform well in predicting mortality after colorectal cancer surgery, but the Cr-POSSUM is more accurate. There is a constant need for reevaluation of existing and any new scoring systems outside original development and validation populations. The Cr-POSSUM score is a promising specialized tool for monitoring surgical outcomes in colorectal cancer surgery.
Operative mortality rate is a common measure of outcome and can be used to compare quality of health care.1 However, when comparing quality of care, mortality and morbidity rates have obvious limitations and may give misleading results because they do not consider the physiologic condition of the patient at the time of surgery, the severity of the surgery, and the age and general health of the patient.2,3 To give a more objective comparison for quality of care, various scoring systems have been introduced.
One of the first scoring systems for predicting outcome in surgery was the Physiological and Operative Severity Score for the Enumeration of Mortality and Morbidity (POSSUM), which was designed for general surgery.4 Since the original POSSUM system was introduced, several modifications have been suggested for the specific requirements of certain surgical subspecialities.1,5,6 Also, there is concern about the applicability of POSSUM scores in health care domains other than the one it was originally designed for.7 Therefore, modifications of the original POSSUM score were created. The Portsmouth POSSUM (P-POSSUM) system was designed to overcome the problem of overpredicting mortality in patients at low risk by using the original POSSUM score.5,8 P-POSSUM system was found to be more accurate in predicting mortality in general surgery.5
Colorectal surgery is a specific surgical subspeciality. The colorectal POSSUM (Cr-POSSUM) system was created in 2004 specifically for this field of surgery.1 Within colorectal surgery, oncologic colorectal surgery is particularly demanding. Patients with colorectal cancer are often at increased risk of complications owing to specific features of colorectal cancer such as malnutrition, anemia, and compromised immune systems.9- 11 The objective of this study was to assess the accuracy of P-POSSUM and Cr-POSSUM systems in predicting postoperative mortality in patients with colorectal cancer.
Patients who underwent resection of colorectal cancer between January 1, 1996, and December 31, 2004, at our institution were retrospectively included in the study. Those patients for whom P-POSSUM and Cr-POSSUM scores could not be calculated because of lack of data were excluded. Parameters for calculating P-POSSUM and Cr-POSSUM are given in Table 1 and Table 2. The remaining 120 patients were included in the study. Physiologic scores for both P-POSSUM and Cr-POSSUM were calculated for each patient from their medical records. Operative severity scores were calculated based on findings recorded by the operating surgeon. In-hospital mortality and death within 30 days after colorectal surgery were recorded. Both scores were calculated as previously described.1,5
Data were analyzed using the linear method of analysis described by Wijesinghe et al.6 In this type of analysis, patients are stratified into groups according to the predicted risk of death. Expected number of deaths is then calculated for each risk group by multiplying the number of patients in a given group with average risk of death in that group. The ratio of observed to expected deaths (O:E ratio) was calculated for each analysis. The χ2 test of Lemeshaw and Hosmer12 was used to assess any differences between predicted and observed morbidity and mortality rates. Furthermore, 3 separate subgroups were analyzed according to the type of operation, including right-sided hemicolectomy or transverse colon resection; left-sided hemicolectomy, the Hartmann procedure, anterior resection of the rectum, or resection of the sigmoid colon; and abdominoperineal resection. Discrimination ability, that is, the ability of the model to assign higher probabilities of death to those patients who died, was measured using receiver operating characteristic curves, which were analyzed for both scores. P < .05 was considered statistically significant.
The study included 69 men and 51 women. Ten patients (8.3%) died either in hospital or within 30 days after colorectal surgery and 23 (19.2%) developed complications. Two patients (1.7%) underwent 2 repeated laparotomies and 15 patients (12.5%) underwent 1 repeated operation.
Potentially curative surgery was performed in 101 patients (84.2%) and included right-sided hemicolectomy in 19 patients, left-sided hemicolectomy in 7, resection of the transverse colon in 5, resection of the sigmoid colon in 21, anterior resection of the rectum in 18, abdominoperineal resection in 22, and the Hartmann procedure in 9. In the remaining 19 patients, palliative surgery was performed that always included laparotomy, as follows: bypass surgery in 5 patients, local excision of the tumor in 2, creation of a palliative stoma in 8, and surgical exploration in 4.
In 24 patients in whom right-sided hemicolectomy or transverse colon resection was performed, the O:E ratio for all risk groups was 1.00, indicating that the Cr-POSSUM system correctly predicted mortality (1 predicted vs 1 observed). There was no significant difference between the observed and predicted values (χ28 = 0.35; P = 1.00). P-POSSUM also correctly predicted mortality (O:E ratio 1.00). There was no significant difference between the predicted and observed values (χ28 = 8.93; P = .35).
In 55 patients in whom left-sided hemicolectomy, the Hartmann procedure, or anterior or sigmoid resection was performed, the O:E ratio for all risk groups was 0.80, indicating that Cr-POSSUM overpredicted mortality in this study by 20% (5 predicted vs 4 observed). There was no significant difference between the observed and predicted values (χ27 = 5.55; P = .593). P-POSSUM system correctly predicted mortality (O:E ratio 1.00). There was no significant difference between the predicted and observed values (χ27 = 7.24; P = .41).
In 22 patients in whom abdominoperineal resection was performed, the O:E ratio for all risk groups was 1.00, indicating that Cr-POSSUM correctly predicted mortality (2 predicted vs 2 observed). There was no significant difference between the observed and predicted values (χ26 = 0.95; P = .99). P-POSSUM underpredicted mortality with an overall O:E ratio of 2.00. There was no significant difference between the predicted and observed values (χ28 = 8.40; P = .40).
Table 3 gives the number of deaths predicted by Cr-POSSUM with linear analysis when all patients were analyzed, including those who underwent palliative procedures. The O:E ratio for all risk groups was 1.11, indicating that the Cr-POSSUM system underpredicted mortality in this study by 11%. However, there was no significant difference between the observed and predicted values (χ27 = 10.05; P = .19). P-POSSUM system also underpredicted mortality by 25%, with an overall O:E ratio of 1.25. There was no significant difference between the predicted and observed values (χ28 = 2.54; P = .96) (Table 3).
Discriminatory power of P-POSSUM and Cr-POSSUM scores in predicting death as an outcome measure was analyzed using receiver operating characteristic curves. Area under the receiver operating characteristic curve (AUC) for Cr-POSSUM was 0.59 (95% confidence interval, 0.36-0.82) (Figure). For P-POSSUM, the AUC was 0.70 (95% confidence interval, 0.52-0.88), indicating satisfactory discriminatory power (Figure).
In this study, validity of P-POSSUM and Cr-POSSUM scores in patients who underwent resection of colorectal cancer was analyzed by assessing calibration and discrimination. Calibration is the ability of the model to assign the correct probabilities of outcome to individual patients. In this analysis, patients were stratified into risk groups on the basis of predicted mortality. The predicted number of deaths in each risk group was compared with the observed number of deaths using the Hosmer-Lemeshaw goodness-of-fit test (Table 1, Table 2, and Table 3). Both scores demonstrated good calibration ability, with no statistically significant differences in observed to expected number of deaths. P-POSSUM system underpredicted mortality by 25%. More accurate prediction of mortality with P-POSSUM was observed in patients at high risk compared with patients at low risk. Similarly, the Cr-POSSUM system also underpredicted mortality in patients at low risk, but overall accuracy was greater, with an O:E ratio of 1.11.
Discriminatory power of P-POSSUM and Cr-POSSUM, that is, the ability of the model to assign higher probabilities of outcome to patients who die compared with those who live, was analyzed using the AUC. In general, the AUC ranges from 0.5 for chance performance to 1.0 for perfect prediction.13 In this study, the AUC for P-POSSUM was 0.70, representing good discrimination power of this score. However, Cr-POSSUM did not perform as well, and the AUC was only 0.59. These results indicate that although P-POSSUM and Cr-POSSUM may be used to calculate predicted mortality rates in given populations, they are less accurate for predicting the risk of death for individual patients.
The results obtained in this study are somewhat different from those previously published. Some validation studies of P-POSSUM report slight overprediction of mortality, especially in populations at low risk.7,14 This overprediction has been explained in part by the mathematical characteristics of the scoring system; that is, the lowest probability for each patient with the P-POSSUM scoring system is 0.2%, which is substantially more than observed in young, fit patients undergoing elective minor surgery.
Substantial differences in prediction of mortality based on P-POSSUM have been described when applying this score in different populations and health care systems.7 Bennett-Guerrero et al7 compared P-POSSUM mortality rates after surgery between patients in the United States and the United Kingdom and found overprediction of mortality by a factor of 4 to 6 in the United States.7 Possible reasons for such overprediction may include differences in the organization of intensive care units. Another possible explanation may be the difference in population characteristics. For example, patients may have more advanced disease, which may have profound implications in the development of a scoring system.7 Patients with advanced gastrointestinal cancer often have pronounced nutritional deficit.11 Advanced protein-calorie malnutrition caused by decreased nutritional intake, dysfunctional metabolic processes, and hormonal- and cytokine-related abnormalities are major causes of morbidity and mortality in patients with cancer.11 According to official cancer registry data, age at onset of colorectal cancer in Croatia15 is 5 to 10 years later than in the United Kingdom.16
Specific scoring systems may be required to evaluate surgical outcomes in different specialties. The Cr-POSSUM system was created as a modification of an original POSSUM score to suit the specific needs of colorectal surgery.1 The results of our study demonstrate better accuracy of Cr-POSSUM compared with P-POSSUM in predicting mortality after surgery for colorectal cancer, which is in agreement with the results of another published study.14 However, all scoring systems tend to optimize the fit of the data to the original population. Although during development, Cr-POSSUM fitted the data well in both the development and validation sets, it is important to cross-validate the scoring system externally by applying the model to a different population to assess its predictive power.1
Practical value of the scores can be noted at different levels. Scores can indicate patients at high risk who require additional postoperative care in intensive care units or surgical wards, although their vital functions could be satisfactory. Also, scores could indicate patients at high risk who could benefit from postponing surgical treatment and receive preoperative treatment to improve their condition and decrease operative risk. Scores might aid in decision making about the extent of a surgical procedure in patients at high risk. In addition, they can offer an objective parameter of risk that could help the patients in deciding to consent to a surgical procedure. In comparing mortality rates between institutions or individuals, scores can give an objective measure of patient preoperative condition and operative risk and, thus, provide a basis for comparison of quality of health care and surgical procedures.
The results of this study demonstrate that both P-POSSUM and Cr-POSSUM perform well in prediction of mortality after surgery for colorectal cancer. Specialized scoring systems more accurately predict mortality. Although both scoring systems are based on universally available and clearly defined variables, there are differences in observed and expected mortality in various geographic settings. Therefore, there is a constant need for reevaluation of existing and new scoring systems outside of original development and validation populations. Cr-POSSUM is a promising tool for monitoring surgical outcomes in colorectal cancer surgery.
Correspondence: Kristijan Cupurdija, MD, MSc, Department of Surgery, University Hospital Dubrava, Avenija Gojka Suska 6, HR-10000 Zagreb, Croatia (firstname.lastname@example.org).
Accepted for Publication: April 23, 2006.
Author Contributions:Study concept and design: Horzic, Kopljar, and Cupurdija. Acquisition of data: Kopljar, Vanjak Bielen, Vergles, and Lackovic. Analysis and interpretation of data: Horzic, Kopljar, Cupurdija, Vanjak Bielen, Vergles, and Lackovic. Drafting of the manuscript: Horzic, Kopljar, and Cupurdija. Critical revision of the manuscript for important intellectual content: Horzic, Kopljar, Cupurdija, Vanjak Bielen, Vergles, and Lackovic. Statistical analysis: Horzic and Kopljar. Obtained funding: Horzic. Administrative, technical, and material support: Kopljar, Cupurdija, Vanjak Bielen, Vergles, and Lackovic. Study supervision: Horzic.
Financial Disclosure: None reported.
Funding/Support: This study was supported by the Ministry of Science, Education, and Sports of the Republic of Croatia (project 0198020).