Customize your JAMA Network experience by selecting one or more topics from the list below.
Objective To validate and refine a clinical prediction rule to identify which children with acute abdominal pain are at low risk for appendicitis (Low-Risk Appendicitis Rule).
Design Prospective, multicenter, cross-sectional study.
Setting Ten pediatric emergency departments.
Participants Children and adolescents aged 3 to 18 years who presented with suspected appendicitis from March 1, 2009, through April 30, 2010.
Main Outcome Measures The test performance of the Low-Risk Appendicitis Rule.
Results Among 2625 patients enrolled, 1018 (38.8% [95% CI, 36.9%-40.7%]) had appendicitis. Validation of the rule resulted in a sensitivity of 95.5% (95% CI, 93.9%-96.7%), specificity of 36.3% (33.9%-38.9%), and negative predictive value of 92.7% (90.1%-94.6%). Theoretical application would have identified 573 (24.0%) as being at low risk, misclassifying 42 patients (4.5% [95% CI, 3.4%-6.1%]) with appendicitis. We refined the prediction rule, resulting in a model that identified patients at low risk with (1) an absolute neutrophil count of 6.75 × 103/μL or less and no maximal tenderness in the right lower quadrant or (2) an absolute neutrophil count of 6.75 × 103/μL or less with maximal tenderness in the right lower quadrant but no abdominal pain with walking/jumping or coughing. This refined rule had a sensitivity of 98.1% (95% CI, 97.0%-98.9%), specificity of 23.7% (21.7%-25.9%), and negative predictive value of 95.3% (92.3%-97.0%).
Conclusions We have validated and refined a simple clinical prediction rule for pediatric appendicitis. For patients identified as being at low risk, clinicians should consider alternative strategies, such as observation or ultrasonographic examination, rather than proceeding to immediate computed tomographic imaging.
Appendicitis is the most common surgical emergency in children, and acute abdominal pain accounts for 5% to 10% of all pediatric emergency department (PED) visits.1-3 The diagnosis of appendicitis can be difficult, with many children receiving a misdiagnosis on initial presentation.4 Furthermore, negative appendectomy and perforation rates remain high, indicating a need to reevaluate the diagnostic assessment for this condition.5-8
Quiz Ref IDComputed tomography (CT) has high sensitivity and specificity for appendicitis and is heavily relied on in the evaluation of possible appendicitis.9 However, despite dramatic increases in CT use, substantial improvements in patient outcomes have not been realized.5,10-13 This discrepancy is potentially the result of overuse of CT, which is problematic because it results in unnecessary exposure to ionizing radiation, prolonged PED visits, and increased costs.6,13,14
Prior studies have described substantial variability in the evaluation and management of suspected appendicitis in children.10,15Standardizing the approach to patients with suspected appendicitis through clinical prediction rules could reduce variability and reliance on CT, thus promoting the delivery of efficient, safe, and cost-effective health care.16 Clinical prediction rules can be used to stratify patients by risk, allowing for tailored management based on patients' risks for disease.17
In 2005, our research team published a low-risk clinical prediction rule for pediatric appendicitis.18 Single-center internal validation revealed a sensitivity and negative predictive value (NPV) of 98% (95% CI, 89%-100%) and 98% (85%-100%), respectively.18 Hypothetical application of the rule could have led to a 20% reduction in CT use. Before implementation, independent validation of this rule is important. The objective of the present study was to validate and potentially refine our clinical prediction rule in a multicenter cohort of children and adolescents with suspected appendicitis.
Quiz Ref IDWe performed a prospective, cross-sectional study of children and adolescents with suspected appendicitis at 10 PEDs that are members of the Pediatric Emergency Medicine Collaborative Research Committee (PEM-CRC) of the American Academy of Pediatrics. The PEM-CRC reviewed and approved the final study protocol. The study was approved by each participating site's institutional review board, and data user agreements were formalized between the sites and the central data center. Seven institutional review boards granted a waiver of written informed consent/assent and instead allowed verbal consent. At the 3 remaining sites, written consent from the guardians and assent from patients 7 years or older was obtained.
Children and adolescents aged 3 to 18 years presenting to the PED with acute abdominal pain of less than 96 hours duration and undergoing evaluation for suspected appendicitis were approached for enrollment. We defined patients with suspected appendicitis as those for whom the treating physician obtained blood tests, radiological studies (CT and/or ultrasonography [US]), or a surgical consultation for the purpose of diagnosing appendicitis. Radiological studies or surgical consultations were obtained at the discretion of the treating physician. Quiz Ref IDWe excluded patients with pregnancy, prior abdominal surgery (eg, gastrostomy tube or abdominal hernia repair), chronic abdominal illness or pain (eg, inflammatory bowel disease, chronic pancreatitis, or chronic/recurrent appendicitis), sickle cell anemia, cystic fibrosis, or a medical condition affecting the provider's ability to obtain an accurate history. We also excluded patients who had radiological studies (CT or US) of the abdomen performed before arrival in the PED or a history of abdominal trauma within 7 days of the PED evaluation.
Before initiation of the study, principal investigators at each site received standardized training that included a detailed manual of operations and instructions on the proper completion of case report forms (CRFs). Principal investigators subsequently conducted group and one-on-one instructional sessions with clinicians who worked in their respective PEDs.
A PEM attending or fellow physician completed a standardized history and physical examination on a structured CRF. A resident physician, nurse practitioner, or physician assistant was allowed to complete the CRF with attending oversight. A subset of participants had a separate, independent assessment performed by a second clinician within 30 to 60 minutes of the first evaluation. Clinicians completed CRFs before knowledge of CT or US results.
The CRFs were completed on paper and subsequently entered into a computer program (Adobe Pro; Adobe Systems) for electronic transfer to the central data management warehouse through an electronic CRF (TeleForm; Verity, Inc). Quality assurance practices at the data warehouse included surveillance for missing and duplicate data. We determined capture rate by reviewing the PED visit, admission, pathology and radiology logs for 2 random days of each study month. Two sites were able to perform active surveillance (daily data capture monitoring). We compared demographic, clinical, and outcome data between enrolled and missed patients to detect possible enrollment bias.
The primary outcome was the test performance of the clinical prediction rule to identify patients at low risk for appendicitis. Patient disposition was based on physician discretion. Among patients undergoing surgery, we determined the presence of appendicitis from the attending pathologist's written report. Appendiceal perforation was determined from the attending surgeon's written operative report. A priori, we standardized the terms a priori to code pathology and operative reports.
For patients discharged from the PED, we conducted telephone follow-up within 2 weeks to determine resolution of signs and symptoms, visits to other sites of care, and need for surgical intervention. If we were unable to contact the guardian, we reviewed the medical record for 90 days after the index PED visit to determine whether the patient underwent CT, US, or an operation at that facility.
The previously published low-risk prediction rule consisted of the following variables: absolute neutrophil count of 6.75 × 103/μL or less (to convert the count to ×109 per liter, multiply by 1), absence of nausea, and absence of maximal tenderness in the right lower quadrant (RLQ) of the abdomen. On the CRFs, clinicians had the option of coding the presence of nausea as yes, no, or don't know and maximal tenderness in the RLQ as yes, no, or unsure. Responses of don't know or unsure were analyzed as if the patient had the finding. We excluded patients if any of the prediction rule components were missing. A sensitivity analysis was performed to determine the effect on test performance of recoding don't know/unsure findings as present, absent, or missing. We calculated performance of the rule as sensitivity, specificity, positive predictive value (PPV), and NPV. We assessed the accuracy of the low-risk rule based on whether patients were identified as being at low risk in either of the terminal decision tree nodes (as analyzed in the original study).18
We anticipated that our validated prediction rule may have diminished performance; thus, a priori we planned to refine the rule. We conducted binary recursive partitioning analyses (CART, version 6.0; Salford Systems) to refine our prediction rule and create models that had higher sensitivity (>95%) without affecting specificity (25%-35%). We aimed to create rules for which the risk of appendicitis in the low-risk group was less than or, at minimum, similar to the approximately 6.0% to 7.5% false-negative rate of CT findings.9,19 We entered variables into the model that were included in our original study as well as any patient history and physical examination variables that had at least moderate interrater reliability (κ > 0.4).20 The following variables were entered: duration of abdominal pain, nausea, emesis, history of focal RLQ pain, presence of abdominal tenderness, maximal tenderness in the RLQ, abdominal pain with walking, abdominal pain on the right side with walking, and the absolute neutrophil and white blood cell counts using both continuous and categorical cutoff points. We identified the categorical cutoff points through the use of univariate recursive partitioning. For this analysis, responses that were marked unsure or don't know were coded as missing data. We used the Gini splitting method for classification trees and internally validated the results of our refined model using 10-fold cross validation. To create the models, we varied costs to always favor not missing a case of appendicitis rather than diagnosing appendicitis in a patient who did not have the illness.
Patients were enrolled in 10 PEDs with broad United States geographic distribution from March 1, 2009, through April 30, 2010. We removed data from 1 site before analysis because their capture rate was less than 40%. Therefore, the study cohort consisted of 2625 patients across the remaining 9 sites, representing 70.8% of eligible patients. Enrollment by site ranged from 223 to 473 patients, and the capture rate varied from 48% to 96%. A total of 1018 patients (38.8% [95% CI, 36.9%-40.7%]) were diagnosed as having appendicitis, of whom 275 (27.0% [24.4%-30.0%]) had a perforated appendix. Of those undergoing an operation, no evidence of appendicitis by pathology was found in 95 patients (negative appendectomy rate, 8.5% [95% CI, 7.0%-10.3%]). We completed telephone follow-up on 87.8% of patients discharged from the PED. None of the 186 patients lost to telephone follow-up had evidence of an appendectomy via review of the medical record (Figure 1).
Figure 1. Flow diagram of study population and final diagnosis. ED indicates emergency department.
The mean (SD) age of enrolled patients was 10.8 (3.8) years; 51.0% were male.Quiz Ref IDThe most common diagnoses among patients who did not undergo an appendectomy included nonspecific abdominal pain (42.6%), gastroenteritis (14.3%), and constipation (12.1%). Clinicians obtained CT in 55.4%, US in 36.8%, and both procedures in 11.6% of patients. In total, 2116 patients (80.6%) underwent diagnostic imaging. Missed patients (those not enrolled) were similar to those enrolled, with a mean (SD) age of 11.0 (4.1) years, 52.8% being male, and a 41.5% rate of appendicitis (of whom 29.5% having perforated) (Table 1). Among missed patients, clinicians used US more frequently (67.9%) and CT less frequently (44.3%), and there was a higher rate of using CT or US (93.4%).
Complete data for rule performance were available for 2390 patients (91.0%). The most common reason for exclusion from analysis was the absence of a white blood cell count (188 patients). The test characteristics of validation are provided in Table 2; we include the test characteristics of the derivation sample from our previously published study18 for comparison.
Theoretical application of the low-risk prediction rule for appendicitis is presented in Figure 2. A sensitivity analysis revealed no significant change in test performance based on the coding of unsure and don't know responses (data available on request). In total, 573 patients (24.0% of those with complete data) were identified as being at low risk; of these, 64 (11.2%) underwent an operation for presumed appendicitis, of whom 42 had pathology-proven appendicitis and 22 had negative findings. In addition, 296 (51.7%) underwent CT; 241 (42.1%), US; and in total, 465 (81.2%), CT or US. Application of the low-risk rule would have theoretically prevented 22 unnecessary operations and 465 (24%) diagnostic imaging studies but would have missed 42 patients (4.5% [95% CI, 3.4%-6.1%]) who were ultimately diagnosed as having appendicitis. In Table 3, we present the clinical characteristics of the 42 patients with appendicitis who were misclassified by the prediction rule.
Figure 2. Effect of hypothetical application of the Low-Risk Appendicitis Rule. ANC indicates absolute neutrophil count (to convert count to ×109 per liter, multiply by 1); RLQ, right lower quadrant.
The refined model identified patients as being at low risk for appendicitis if they met one of the following: (1) absolute neutrophil count of 6.75 × 103/μL or less and no maximal tenderness in the RLQ or (2) absolute neutrophil count of 6.75 × 103/μL or less with maximal tenderness in the RLQ but no abdominal pain with walking/jumping or coughing (Figure 3). Test characteristics of the refined model are presented in Table 4. Of the 400 patients identified as being at low risk, 27 (6.8%) underwent an operation, 19 of whom had appendicitis. In addition, of these 400 patients, clinicians obtained CT or US in 301 (75.2%), including 180 patients (45.0%) who had a CT.
Figure 3. Refined Low-Risk Appendicitis Rule and rule performance. ANC indicates absolute neutrophil count (to convert count to ×109 per liter, multiply by 1); RLQ, right lower quadrant.
Quiz Ref IDIn this large, prospective, multicenter study of children and adolescents with suspected appendicitis, our previously derived low-risk prediction rule maintained high sensitivity and modest specificity in a validation cohort. Furthermore, we refined our low-risk rule to improve test sensitivity. These low-risk rules identify pediatric patients with suspected appendicitis at low but not zero risk for appendicitis.
Our study adds to a growing literature on the use of clinical prediction rules for treating patients in the emergency department.17,21-23 Similar to prior studies, our goal was to identify patients at low risk for illness to reduce reliance on diagnostic imaging and inefficient care delivery. As our study confirms, CT is heavily relied on to diagnose and manage acute abdominal pain in children.10 The potential benefit of our clinical prediction rule lies in its ability to stratify patients, identifying those at low risk for appendicitis.
Several previous investigators have developed clinical prediction rules or scores for the diagnosis of appendicitis.24-27 The Samuel24 and Alvarado25 scores are the most commonly cited, and although the original studies noted excellent test performance, external validation by independent investigators revealed conflicting results.28-30 Both scoring systems were intended to identify patients with appendicitis rather than identify a low-risk group.24,25 Compared with these prior scores, advantages of our prediction rule include its simplicity, external validation in a large sample across multiple PEDs, and ability to more accurately identify a low-risk cohort. Last, a decision tree format may be easier than a numerical-based score for clinicians to remember and use.
Although the sensitivity of our validated low-risk prediction rule was high, the NPV was lower than in the derivation study (92.7% vs 98% for the derivation study). As a result, 42 children (4.5% of patients with appendicitis) were misclassified as not having appendicitis. This rate of misclassification may concern clinicians, given the potential medical and legal consequences associated with missed appendicitis. We anticipated this issue and thus refined our rule with the goal of improving the sensitivity and NPV. Our refined prediction rule provides sensitivity and NPV that are somewhat higher (98.1% and 95.3%, respectively), but the specificity and PPV of the rule diminish. Furthermore, the refined rule would still miss some cases of appendicitis (19 patients). Consequently, either rule may be appropriate to identify a low-risk population (risk of appendicitis: 7.3% with the validated rule and 4.8% with the refined rule), whom clinicians may choose to observe for progression of abdominal symptoms. The use of US and/or surgical consultation may also be viable alternatives. Given the high rate of negative appendectomies (no appendicitis on pathology) in the low-risk cohort (>30%) compared with the overall study cohort (8.5%), it would be prudent for surgeons to be cautious operating on low-risk patients. Ultimately, our prediction rules may be best suited for integration into an appendicitis care algorithm to help stratify risk and guide clinical management (eg, observation with serial examination for low-risk patients).
We should consider the potential use of our low-risk prediction rules in relation to the performance of CT. Although CT has demonstrated a sensitivity of 94% (95% CI, 92%-97%) and a specificity of 95% (94%-97%) for appendicitis, the PPV of CT will be lower when it is used in populations with a low prevalence of appendicitis.9 In addition, the NPV of CT is not 100%.19 In our present study, if clinicians had acted on CT results in isolation, appendicitis would have been missed in 20 patients inappropriately discharged home, and 27 patients would have had negative appendectomies (data available on request). These results support concerns raised by several investigators that the excessive use of CT may lead to unnecessary operations, delays in care, and increased costs.31-33
Physicians may have concerns regarding the reliability of the clinical variables included in our prediction rules. Through the course of our study, we collected data on the interrater reliability of clinical history and physical examination findings, the results of which have been presented previously.20 The presence of nausea had a κ value of 0.44 (95% CI, 0.37-0.52); maximal tenderness in the RLQ, 0.45 (0.36-0.54); and pain with walking, 0.54 (0.45- 0.63), indicating moderate reliability for all 3 variables.
Ultimately, the clinical utility of our prediction rules is in their ability to provide a quantitative assessment of risk for appendicitis. In this study, we elected to stratify patients as being at low risk or not low risk for appendicitis. In this scheme, patients identified as being at low risk had a risk of appendicitis of 7.3% (validated rule) or 4.8% (refined rule). However, by observing how patients flow within the decision trees, specific risks for appendicitis can be determined depending on a patient's particular signs and symptoms (range, 3.6%-12.4% for the various terminal nodes). As electronic health record–based clinical decision support becomes more common within emergency departments, the ability to calculate an appendicitis risk may allow physicians to tailor management based on their own risk tolerance and availability of diagnostic imaging and surgical resources.
Our study had several limitations. Enrollment of patients varied considerably by site. To assess for enrollment bias, we conducted random medical record audits, which revealed that missed patients were similar to those enrolled. Although we enrolled pediatric patients from numerous geographic regions, enrollment occurred exclusively in PEDs. Therefore, our results may not be able to be generalized to other settings. Our clinical prediction rule was developed and validated in cohorts in which the rate of appendicitis was quite high (>30%). Use of the rule in an urgent care or clinic setting, where the rate of appendicitis is lower, might result in a higher NPV but lower PPV. We collected clinical variables only at the time of enrollment; thus, the patients' examination findings may have changed before final disposition. Although we made every attempt to follow up patients discharged from the PED, we cannot exclude the possibility that some underwent appendectomies at alternative facilities. Last, we stress that our study was not an implementation study; clinicians should understand the potential risks and benefits of using the validated rule prior to formal implementation and of the refined rule before external validation.
We validated and refined a clinical prediction rule for pediatric appendicitis, identifying a population of children with suspected appendicitis who are at low but not zero risk for appendicitis. If applied, clinicians will need to balance the risks of missing a case of appendicitis with the increased risk of negative appendectomies and the potential long-term risks associated with exposure to ionizing radiation. Clinicians should consider alternative strategies, such as observation or US, for patients identified as being at low risk rather than proceeding to immediate CT.
Correspondence: Anupam B. Kharbanda, MD, MSc, Department of Pediatric Emergency Medicine, Children's Hospital and Clinics of Minnesota, 2525 Chicago Ave S, Minneapolis, MN 55404 (email@example.com).
Accepted for Publication: March 6, 2012.
Author Contributions: This manuscript was written by Dr Kharbanda, and all authors take full responsibility for the integrity of the data and the accuracy of data analysis. Study concept and design: Kharbanda, Dudley, Bajaj, Stevenson, Macias, Mittal, Bachur, Bennett, Sinclair, Huang, and Dayan. Acquisition of data: Kharbanda, Dudley, Bajaj, Stevenson, Macias, Mittal, Bachur, Bennett, Sinclair, Huang, and Dayan. Analysis and interpretation of data: Kharbanda, Dudley, Stevenson, Macias, Bachur, Sinclair, and Dayan. Drafting of the manuscript: Kharbanda and Dayan. Critical revision of the manuscript for important intellectual content: Kharbanda, Dudley, Bajaj, Stevenson, Macias, Mittal, Bachur, Bennett, Sinclair, Huang, and Dayan. Statistical analysis: Kharbanda, Bachur, and Dayan. Obtained funding: Kharbanda and Dayan. Administrative, technical, and material support: Kharbanda, Bajaj, Stevenson, Macias, and Sinclair. Study supervision: Macias, Mittal, Sinclair, and Dayan.
Members of the Executive Committee of the Pediatric Emergency Medicine Collaborative Research Committee of the American Academy of Pediatrics: Marc Auerbach, MD, and Lei Chen, MD, Yale School of Medicine, New Haven, Connecticut; Todd Chang, MD, Children's Hospital Los Angeles, Los Angeles, California; Andrea Cruz, MD, and Charles G. Macias, MD, Baylor College of Medicine, Texas Children's Hospital, Houston; Denise Dowd, MD, Children's Mercy Hospitals and Clinics and University of Missouri–Kansas City School of Medicine; Stephen Freedman, MD, Pediatric Hospital for Sick Children, Toronto, Ontario, Canada; Anupam B. Kharbanda, MD, University of Minnesota, Minneapolis; Prashant Mahajan, MD, University of Michigan, Detroit; Jared Muenzer, MD, and David Schnadower, MD, Washington University School of Medicine, St Louis, Missouri; and Joe Zorc, MD, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania.
Financial Disclosure: Dr Kharbanda received salary support through the Empire Clinical Research Program of New York State.
Funding/Support: This study was supported by grant UL1 RR024156 from the National Center for Research Resources, a component of the National Institutes of Health (NIH) and NIH Roadmap for Medical Research. The PEM-CRC data center is supported in part by the Center for Clinical Effectiveness at Baylor College of Medicine and Texas Children's Hospital.
Previous Presentations: This study was presented in part at the Annual Meeting of Pediatric Academic Societies; April 30, 2011; Denver, Colorado.
Role of the Sponsor: The sponsor had no role in the design and conduct of the study; in the collection, analysis, and interpretation of the data; or in the preparation, review, or approval of the manuscript.
Online-Only Material: This article is featured in the Archives Journal Club. Go here to download teaching PowerPoint slides.
Additional Contributions: We thank all the clinicians who enrolled patients into this study and the research coordinators who greatly facilitated study completion.
This article was corrected for errors on August 30, 2012.
Kharbanda AB, Dudley NC, Bajaj L, et al; Pediatric Emergency Medicine Collaborative Research Committee of the American Academy of Pediatrics.
Arch Pediatr Adolesc Med. 2012;166(8):738-744.
Kharbanda AB, Dudley NC, Bajaj L, et al. Validation and Refinement of a Prediction Rule to Identify Children at Low Risk for Acute Appendicitis. Arch Pediatr Adolesc Med. 2012;166(8):738–744. doi:10.1001/archpediatrics.2012.490
Create a personal account or sign in to: