Receiver operating characteristic curves for computed tomographic coronary angiography at 50% stenosis and 70% stenosis. Area area under the receiver operating characteristic curve point estimates of study: 50% threshold, 0.873; 70% threshold, 0.848.
Chow BJW, Freeman MR, Bowen JM, Levin L, Hopkins RB, Provost Y, Tarride J, Dennie C, Cohen EA, Marcuzzi D, Iwanochko R, Moody AR, Paul N, Parker JD, O’Reilly DJ, Xie F, Goeree R. Ontario Multidetector Computed Tomographic Coronary Angiography StudyField Evaluation of Diagnostic Accuracy. Arch Intern Med. 2011;171(11):1021-1029. doi:10.1001/archinternmed.2011.74
Computed tomographic coronary angiography (CTCA) has rapidly gained clinical acceptance as a diagnostic tool for the detection of obstructive coronary artery disease (CAD). Single-center CTCA studies have reported very good operating characteristics for CAD diagnosis and a positive impact on referrals for invasive CA (ICA).1- 5 Multicenter studies using CTCA have yielded mixed results, with the sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) ranging from 85% to 99%, 64% to 90%, 64% to 91%, and 83% to 99%, respectively.6- 9 Although initial results are promising, they maybe biased by the use of “core laboratories/readers” and the restriction to specific CT vendors; therefore, the results may not be reflective of those expected in day-to-day practice. To better understand the potential diagnostic accuracy of 64-slice CTCA in day-to-day practice, a multicenter, multivendor field evaluation was performed comparing CTCA with ICA in patients with low and intermediate pretest probability for obstructive CAD.
Between September 2006 and June 2009, patients referred for ICA were screened at the 4 participating institutions. The following 2 groups of patients scheduled for ICA were eligible for the study: group 1 comprised patients with valvular heart disease, congenital heart disease, cardiomyopathy, or aortic disease, and group 2 comprised symptomatic patients with intermediate (10%-90%) pretest probability for CAD.
Patients with a high pretest probability (>90%) for CAD; documented CAD; a history of revascularization; renal insufficiency (glomerular filtration rate <40 mL/min for nondiabetic patients and <60 mL/min for patients with diabetes mellitus); age younger than 18 years; contrast allergy; pregnancy or breastfeeding; an uncontrolled heart rate; chronic atrial fibrillation; or those unable to perform a 20-second breath-hold were excluded. Patients were also excluded if the CTCA could not be performed within 10 days of the ICA. The study was approved by each participating Institutional Human Research Ethics Board, and all patients provided written informed consent.
At the time of CTCA, a medical history was ascertained to document symptoms, cardiac risk factors, and medications. Each patient's pretest probability for obstructive CAD was calculated using age, sex, symptoms,10- 12 and, if available, results from recent stress tests.13
Before image acquisition, metoprolol or diltiazem (oral and/or intravenous) was administered targeting a heart rate of 65 beats or fewer per minute, and in the absence of contraindications, nitroglycerin (0.3-0.8 mg) was administered sublingually.5,14- 16 Image acquisition was performed according to the enrolling institutions' clinical protocols.5,7,16 The CTCA data sets were acquired using a triphasic intravenous contrast (Visipaque 320 or Omnipaque 350; GE Healthcare, Princeton, New Jersey) administration protocol with a bolus tracking or timing bolus technique. Contrast infusion rates were tailored according to patient weight and scan duration using a minimum of 4 mL/s (<60 kg), 5 mL/s (60 and 80 kg), and 6 mL/s (>80 kg), for a total of 60 to 120 mL followed by a 50-mL saline bolus.
Retrospective electrocardiographic-gated data sets were acquired with either 1 of 2 commercially available CT scanners (GE Volume CT Scanner; GE Healthcare, Milwaukee, Wisconsin; or Aquilion 64 MDCT scanner; Toshiba Medical Systems, Tochigi, Japan). For evaluation of the coronary arteries, the data sets were reconstructed at the phase(s) with the least cardiac motion.5,7,16
The CTCA data sets were postprocessed using 1 of 2 workstations (GE Advantage Volume Share; GE Heathcare, Milwaukee, Wisconsin; or Vitrea Imaging Software; Vital Images Inc, Minnetonka, Minnesota). Each study was interpreted independently at each site by 2 expert observers who were blinded to all clinical data, and discrepancies were resolved by consensus or a third reader. Coronary artery lumina were assessed using axial images, oblique and curved multiplanar reformations using window levels, and widths optimized for each study.
A 17-segment model of the coronary arteries and a 4-point grading score (normal, mild [<50%], moderate [50%-69%], and severe [≥70%]) were used for the evaluation of coronary diameter stenosis.17 In segments that were “unevaluable,” forced reading was performed, and readers provided their “best educated guess.”9,16 The diagnosis of obstructive CAD was assessed on a per-patient and a per-vessel level.6 Patients with obstructive CAD were further categorized as having high-risk CAD or non–high-risk CAD (CAD model 1). High-risk CAD was defined as having left main stenosis (≥50%) or 3-vessel(≥70%) or 2-vessel (≥70%) disease involving the proximal left anterior descending artery.18,19 Also, patients with obstructive CAD (≥50% diameter stenosis) were categorized as having 1-, 2-, or 3-vessel disease (CAD model 2).16,20
Invasive CA was performed according to clinical routine.21 Using the same CTCA visual grading system, all ICAs were reviewed by 2 observers who were blinded to clinical data and prior CTCA results. Discrepancies were resolved by consensus.
Statistical analyses were performed using SAS software (Version 9.1.3; SAS Institute Inc, Cary, North Carolina), and statistical significance was defined as P < .05. Continuous variables with normal distributions were presented as means and standard deviations, and those with nonnormal distribution were presented as median and interquartile ranges. Categorical variables were presented as frequencies with percentages. To compare patient characteristics and CTCA imaging parameters, the Wilcoxon rank sum test was used for continuous variables, and the Fisher exact test was used for categorical variables. Receiver operating characteristic curves were constructed for CTCA to detect obstructive CAD. Furthermore, a multiple logistic regression of predictors for false CTCA results (false-positive or false-negative) were run as the dependent binary variable, with a binary predictor for center 1 vs others (secondary to observed differences in diagnostic accuracy) and continuous predictors for body mass index, heart rate during CT, pretest probability for CAD, and coronary calcification (number of coronary segments with calcification on CT scans).
Over an enrollment period of 34 months, 594 patients were prospectively screened, with a total of 250 patients meeting the enrollment criteria. Of these, 181 consented to the study, but 11 withdrew from the study and 1 was excluded from the analysis because of an interval of more than 10 days between CTCA and ICA. The final study population comprised 169 patients (mean [SD] age, 61.0 [10.4] years; men, 52.6%; mean [SD] pretest likelihood for obstructive CAD, 46.8% [29.4%]) (Table 1). A total of 344 patients were excluded because of renal insufficiency (n = 82 [23.8%]), chronic atrial fibrillation (n = 76 [22.1%]), history of acute myocardial infarction (n = 36 [10.5%]), need for urgent ICA (n = 32 [9.3%]), previous coronary artery bypass graft or percutaneous coronary intervention (n = 30 [8.7%]), allergy to contrast (n = 10 [2.9%]), uncontrolled heart rate (n = 6 [1.7%]), or being unable to hold breath for 20 seconds (n = 4 [1.2%]). Various other nonprotocol reasons for ineligibility were present in 68 of the patients (19.8%), with the most common being a recent other CT test, cardiac catheterization, or magnetic resonance imaging (n = 12 [3.5%]) and the inability to coordinate successive CTCA and ICA (n = 15 [4.4%]).
Group 1 comprised 52 patients who were primarily referred for valvular heart disease (n = 46), cardiomyopathy (n = 3), congenital heart disease (n = 2), or aortic disease (n = 1). The remaining 117 patients (group 2) were symptomatic and were referred to ICA for intermediate pretest probability for CAD. Radiation exposure resulting from CTCA and diagnostic ICA was measured, and the mean (SD) exposure was 18.6 (4.7) mSv from CTCA and 11.0 (6.8) mSv from ICA (Table 2). The CTCA and ICA imaging parameters and results are listed in Table 2.
The overall prevalence of angiographic disease (≥50% stenosis by ICA) was 53%, with a prevalence of 21% in group 1 and 61% in group 2 (P < .001). Therefore, the overall patient-based sensitivity, specificity, PPV, and NPV of CTCA for detecting obstructive CAD (≥50% diameter stenosis) were 81.3% (95% confidence interval [CI], 71.0%-89.1%), 93.3% (95% CI, 85.9%-97.5%), 91.6% (95% CI, 82.5%-96.8%), and 84.7% (95% CI, 76.0%-91.2%), respectively (Table 3). The area under the receiver operating characteristic curve (AUC) was 0.873 (Figure and Table 4). The operating characteristics in each subgroup (groups 1 and 2) were not statistically different (Table 3). Using a 70% threshold for obstructive CAD, the sensitivity of CTCA was lower, while the other operating characteristics remained unchanged (Table 3). Using a vessel-based analysis (≥50% and ≥70% diameter stenosis), the apparent decrease in sensitivity was not statistically significant (P = .56) (Table 3).
The agreement between CTCA and ICA for the severity of CAD was good, with a weighted κ of 0.72 for CAD model 1 and 0.72 for CAD model 2 (Table 5).
Potential predictors of diagnostic accuracy were examined (Table 6). On univariate analysis, factors that increased the likelihood of false CTCA results (false-positive or false-negative) were pretest probability of CAD (odds ratio [OR], 1.02; P = .005) and presence of coronary calcification (OR, 1.09; P = .03). The CTCAs performed at center 1 were less likely to have false-positive or false-negative results (OR, 0.31; P = .004). On multivariate analysis, the enrolling center predicted false CTCA results (OR, 0.28; P = .005). When the center variable was excluded from the regression model, pretest probability became a significant predictor of increased false CTCA results (OR, 1.02; P = .02).
Significant variability was observed in diagnostic accuracy across enrolling centers (P < .001) (Table 7), with the greatest variability observed in sensitivity (range, 50.0%-93.2%) and the NPV (range, 42.9%-94.7%). When center 1 was compared with the remaining centers, there was a statistically significant difference in operating characteristics (P < .001). Differences in patient demographics and CT parameters across centers were examined, and there was a significant difference in pretest likelihood (P = .003), prevalence of CAD (P = .005), smoking status (P = .03), and contrast infusion rate (P < .001) (Table 8).
Acknowledging the potential for bias in patient populations at the enrolling centers and the presumed high NPV of CTCA, sensitivity analyses were performed in patients with a lower pretest probability for CAD (<50% and <70%). Although limited by issues of power, the operating characteristics (sensitivity, specificity, PPV, NPV, and AUC) of CTCA for patients with a pretest probability of less than 50% were 95.7% (95% CI, 78.1%-99.9%), 94.6% (95% CI, 81.8%-99.3%), 91.7% (95% CI, 73.0%-99.0%), 97.2% (95% CI, 85.5%-99.9%) and 0.951, respectively, at center 1 and 25.0% (95% CI, 0.63%-80.6%), 94.4% (95% CI, 72.7-99.9%), 50.0% (95% CI, 1.26%-98.7%), 85.0% (95% CI, 62.1%-96.8%), and 0.597, respectively, at centers 2, 3, and 4. Similarly, for patients with a pretest probability of less than 70%, the sensitivity, specificity, PPV, NPV, and AUC at center 1 were 93.1% (95% CI, 77.2%-99.2%), 95.8% (95% CI, 85.7%-99.5%), 93.1% (95% CI, 77.2%-99.2%), 95.8% (95% CI, 85.7%-99.5%), and 0.945, respectively, compared with 57.1% (95% CI, 28.9%-82.3%), 92.9% (95% CI, 76.5%-99.1%), 80.0% (95% CI, 44.4%-97.5%), 81.3% (95% CI, 63.6%-92.8%), and 0.750, respectively, at centers 2, 3, and 4.
To better understand whether the disagreements between CTCA and ICA (κ = 0.75; 95% CI, 0.65-0.85) might be explained by interobserver variability, the interobserver variability of ICA and CTCA interpretations was examined. The agreement between ICA readers was 0.88 (95% CI, 0.81-0.94), which was similar to the agreement between CTCA readers of 0.81 (95% CI, 0.75-0.88). To resolve disagreement between readers, consensus third reviews were required in 69 of the CTCA images (40.8%) and 65 of the ICA images (38.5%).
Our real-world field evaluation of the diagnostic accuracy of CTCA suggests that the operating characteristics of CTCA are good, but implementation into clinical practice may result in a decline in sensitivity and NPV. Such a change, although unexpected, is consistent with the application of single-site results of testing to real-life practice. As testing becomes more widely applied to additional populations and used by multiple users, the sensitivity and specificity are frequently adversely affected.22,23 Multivariate logistic regression analysis demonstrated that the enrolling center was a predictor of CTCA false reads and that center 1 had greater diagnostic accuracy with fewer false CTCA results. When the CTCA images for those patients with false diagnoses from the other centers (n = 14) were read at center 1, the presence of 50% or more stenosis was identified in 5 individual patients (36%) who were initially identified by CTCA as not having significant CAD, resulting in a recalculated sensitivity estimate of 87.5%. with no change in specificity. Furthermore, using both the CTCA and ICA readings from center 1, 10 patients (72%) had changes to their diagnosis, resulting in a change of sensitivity to 92.1% and specificity to 93.5%. Although the study was not designed to provide detailed statistical comparisons between the centers, it is highly likely that the discrepancy in diagnostic performance was influenced by patient-related parameters known to strongly affect the performance of CTCA and by center-specific factors. The higher proportion of patients with a lower pretest probability of CAD and disease prevalence in center 1 (only vs centers 3 and 4) likely influenced the overall performance of both CTCA and ICA. Center 1 also had significantly higher contrast infusion rates than the other centers. It is possible that additional factors such as differences in reading styles and visual thresholds for abnormal study findings may have influenced the results. Given the potential for confounding variables associated with the diagnostic performance of CTCA, it is probable that a dedicated cardiac CT program with a small, focused group of technologists and nurses who routinely perform CTCA are more likely to be able to optimize patient heart rate and image acquisition parameters. Therefore, it is important to acknowledge that the accuracy results of a single center may not apply uniformly across all centers depending on local practice and the experience of observers.
Early multicenter studies have reported the diagnostic accuracy of CTCA, but their results also may not be uniformly translatable to all centers performing CTCA.6- 9 Because 2 of the 3 multicenter studies were restricted to a single vendor, their study results may not be applicable to centers using different or newer CT scanners. Similarly, previous multicenter studies used a core laboratory, with 2 to 6 core readers for CTCA image analysis, which could potentially overestimate the diagnostic accuracy of CTCA. More important is understanding the diagnostic accuracy in day-to-day practice. Although it is difficult to simulate daily practice when enrolling patients referred for ICA, several steps were undertaken to ensure that our results may better reflect real-world expectations. Our study purposely did not centralize CTCA reading (9 readers) or ICA reading (12 readers), which would potentially simulate the variability that might be observed at different centers in the real world. The study did use dual readings for both CTCA and ICA, which might result in potentially better diagnostic accuracy than would be seen in daily clinical practice. Similarly, we did not restrict the study to a single vendor, but we did not have representation from all vendors. Also, there was no restriction based on Agatston score, vessel size,7 or patient age.9 Therefore, the potential inclusion of older patients, patients with severe coronary artery calcification, or patients with small coronary vessels might bias the results toward a lower accuracy.
Unlike previous studies, quantitative CA was not routinely performed,6,7,9 which might result in greater interobserver variability in ICA and even greater discrepancy between CTCA and ICA. Since current clinical practice uses visual analysis of ICA, our method better reflects clinical practice in the real world. Similar to the study by Meijboom et al,9 “forced reads” were performed with an intention-to-diagnose analysis as opposed to excluding unevaluable segments,6,7 again better emulating the challenges of image interpretation that are experienced in daily clinical practice.
Previous CTCA multicenter studies also had a wide variation in disease prevalence and enrolled patients with known CAD,whichs could bias the interpretation of studies and thereby increase sensitivity for disease detection and decrease specificity. In fact, these studies did observe a higher sensitivity and a lower specificity compared with the present study.
Although our study was not powered to assess accuracy at individual centers, the observed results at centers with low enrollment may have occurred from chance or local bias. We acknowledge that the sensitivity of CTCA was lower than that of previous multicenter studies; however, we are mindful that the operating characteristics of our study remain comparable to those of the traditional noninvasive modalities that are routinely used.24,25 Further comparison studies are needed to better understand how CTCA compares with conventional noninvasive techniques.
This was a multicenter, multivendor, single-blinded prospective study. Although we did not restrict enrollment centers to a single vendor, our results may not translate to centers that use different vendors or newer CT systems. Because 60% of the enrolled patients were recruited from center 1, a bias may have been introduced, potentially inflating the real-world operating characteristics of CTCA. Also, although differences in diagnostic accuracy were observed between the centers, this study was not designed or powered to detect or determine the cause of these differences. We acknowledge the potential differences in CTCA accuracy at the different centers. Such results may reflect the real-world experience if CTCA is indiscriminately adopted.
However, our results highlight the need for quality assurance at centers that are planning to implement CTCA. We recognize that the enrollment of patients referred to ICA may subject the study to referral bias. To overcome such bias, further studies are needed to understand downstream resource use after CTCA and to confirm the accuracy of CTCA by enrolling a large consecutive CTCA cohort and performing ICA.
The calculated radiation exposure for both CTCA and ICA appears to be higher than that in previously reported studies. Patient exposure from ICA was directly measured and prospectively collected and likely reflects real-world practice. However, newer algorithms for CTCA acquisition have since been developed that significantly reduce radiation exposure without reduction in image quality. Our study supports the requirement to adopt radiation dosage reduction algorithms to lower CTCA radiation exposure below those associated with ICA.
Although the rapid dissemination of CTCA has occurred, enthusiasm for CTCA must be tempered by the reality that centers may have different patient cohorts, acquisition protocols, expertise, and interpretation thresholds. There is a need to develop standardized measures for CTCA acquisition and interpretation to ensure optimal patient diagnosis and care.
In conclusion, compared with ICA, CTCA appears to have good accuracy; however; significant variability in diagnostic accuracy was observed across the different enrolling centers. This variability may have clinical implications as more centers adopt CTCA. Further real-world evaluations are needed to fully understand the impact of accepting CTCA into routine clinical care.
Correspondence: Ron Goeree, MA, Programs for Assessment of Technology in Health Research Institute, St Joseph's Healthcare Hamilton, 25 Main St W, Ste 2000, Hamilton, ON L8P 1H1, Canada (firstname.lastname@example.org).
Accepted for Publication: December 17, 2010.
Published Online: March 14, 2011. doi:10.1001/archinternmed.2011.74
Author Contributions: Mr Goeree had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Study concept and design: Chow, Freeman, Bowen, Levin, Tarride, Dennie, Cohen, Moody, Parker, and Goeree. Acquisition of data: Chow, Freeman, Bowen, Provost, Dennie, Marcuzzi, Iwanochko, Moody, Paul, and Goeree. Analysis and interpretation of data: Chow, Freeman, Bowen, Hopkins, Provost, Tarride, Cohen, Marcuzzi, Moody, O’Reilly, Xie, and Goeree. Drafting of the manuscript: Chow, Freeman, Bowen, Hopkins, Tarride, and Moody. Critical revision of the manuscript for important intellectual content: Chow, Freeman, Bowen, Levin, Provost, Tarride, Dennie, Cohen, Marcuzzi, Iwanochko, Moody, Paul, Parker, O’Reilly, Xie, and Goeree. Statistical analysis: Hopkins, Tarride, Xie, and Goeree. Obtained funding: Levin, Cohen, Parker, and Goeree. Administrative, technical, and material support: Chow, Freeman, Bowen, Provost, Marcuzzi, Iwanochko, Moody, Parker, and Goeree. Study supervision: Freeman, Provost, Dennie, Paul, Parker, O’Reilly, and Goeree.
Financial Disclosure: Dr Chow receives research support from GE Healthcare, Pfizer, and AstraZeneca and educational support from TeraRecon Inc. Drs Chow and Dennie receive fellowship training support from GE Healthcare. Dr Paul receives research support from Toshiba Medical Systems.
Funding/Support: Dr Chow is supported by Canadian Institutes of Health Research New Investigator Award MSH-83718. Drs Tarride and O’Reilly are supported by Ontario Ministry of Health and Long-Term Care Career Scientists Awards.
Additional Contributions: We acknowledge the support of the Medical Advisory Secretariat and the Ontario Health Technology Advisory Committee and extend our gratitude to the support staff at St Joseph's Healthcare Hamilton, St Michael's Hospital, Sunnybrook Health Sciences Centre, University Health Network, and University of Ottawa Heart Institute for their dedication to cardiac CT research.