Distribution of individuals within each risk category when the model includes traditional risk factors vs traditional risk factors plus coronary artery calcium score (CACS).
Tamar S. Polonsky, Robyn L. McClelland, Neal W. Jorgensen, Diane E. Bild, Gregory L. Burke, Alan D. Guerci, Philip Greenland. Coronary Artery Calcium Score and Risk Classification for Coronary Heart Disease Prediction. JAMA. 2010;303(16):1610–1616. doi:10.1001/jama.2010.461
Author Affiliations: Department of Preventive Medicine, Northwestern University, Chicago, Illinois (Drs Polonsky and Greenland); Department of Biostatistics, University of Washington, Seattle (Dr McClelland and Mr Jorgensen); Division of Cardiovascular Sciences, National Heart, Lung, and Blood Institute, Bethesda, Maryland (Dr Bild); Division of Public Health Sciences, Wake Forest University School of Medicine, Winston-Salem, North Carolina (Dr Burke); and St Francis Hospital, The Heart Center, Roslyn, New York (Dr Guerci).
Context The coronary artery calcium score (CACS) has been shown to predict future coronary heart disease (CHD) events. However, the extent to which adding CACS to traditional CHD risk factors improves classification of risk is unclear.
Objective To determine whether adding CACS to a prediction model based on traditional risk factors improves classification of risk.
Design, Setting, and Participants CACS was measured by computed tomography in 6814 participants from the Multi-Ethnic Study of Atherosclerosis (MESA), a population-based cohort without known cardiovascular disease. Recruitment spanned July 2000 to September 2002; follow-up extended through May 2008. Participants with diabetes were excluded from the primary analysis. Five-year risk estimates for incident CHD were categorized as 0% to less than 3%, 3% to less than 10%, and 10% or more using Cox proportional hazards models. Model 1 used age, sex, tobacco use, systolic blood pressure, antihypertensive medication use, total and high-density lipoprotein cholesterol, and race/ethnicity. Model 2 used these risk factors plus CACS. We calculated the net reclassification improvement and compared the distribution of risk using model 2 vs model 1.
Main Outcome Measures Incident CHD events.
Results During a median of 5.8 years of follow-up among a final cohort of 5878, 209 CHD events occurred, of which 122 were myocardial infarction, death from CHD, or resuscitated cardiac arrest. Model 2 resulted in significant improvements in risk prediction compared with model 1 (net reclassification improvement = 0.25; 95% confidence interval, 0.16-0.34; P < .001). In model 1, 69% of the cohort was classified in the highest or lowest risk categories compared with 77% in model 2. An additional 23% of those who experienced events were reclassified as high risk, and an additional 13% without events were reclassified as low risk using model 2.
Conclusion In this multi-ethnic cohort, addition of CACS to a prediction model based on traditional risk factors significantly improved the classification of risk and placed more individuals in the most extreme risk categories.
The coronary artery calcium score (CACS) has been shown in large prospective studies to be associated with the risk of future cardiovascular events.1- 4 Recent data from the Multi-Ethnic Study of Atherosclerosis (MESA), a population-based cohort of individuals without known cardiovascular disease, found that a CACS greater than 300 was associated with a hazard ratio for future coronary heart disease (CHD) events of nearly 10.4 In addition, including CACS in a prediction model based on traditional risk factors significantly improved the prediction of future CHD events.
While these findings clearly demonstrated strong statistical association of CACS with cardiovascular risk, assessing the clinical value of new markers in risk prediction requires assessment of several additional measures.5 Further investigation should evaluate how closely the predicted probabilities of risk using the new marker reflect observed risk. In addition, Pencina et al6 recently introduced the concept of net reclassification improvement (NRI), which measures the extent to which persons with and without events are appropriately reclassified into clinically accepted higher or lower risk categories with the addition of a new marker. The NRI therefore provides a method of quantifying the enhancement in clinically useful risk estimation when a novel marker is added to a standard risk prediction model. This new approach is rapidly being accepted as an important method for evaluating the clinical utility of new risk markers.7,8
We evaluated the extent to which adding CACS to a model based on traditional risk factors correctly reclassifies participants in the MESA cohort in terms of risk of future CHD events. We determined how the addition of CACS to a prediction model changes the overall distribution of estimated risk. In contrast to previous studies that reported statistical associations only, we sought to clarify the potential utility of CACS as a tool for risk stratification.
The study design for MESA has been published elsewhere.9 In brief, MESA is a prospective cohort study of 6814 persons aged 45 to 84 years without known cardiovascular disease. Participants were recruited from July 2000 through September 2002 and identified themselves as white (38%), black (28%), Hispanic (22%), or Chinese (12%) at the time of enrollment. The study was approved by the institutional review boards of each site, and all participants gave written informed consent.
Carr et al10 reported the details of the MESA CT scanning and interpretation methods. Scanning centers assessed coronary calcium by chest computed tomography (CT) with either a cardiac-gated electron-beam CT scanner (Chicago, Illinois; Los Angeles, California; and New York, New York field centers) or a multidetector CT system (Baltimore, Maryland; Forsyth County, North Carolina; and St Paul, Minnesota field centers). Certified technologists scanned all participants twice over phantoms of known physical calcium concentration. A radiologist or cardiologist read all CT scans at a central reading center (Los Angeles Biomedical Research Institute at Harbor–UCLA, Torrance, California). We used the mean Agatston score for the 2 scans in all analyses.11 Intraobserver and interobserver agreements were excellent (κ = 0.93 and κ = 0.90, respectively). The participants were told either that they had no coronary calcification or that the amount was less than average, average, or greater than average and that they should discuss the results with their physicians.
As part of the baseline examination, clinical teams collected information on traditional cardiovascular risk factors, including age, blood pressure, and tobacco use (current, former, or no prior use). Total and high-density lipoprotein cholesterol, triglycerides, and plasma glucose were measured from blood samples obtained after a 12-hour fast. Using a Dinamap Pro 1000 automated oscillometric sphygmomanometer (Critikon, Tampa, Florida), we measured resting blood pressure 3 times with the participant in a seated position. The mean of the last 2 blood pressure measurements was used.
For the primary analysis, 883 individuals with diabetes were excluded because current National Cholesterol Education Program guidelines consider diabetes a CHD risk equivalent.12 Diabetes was defined as a fasting plasma glucose level greater than 126 mg/dL (7.8 mmol/L) or a history of medical treatment for diabetes.
At intervals of 9 to 12 months, interviewers telephoned participants or a family member to inquire about interim hospital admissions, outpatient diagnoses of cardiovascular disease, and deaths. Follow-up for this analysis extended through May 2008. To verify self-reported diagnoses, trained personnel abstracted data from hospital records for an estimated 96% of hospitalized cardiovascular events; records were available for 95% of outpatient diagnostic encounters. Next of kin and physicians were interviewed for participants who experienced out-of-hospital cardiovascular deaths. Two physician members of the MESA mortality and morbidity review committee independently classified events and assigned incidence dates. If they disagreed, the full committee made the final classification. We classified CHD events as myocardial infarction (MI), death due to CHD, resuscitated cardiac arrest, definite or probable angina followed by coronary revascularization, and definite angina not followed by coronary revascularization. Revascularizations that were not based on a diagnosis of angina were not included in the primary end point.
The diagnosis of MI was based on a combination of symptoms, electrocardiographic findings, and levels of circulating cardiac biomarkers. A death was considered related to CHD if it occurred within 28 days after an MI, if the participant had had chest pain within 72 hours before death, or if the participant had a history of CHD and there was no known nonatherosclerotic, noncardiac cause of death. Reviewers classified resuscitated cardiac arrest when a patient successfully recovered from full cardiac arrest through cardiopulmonary resuscitation (including cardioversion). Adjudicators graded angina on the basis of their clinical judgment. A classification of definite or probable angina required clear and definite documentation of symptoms distinct from the diagnosis of MI. A classification of definite angina also required objective evidence of reversible myocardial ischemia or obstructive coronary artery disease. A more detailed description of the MESA follow-up methods is available at http://www.mesa-nhlbi.org.
Five-year estimated incident CHD risk was calculated for each participant using a Cox proportional hazards model. Model 1 used the standard Framingham risk factors (age, sex, smoking, systolic blood pressure, use of antihypertensive medications, and high-density lipoprotein and total cholesterol) and race/ethnicity. Model 2 used these standard risk factors plus CACS (expressed as ln[CACS + 1]). The risk estimates were categorized as 0% to less than 3%, 3% to less than 10%, and 10% or more, corresponding to low, intermediate, and high risk, respectively. Tests for nonproportional hazards using Schoenfeld residuals were not significant. Interaction of CACS with sex was also tested and was not significant (P = .97).
We assessed discrimination, which reflects a marker's ability to differentiate between individuals who do and do not have events. We constructed receiver operating characteristic (ROC) curves and compared the areas under the ROC curves with and without CACS in the model. We estimated predicted values from a survival model and then treated the end point as binary and uncensored for purposes of estimating and testing the areas under the ROC curves.13 As a sensitivity analysis, we also calculated the Harrell C statistic, which allows censored data.14 These estimates were identical through 2 decimal places to the binary version for both models.
The integrated discrimination index (IDI) measures the improvement in the average sensitivity with the new marker, and subtracts any increase in the mean 1 − specificity. The integrals of sensitivity and 1 − specificity over all possible cutoff values from the (0, 1) interval are used.6 The IDI can be expressed as (EY1 − EY0) − (EX1 − EX0), where EY1 and EY0 are the mean expected probabilities of events and nonevents, respectively, for the model including the new marker and EX1 and EX0 are the mean expected probabilities of events and nonevents, respectively, for the model without the new marker. When the incidence of events is relatively small, it is recommended to calculate the relative IDI as well.6 The relative IDI is defined as (EY1 − EY0)/(EX1 − EX0) − 1.
Cross-tabulations of risk categories based on the models with and without CACS were performed to describe the number and percentage of participants who were reclassified appropriately (ie, to a lower risk group for nonevents or to a higher risk group for events) and inappropriately (ie, to a lower risk group for events or to a higher risk group for nonevents). We calculated the NRI per Pencina et al.6 The NRI is estimated as ([number of events reclassified higher − number of events reclassified lower]/number of events) + ([number of nonevents reclassified lower − number of nonevents reclassified higher]/number of nonevents).
Kaplan-Meier 5-year event rates were calculated. Statistical significance was established a priori as a P<.05.
We sought to determine how the use of lipid-lowering therapy and the presence of diabetes might change the NRI. The NRI was recalculated after excluding individuals who were receiving lipid-lowering therapy at the baseline examination (16% of the cohort). We also recalculated the NRI after including individuals with diabetes. Presence or absence of diabetes was incorporated into the model as an additional variable.
We assessed calibration, which measures how closely the predicted probabilities of risk using the new marker reflect observed risk. We calculated the survival-adapted Hosmer-Lemeshow χ2 statistic for both models.15P<.05 represents a significant difference between the expected and observed event rates and suggests that the model is not well calibrated.
Finally, we examined the risk stratification capacity as described by Janes et al.16 The risk stratification capacity measures the ability of a model to reclassify participants from the intermediate risk categories to the highest and lowest risk categories, where treatment strategies are better delineated.
All analyses were conducted with Stata software, version 11.0 (Stata Corp, College Station, Texas).
The study population included 5931 individuals without diabetes at baseline. Follow-up or risk factor information was not available for 53 individuals, leaving a final cohort of 5878 participants. There were 209 CHD events during a median follow-up of 5.8 years (interquartile range, 5.6-5.9 years). One hundred twenty-two individuals had a major event (96 had an MI, 14 died of CHD, and 12 had a resuscitated cardiac arrest) and 87 had angina (81 with definite angina, of whom 67 were revascularized, and 6 with probable angina followed by revascularization).
Table 1 shows the baseline cardiovascular risk factors, stratified by estimated 5-year risk categories. As expected, the cardiovascular risk profile was less favorable in those with a higher predicted risk and included a higher proportion of men and older individuals.
Measures of discrimination showed a significant improvement with the inclusion of CACS to the prediction model. The area under the ROC curve for the prediction of CHD events was 0.76 (95% confidence interval [CI], 0.72-0.79) using model 1 and increased to 0.81 (95% CI, 0.78-0.84) (P < .001) with the addition of CACS, consistent with a previous MESA report based on fewer events.4 The IDI was 0.026 (P < .001), with the relative IDI showing an 81% improvement in the discrimination slope.
Cross-tabulations of the 5-year estimated risk using the models with and without CACS are shown in Table 2. Kaplan-Meier event rates for the model using traditional risk factors and the model using risk factors plus CACS are shown. The survival-adapted Hosmer-Lemeshow χ2 statistic was 6.72 (P = .46) for the model with traditional risk factors and was 9.15 (P = .24) with the addition of CACS, suggesting that neither model had a significant lack of fit.
The addition of CACS to the predictive model resulted in reclassification of 26% of the sample. The NRI for events was 0.23 and the NRI for nonevents was 0.02, achieving an NRI for the entire study cohort of 0.25 (95% CI, 0.16-0.34; P < .001) (Table 2). The NRI was essentially unchanged after including participants with diabetes (0.27; 95% CI, 0.19-0.34) or excluding participants who were receiving lipid-lowering therapy at the baseline examination (0.26; 95% CI, 0.16-0.37).
Overall, 728 individuals in the entire cohort were reclassified to a higher risk category, with an event rate of 8.7% (95% CI, 6.9%-11.1%), and 814 were reclassified to a lower risk category, with an event rate of 2.7% (95% CI, 1.8%-4.1%). The 5-year event rate for the entire cohort was 3.1% (95% CI, 2.7%-3.6%).
We evaluated separately the most clinically meaningful reclassifications, which would presumably have the largest effect on treatment decisions. When CACS was added to the model, 298 (5.1%) were reclassified as high risk. Among those upgraded to high risk, 49 individuals (16.4%) experienced events. Conversely, 744 (12.7%) were reclassified as low risk, of whom 17 (2.3%) experienced events. Two high-risk individuals who were reclassified as low risk experienced events (6.3%).
Among intermediate-risk individuals, 292 (16%) were reclassified as high risk, while 712 (39%) were classified as low risk (NRI, 0.55; 95% CI, 0.41-0.69; P < .001). The improvement in risk classification is more balanced between events and nonevents for intermediate-risk individuals than the overall cohort (0.29 for events and 0.26 for nonevents). Furthermore, of the 115 events that occurred among intermediate-risk participants, 48 (41%) were among individuals reclassified as high risk whereas 15 (13%) were among individuals reclassified as low risk.
The hazard ratios associated with risk of a CHD event before and after adjustment for CACS are shown in Table 3. Inclusion of CACS into the model substantially attenuated the risk associated with all of the risk factors, although the hazard ratio associated with high-density lipoprotein cholesterol was least influenced by the inclusion of CACS to the model.
The risk stratification capacity of a CACS-adjusted model is shown in the Figure. The left panel shows that including CACS in the model places 77% of the overall population into either the highest or lowest risk categories, compared with 69% with traditional risk factors alone. With the addition of CACS to the model, an additional 23% of those who experienced events were reclassified as high risk (center panel) and an additional 13% of those who did not experience events were reclassified as low risk (right panel).
The results of this study demonstrate that when CACS is added to traditional risk factors, it results in a significant improvement in the classification of risk for the prediction of CHD events in an asymptomatic population-based sample of men and women drawn from 4 US racial/ethnic groups. Our results highlight improvements in risk classification when using CACS. Incorporation of an individual's CACS leads to a more refined estimation of future risk of CHD events than traditional risk factors alone. The intermediate-risk group achieved a substantially higher NRI than the overall cohort and, therefore, appears to benefit the most from a CACS-adjusted strategy. This study provides strong evidence that there may be a significant amount of clinically useful reclassification when CACS is added to risk assessment in asymptomatic intermediate-risk patients.
Considerable debate remains about how best to use CACS for risk assessment. Current American College of Cardiology/American Heart Association statements recommend that asymptomatic individuals at intermediate Framingham risk may be reasonable candidates for CHD testing using CACS.17 However, particular concern has been raised about the safety and cost associated with the widespread use of CACS. One recent study suggested an elevated cancer risk if a calcium score is obtained every 5 years.18 Others have questioned whether a CACS-guided strategy may actually cost more money and prevent fewer events than simply treating all patients at intermediate risk.19 In the setting of such uncertainty, it is important to understand how to maximize the potential benefits of using CACS while minimizing harm.
Direct comparisons to studies evaluating the NRI with other biomarkers should be made with caution because the number of risk categories used, definition of the primary outcome, and length of follow-up often differ between studies. However, it is of interest that the NRI achieved with the addition of lipoprotein particles was negligible, with glycosylated hemoglobin was 0.034, with midregional proadrenomedullin with N-terminal pro-β-natriuretic peptide was 0.047, and with high-sensitivity C-reactive protein with family history was 0.068.20- 23 In another study from MESA, the use of brachial artery flow-mediated dilation resulted in an NRI of 0.29.24 However, this included a substantial proportion of inappropriate reclassifications downward among individuals who experienced events (23%).
An important effect of a marker for the prediction of risk is the number of persons identified as having a higher disease risk and, consequently, becoming eligible to receive more intensive therapy as a result of screening. A relatively small proportion of the total MESA population, 5.1%, was reclassified as high risk. Importantly, almost 60% of the events (123/209) occurred among individuals who were not classified as high risk either by traditional risk factors or CACS. The smaller number of participants who were classified as high risk is likely in part a reflection of the study population. More than half of the MESA cohort is in the lowest 5-year risk category based on traditional risk factors. Participants who were low risk required very elevated CACS to be reclassified as high risk. In contrast, the proportions of individuals reclassified were larger among intermediate-risk participants (16% to high risk and 39% to low risk). Almost half of the events among participants who were intermediate risk based on traditional risk factors alone occurred in individuals who were reclassified as high risk based on their CACS (48/115).
Inspection of the relative contribution of correct reclassification for events and nonevents also reveals important strengths and weaknesses of a CACS-adjusted strategy. For the entire cohort, the NRI for events was 0.23, whereas the NRI for nonevents was 0.02. The results suggest that when applied to a general population, a CACS-adjusted strategy may effectively identify more individuals who experience events, but at the expense of identifying many other individuals as higher risk who do not experience events. With the availability of generic statins and years of data confirming their tolerability, the disadvantages of “overtreatment” may have become less significant over time. However, the improvement in risk classification is more balanced among intermediate-risk individuals (0.29 for events and 0.26 for nonevents), again suggesting that a CACS-adjusted strategy may be most clinically useful in this group.
Another metric of a risk marker's utility is whether it separates individuals into more clinically relevant risk categories, as seen by the risk stratification capacity. Ideally, a model would reclassify most of the individuals out of the intermediate-risk group and into the highest or lowest risk categories. When CACS is added to the model, more than half of the intermediate-risk individuals are reclassified as high and low risk, where treatment strategies are better established.
The values in the margins of the reclassification table best represent the net effect of including a novel marker in a risk prediction model.16 However, looking at individual cells can shed light on the potential limits of applying a marker to the clinical setting. Only 4 of more than 3000 low-risk individuals were reclassified as high risk, suggesting that CACS may not be an efficient screening tool among low-risk individuals. An additional concern is whether physicians can safely withhold or decrease therapy for patients who are reclassified to lower risk categories. We report that individuals who were reclassified from high risk to low risk experienced an event rate that was higher than predicted by the model with CACS. While the absolute number of events was small, our data support the recommendation that patients who are at high risk should be treated regardless of their CACS and, as a result, should not undergo CACS testing for additional risk assessment.
A critical question not answered in this study is whether screening for subclinical disease with CACS improves patient outcomes. In a recent American Heart Association scientific statement, the steps needed before widespread adoption of a risk marker were outlined.5 Initial phases of evaluation should demonstrate that a marker can differentiate between people with and without events, prospectively predict future events, and add predictive information to traditional risk factors—all of which have been accomplished with CACS. The results in the current report address the fourth phase, in which a marker must be shown to adjust predicted risk sufficiently to change recommended therapy. Whether the use of a marker improves clinical outcomes enough to justify the associated cost should be tested in the final phase, preferably with a randomized clinical trial.
Our study has limitations that should be acknowledged. Our results will need to be validated in additional populations. Had our study population contained a larger proportion of higher-risk individuals, we may have seen higher event rates and different rates of reclassification. It is also possible that with longer follow-up and additional events, our results could change.
In MESA, CACS was revealed to participants and their physicians. This could have affected our results in 2 ways. Knowledge of a high CACS may have biased the diagnosis of angina and, thus, could have increased the NRI. Alternatively, participants with a high CACS may have had more intensive risk factor modification, thereby reducing the number of events and decreasing the NRI. We do not expect that the diagnosis of major coronary events would have been influenced by CACS.
In conclusion, we found that use of CACS plus traditional risk factors substantially enhances the ability to classify a multiethnic cohort of asymptomatic persons without known CVD into clinically accepted categories of risk of future CHD events. The results provide encouragement for moving to the next stage of evaluation to assess the use of CACS on clinical outcomes.
Corresponding Author: Philip Greenland, MD, Feinberg School of Medicine, Northwestern University, 750 N Lake Shore Dr, 11th Floor, Chicago, IL 60611.
Author Contributions: Dr Polonsky had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Polonsky, McClelland, Greenland.
Acquisition of data: McClelland, Bild, Burke, Guerci, Greenland.
Analysis and interpretation of data: Polonsky, McClelland, Jorgensen, Bild, Burke, Guerci, Greenland.
Drafting of the manuscript: Polonsky, Greenland.
Critical revision of the manuscript for important intellectual content: McClelland, Jorgensen, Bild, Burke, Guerci.
Statistical analysis: McClelland, Jorgensen.
Obtained funding: Bild, Burke, Guerci, Greenland.
Administrative, technical or material support: Polonsky, McClelland, Jorgensen, Bild, Burke, Guerci, Greenland.
Study supervision: McClelland, Bild, Burke, Guerci, Greenland.
Financial Disclosures: Dr Guerci reports that he has received grant support from Pfizer. Dr Greenland reports that he has served as a consultant to Pfizer and GE/Toshiba. No other disclosures were reported.
Funding/Support: MESA was supported by contracts N01-HC-95159 through N01-HC-95169 from the National Heart, Lung, and Blood Institute (NHLBI). Dr Polonsky is supported by an NHLBI training grant in cardiovascular epidemiology and prevention (grant 5T32HL069771). Dr Greenland is supported by a grant from the National Center for Research Resources (grant number 1UL1RR025741).
Role of the Sponsors: The NHLBI participated in the design and conduct of MESA. A member of the NHLBI staff served as a coauthor and had input into the collection, management, analysis, and interpretation of the data and in preparation of the manuscript, as did the other coauthors. Although additional members of the NHLBI staff were able to view the manuscript prior to submission, they did not participate in the decision to submit the manuscript or approve it prior to publication. The National Center for Research Resources had no role in the design and conduct of the study, in the collection, analysis, and interpretation of the data, or in the preparation, review, or approval of the manuscript.
Additional Information: A full list of participating MESA investigators and institutions can be found at http://www.mesa-nhlbi.org.
Additional Contributions: We thank the other investigators, the staff, and the participants of MESA for their valuable contributions.