Evidence reviews for the US Preventive Services Task Force (USPSTF) use an analytic framework to visually display the key questions that the review will address to allow the USPSTF to evaluate the effectiveness and safety of a preventive service. The questions are depicted by linkages that relate interventions and outcomes. Further details are available in the USPSTF procedure manual.16 ABI indicates ankle-brachial index; CAC, coronary artery calcium; CVD, cardiovascular disease; hsCRP, high-sensitivity C-reactive protein; MI, myocardial infarction.
aRisk factors: age, sex, blood pressure, levels of total and high-density lipoprotein cholesterol, smoking, diabetes, race/ethnicity.
In total, the current review included 54 articles (43 studies); studies may appear in more than 1 key question (KQ). Reasons for exclusion: Aim: Study aim was not relevant. Setting: Study was not conducted in a country relevant to US practice, or not conducted in, recruited from, or feasible for primary care or a health care system. Population: Study was not conducted in adults without known cardiovascular disease. Outcomes: Study did not report required outcomes. Intervention: Intervention was out of scope. Study design: Study did not use an included design. Quality: Study was poor quality. Base model: Eligible base models had to include age, sex, systolic blood pressure, antihypertensive medication use, total cholesterol, high-density lipoprotein cholesterol, and current smoking status; eligible base models could not include additional risk factors. Comparator: Study did not include an eligible model for comparison. Ancillary: The article met inclusion criteria, but more complete data such as a larger sample or longer follow-up was abstracted from a different included article.
aJanuary 2007-May 19, 2017 (high-sensitivity C-reactive protein); January 2008-May 19, 2017 (coronary artery calcium score); January 2012-May 19, 2017 (ankle-brachial index).
eMethods. Literature Search Strategies for Primary Literature
eTable 1. Inclusion Criteria
eTable 2. Quality Assessment Criteria
eTable 3. Examples of Types of Test Performance Measures for Comparing Risk Assessment or Prediction Models
eTable 4. Discrimination Outcomes in Included ABI Risk Prediction Studies (KQ2)
eTable 5. Discrimination Outcomes in Included hsCRP Risk Prediction Studies (KQ2)
eTable 6. Discrimination Outcomes in Included CAC Risk Prediction Studies (KQ2)
eTable 7. Evidence Overview for ABI, hsCRP, and CAC in Cardiovascular Risk Assessment
Customize your JAMA Network experience by selecting one or more topics from the list below.
Lin JS, Evans CV, Johnson E, Redmond N, Coppola EL, Smith N. Nontraditional Risk Factors in Cardiovascular Disease Risk AssessmentUpdated Evidence Report and Systematic Review for the US Preventive Services Task Force. JAMA. 2018;320(3):281–297. doi:10.1001/jama.2018.4242
Incorporating nontraditional risk factors may improve the performance of traditional multivariable risk assessment for cardiovascular disease (CVD).
To systematically review evidence for the US Preventive Services Task Force on the benefits and harms of 3 nontraditional risk factors in cardiovascular risk assessment: the ankle-brachial index (ABI), high-sensitivity C-reactive protein (hsCRP) level, and coronary artery calcium (CAC) score.
MEDLINE, PubMed, and the Cochrane Central Register of Controlled Trials for studies published through May 22, 2017. Surveillance continued through February 7, 2018.
Studies of asymptomatic adults with no known cardiovascular disease.
Data Extraction and Synthesis
Independent critical appraisal and data abstraction by 2 reviewers.
Main Outcomes and Measures
Cardiovascular events, mortality, risk assessment performance measures (calibration, discrimination, or risk reclassification), and serious adverse events.
Forty-three studies (N = 267 244) were included. No adequately powered trials have evaluated the clinical effect of risk assessment with nontraditional risk factors on patient health outcomes. The addition of the ABI (10 studies), hsCRP level (25 studies), or CAC score (19 studies) can improve both discrimination and reclassification; the magnitude and consistency of improvement varies by nontraditional risk factor. For the ABI, improvements in performance were the greatest for women, in whom traditional risk assessment has poor discrimination (C statistic change of 0.112 and net reclassification index [NRI] of 0.096). Results were inconsistent for hsCRP level, with the largest analysis (n = 166 596) showing a minimal effect on risk prediction (C statistic change of 0.0039, NRI of 0.0152). The largest improvements in discrimination (C statistic change ranging from 0.018 to 0.144) and reclassification (NRI ranging from 0.084 to 0.35) were seen for CAC score, although CAC score may inappropriately reclassify individuals not having cardiovascular events into higher-risk categories, as determined by negative nonevent NRI. Evidence for the harms of nontraditional risk factor assessment was limited to computed tomography imaging for CAC scoring (8 studies) and showed that radiation exposure is low but may result in additional testing.
Conclusions and Relevance
There are insufficient adequately powered clinical trials evaluating the incremental effect of the ABI, hsCRP level, or CAC score in risk assessment and initiation of preventive therapy. Furthermore, the clinical meaning of improvements in measures of calibration, discrimination, and reclassification risk prediction studies is uncertain.
Cardiovascular disease (CVD) is the leading cause of death in the United States, accounting for about 1 in 3 deaths.1 The incidence of CVD is strongly associated with a set of traditional risk factors,2-5 which have been combined using multivariable risk assessment tools to estimate an individual’s risk for having a CVD event.6 However, these tools can underestimate or overestimate CVD risk.7 Inclusion of nontraditional biological and physiologic risk factors might improve the performance of tools. Given that risk estimates are used to guide preventive therapy such as aspirin and statins,6,8-10 improved risk assessment performance could result in improved CVD outcomes.
Previous research has established that abnormal values for the nontraditional risk factors of the ankle-brachial index (ABI), high-sensitivity C-reactive protein (hsCRP) level, and coronary artery calcium (CAC) score are significantly associated with morbidity and mortality.11-13 Moreover, this research has shown that these factors can improve the ability of models to distinguish between individuals who will and will not have events and to reclassify individuals into clinically actionable risk strata. However, previous systematic reviews were conducted before the release of newly recommended risk tools and treatment thresholds. Additionally, these reviews did not address the ability of nontraditional risk factors to improve on existing models’ overestimation or underestimation of risk.
Based on previous systematic reviews, the US Preventive Services Task Force (USPSTF) issued recommendations in 2009 and 2013, concluding that the evidence was insufficient to assess the benefits and harms of using nontraditional risk factors for cardiovascular risk assessment (I statements).14,15 This review sought to identify and appraise updated evidence on the benefits and harms of nontraditional risk factor assessment and therapy guided by the ABI, hsCRP level, and CAC score to support the USPSTF in updating its 2009 and 2013 recommendations.
This review addressed 5 key questions (KQs) as shown in Figure 1. Methodological details including study selection, a list of excluded studies, data analysis methods, and additional subpopulation results, as well as detailed descriptions of all models, are available in the full evidence report at https://www.uspreventiveservicestaskforce.org/Page/Document/UpdateSummaryFinal/cardiovascular-disease-screening-using-nontraditional-risk-assessment.
MEDLINE, PubMed, and the Cochrane Central Register of Controlled Trials were searched through May 22, 2017, to identify literature published after 2 reviews for the USPSTF11,12,17 (eMethods in the Supplement). All studies in the prior reviews were also evaluated, as well as reference lists of other systematic reviews.18-24 ClinicalTrials.gov and the World Health Organization International Clinical Trials Registry Platform were searched for relevant ongoing trials. Since May 22, 2017, ongoing surveillance through article alerts and targeted searches of journals with a high impact factor and journals relevant to the topic was conducted to identify major studies published in the interim that may affect the conclusions or understanding of the evidence and therefore the related USPSTF recommendation. The last surveillance was conducted on February 7, 2018, and identified no additional relevant studies.
Investigators reviewed 22 707 unique citations and 483 full-text articles against a priori inclusion criteria (Figure 2; eTable 1 in the Supplement).
Studies of adults without known CVD that were published in English were eligible for inclusion. For each KQ, subpopulation analyses by sex, race/ethnicity, and diabetes were identified a priori. For KQ1, trials comparing traditional risk assessment with traditional risk assessment plus the ABI, hsCRP level, or CAC score that reported patient health outcomes (ie, CVD events, mortality, or both) were included. For KQ2, individual participant data meta-analyses, trials, and well-designed prospective cohort studies evaluating risk prediction in models with traditional risk factors (base model), compared with models additionally including the ABI, hsCRP level, or CAC score (extended model), were included. KQ2 studies were required to include a measure of calibration, discrimination, or reclassification (Box). Eligible base models were the Pooled Cohort Equations,27 the Framingham Risk Score,28-32 or models including the same variables: age, sex, systolic blood pressure, use of antihypertensive medication, total cholesterol level, high-density lipoprotein cholesterol level, and smoking status. Models were eligible with or without inclusion of race/ethnicity and diabetes status. Studies with additional variables in their base models were excluded, as this would have precluded isolation of the effect of the nontraditional risk factor of interest. Extended models that incorporated multiple nontraditional risk factors were excluded if the effect of newly added risk factors could not be isolated.
Calibration refers to the agreement between observed and predicted outcomes. Calibration plots and observed to expected ratios were identified a priori as preferable measures of calibration because of their ability to indicate direction of miscalibration, capacity to compare across models. and intuitive graphic interpretation.
Discrimination is the ability to distinguish between individuals who will and will not have an event. This analysis uses the change in C statistic to measure discrimination. The C statistic is the probability that, for a randomly selected pair of individuals, 1 with disease and the other without, the person with disease will have the higher estimated disease probability according to the model.25
Reclassification reflects the ability of a new model to appropriately reassign people into different risk strata. The most commonly used measure in this review was the Net Reclassification Index (NRI), which captures the reclassification of individuals that occurs when traditional risk assessment is enriched with the ABI, hsCRP level, or CAC score, where clinically meaningful risk strata are typically defined by treatment thresholds. Reclassification to a higher risk category is considered appropriate when the individual has an outcome, and reclassification to a lower risk category is considered appropriate when an individual does not have an outcome. The NRI is the sum of differences in proportions of individuals moving up a risk category minus those moving down a risk category with a cardiovascular disease outcome (known as the event NRI), plus the proportion moving down a risk category minus those moving up a risk category without an outcome (known as the nonevent NRl).26
Abbreviations: ABI, ankle-brachial index; CAC, coronary artery calcium; hsCRP, high-sensitivity C-reactive protein; KQ, key question.
For KQ3 and KQ5, trials, prospective and retrospective cohort studies, and well-designed case-control studies examining harms of risk assessment or treatment guided by risk assessment were included. Harms included any serious adverse event requiring unexpected or unwanted medical attention resulting from risk assessment or harms from risk factor modification. For assessment of CAC score, radiation exposure from computed tomography (CT) was included as a potential harm. For KQ4, trials of treatment guided by nontraditional risk factor assessment in addition to traditional risk assessment vs no treatment or usual care that reported patient health outcomes (ie, CVD events, mortality, or both) were included.
Two independent reviewers critically appraised the included studies using predefined criteria,16,33,34 with a third reviewer resolving disagreements (eTable 2 in the Supplement). Articles were rated as good, fair, or poor quality. In general, a good-quality study met all criteria. A fair-quality study did not meet, or it was unclear whether it met, at least 1 criterion but had no known important limitations that could invalidate its results. A poor-quality study had a single fatal flaw or multiple important limitations. Poor-quality studies were excluded from the review. Two studies, both for KQ2, were excluded as poor quality because of nonrepresentative sampling of patients, self-reported outcomes, limited duration of follow-up, and/or a small number of CVD events.35,36 One reviewer abstracted descriptive and outcome data from each included study into standardized evidence tables; a second checked for accuracy and completeness.
Data for each KQ are summarized narratively according to nontraditional risk factor. Quantitative analyses were not conducted because of the limited number of studies for each KQ or methodological and clinical heterogeneity, including differences in outcomes and treatments evaluated. KQ2 results are stratified by calibration, discrimination, or reclassification (Box). A full description of selected measures for each domain and their limitations is available in eTable 3 in the Supplement.
Because there is no guidance in existing literature about how to characterize the magnitude or clinical meaning of changes in discrimination (KQ2),26 the following definitions were used for practical reasons. For changes in the C statistic, the term “large” is used to denote changes of 0.1 or greater, “moderate” for changes of 0.05 to 0.1, “small” for changes of 0.025 to 0.05, and “very small” for changes less than 0.025. C statistics range from 0.5 to 1.0; the 0.1 cutpoint for “large” was set because it represents 20% of the possible range. A change in C statistic of 0.025 approximates a 5% higher sensitivity when specificity is 50%.37
KQ2 analyses were additionally stratified by (1) type of model design—ie, published coefficient vs model development; (2) choice of the Pooled Cohort Equations or the Framingham Risk Score as the base model; and (3) prediction of global CVD outcomes, which include coronary and noncoronary events. Published coefficient model studies are risk prediction studies evaluating the added prognostic value of the ABI, hsCRP level, or CAC score that preserved the coefficients from the published Pooled Cohort Equations or Framingham Risk Score, which are readily available for use in clinical practice. This is in contrast to model development studies, which fit entirely new models with locally developed coefficients, such that these models are generally not available in the public domain for clinical use. Studies that used the original published coefficients of the Framingham Risk Score or the Pooled Cohort Equations were considered preferable to model development studies because of their applicability.
To help illustrate the clinical meaning of the Net Reclassification Index (NRI), event NRI, and nonevent NRI for each nontraditional risk factor, reclassification tables from selected studies were modified to correspond to current treatment thresholds and show reclassification in both absolute and relative terms. A 2016 Multi-Ethnic Study of Atherosclerosis (MESA) published coefficient analysis (which evaluated all 3 nontraditional risk factors using both Pooled Cohort Equations and Framingham Risk Score base models)38 and 2 individual patient data meta-analyses (available for the ABI39 and hsCRP level40) were selected based on the applicability of the models evaluated, the ability to compare across risk factors, the larger populations represented therein, and the reporting of reclassification tables allowing for this analysis. For the Framingham Risk Score–based analyses, which used 3 risk strata, the top 2 risk strata (10%-20% and >20%) were combined to conform to the current USPSTF recommendation to initiate preventive low-dose aspirin and statins based on a 10-year CVD risk of 10% or greater.
In total, 43 studies reported in 54 publications (n = 267 244) were included (Figure 2).24,25,38-89 Thirty-eight unique trials or cohorts are represented in the included literature. For all KQs, additional descriptive and outcome data are available in the full report.
Key Question 1. Compared with the Pooled Cohort Equations or Framingham risk factors alone, does risk assessment of asymptomatic adults using nontraditional risk factors lead to reduced incidence of cardiovascular events, mortality, or both?
Only 1 fair-quality trial evaluated the incremental value of nontraditional risk factor assessment on CVD events. The Early Identification of Subclinical Atherosclerosis by Non-invasive Imaging Research (EISNER) trial (n = 2137), conducted in the United States, randomized volunteers to undergo CT scanning for CAC scoring in addition to the Framingham Risk Score vs no CAC scoring before risk factor counseling.75 Participants were middle-aged adults with CVD risk factors but no known CVD or symptoms. This study found no statistically significant difference in myocardial infarction, mortality, or combined myocardial infarction and mortality at 4 years between the 2 groups; however, the trial did not have adequate sample size and length of follow-up to detect differences in these outcomes, as the primary outcome of this trial was a change in CVD risk factors and Framingham Risk Score at 4 years.
Key Question 2. Does use of nontraditional risk factors in addition to traditional risk factors to predict cardiovascular disease risk improve measures of calibration, discrimination, and risk reclassification?
Thirty-three studies reported in 43 articles evaluated the ABI, hsCRP level, and/or CAC score in addition to traditional cardiovascular risk assessment and reported 1 or more measures of calibration, discrimination, and/or risk reclassification. Ten studies evaluated the ABI, 25 studies evaluated hsCRP level, and 19 studies evaluated CAC score (Table 1). The evidence base for the ABI and hsCRP included large individual patient data meta-analyses39,40; in these cases, the meta-analyses are discussed as the central piece of evidence and other studies of individual cohorts are discussed in relation to these meta-analyses.
In general, participants included in studies represented a broad range of primary prevention populations to whom CVD risk assessment is applicable. Most individuals were recruited as part of population-based cohort studies; additional populations were from cohorts derived from treatment trials72,77,79,83,86 and 1 registry of individuals who received CAC scores.44 Some analyses excluded people with known diabetes or already taking statins.25,38,53,57,65,69,71,76,88
Most included studies have important limitations in their applicability to current clinical practice. Fourteen studies (32.6%) used the published coefficients of publicly available Framingham Risk Score or Pooled Cohort Equations models. Moreover, the “low,” “intermediate,” and “high” risk strata used in many studies are not consistent with those used in current practice, where a single threshold, such as 7.5% or 10% 10-year risk, is used for decision making to initiate preventive therapy.
Limited data are available to inform whether the addition of the ABI (5 studies), hsCRP level (9 studies), or CAC score (8 studies) can improve agreement between predicted and observed events of risk assessment models. Across all 3 nontraditional risk factors, less than half of studies reported a measure of calibration (Table 1). Of these, only 224,25 reported the preferred measures of calibration plots and observed to expected ratios, and these are reported only for studies evaluating hsCRP level. These studies showed that the addition of hsCRP can improve calibration in some risk groups but worsen it in others; however, the small number of events—particularly in lower-risk groups—precludes definitive conclusions.
A large body of literature (10 studies; n = 79 583) consistently shows that the addition of the ABI to risk prediction models generally results in no to small improvement in a model’s ability to distinguish between individuals who will and will not have cardiovascular events (C statistic change of −0.006 to 0.036); however, improvements in discrimination can be large (eg, change of 0.112) when base models perform poorly (eTable 4 in the Supplement). Most models are not directly applicable to current practice because most are model development studies rather than published coefficient analyses, and evidence for the addition of the ABI to the Pooled Cohort Equations is sparse. The 1 study evaluating the Pooled Cohort Equations, a 2016 analysis of the MESA study38 showed moderate base model discrimination of 0.74 and no statistically significant change in discrimination of 0.01 when the ABI was added. The central piece of evidence for the addition of the ABI to the Framingham Risk Score is the individual patient data meta-analysis from the ABI Collaboration, which includes 11 421 individuals in the external validation data set. Both published coefficient and model development analyses were conducted, and results were stratified by sex.39 For published coefficient analyses, the base model performed poorly in women, with a C statistic of 0.578 (95% CI, 0.492 to 0.661), and the addition of the ABI to the model showed a large improvement in discrimination, with a change of 0.112. The base model for men showed better discrimination, with a C statistic of 0.672 (95% CI, 0.599 to 0.737) and a smaller change of 0.013 when the ABI was added to the model. In the individual patient data meta-analysis model development analyses, base model performance was better because the model was fit to the studied population. The base model C statistics were 0.788 (95% CI, 0.709 to 0.850) for women and 0.683 (95% CI, 0.611 to 0.748) for men. The resulting changes in discrimination were very small when the ABI was added to the model: 0.003 for women and 0.007 for men. Results for cohorts not included in the individual patient data meta-analysis were consistent with those included in the analysis.38,52,56,81,88
The body of evidence for hsCRP level is much larger (25 studies; n = 265 704) than for ABI and CAC but demonstrates less consistent findings (eTable 5 in the Supplement). The hsCRP literature is dominated by model development studies, which add the nontraditional risk factor to base models including Framingham Risk Score variables. The 1 study evaluating the addition of hsCRP to the Pooled Cohort Equations, the 2016 MESA analysis,38 showed moderate base model discrimination of 0.74 and no change in discrimination when hsCRP level was added to the model. The central piece of evidence for the addition of hsCRP level to the Framingham Risk Score is the Emerging Risk Factors Collaboration individual patient data meta-analysis, which involved 166 596 participants and 13 568 hard CVD events; this was a model development study with a moderate base model discrimination of 0.714.40 This analysis showed that the addition of hsCRP level increased discrimination by 0.0039 (95% CI, 0.0028 to 0.0050) for predicting a composite of fatal and nonfatal myocardial infarction and cerebrovascular accident. Exploratory subgroup analyses showed a very small, statistically significant improvement in men and no change in women; the P value for heterogeneity was less than .001. Results were inconsistent for studies not included in the individual patient data meta-analysis, ranging from a worsening of discrimination with the addition of hsCRP level24 to a small improvement59,71; however, estimates of even small improvement likely represent an upper bound because of study design limitations.
The evidence (18 studies; n = 60 486) shows that the addition of CAC score to traditional risk assessment models results in the largest improvement in discrimination of all nontraditional risk factors evaluated, with change in the C statistic ranging from 0.018 to 0.144 (eTable 6 in the Supplement); however, there is no available individual patient data meta-analysis allowing for robust sex-stratified analyses. Two studies evaluated published coefficient Pooled Cohort Equations models; however, both were analyses of the same cohort.38,51 The MESA analysis by Yeboah et al,38 which had moderate base model discrimination of 0.74, found a very small to small, statistically significant improvement of 0.02 with the addition of CAC score to the model. The MESA analysis by Fudim et al,51 which presented only sex-stratified results, demonstrated similar findings, although statistically significant only for men. Results for published coefficient Framingham Risk Score base models were statistically significant and showed a slightly higher magnitude of change. For example, the 2016 MESA analysis found a statistically significant improvement in discrimination of 0.04.38 Results for model development studies were generally consistent with results from published coefficient analyses.
A large but heterogeneous body of literature (9 studies; n = 46 979) showed that the addition of the ABI to risk prediction models can improve the appropriate reclassification of individuals into clinically meaningful risk strata; NRIs are at best less than 0.1 and are usually much smaller and often nonsignificant (Table 2). There was considerable variation in the definitions of risk strata used across studies, and only 1 study used the Pooled Cohort Equations, with a treatment threshold of 7.5% 10-year risk.38 In this study, the total NRI was 0.017 (95% CI, −0.031 to 0.058). The individual patient data meta-analysis published coefficient Framingham Risk Score analysis used 10-year risk strata of less than 10%, 10% to 19%, and 20% or greater.39 The NRIs based on these 3 strata are no longer as relevant because current treatment thresholds to initiate preventive therapies have been lowered to 7.5% or 10%, and reported NRI will capture movement between the middle and upper categories. Consistent with findings for discrimination, the addition of the ABI showed a larger improvement in women than in men. In contrast to the findings for the Pooled Cohort Equations, this improvement was statistically significant for the Framingham Risk Score base model: 0.096 (95% CI, 0.061 to 0.164) for women and 0.043 (95% CI, 0.008 to 0.076) for men. Event NRIs (the appropriate upward reclassification of individuals having CVD events) were larger than nonevent NRIs (the appropriate downward reclassification of individuals not having CVD events). However, the nonevent NRI for women was negative and statistically significant, meaning that more women without events were inappropriately reclassified to a higher risk stratum than appropriately reclassified to a lower risk stratum. Consideration of the event and nonevent NRI separately is important because the total NRI does not weight for the proportion of individuals who have or do not have events; in the included primary prevention populations, it is much more common not to have an event. Cohorts not included in the individual patient data meta-analysis showed a similar magnitude of reclassification with the addition of the ABI, with mixed statistical significance.
The body of evidence for the addition of hsCRP level to risk prediction models (15 studies; n = 115 686) shows inconsistent evidence for an improvement in the appropriate reclassification of individuals into clinically meaningful risk strata; evidence from the largest analysis suggests an overall NRI of no greater than 0.02 (Table 2). As with the evidence for the ABI, risk strata were defined variably among studies, with few using thresholds applicable to current practice. Additionally, many studies used 3 risk strata, which is no longer as relevant to current clinical practice. Only 1 study used the Pooled Cohort Equations with treatment threshold of 7.5% 10-year risk, and this was the same study that evaluated the addition of the ABI to the Pooled Cohort Equations.38 In this study, the total NRI was 0.024 (95% CI, −0.015 to 0.067). The IPD meta-analysis model development study (n = 72 574) used 10-year risk strata of less than 10%, 10% to 19%, and 20% or greater.40 The total NRI and event NRI were 0.0152 (95% CI, 0.0078 to 0.0227) and 0.0146 (95% CI, 0.0073 to 0.0219), respectively. However, the nonevent NRI was much smaller and nonsignificant (0.0006 [95% CI, −0.0009 to 0.0022]); about 92% of individuals in the individual participant data meta-analysis did not have events. Similar to results for discrimination, exploratory subgroup analyses by sex show that reclassification was greater for men than for women, but reclassification results were not significant for either sex individually. Two other studies offer some confirmatory evidence about a larger effect for men, although reporting inconsistencies preclude definitive conclusions.60,76 The results of cohorts not included in the individual patient data meta-analysis were also inconsistent with respect to magnitude and statistical significance. Predicted outcome, definitions of risk strata, and case mix did not explain differences among studies; however, these comparisons were limited by several concurrent sources of heterogeneity.
Of the 3 nontraditional risk factors evaluated in this review, the addition of CAC score to traditional risk assessment models has the strongest effect on appropriate reclassification (15 studies; n = 58 289). Results range from 0.084 (95% CI, 0.024 to 0.196) to 0.35 (95% CI, 0.11 to 0.58) and are consistently statistically significant; however, the nonevent NRI is negative in several studies (Table 2). In the absence of an individual patient data meta-analysis for CAC score, few reliable data are available to inform whether effect modification exists by sex. As for other nontraditional risk factors, studies used a variety of definitions of risk strata, which precludes definitive comparisons. The most applicable evidence is from the published coefficient Pooled Cohort Equations analysis,38 which reported an overall NRI of 0.119 (95% CI, 0.080 to 0.256). Numerically, the overall NRI was driven by the event NRI of 0.178 (0.080 to 0.256). The nonevent NRI was negative (−0.059 [95% CI, −0.075 to 0.030]), meaning that individuals not having events were inappropriately reclassified upward into higher risk and above a treatment threshold. While not statistically significant in this study, the negative nonevent NRI is statistically significant in other studies.55,58,68
The clinical significance of the NRI, event NRI, and nonevent NRI can be understood by illustrating the absolute number of people appropriately and inappropriately reclassified in terms of current treatment thresholds. Selected examples from included studies are provided in Table 3. In the example for the Pooled Cohort Equations, 76 people having CVD events were appropriately reclassified upward when CAC score was added to risk assessment, and 19 people who had events were inappropriately reclassified downward who had a CVD event—a net improvement of 57 individuals among the 320 having events, about 18 events per 100 people (reported event NRI of 0.178 [95% CI, 0.080 to 0.256]). However, in the primary prevention populations to which CVD risk assessment with the Pooled Cohort Equations or Framingham Risk Score applies, the majority of people will not experience a CVD event. With the addition of CAC score to risk assessment, 202 people not having events are appropriately reclassified downward, but 496 people are inappropriately reclassified upward—on net, a worsening of reclassification of 294 individuals of 4865 not having events—or about 6 events per 100 people (reported nonevent NRI of −0.059 [ 95% CI, −0.075 to 0.030]). Therefore, the NRI of 0.119 [95% CI, 0.080 to 0.256] does not convey that for CAC score a sizeable proportion of individuals who are not having events will now be considered for treatment. The addition of the ABI to the Framingham Risk Score in women showed a similar pattern.
Key Question 3. What are the harms of nontraditional risk factor assessment?
Evidence for the harms of nontraditional risk factor assessment was limited to 8 studies evaluating CAC score; no eligible studies evaluated the potential harms of the ABI or hsCRP level. Four studies reported radiation exposure for CT imaging to obtain CAC score,43,55,62,75 and 5 studies reported other potential adverse events from CAC score measurement such as psychological outcomes, adverse cardiovascular events, and health care utilization.45,66,67,75,80
Overall, the radiation exposure or effective radiation dose per CT examination is low: 2 mSv or less. Based on 2 studies (n = 1619)—a small randomized clinical trial (RCT) and a subsample from a population-based cohort—risk assessment with CAC score does not appear to cause short-term mental distress.66,67 Additionally, 2 studies (n = 11 364) using administrative claims data showed that risk assessment with CAC score did not appear to paradoxically increase CVD events.45,80 However, a small body of evidence with study design and applicability limitations shows mixed findings for the effect of CAC score on downstream health care utilization.45,75,80 Both the EISNER RCT and 1 cohort using administrative data found no statistically significant increase in cardiac imaging or revascularization procedures in patients receiving vs not receiving CAC screening in follow-up of 4 years or 6 months, respectively.45,75 However, 1 study using Medicare claims data found a greater number of subsequent cardiac imaging tests and revascularization in asymptomatic people who received CAC screening compared with those receiving hsCRP or lipid screening.80
Key Question 4. Does treatment guided by nontraditional risk factors, in addition to traditional risk factors, lead to reduced incidence of cardiovascular events, mortality, or both?
No trials compared treatment guided by nontraditional risk factors in addition to traditional risk factors vs no treatment or usual care. In the absence of this evidence, studies in which preventive therapies were guided by nontraditional risk factors alone, without formal traditional risk assessment, were included. Four such RCTs reported the outcomes of CVD events, mortality, or both.41,42,50,64,73 Two of these trials evaluated aspirin in individuals with an abnormal ABI, 1 trial evaluated high-intensity statins in those with an abnormal hsCRP level, and 1 trial evaluated moderate-intensity statins in those with an abnormal CAC score.
Two good-quality RCTs (n = 4626) in asymptomatic adults with an abnormal ABI—including 1 trial exclusively in participants with diabetes—did not find any statistically significant benefit for aspirin (100 mg daily) on reducing CVD outcomes or all-cause mortality compared with placebo after approximately 7 to 8 years of follow-up.42,50 However, neither trial used the conventional 0.90 threshold for an abnormal ABI; 1 defined an abnormal ABI as 0.95 or less and the other as 0.99 or less. One good-quality RCT (n = 17 802) in asymptomatic people with elevated hsCRP level (≥2.0 mg/L [19.05 nmol/L]) but normal low-density lipoprotein cholesterol level (<130 mg/dL [3.37 mmol/L]) found a significant relative reduction in CVD events for rosuvastatin (20 mg daily) compared with placebo (hazard ratio, 0.56 [95% CI, 0.46 to 0.69]) at approximately 2 years.73 One fair-quality trial (n = 1005) in asymptomatic people with elevated CAC score, defined as 80th percentile or greater for age and sex, and with low-density lipoprotein cholesterol levels less than 175 mg/dL (4.5 mmol/L), did not find any statistically significant benefit for atorvastatin (20 mg daily) on reducing CVD outcomes compared with placebo after a mean of 4.3 years of follow-up41; however, that study had a lower than expected number of events and was terminated early because of futility.
Key Question 5. What are the harms of treatment guided by nontraditional risk factors?
Three of the 4 RCTs included for KQ4 reported harms of aspirin or statins guided by nontraditional risk factor assessment.42,50,73 No other studies evaluating harms met inclusion criteria. Neither aspirin trial (n = 4626) found a statistically significant increase in hemorrhagic cerebrovascular accident after approximately 7 to 8 years of follow-up, although analyses are limited by a rare event rate.42,50 In the 1 trial reporting major bleeding, the association with low-dose aspirin approached statistical significance (hazard ratio, 1.71 [95% CI, 0.99 to 2.97]).50 The trial evaluating high-intensity statins in adults with elevated hsCRP levels (n = 17 802) found evidence of an increased incidence of diabetes in the treatment group compared with placebo after approximately 2 years (3.0% vs 2.4%, P = .01); however, it did not find evidence of increases in other serious adverse effects, including hemorrhagic cerebrovascular accident or myopathic events, for high-intensity statin therapy compared with placebo.
Direct evidence from adequately powered clinical trials evaluating the incremental effect of the ABI, hsCRP level, or CAC score to improve health outcomes is lacking (Table 4; eTable 7 in the Supplement). This remains unchanged from the 2009 and 2013 USPSTF recommendations, which concluded that the evidence was insufficient to assess the benefits and harms of using nontraditional risk factors for cardiovascular risk assessment (I statements).14,15
Advances in the evaluation of risk prediction literature, as well as an accrual of new evidence since the previous recommendations, has allowed for more detailed analyses of risk prediction studies that offer indirect evidence for the use of nontraditional risk factors (Figure 1). A substantial body of evidence evaluates the ability of the ABI, hsCRP level, or CAC score to improve on the discrimination and reclassification of traditional risk assessment, but most of this evidence is not readily applicable to current clinical practice because it does not use published coefficients, rarely evaluates the addition of nontraditional risk factors in the context of the Pooled Cohort Equations, and seldom uses current risk thresholds for aiding treatment decisions. Regardless, CAC score shows the largest potential for improvement in discrimination and reclassification. Improvements for the ABI are larger when base models perform poorly, such as the Framingham Risk Score in women. Results for hsCRP level are inconsistent. Based on selected included studies, examination of the absolute number of people appropriately and inappropriately reclassified using current treatment thresholds (ie, 7.5% or 10% 10-year CVD risk) demonstrates that even with favorable NRI for CAC score and ABI, more people will be incorrectly reclassified to a higher risk category, thus needing treatment, vs correctly reclassified as needing or not needing treatment.
A 2015 analysis of data from the MESA study91 found inconsistent results. That analysis, while valid, was designed to answer a different question about risk stratification and was excluded from this review because of its different study design. Multiple methodologic differences between this analysis and the included MESA analysis38 (ie, differences in definitions of risk strata, differences in risk classification of individuals with diabetes, differences in comparisons of CAC score alone vs CAC score plus Pooled Cohort Equations, and recalibration of the Pooled Cohort Equations) prohibit direct comparisons between the 2 studies’ findings. Therefore, while CAC score appears to be the most informative nontraditional risk factor, there remains uncertainty around the overall clinical effect of the inappropriate upward reclassification of individuals and of resultant downstream testing, including from incidental findings. While experts have argued that large-scale clinical trials evaluating the effectiveness of CAC screening on patient health outcomes are not feasible,92 these studies may be necessary to address this remaining clinical uncertainty.
Despite the recent proliferation of risk prediction studies, few clarify whether the addition of the ABI, hsCRP level, or CAC score can improve the calibration of traditional risk assessment. The lack of reporting of preferred measures such as calibration plots and observed to expected ratios limits the ability to interpret the clinical meaning of improvements in calibration. Even when preferred measures are reported, confidence in their results is usually limited by small numbers of events, especially among low-risk groups. As treatment thresholds have decreased over time, the implications of calibration in lower-risk groups have become especially important. Historically, the performance of risk prediction models has been focused on discrimination,93 so the sparse reporting of calibration measures is not surprising and is consistent with the findings of other systematic reviews.94 The focus on discrimination is incomplete because the C statistic is a rank order statistic; therefore a model can discriminate well but still systematically underestimate or overestimate risk.95
The evidence report has several limitations. The review focused on just 3 nontraditional risk factors—the ABI, hsCRP level, and CAC score. Many more have been identified in the literature; however, these risk factors were selected because other synthesized literature identified them as having the greatest clinical potential.21,27 Additionally, the predictive value of traditional risk factors such as levels of total or high-density lipoprotein cholesterol was taken as given, but some literature suggests that these, too, might be very small to small when assessed in terms of the C statistic.40 Given the large volume of studies included for KQ2, some explicit exclusions were made so as to focus on the most clinically relevant analyses. Additionally, comparisons across studies are difficult because of the heterogeneity of base models, model type, predicted outcome, and definitions of risk strata as well as limited reporting of confidence intervals, statistical significance, and separate reporting of event and nonevent NRIs. In addition, 1 of the risk assessment analysis stratifications was whether a study preserved published coefficients or developed a new model; because methods were often only briefly described, the categorization represents a best guess for many studies, and, in cases of uncertainty, the study was categorized as model development.
There are insufficient adequately powered clinical trials evaluating the incremental effect of the ABI, hsCRP level, or CAC score in risk assessment and initiation of preventive therapy. Furthermore, the clinical meaning of improvements in measures of calibration, discrimination, and reclassification risk prediction studies is uncertain.
Corresponding Author: Jennifer S. Lin, MD, MCR, Kaiser Permanente Research Affiliates Evidence-based Practice Center, Center for Health Research, Kaiser Permanente Northwest, 3800 N Interstate Ave, Portland, OR 97227 (email@example.com).
Accepted for Publication: March 19, 2018.
Published Online: July 10, 2018. doi:10.1001/jama.2018.4242
Author Contributions: Dr Lin had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: Lin, Evans, Coppola.
Acquisition, analysis, or interpretation of data: Lin, Evans, Johnson, Redmond, Smith.
Drafting of the manuscript: Lin, Evans, Redmond, Coppola.
Critical revision of the manuscript for important intellectual content: Lin, Evans, Johnson, Smith.
Statistical analysis: Johnson, Redmond, Smith.
Obtained funding: Lin.
Administrative, technical, or material support: Evans, Coppola.
Supervision: Lin, Evans.
Conflict of Interest Disclosures: All authors have completed and submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest and none were reported.
Funding/Support: This research was funded under contract HHSA290201500007I, Task Order 2, from the Agency for Healthcare Research and Quality (AHRQ), US Department of Health and Human Services, under a contract to support the USPSTF.
Role of the Funder/Sponsor: Investigators worked with USPSTF members and AHRQ staff to develop the scope, analytic framework, and key questions for this review. AHRQ had no role in study selection, quality assessment, or synthesis. AHRQ staff provided project oversight, reviewed the report to ensure that the analysis met methodological standards, and distributed the draft for peer review. Otherwise, AHRQ had no role in the conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, and approval of the manuscript findings. The opinions expressed in this document are those of the authors and do not reflect the official position of AHRQ or the US Department of Health and Human Services.
Additional Contributions: We gratefully acknowledge the following individuals for their contributions to this project: Todd Hannon, MLS, Janelle Guirguis-Blake, MD, and Katherine Essick, BS (Kaiser Permanente Center for Health Research); Robert Platt, PhD (McGill University); Justin Mills, MD, MPH (Agency for Healthcare Research and Quality [AHRQ]); Elisabeth Kato, MD, MRP (formerly at AHRQ); and current and former members of the US Preventive Services Task Force (USPSTF) who contributed to topic deliberations. USPSTF members, peer reviewers, and federal partner reviewers did not receive financial compensation for their contributions.
Additional Information: A draft version of this evidence report underwent external peer review from 5 content experts (Nancy Cook, ScD, Harvard T. H. Chan School of Public Health; Donald Lloyd-Jones, MD, Northwestern University Feinberg School of Medicine; Gerry Fowkes, MD, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh; Matthew Budoff, MD, David Geffen School of Medicine at UCLA; Bruce Psaty, MD, PhD, University of Washington) and 1 federal partner (the National Institutes of Health). Comments from reviewers were presented to the USPSTF during its deliberation of the evidence and were considered in preparing the final evidence review.
Editorial Disclaimer: This evidence report is presented as a document in support of the accompanying USPSTF Recommendation Statement. It did not undergo additional peer review after submission to JAMA.