The opportunity to intervene in the natural history through screening is noted in red. Screening can either remove an adenoma, thus moving a person to the “no lesion” state, or diagnose a preclinical cancer, which, if detected at an earlier stage, may be more amenable to treatment.
aThe SimCRC and MISCAN models simulate discrete adenoma size categories (ie, 1-5 mm, 6-9 mm, ≥10 mm). The CRC-SPIN model simulates continuous adenoma size.
bScreening may allow for detection of cancer at an earlier stage than symptom-detected cancer and therefore create the conditions necessary for a better prognosis.
The calibrated models were used to project estimates for ages for which calibration data were not available. A, Adenoma prevalence from autopsy studies14- 23 and as predicted by the models. Multiple observations at each data point reflect estimates from different studies. The SimCRC and MISCAN models were each simultaneously calibrated to adenoma prevalence estimates from 10 autopsy studies.14- 23 The CRC-SPIN model incorporates the distribution of adenoma risk based on a Bayesian meta-analysis49 of the 10 autopsy studies.14- 23 B, Colorectal cancer cases per 100 000 from the Surveillance, Epidemiology, and End Results (SEER) program (1975-1979)24 and as predicted by the models. The models were calibrated to SEER colorectal cancer incidence rates in 1975-1979 because this period represents colorectal cancer incidence in the United States when there was little or no screening for the disease. (SEER data do not distinguish between screen-detected cancer and clinically detected cancer.)
Labeled strategies are efficient or near-efficient with an age to begin screening of 50 or 55 years. aStrategy is near-efficient (it is weakly dominated and its life-years gained [LYG] are within 98% of the efficient frontier).
eTable 1. Comparison of natural history model structures.
eTable 2. FIT test characteristics (per person) by cutoff for positivity.
eTable 3. Outcomes with colonoscopy screening strategies.
eTable 4. Outcomes with gFOBT screening strategies.
eTable 5. Outcomes with FIT screening strategies.
eTable 6. Outcomes with FIT-DNA screening strategies.
eTable 7. Outcomes with SIG screening strategies.
eTable 8. Outcomes with SIG+gFOBT screening strategies.
eTable 9. Outcomes with SIG+FIT screening strategies.
eTable 10. Outcomes with CTC screening strategies.
eTable 11. Efficiency ratios for stool-based screening strategies.
eTable 12. Efficiency ratios for screening strategies combining SIG with stool testing.
eTable 13. Model-recommendable strategies with the colonoscopy strategy with a five-year interval selected as the benchmark strategy.
eTable 14. Sensitivity analysis: Percent change in outcomes compared to the base-case analysis for model-recommendable strategies using the worst-case and best-case test characteristics.
eTable 15. Sensitivity analysis: Model-recommendable stool-based strategies with the inclusion of FIT strategies with a lower cutoff for positivity.
eTable 16. Sensitivity analysis: Model-recommendable CTC strategies using the number of procedures requiring cathartic bowel preparations as the measure of the burden of screening.
eTable 17. Efficiency ratios for colonoscopy screening strategies.
eTable 18. Efficiency ratios for SIG screening strategies.
eTable 19. Efficiency ratios for CTC screening strategies.
eTable 20. Sensitivity analysis: Efficiency ratios for stool-based screening strategies with the inclusion of FIT strategies with a lower cutoff for positivity.
eTable 21. Sensitivity analysis: Efficiency ratios for CTC screening strategies using the number of procedures requiring cathartic bowel preparation as the measure of the burden of screening.
eFigure 1. Age-specific risks of complications from colonoscopy with polypectomy.
eFigure 2. Example of the assessment of near-efficiency for weakly dominated strategies.
eFigure 3. Lifetime number of colonoscopies and life-years gained for gFOBT screening strategies that vary by age to begin, age to end, and screening interval.
eFigure 4. Lifetime number of colonoscopies and life-years gained for FIT screening strategies that vary by age to begin, age to end, and screening interval.
eFigure 5. Lifetime number of colonoscopies and life-years gained for FIT-DNA screening strategies that vary by age to begin, age to end, and screening interval.
eFigure 6. Lifetime number of colonoscopies and life-years gained for SIG screening strategies that vary by age to begin, age to end, and screening interval.
eFigure 7. Lifetime number of colonoscopies and life-years gained for SIG+gFOBT screening strategies that vary by age to begin, age to end, and screening interval.
eFigure 8. Lifetime number of colonoscopies and life-years gained for SIG+FIT screening strategies that vary by age to begin, age to end, and screening interval.
eFigure 9. Lifetime number of colonoscopies and life-years gained for CTC screening strategies that vary by age to begin, age to end, and screening interval.
eFigure 10. Lifetime number of colonoscopies and life-years gained for screening strategies combining SIG with stool-based testing that vary by age to begin, age to end, and screening interval.
Knudsen AB, Zauber AG, Rutter CM, Naber SK, Doria-Rose VP, Pabiniak C, Johanson C, Fischer SE, Lansdorp-Vogelaar I, Kuntz KM. Estimation of Benefits, Burden, and Harms of Colorectal Cancer Screening StrategiesModeling Study for the US Preventive Services Task Force. JAMA. 2016;315(23):2595-2609. doi:10.1001/jama.2016.6828
The US Preventive Services Task Force (USPSTF) is updating its 2008 colorectal cancer (CRC) screening recommendations.
To inform the USPSTF by modeling the benefits, burden, and harms of CRC screening strategies; estimating the optimal ages to begin and end screening; and identifying a set of model-recommendable strategies that provide similar life-years gained (LYG) and a comparable balance between LYG and screening burden.
Design, Setting, and Participants
Comparative modeling with 3 microsimulation models of a hypothetical cohort of previously unscreened US 40-year-olds with no prior CRC diagnosis.
Screening with sensitive guaiac-based fecal occult blood testing, fecal immunochemical testing (FIT), multitarget stool DNA testing, flexible sigmoidoscopy with or without stool testing, computed tomographic colonography (CTC), or colonoscopy starting at age 45, 50, or 55 years and ending at age 75, 80, or 85 years. Screening intervals varied by modality. Full adherence for all strategies was assumed.
Main Outcomes and Measures
Life-years gained compared with no screening (benefit), lifetime number of colonoscopies required (burden), lifetime number of colonoscopy complications (harms), and ratios of incremental burden and benefit (efficiency ratios) per 1000 40-year-olds.
The screening strategies provided LYG in the range of 152 to 313 per 1000 40-year-olds. Lifetime colonoscopy burden per 1000 persons ranged from fewer than 900 (FIT every 3 years from ages 55-75 years) to more than 7500 (colonoscopy screening every 5 years from ages 45-85 years). Harm from screening was at most 23 complications per 1000 persons screened. Strategies with screening beginning at age 50 years generally provided more LYG as well as more additional LYG per additional colonoscopy than strategies with screening beginning at age 55 years. There were limited empirical data to support a start age of 45 years. For persons adequately screened up to age 75 years, additional screening yielded small increases in LYG relative to the increase in colonoscopy burden. With screening from ages 50 to 75 years, 4 strategies yielded a comparable balance of screening burden and similar LYG (median LYG per 1000 across the models): colonoscopy every 10 years (270 LYG); sigmoidoscopy every 10 years with annual FIT (256 LYG); CTC every 5 years (248 LYG); and annual FIT (244 LYG).
Conclusions and Relevance
In this microsimulation modeling study of a previously unscreened population undergoing CRC screening that assumed 100% adherence, the strategies of colonoscopy every 10 years, annual FIT, sigmoidoscopy every 10 years with annual FIT, and CTC every 5 years performed from ages 50 through 75 years provided similar LYG and a comparable balance of benefit and screening burden.
Randomized clinical trials (RCTs) have demonstrated that colorectal cancer (CRC) screening reduces CRC mortality.1- 8 However, although there are multiple screening modalities, trial data are available only for screening with low-sensitivity guaiac-based fecal occult blood tests (gFOBT)1- 4 and with flexible sigmoidoscopy5- 8 (SIG) and only for select ages and intervals of screening. No trials have reported long-term findings of direct comparisons of the various screening methods. Recognizing that simulation models provide a way to extrapolate available evidence and predict long-term outcomes, the US Preventive Services Task Force (USPSTF) requested simulation modeling to assess the benefits, burden, and harms of various screening strategies for the general population for its update to the 2008 CRC screening recommendations.
Three independently created microsimulation models of CRC developed within the National Cancer Institute–funded Cancer Intervention and Surveillance Modeling Network (CISNET) were used to evaluate 204 screening strategies for the US general population without a prior CRC diagnosis. The goals were to model different ages to begin and end screening and screening intervals and to identify a set of recommendable strategies that are estimated to provide similar clinical benefit and a comparable balance of benefit and screening burden.
Three models were used for this analysis: Simulation Model of CRC (SimCRC), Microsimulation Screening Analysis (MISCAN) for CRC, and CRC Simulated Population Model for Incidence and Natural History (CRC-SPIN). The 2008 analysis for the USPSTF9 used SimCRC and MISCAN, although MISCAN has since been revised10 based on findings from a joint model validation study.11
Each model consists of a natural history component and a screening component, which were used to simulate individual life histories from birth to death under alternative CRC screening strategies. These components are described briefly below and in more detail in a full report to the USPSTF12 (http://www.uspreventiveservicestaskforce.org/Page/Document/modeling-report/colorectal-cancer-screening2) and in the CISNET model registry.13
Each model simulates the natural history of CRC based on the adenoma-carcinoma sequence (Figure 1). Simulated persons enter the models free of disease, and over time they may develop 1 or more adenomas. Adenomas may grow, and some may transition to preclinical CRC. A preclinical CRC may become symptomatic, leading to clinical detection. Simulated persons may die of other causes at any age, and those with clinically detected CRC may die from the disease. Postdiagnosis survivorship depends on age and stage at diagnosis and tumor location. Each model’s natural history component was calibrated to data on adenoma prevalence14- 23 and CRC incidence24 from a period before the dissemination of CRC screening. The models use all-cause mortality rates from the 2009 US life table25 and stage-specific CRC relative survival estimates from analysis of data from the Surveillance, Epidemiology, and End Results (SEER) Program.26 Further details on the natural history structures of the models are provided in eTable 1 in the Supplement.
Each model also has a screening component that allows a simulated lifetime to be altered because of detection of a preclinical CRC or detection and removal of an adenoma. The effect of screening depends on the test performed; its sensitivity and specificity; how frequently it is repeated; and, for endoscopic tests, the reach of the scope (Table 19,27- 39). The models incorporate the risk of complications from colonoscopy with polypectomy,30,31 including the potential for death from perforation.32 Further assumptions on risk of colonoscopy complications (eFigure 1 in the Supplement) and test characteristics (Table 1) can be found in the full report.12
The models have been validated11 against the findings from the UK Flexible Sigmoidoscopy Screening (UKFSS) Trial of once-only SIG.5 All 3 models predicted CRC mortality reductions 10 years after SIG screening that were within the trial’s 95% confidence interval. Two models (SimCRC and CRC-SPIN) also predicted CRC incidence reductions that were within the trial’s 95% confidence interval. The MISCAN model underestimated the incidence reduction. The natural history component of the MISCAN model has since been recalibrated10 and now yields predictions that are consistent with both the mortality and incidence reductions of the UKFSS Trial.11 In this analysis, the validated and recalibrated models were used.
Eight screening modalities were evaluated: high-sensitivity gFOBT (HSgFOBT), fecal immunochemical testing (FIT) with a cutoff for positivity of 100 ng or more of hemoglobin (Hb) per mL of buffer (≥20 μg Hb/g of feces), multitarget stool DNA testing (FIT-DNA), SIG alone or with interval HSgFOBT or FIT, computed tomographic colonography (CTC), and colonoscopy. For each modality, multiple ages to begin screening (45, 50, or 55 years) and end screening (75, 80, or 85 years) and multiple screening intervals were evaluated (Table 2). It was assumed that no screening occurs after the stopping age, but that colonoscopy surveillance of persons with a history of adenomas continues through at least age 85 years. In addition, it was assumed that screening, follow-up, and surveillance procedures are performed regardless of the simulated person’s life expectancy; that is, they do not cease among persons with limited life expectancy. In all, 204 unique strategies were evaluated, including a strategy with no screening. It was assumed that there is 100% adherence to all procedures. As a result, predicted outcomes from the models reflect the potential lifetime benefits, burden, and harms of screening among a 40-year-old US population with full willingness to participate.
Benefit of screening was measured by the number of life-years gained (LYG) from the prevention or delay of CRC death. The life-years lost as a result of death from screening complications were also accounted for. As in the 2008 analysis for the USPSTF,9 the number of required colonoscopies was used as a measure of the burden of screening and includes colonoscopies for screening, follow-up, surveillance, and the diagnosis of symptomatic cancers. The number of screening tests as a measure of the burden of screening has been used for modeling analyses for the USPSTF for mammography for breast cancer screening,40,41 computed tomography for lung cancer screening,42 and colposcopies for cervical cancer screening.43,44 Harms from screening were measured by the number of complications from colonoscopy, including serious gastrointestinal events (perforations, gastrointestinal bleeding, or transfusions), other gastrointestinal events (paralytic ileus, nausea and vomiting, dehydration, or abdominal pain), and cardiovascular events (myocardial infarction or angina, arrhythmias, congestive heart failure, cardiac or respiratory arrest, syncope, hypotension, or shock) (eFigure 1 in the Supplement).30,31 For each outcome, the range of findings across the 3 models is reported.
As in the 2008 analysis for the USPSTF,9 it was decided a priori that it was important to consider not only the LYG from screening but also the burden of testing required to achieve those gains. Because the measure of burden—the number of required colonoscopies—does not capture the burden of other testing, direct comparisons of the benefit and burden across screening strategies were limited to those with similar noncolonoscopy burden. This was accomplished by grouping comparable tests, which resulted in 5 classes of screening modalities: stool-based modalities (ie, HSgFOBT, FIT, and FIT-DNA), SIG with stool-based modalities (ie, SIG + HSgFOBT and SIG + FIT), SIG alone, CTC, and colonoscopy.
Within each class of screening modality, the LYG and the colonoscopy burden were plotted for each screening strategy, creating an efficient frontier, the line connecting the strategies that provide the largest incremental increase in LYG per additional colonoscopy performed (eFigure 2 in the Supplement). All of the screening strategies that composed the frontier were considered efficient screening options.45,46 As in the 2008 analysis,9 it was decided a priori that weakly dominated strategies that fell below the frontier but had LYG within 98% of the efficient frontier would be defined as near-efficient (eFigure 2 in the Supplement). All other strategies that fell below the efficient frontier were considered inefficient. For efficient and near-efficient strategies, the incremental number of colonoscopies (∆COL), the incremental number of LYG (∆LYG), and the efficiency ratio (ie, ∆COL/∆LYG) relative to the next less effective efficient strategy were calculated.
It was assumed that model-recommendable screening strategies would be efficient or near-efficient options within their class of screening modality; all other strategies were eliminated from consideration. For ease of clinical implementation, it was assumed that a set of recommendable strategies would have the same ages to begin and end screening. For each combination of screening initiation and cessation ages, a benchmark strategy was selected, defined as a colonoscopy strategy with predicted (benchmark) LYG that are at least as large as the predicted LYG for colonoscopy every 10 years from ages 50 to 75 years, the colonoscopy strategy included in the 2008 CRC screening recommendation.47 This ensured that the model-recommendable colonoscopy strategy was no less effective than the previously recommended colonoscopy strategy. Within each class of screening modalities, the number of strategies under consideration was narrowed by eliminating those that were not efficient or near-efficient, those that resulted in LYG that were less than 90% of the benchmark LYG, and those that required more additional colonoscopies per LYG than the benchmark strategy (ie, strategies with a larger efficiency ratio than the benchmark strategy). The 90% threshold was selected before analysis of simulation results and was intended to yield model-recommendable strategies with similar LYG. The focus was on strategies with efficiency ratios less than or equal to that of the benchmark strategy because all noncolonoscopy strategies require use of additional tests, and hence impose additional burden, while colonoscopy strategies do not. The final set of model-recommendable strategies included all those that were recommendable by at least 2 of the 3 models. It was possible to have no recommendable strategy within a class of screening modalities. If more than 1 strategy within a class was recommendable by at least 2 models, then only the strategy yielding the most LYG was included in the final set of model-recommendable strategies.
Additional simulations were conducted using best-case and worst-case values for test sensitivity (Table 1) for the model-recommendable screening strategies; the time frame for the USPSTF recommendation process precluded evaluation of all 204 unique screening strategies with best-case and worst-case analyses. In addition, FIT was evaluated with a lower cutoff for positivity (eTable 2 in the Supplement). Because the number of colonoscopies does not fully capture the burden of CRC screening, particularly in terms of bowel preparation (required for colonoscopy and for CTC), the number of procedures requiring cathartic bowel preparation was considered as an alternative measure of the burden of screening (continuing to assume that harms arise only from colonoscopy with polypectomy30,31,34). It was assumed that follow-up colonoscopy for a positive CTC finding would be performed on the same day,48 eliminating the need for 2 bowel preparations.
SimCRC was programmed in C++, MISCAN in Delphi, and CRC-SPIN in C#. Output from each model was analyzed in RStudio version 0.98.1103.
In the absence of screening, the models simulated nearly identical life expectancy among 40-year-olds: 39.6 years with SimCRC and 40.0 years with MISCAN and CRC-SPIN. Estimated adenoma prevalence among an unscreened population ranged from 11% to 13% across models at age 40 years, 26% to 36% at age 60 years, and 43% to 50% at age 80 years, with highest prevalence at younger ages with MISCAN and highest prevalence at older ages with SimCRC (Figure 2A14- 23,49). Although adenoma prevalence was comparable across models, the models differed in the distribution of adenomas by location within the colon and rectum (Table 3). The proportion of adenomas in the distal colon (ie, descending or sigmoid colon) or rectum ranged from 38% to 63%, with a higher proportion in MISCAN compared with SimCRC and CRC-SPIN. The models also differed in the distribution of the size of the largest adenoma (Table 3). Compared with MISCAN and CRC-SPIN, persons with adenomas in SimCRC were less likely to have a 1- to 5-mm adenoma as the largest adenoma, while persons in CRC-SPIN were more likely to have an adenoma of at least 10 mm as the largest adenoma.
Prior to age 75 years, the models reproduced age-specific CRC incidence rates from SEER from 1975-1979,24 a period with little to no CRC screening (Figure 2B). At older ages, SimCRC and CRC-SPIN predicted incidence rates that were higher than those observed in SEER. The models generally replicated the stage distribution observed in SEER among a largely unscreened population, although the proportion of cases diagnosed at stage IV was lower with CRC-SPIN (19% of cases vs 25% of cases in SEER) (Table 3).
In the absence of screening, the models estimate that 67 to 72 per 1000 40-year-olds will be diagnosed with CRC in their lifetimes and that 27 to 28 per 1000 40-year-olds will die from CRC (eTable 3 in the Supplement).
Outcomes for all screening strategies are shown in eTables 3 through 10 in the Supplement. Although the models differed slightly in terms of absolute benefits, burden, and harms of screening, overall they yielded consistent relative predictions across screening modalities and similar rankings of strategies within each class of modalities. All strategies yielded clinically important LYG compared with no screening (range, 152-313 per 1000 40-year-olds). Lifetime colonoscopy burden ranged from fewer than 900 per 1000 persons (FIT every 3 years from ages 55-75 years) to more than 7500 per 1000 persons (colonoscopy screening every 5 years from ages 45-85 years). The lifetime number of harms from screening (ie, colonoscopy-related complications) was low, with at most 23 per 1000 40-year-olds with colonoscopy screening every 5 years from ages 45 to 85 years.
The LYG relative to the number of colonoscopies required and the efficient frontier for all colonoscopy strategies are presented in Figure 3. Across the 3 models, the LYG and colonoscopy burden were lowest with colonoscopy screening every 15 years from ages 55 to 75 years (range of LYG, 214-236 per 1000 persons; range of colonoscopy burden, 2968-3079 per 1000 persons) and highest with colonoscopy screening every 5 years from ages 45 to 85 years (range of LYG, 282-313 per 1000 persons; range of colonoscopy burden, 7552-7630 per 1000 persons). Similar plots for the other modalities are presented in eFigures 3 through 9 in the Supplement. For all modalities, strategies with screening beginning at age 45 years predominated on the efficient frontier; that is, they generally provided additional LYG at a lower number of additional colonoscopies than strategies with screening beginning at later ages. However, the additional LYG from starting screening at age 45 years instead of 50 years were small relative to the additional number of colonoscopies. For example, with colonoscopy screening every 10 years to age 75 years, lowering the age to begin screening from age 50 to age 45 years yielded 15 to 28 additional LYG per 1000 and required an additional 827 to 856 colonoscopies per 1000. Given this small increase in LYG and the limited empirical data to support lowering the recommended age to begin CRC screening from 50 to 45 years, subsequent analyses presented are limited to strategies with age to begin screening of 50 or 55 years. Within this subset, strategies with screening beginning at age 50 years predominated among those that were on or near the efficient frontier, suggesting that age 50 years would be a reasonable age to begin screening.
Unlike with the age to begin screening, no age to end screening predominated on the efficient frontier. However, the LYG associated with extending the age to end screening were generally small relative to the number of additional colonoscopies required. For example, with colonoscopy every 10 years from age 50 years, raising the age to end screening from 75 to 80 or 85 years (so an additional screening colonoscopy was performed at age 80 years) increased LYG by 2 to 3 per 1000 persons (a 1% change for each model) and the number of colonoscopies by 384 to 414 per 1000 (a 9%-10% change for each model). This suggests that 75 years would be a reasonable age to end screening.
Figure 4 shows the efficient frontiers for the stool-based modalities. When HSgFOBT, FIT, and FIT-DNA were evaluated together, FIT strategies comprised the majority of those that were efficient or near-efficient (efficient strategies are on the efficient frontier and near-efficient strategies have LYG within 98% of the efficient frontier [eFigure 2 in the Supplement]). Strategies involving FIT-DNA with annual screening from age 50 years to age 75, 80, or 85 years were also efficient or near-efficient options in all 3 models, while FIT-DNA strategies with screening every 3 years (the interval at which the test is currently reimbursed by the Centers for Medicare & Medicaid Services50) or every 5 years were dominated in all 3 models. With 2 models (SimCRC and MISCAN), no HSgFOBT strategies were included among those that were efficient or near-efficient, and in 1 model (CRC-SPIN) only 1 HSgFOBT strategy—annual HSgFOBT from ages 50 to 85 years—was near-efficient (eTable 11 in the Supplement).
When strategies combining SIG and stool-based testing were evaluated as a group, SIG-plus-FIT strategies predominated among those that were efficient or near-efficient. For all models, efficient and near-efficient strategies included SIG plus FIT; 1 model (CRC-SPIN) also included 1 SIG-plus-HSgFOBT strategy as efficient: SIG every 10 years with annual HSgFOBT from ages 50 to 85 years (eFigure 10 and eTable 12 in the Supplement).
In light of the limited benefits from extending the age to end screening beyond 75 years, the predominance of earlier ages to begin screening on the efficient frontier, and the lack of empirical evidence to support lowering the recommended age to begin screening from 50 to 45 years, only strategies with CRC screening from ages 50 to 75 years were eligible for model recommendation. There were 3 efficient or near-efficient colonoscopy strategies from age 50 to 75 years: colonoscopy at a 5-, 10-, or 15-year interval (Figure 3). The 15-year interval was eliminated because it yielded fewer LYG than colonoscopy every 10 years from ages 50 to 75 years (the colonoscopy strategy included in the 2008 USPSTF recommendation).47 Model-recommendable strategies with the selection of colonoscopy with a 10-year interval as the benchmark strategy are described below, and model-recommendable strategies with a 5-year colonoscopy interval as the benchmark are shown in eTable 13 in the Supplement. Focus is on the 10-year interval benchmark because moving from a 10-year to a 5-year colonoscopy interval had a small effect on LYG (a 3%-7% increase) relative to the effect on the colonoscopy burden (a 45%-49% increase).
With colonoscopy every 10 years from ages 50 to 75 years selected as the benchmark strategy, the benchmark number of LYG (per 1000 persons) and efficiency ratio against which other strategies were compared equaled 275 and 55, respectively, for SimCRC; 248 and 39 for MISCAN; and 270 and 65 for CRC-SPIN (Table 4).
Selecting strategies from the other classes of screening modalities that were efficient or near-efficient and that had LYG at least 90% of the benchmark colonoscopy strategy, while requiring a lower efficiency ratio than the benchmark, resulted in the following set of model-recommendable strategies in addition to colonoscopy every 10 years (the benchmark strategy): annual FIT; SIG every 10 years with annual FIT; and CTC every 5 years (Table 4). Findings were consistent across models. Flexible sigmoidoscopy alone was not selected because, for each model and each SIG strategy, LYG were less than 90% of the benchmark LYG. The strategies of SIG every 10 years with either annual or biennial FIT met the criteria for being a recommendable strategy in at least 2 models, but only 10-yearly SIG with annual FIT was included in the final set of model-recommendable strategies because it yielded more LYG. No HSgFOBT strategy was selected because no strategies with screening from ages 50 to 75 years were efficient or near-efficient. Although annual screening with FIT-DNA from ages 50 to 75 years was near-efficient in all 3 models, it was not selected because its efficiency ratio exceeded that of the benchmark.
Among the 4 model-recommendable strategies (with colonoscopy screening every 10 years from ages 50-75 years as the benchmark), median LYG across the 3 models ranged from 244 per 1000 with annual FIT to 270 per 1000 persons with colonoscopy every 10 years (the benchmark strategy); median colonoscopy burden ranged from 1743 per 1000 persons with CTC every 5 years to 4049 per 1000 persons with colonoscopy every 10 years (Table 4). The median reduction in the lifetime risk of dying from CRC across the 3 models was 81% with annual FIT (eTable 5 in the Supplement), 82% with CTC every 5 years (eTable 10), 85% with SIG every 10 years with annual FIT (eTable 9), and 87% with colonoscopy every 10 years (eTable 3).
Model predictions for the sensitivity analysis using the best-case and worst-case assumptions for test sensitivity are presented in eTable 14 in the Supplement for the set of model-recommendable strategies with screening from ages 50 to 75 years with colonoscopy every 10 years selected as the benchmark strategy. The percent change in numbers of colonoscopies, noncolonoscopy tests, LYG, complications, and CRC deaths averted relative to the base-case analysis ranged from −2% to 3% for the colonoscopy strategy, −6% to 6% for the FIT strategy, −4% to 5% for the SIG-plus-FIT strategy, and −5% to 7% for the CTC strategy.
Overall conclusions for stool-based testing did not change with the inclusion of FIT strategies with a lower cutoff for positivity. Annual FIT with a high positivity threshold continued to be the model-recommendable stool-based strategy in all models (eTable 15 in the Supplement).
Findings were sensitive to the measure used for the burden of screening. When the number of procedures requiring cathartic bowel preparation was used, rather than the number of colonoscopies, CTC was no longer included as a model-recommendable strategy (eTable 16 in the Supplement).
In these analyses of a general population of US 40-year-olds without prior diagnosis of CRC undergoing screening, the following screening strategies from ages 50 to 75 years were estimated to provide comparable LYG and a comparable balance of benefit and burden: colonoscopy every 10 years, annual FIT, SIG every 10 years with annual FIT, and CTC every 5 years. With these strategies, median LYG across the 3 models ranged from 244 per 1000 persons with FIT to 270 per 1000 persons with colonoscopy; median colonoscopy burden ranged from more than 1700 per 1000 persons with CTC to approximately 4000 per 1000 persons with colonoscopy. The median reduction in the lifetime risk of dying from CRC was 81% with annual FIT, 82% with CTC every 5 years, 85% with SIG every 10 years with annual FIT, and 87% with colonoscopy every 10 years. Although the model-recommendable strategies are based on beginning screening at age 50 years, model results suggested that starting screening at age 45 years was more effective and provided a more favorable balance between LYG and screening burden than starting at age 50 years. However, empirical evidence is lacking to support lowering the age to begin screening. Consistent with the 2008 analysis,9 continuing screening beyond age 75 years for regularly screened persons in whom no adenomas or CRCs have been detected was estimated to provide limited benefit relative to the increase in the number of colonoscopies required.
There are some important differences between the current analysis and the 2008 analysis.9 In the current analysis, screening modalities with similar noncolonoscopy burden were grouped. The gFOBT Hemoccult II was not considered because of its low sensitivity,51 which resulted in lower LYG than with other stool-based modalities.9 HSgFOBT and FIT were again considered, although there are now empirical data to suggest that FIT has higher sensitivity and specificity for CRC than HSgFOBT.34 Also considered was the newly developed FIT-DNA test, which has higher sensitivity for CRC and for advanced adenomas than a FIT with a positivity cutoff of 100 ng or more of Hb/mL of buffer (≥20 μg Hb/g of feces) alone.33 Among the stool-based tests, FIT strategies predominated on the efficient frontier; HSgFOBT strategies were consistently below the frontier. Strategies including FIT-DNA with annual testing were on or near the frontier but were not among the model-recommendable strategies because their efficiency ratios were larger than that of the benchmark colonoscopy strategy. The strategy of FIT-DNA every 3 years (the interval currently reimbursed by the Centers for Medicare & Medicaid Services50) provided fewer LYG than that of the benchmark colonoscopy strategy and was dominated by other stool-based modalities. Flexible sigmoidoscopy alone provided fewer LYG than other strategies, but SIG every 10 years combined with annual FIT emerged as a model-recommendable strategy in all models. The latter strategy may be attractive to persons who opt for annual FIT but who also want reassurance from endoscopic testing. However, combined SIG and stool testing is nearly obsolete in the United States.52
Computed tomographic colonography strategies were also included in the current analysis, whereas CTC was excluded from the 2008 analysis. A strategy involving CTC was model-recommendable provided that the number of colonoscopies, rather than the number of procedures requiring cathartic bowel preparation, was used as the measure of screening burden. When cathartic bowel preparations were included as part of the burden metric, there was an optimistic assumption that colonoscopy for the follow-up of a positive CTC finding would be performed on the same day, thereby eliminating the need for 2 bowel preparations. Same-day follow-up colonoscopy requires integration between radiology and gastroenterology units and is available at some specialized centers in the United States.53 Although the burden of bowel preparation with CTC was included in the sensitivity analysis, none of the analyses accounted for the harms associated with the small risk of radiation-induced cancer from CTC, nor did the analyses account for the harms (or potential benefits) from the follow-up of extracolonic findings detected at CTC. Accounting for these benefits, harms, and burdens might have the potential to alter whether CTC was a model-recommendable strategy, but evidence is insufficient to reliably quantify the magnitude of these effects.34
Having multiple independently developed models that provide similar findings despite differences in underlying assumptions provides a stronger case for model results. Each model simulates a different average dwell time from adenoma to clinical cancer,11,54 reflecting uncertainty in clinical understanding of these unobservable processes. Using 3 distinct models provides a range of outcomes based on different assumptions, similar to a sensitivity analysis. In general, while the models differed slightly in terms of absolute outcomes (eg, number of LYG from screening, number of colonoscopies required, and number of CRC deaths averted), they yielded consistent relative predictions across screening modalities and similar rankings within classes of screening modalities.
This study should be interpreted in the context of several limitations. First, the models assumed perfect adherence to screening regimens, including all screening, follow-up, and surveillance tests, resulting in a prediction of the maximum achievable benefit for each strategy. Adherence to screening is a crucial component of screening effectiveness. Currently there is limited empirical evidence on test-specific adherence over multiple rounds of screening.55 Furthermore, there are no data describing screening adherence over an extended period (such as the 40-year period from ages 45-85 years, as simulated by the models), making it impossible to inform the models with empirical evidence. The 2008 analysis for the USPSTF9 included a sensitivity analysis examining the effect of adherence, with the expected result that reducing adherence resulted in fewer LYG and lower colonoscopy burden. However, adherence was not incorporated into selection of model-recommendable strategies in 2008 (nor was it in the current analysis), because identifying model-recommendable strategies based on imperfect adherence could result in selection of strategies with short intervals to make up for suboptimal population-level adherence; it could also lead to overscreening for those individuals who adhere to recommendations, potentially at the cost of unnecessary risks and burden.
Second, this analysis is meant to inform population guidelines. It is based on simulation of the general US population and is not intended for individual-level decision making, which would incorporate information about personal risk and patient preferences. Evaluation of personalized screening scenarios was beyond the scope of this analysis. However, screening strategies tailored to family history,56 comorbidity status,57 and screening history10 have been evaluated in other analyses.
Third, although the results provide a framework for evaluating a program of screening, much of the empirical data on test sensitivity and specificity are based on a single round of screening. Additional studies with multiple rounds of screening are needed to inform whether and how test performance varies at repeat screenings. In the absence of data to suggest otherwise, conditional independence of repeat screenings was assumed, meaning there were no systematic false-negative results for adenomas and cancers. This assumption would not hold for HSgFOBT, FIT, or the FIT component of the FIT-DNA test if some lesions never bleed. There is no evidence to inform whether that is the case, or whether the DNA assay component of the FIT-DNA test would also be subject to systematic false-negative findings. Colonoscopy and CTC might also have systematic false-negative findings due to lesions located behind a colonic fold or flat lesions. If test sensitivity is lower at subsequent rounds of screening, estimates of the benefits of screening might be overstated.
Fourth, adenoma size was used as an indicator for advanced adenomas, but the models did not explicitly simulate adenoma histology, largely because it is correlated with size. The models did not include the serrated polyp pathway58,59 due to insufficient evidence on the prevalence of sessile-serrated polyps by age, size, and location; their malignant potential; and the ability of screening tests to detect them.
Fifth, it was assumed that, conditional on size, colonoscopy sensitivity is the same for each adenoma within reach of the endoscope, regardless of its location. Observational studies suggest a smaller mortality reduction for proximal than for distal or rectal cancer with colonoscopy,60- 68 implying that test sensitivity (and natural history) might differ by location.
Sixth, the effect of uncertainty in model input parameters on the model-recommendable screening strategies was not evaluated with a probabilistic sensitivity analysis. The USPSTF recommendation process necessitated the completion of the systematic evidence review prior to estimation of screening effects by the models. The time frame for presentation of model findings to the USPSTF did not allow for completion of a probabilistic sensitivity analysis, which, with 204 unique screening strategies, would have required at least 200 000 additional simulations per model. The uncertainty in the deep natural history parameters is captured to some degree by the use of 3 models that have different assumptions with respect to natural history.54 To the extent that the 3 models yield similar conclusions, the results appear to be less sensitive to the natural history parameters. The most important external parameters are the test sensitivity estimates, which were varied.
Seventh, the measures of the benefits and burden of screening used in the analysis were imperfect. The benefits of screening were measured by LYG and did not account for quality of life. Utility weights have been estimated for diagnosed CRC states,69 but utility weights have not been estimated for the 8 CRC screening tests, nor for colonoscopy complications. Had quality of life been accounted for in this analysis, alternative model-recommendable screening strategies might have emerged. In addition, the number of required colonoscopies was used as the measure of the burden of screening. This was chosen because colonoscopy is the only burden shared by all modalities. All tests are burdensome but in different ways.70 Ideally, a metric would have been identified that accounts for the burden of all testing, but doing so requires subjective assumptions about how many of one test is equivalent to one of another (eg, x stool tests are equivalent to y SIGs and to z colonoscopies, etc). The relative burden of different tests likely varies across patients according to different preferences. In a sensitivity analysis in which the number of cathartic bowel preparations was used as an alternative measure of the burden of screening, CTC every 5 years was no longer included as a model-recommendable strategy, suggesting that the recommendable strategies are sensitive to the measure of screening burden. Future work should consider alternative measures of test burden that would enable direct comparison across all screening strategies.
In this microsimulation modeling study of a previously unscreened population undergoing CRC screening that assumed 100% adherence, the strategies of colonoscopy every 10 years, annual FIT, SIG every 10 years with annual FIT, and CTC every 5 years performed from ages 50 to 75 years provided similar LYG and a comparable balance of benefit and screening burden.
Corresponding Author: Ann G. Zauber, PhD, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, 485 Lexington Ave, Second Floor, New York, NY 10017 (email@example.com).
Published Online: June 15, 2016. doi:10.1001/jama.2016.6828.
Author Contributions: Dr Knudsen had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Drs Knudsen and Zauber served as dual first authors, and Drs Lansdorp-Vogelaar and Kuntz served as dual senior authors.
Study concept and design: Knudsen, Zauber, Rutter, Doria-Rose, Lansdorp-Vogelaar, Kuntz.
Acquisition, analysis, or interpretation of data: Knudsen, Zauber, Rutter, Naber, Doria-Rose, Pabiniak, Johanson, Fischer, Lansdorp-Vogelaar, Kuntz.
Drafting of the manuscript: Knudsen, Zauber, Rutter, Doria-Rose, Lansdorp-Vogelaar, Kuntz.
Critical revision of the manuscript for important intellectual content: Knudsen, Zauber, Rutter, Naber, Doria-Rose, Pabiniak, Johanson, Fischer, Lansdorp-Vogelaar, Kuntz.
Statistical analysis: Knudsen, Zauber, Johanson, Lansdorp-Vogelaar, Kuntz.
Obtained funding: Knudsen, Zauber, Rutter, Lansdorp-Vogelaar, Kuntz.
Administrative, technical, or material support: Zauber, Naber, Doria-Rose, Pabiniak, Johanson, Fischer.
Study supervision: Knudsen, Zauber, Rutter, Lansdorp-Vogelaar, Kuntz.
Conflict of Interest Disclosures: All authors have completed and submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest and none were reported.
Funding/Support: This analysis was supported by the National Cancer Institute (NCI) of the National Institutes of Health (NIH) (through a supplement from the Agency for Healthcare Research and Quality [AHRQ]) under award U01CA152959. Dr Zauber was also supported in part by a NCI Cancer Center Support Grant under award P30CA008748.
Role of the Funder/Sponsor: Investigators worked with AHRQ staff and members of the USPSTF to define the scope of the project and specific questions to be addressed. AHRQ and NCI staff reviewed the manuscript prior to submission to ensure that the analysis met methodological standards. The authors are solely responsible for the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Disclaimer: The views expressed are those of the authors only and do not represent any official position of the NIH, NCI, or AHRQ.
An Interactive Resource for many of the figures shown in the text and in the Supplement is available at https://resources.cisnet.cancer.gov/projects/#crcr/uspstf/.
Additional Contributions: We thank Jennifer Croswell, MD, MPH, from AHRQ, for assistance, support, and feedback throughout the project; members of the USPSTF for comments on earlier versions of this research; the Kaiser Permanente Research Affiliates Evidence-based Practice Center for sharing early drafts of their review of the literature for incorporation in this analysis; the peer reviewers noted in the Additional Information section for suggestions on earlier versions of this study; James Allison, MD, University of California, San Francisco, for addressing questions about fecal immunochemical tests; and Eric “Rocky” Feuer, PhD, NCI, for leadership of CISNET. None of these individuals were compensated in association with their respective contribution to this article.
Additional Information: A draft version of this modeling report underwent external peer review from 3 content experts (Jason Dominitz, MD, MHS, University of Washington and the Department of Veterans Affairs; Russell Harris, MD, University of North Carolina; and David Ransohoff, MD, University of North Carolina) and 3 federal partners (Paul Pinsky, PhD, NCI; Marion Nadel, PhD, MPH, Centers for Disease Control and Prevention; and Jean Shapiro, PhD, Centers for Disease Control and Prevention). Comments were presented to the USPSTF during its deliberation of the evidence and were considered in preparing the final modeling report.
Editorial Disclaimer: This modeling study is presented as a document in support of the accompanying USPSTF Recommendation Statement. It did not undergo additional peer review after submission to JAMA.