Evidence reviews for the US Preventive Services Task Force (USPSTF) use an analytic framework to visually display the key questions that the review will address to allow the USPSTF to evaluate the effectiveness and safety of a preventive service. The questions are depicted by linkages that relate interventions and outcomes. Additional Information available in the USPSTF Procedure Manual.8 FIT indicates fecal immunochemical test; gFOBT, guaiac-based fecal occult blood test; mSEPT9, methylated septin 9 gene; sDNA test, stool DNA test; SSP, sessile serrated polyp.
aScreening technology with conditional approval from the US Food and Drug Administration for screening for colorectal cancer.
bScreening modality not discussed in this article.
KQ indicates key question.
aArticles could be reviewed for more than 1 KQ.
bReasons for exclusion: Relevance: Study aim not relevant. Design: Study did not use an included design. Setting: Study not conducted in a country relevant to US practice or not conducted in, recruited from, or feasible for primary care or a health system. Population: Study not conducted in an included population. Outcomes: Study did not have relevant outcomes or had incomplete outcomes. Screening test: Screening test was out of scope. Quality: Study was poor quality. Abstract only: Full-text publication not available.
eMethods. Literature Search Strategies for Primary Literature
eMethods. Included Studies
eTable 1. Inclusion/Exclusion Criteria
eTable 2. Study Design–Specific Quality Rating Criteria
eTable 3. Key Question 1: Results of Screening gFOBT Trials
eTable 4. Ongoing Studies
eTable 5. Extracolonic Findings
eFigure 1. Key Question 1: FS trial Findings on CRC Incidence and Mortality
eFigure 2. FS Trial Findings on CRC Incidence by Sex
eFigure 3. FS on CRC Mortality by Sex
eFigure 4. Key Question 2: Forest Plot of CT Colonography With Bowel Prep Sensitivity and Specificity for Adenomas ≥10 mm
eFigure 5. Key Question 2: Forest Plot of CT Colonography With Bowel Prep Sensitivity and Specificity for Adenomas ≥6 mm
eFigure 6. Key Question 2: Forest Plot of OC-Sensor Sensitivity and Specificity to Detect Colorectal Cancer (All Colonoscopy Follow-up), by Cutoff (μg Hb/g Feces)
eFigure 7. Key Question 2: Forest Plot of OC-Sensor Sensitivity and Specificity to Detect Advanced Adenomas (All Colonoscopy Follow-up), by Cutoff (μg Hb/g Feces)
eFigure 8. Key Question 2: Cologuard Sensitivity and Specificity to Detect Colorectal Cancer, Advanced Neoplasia, and Advanced Adenomas
Customize your JAMA Network experience by selecting one or more topics from the list below.
Lin JS, Perdue LA, Henrikson NB, Bean SI, Blasi PR. Screening for Colorectal Cancer: Updated Evidence Report and Systematic Review for the US Preventive Services Task Force. JAMA. 2021;325(19):1978–1998. doi:10.1001/jama.2021.4417
Colorectal cancer (CRC) remains a significant cause of morbidity and mortality in the US.
To systematically review the effectiveness, test accuracy, and harms of screening for CRC to inform the US Preventive Services Task Force.
MEDLINE, PubMed, and the Cochrane Central Register of Controlled Trials for relevant studies published from January 1, 2015, to December 4, 2019; surveillance through March 26, 2021.
English-language studies conducted in asymptomatic populations at general risk of CRC.
Data Extraction and Synthesis
Two reviewers independently appraised the articles and extracted relevant study data from fair- or good-quality studies. Random-effects meta-analyses were conducted.
Main Outcomes and Measures
Colorectal cancer incidence and mortality, test accuracy in detecting cancers or adenomas, and serious adverse events.
The review included 33 studies (n = 10 776 276) on the effectiveness of screening, 59 (n = 3 491 045) on the test performance of screening tests, and 131 (n = 26 987 366) on the harms of screening. In randomized clinical trials (4 trials, n = 458 002), intention to screen with 1- or 2-time flexible sigmoidoscopy vs no screening was associated with a decrease in CRC-specific mortality (incidence rate ratio, 0.74 [95% CI, 0.68-0.80]). Annual or biennial guaiac fecal occult blood test (gFOBT) vs no screening (5 trials, n = 419 966) was associated with a reduction of CRC-specific mortality after 2 to 9 rounds of screening (relative risk at 19.5 years, 0.91 [95% CI, 0.84-0.98]; relative risk at 30 years, 0.78 [95% CI, 0.65-0.93]). In observational studies, receipt of screening colonoscopy (2 studies, n = 436 927) or fecal immunochemical test (FIT) (1 study, n = 5.4 million) vs no screening was associated with lower risk of CRC incidence or mortality. Nine studies (n = 6497) evaluated the test accuracy of screening computed tomography (CT) colonography, 4 of which also reported the test accuracy of colonoscopy; pooled sensitivity to detect adenomas 6 mm or larger was similar between CT colonography with bowel prep (0.86) and colonoscopy (0.89). In pooled values, commonly evaluated FITs (14 studies, n = 45 403) (sensitivity, 0.74; specificity, 0.94) and stool DNA with FIT (4 studies, n = 12 424) (sensitivity, 0.93; specificity, 0.85) performed better than high-sensitivity gFOBT (2 studies, n = 3503) (sensitivity, 0.50-0.75; specificity, 0.96-0.98) to detect cancers. Serious harms of screening colonoscopy included perforations (3.1/10 000 procedures) and major bleeding (14.6/10 000 procedures). CT colonography may have harms resulting from low-dose ionizing radiation. It is unclear if detection of extracolonic findings on CT colonography is a net benefit or harm.
Conclusions and Relevance
There are several options to screen for colorectal cancer, each with a different level of evidence demonstrating its ability to reduce cancer mortality, its ability to detect cancer or precursor lesions, and its risk of harms.
Although the incidence of colorectal cancer (CRC) has declined over time, it remains a significant cause of morbidity and mortality in the US. Among all cancers, it is third in incidence and cause of cancer death for both men and women.1 In addition, cohort trends indicate that CRC incidence is decreasing only for persons 55 years or older.2 From the mid-1990s until 2013 the incidence of CRC had increased annually by 0.5% to 1.3% in adults aged 40 to 54 years.2
In 2016, the US Preventive Services Task Force (USPSTF) recommended screening for CRC starting at age 50 years and continuing until age 75 years (A recommendation). The task force recommended that the decision to screen for CRC in adults aged 76 to 85 years should be based on the individual, accounting for the patient’s overall health and prior screening history (C recommendation).3 To complete screening, this recommendation offered a number of stool-based and direct visualization tests.
This systematic review was conducted to update the previous review4,5 on the effectiveness, test accuracy, and harms of CRC screening as well as to inform a separate modeling report,6,7 which together were used by the USPSTF in the process of updating its CRC screening recommendation.
This review addressed 3 key questions (KQs), which are listed in Figure 1. No major changes were made to the scope of the previous review for the conduct of the current review except for the addition of 2 screening modalities (ie, capsule endoscopy, urine testing), which are not discussed in this article. The full report9 provides additional details on the methods, results, and contextual issues addressed.
Ovid MEDLINE, PubMed (publisher-supplied records only), and the Cochrane Central Register of Controlled Trials were searched to locate primary studies informing the key questions (eMethods in the Supplement). Searches included literature published between January 1, 2015, and December 4, 2019. The searches were supplemented with expert suggestions and by reviewing reference lists from other relevant systematic reviews, including the 2016 USPSTF evidence report.4 Ongoing surveillance was conducted through March 26, 2021, through article alerts and targeted searches of high-impact journals to identify major studies published in the interim that may affect the conclusions or understanding of the evidence. Two new studies were identified10,11; however, they did not substantively change the review’s interpretation of findings or conclusions and are not discussed further.
Two independent reviewers screened the titles, abstracts, and relevant full-text articles to ensure consistency with a priori inclusion and exclusion criteria (eTable 1 in the Supplement). Included studies were English-language studies of asymptomatic screening populations of individuals 40 years or older who were either at average risk for CRC or not selected for inclusion based on CRC risk factors. Studies that evaluated direct visualization (ie, colonoscopy, flexible sigmoidoscopy, computed tomography [CT] colonography) or currently available stool-based (ie, guaiac fecal occult blood test [gFOBT], fecal immunochemical test [FIT], stool DNA with a FIT [sDNA-FIT]), or serum-based (ie, methylated SEPT9 gene) tests were included.
For KQ1, randomized clinical trials (RCTs) or nonrandomized controlled intervention studies of CRC screening vs no screening or trials comparing screening tests were included. Included studies needed to report outcomes of CRC incidence, CRC-specific mortality, or all-cause mortality. For tests without trial-level evidence, well-conducted prospective cohort studies were included.
For KQ2, test accuracy studies that used colonoscopy as the reference standard were included. Well-conducted test accuracy studies that used robust registry follow-up for screen-negative participants were also included. Studies whose design was subject to a high risk of bias were excluded, including those studies subject to verification bias, spectrum bias, or both.12-16
For KQ3, all trials and observational studies that reported serious adverse events requiring unexpected or unwanted medical attention or resulting in death were included. These events included, but were not limited to, perforation, major bleeding, severe abdominal symptoms, and cardiovascular events. Studies designed to assess for extracolonic findings (ie, incidental findings on CT colonography) and the resultant diagnostic yield and harms of workup were also included.
Two reviewers critically appraised all articles that met inclusion criteria using prespecified quality criteria (eTable 2 in the Supplement).8 Disagreements about critical appraisal were resolved by consensus. Poor-quality studies (ie, those with methodological shortcomings resulting in a high risk of bias) were excluded. One reviewer extracted descriptive information and outcome data into standardized evidence tables and a second reviewer checked the data for accuracy.
The results were synthesized by KQ, type of screening test, and study design. For KQ1, the syntheses were organized into 3 main categories: (1) trials designed to assess the effectiveness (intention to screen) of screening tests compared with no screening; (2) observational studies designed to assess the association of receipt of a screening test compared with no screening; and (3) comparative trials of one screening test vs another screening test. Many of the trials comparing screening tests that met inclusion criteria, however, were designed to determine the differential uptake of tests, determine the comparative yield between tests, or both. As such, they were not powered to detect differences in CRC outcomes or mortality (ie, comparative effectiveness) and are not discussed in this article. When data were available, random-effects meta-analyses were conducted using the restricted maximum likelihood method to estimate the pooled incidence rate ratio (IRR).
For KQ2, the analyses primarily focused on per-person test accuracy of a single test application to detect CRC, advanced adenomas, advanced neoplasia, and adenomas by size (≥6 mm or ≥10 mm). When possible, data from contingency tables was analyzed using a bivariate model, which modeled sensitivity and specificity simultaneously. Although studies evaluating stool-based tests using a colonoscopy reference standard for all persons and studies using a registry follow-up for screen-negative persons were included, only results from the former study design are detailed in this article. For the FITs, random-effects meta-analyses were conducted by test “family” (ie, tests produced by the same manufacturer, using the same components and method and compatible automated analyzers) and by cutoff values (in μg Hb/g feces).
For KQ3, there were no hypothesized serious harms for stool-, blood-, or serum-based tests beyond test inaccuracy and harms accrued from subsequent colonoscopy. Harms for direct visualization tests were categorized by indication (ie, screening vs follow-up for an abnormal flexible sigmoidoscopy or stool test). For colonoscopy and flexible sigmoidoscopy, random-effects meta-analyses using the DerSimonian and Laird method were conducted to estimate rates of perforation and major bleeding.
All quantitative analyses were conducted in Stata version 16 (StataCorp). The presence of statistical heterogeneity was assessed among pooled studies using the I2 statistic. All tests were 2-sided, with P < .05 indicating statistical significance.
The aggregate strength of evidence (ie, high, moderate, or low) was subsequently assessed for each KQ using the approach described in the Methods Guide for the Effectiveness and Comparative Effectiveness Reviews,17 based on consistency, precision, reporting bias, and study quality.
Investigators reviewed 11 306 unique citations and 502 full-text articles for all KQs (Figure 2). Overall, 196 studies reported in 255 publications were included, 70 of which were newly identified since the prior review. A full list of included studies by KQ is available in the Supplement.
Key Question 1. What is the effectiveness or comparative effectiveness of screening in reducing colorectal cancer, mortality, or both?
Thirty-three unique fair- to good-quality studies (n = 10 776 276)18-50 (published in 66 articles18-83) were included to assess the effectiveness or comparative effectiveness of screening tests on CRC incidence and mortality. These included 2 prospective cohort studies37,47 (n = 436 927) that examined the effectiveness of screening colonoscopy, 4 RCTs19,24,29,35 (n = 458 002) that examined the effectiveness of flexible sigmoidoscopy with or without a FIT, 6 trials20,21,27,36,38,39 (n = 525 966) that examined the effectiveness of a gFOBT, and 1 prospective cohort study46 (n = 5 417 699) that examined the effectiveness of a FIT. In addition to 1 screening RCT19 (n = 98 678) that evaluated flexible sigmoidoscopy plus FIT vs flexible sigmoidoscopy alone, 20 studies18,22,23,25,26,28,30-34,40-45,48-50 (n = 471 860) that compared screening modalities were included. The magnitude of benefit in CRC mortality and cancer incidence among screening tests could not be directly compared because of major differences in the design of included studies for each test type (eg, trial vs observational study, intention to screen vs as screened, outcome metric reported). No studies were found evaluating the effectiveness of CT colonography, high-sensitivity gFOBT, sDNA with or without FIT, or serum tests on CRC incidence, CRC mortality, or both.
Two large, prospective observational studies37,47 (n = 436 927) evaluating the association of receipt of screening colonoscopy with CRC incidence or mortality were included (Table 1). After 24 years of follow-up, 1 study among health professionals (n = 88 902) found that the CRC-specific mortality rate was lower in people who self-reported at least 1 screening colonoscopy compared with those who had never had a screening colonoscopy (adjusted hazard ratio, 0.32 [95% CI, 0.24-0.45]).37 This study found that screening colonoscopies were associated with lower CRC mortality from both distal and proximal cancers. Another study conducted among Medicare beneficiaries (n = 348 025) with much shorter follow-up found that people aged 70 to 74 years who underwent a screening colonoscopy had a lower 8-year standardized risk for CRC (−0.42% [95% CI, −0.24% to −0.63%]) than those who did not undergo the test.47
Four well-conducted trials19,24,29,35 (n = 458 002) of 1- or 2-time flexible sigmoidoscopy screening that demonstrated a reduction in CRC incidence and mortality were included (Table 1). All 4 trials were included in the previous review. While 3 of these trials have published longer follow-up since the previous review,19,24,29 the new data did not change the conclusions on screening effectiveness. Based on 4 RCTs that used intention-to-screen analyses, 1- or 2-time flexible sigmoidoscopy was consistently associated with a decrease in CRC incidence (IRR, 0.78 [95% CI, 0.74-0.83], with 28 to 47 fewer CRC cases per 100 000 person-years) and CRC-specific mortality (IRR, 0.74 [95% CI, 0.68-0.80], with 10 to 17 fewer CRC deaths per 100 000 person-years) when compared with no screening at 11 to 17 years of follow-up (eFigure 1 in the Supplement).
Six well-conducted trials20,21,27,36,38,39 (n = 780 458) of biennial or annual gFOBT screening that demonstrated a reduction in CRC incidence and mortality were included (Table 1). Based on 5 RCTs20,21,27,36,39 (n = 419 966) that used intention-to-screen analyses, biennial screening with Hemoccult II (Beckman Coulter) was associated with a reduction of CRC-specific mortality compared with no screening after 2 to 9 rounds of screening at 11 to 30 years of follow-up (relative risk [RR], 0.91 [95% CI, 0.84-0.98] at 19.5 years; RR, 0.78 [95% CI, 0.65-0.93] at 30 years) (eTable 3 in the Supplement). One additional trial38 of screening with Hemoccult II in Finland (n = 360 492) reported only interim findings, with a follow-up of 4.5 years.
Although many observational studies have evaluated national FIT screening programs, only 1 prospective observational study46 (n = 5 417 699) that evaluated receipt of FIT on CRC incidence, CRC mortality, or both met the inclusion criteria (Table 1). This study found that 1 to 3 rounds of screening with a biennial FIT (OC-Sensor [Eiken Chemical] or HM JACK [Kyowa Medex]) were associated with lower CRC mortality at 6 years’ follow-up, compared with no screening (adjusted RR, 0.90 [95% CI, 0.84-0.95]).46
In 1 flexible sigmoidoscopy screening RCT (n = 98 678), compared with persons in the no screening group, persons in the flexible sigmoidoscopy plus FIT group had lower risk of CRC-specific mortality than those in the flexible sigmoidoscopy–only group (age-adjusted hazard ratio, 0.62 [95% CI, 0.42-0.90] vs 0.84 [95% CI, 0.61-1.17]), although this difference was not statistically significant.19 Additional included trials were primarily designed to evaluate the comparative uptake/adherence, test positivity, and initial cancer detection of one screening test vs another. Several adequately powered studies currently underway are evaluating the comparative effectiveness of direct visualization vs stool-based screening programs (eTable 4 in the Supplement).
Overall, age stratified analyses from flexible sigmoidoscopy and gFOBT trials did not demonstrate statistically significant differences in benefit in older vs younger adults, although age strata used were not consistent across trials. Only 3 gFOBT studies included adults younger than 50 years at recruitment, and none of these studies provided age-stratified analyses for this age group.27,36,39 One study evaluating receipt of screening colonoscopy among Medicare beneficiaries did not find a benefit in 8-year standardized risk for CRC in those aged 75 to 79 years, in contrast to the benefit seen in those aged 70 to 74 years.47 Reductions in CRC incidence (eFigure 2 in the Supplement) and mortality (eFigure 3 in the Supplement) from flexible sigmoidoscopy trials were greater for men than for women. This evidence, however, was less consistent in 3 trials that reported sex differences for gFOBT screening programs.
Key Question 2. What is the accuracy of direct visualization, stool-based, or serum-based screening tests for detecting colorectal cancer, advanced adenomas, or adenomatous polyps based on size?
Fifty-nine studies84-142 (n = 3 491 045) (published in 78 articles84-161) that evaluated the accuracy of various screening tests were included. There were no new studies published since the prior review that would add to the understanding of screening sensitivity or specificity for colonoscopy, CT colonography, or flexible sigmoidoscopy. New studies were identified that evaluated the sensitivity and specificity of stool-based (ie, high-sensitivity gFOBT, FIT, sDNA-FIT) and serum-based tests for screening.
Nine fair- to good-quality studies102,105,110,111,114,117,121,128,138 (n = 6497) that evaluated screening CT colonography were included, 4 of which (n = 4821) also reported the test accuracy of colonoscopy (Table 2).110,111,128,138 Based on these studies, while both colonoscopy and CT colonography did not accurately identify all cancers, the number of CRCs in these studies was low and these studies were not powered to estimate the test accuracy for CRC.
Based on 3 studies111,128,138 (n = 2290) that compared colonoscopy to a reference standard of CT colonography–enhanced colonoscopy or repeat colonoscopy, the per-person sensitivity for adenomas 10 mm or larger ranged from 0.89 (95% CI, 0.78-0.96) to 0.95 (95% CI, 0.74-0.99). The per-person sensitivity for adenomas 6 mm or larger ranged from 0.75 (95% CI, 0.63-0.84) to 0.93 (95% CI, 0.88-0.96). Specificity could be calculated only from 1 of the included studies and was 0.89 (95% CI, 0.86-0.91) for adenomas 10 mm or larger and 0.94 (95% CI, 0.92-0.96) for adenomas 6 mm or larger.138
Based on 7 studies105,110,111,114,117,121,128 (n = 5328) evaluating CT colonography with bowel preparation, the sensitivity to detect adenomas 10 mm or larger ranged from 0.67 (95% CI, 0.45-0.84) to 0.94 (95% CI, 0.84-0.98) and specificity ranged from 0.86 (95% CI, 0.85-0.87) to 0.98 (95% CI, 0.96-0.99) (eFigure 4 in the Supplement). Likewise, the sensitivity to detect adenomas 6 mm or larger ranged from 0.73 (95% CI, 0.58-0.84) to 0.98 (95% CI, 0.91-0.99) and specificity ranged from 0.80 (95% CI, 0.77-0.82) to 0.93 (95% CI, 0.90-0.96) (eFigure 5 in the Supplement). Although there was some variation in estimates of sensitivity and specificity among included studies, it remains unclear whether the variation of test performance was due to differences in study design, populations, CT colonography imaging, reader experience, or reading of protocols.
Two84,133 (n = 3503) of the 5 studies that evaluated Hemoccult Sensa (Beckman Coulter) applied a colonoscopy reference standard to all persons (Table 3). In these 2 studies, the sensitivity to detect CRC ranged from 0.50 to 0.75 (95% CI range, 0.09-1.0) and specificity ranged from 0.96 to 0.98 (95% CI range, 0.95-0.99). Hemoccult Sensa was not sensitive to detect advanced adenoma (sensitivity range, 0.06-0.17; 95% CI range, 0.02-0.23).
There are a wide variety of FITs available. Those most commonly evaluated in this review were part of the OC-Sensor family (Eiken Chemical; includes tests OC FIT-CHEK, OC-Auto, OC-Micro, OC-Sensor, and OC-Sensor Micro) or the OC-Light test (by the same manufacturer but using a different methodology) (Table 3). Based on 9 studies89,97,100,107,108,113,127,130,133 (n = 34 352) that used OC-Sensor tests to detect CRC with a colonoscopy reference standard and the manufacturer-recommended cutoff of 20 μg Hb/g feces, pooled sensitivity was 0.74 (95% CI, 0.64 to 0.83; I2 = 31.6%) and pooled specificity was 0.94 (95% CI, 0.93-0.96; I2 = 96.6%) (eFigure 6 in the Supplement). As expected at lower cutoffs (10 and 15 μg Hb/g feces), the sensitivity increased and the corresponding specificities decreased. Based on 10 studies89,91,97,100,107,108,113,127,130,133 (n = 40 411) that used OC-Sensor tests to detect advanced adenoma with a colonoscopy reference standard, sensitivity using a cutoff of 20 μg Hb/g feces was 0.23 (95% CI, 0.20-0.25; I2 = 47.4%) and specificity was 0.96 (95% CI, 0.95-0.97; I2 = 94.8) (eFigure 7 in the Supplement). Based on 3 studies95,96,98 (n = 31 803), OC-Light had similar sensitivity and specificity to detect CRC and advanced adenoma compared with OC-Sensor.
The only available sDNA screening test includes a FIT assay marketed as Cologuard (Exact Sciences), which is sometimes referred to as a multitarget stool DNA test. Based on 4 studies99,108,130,142 (n = 12 424) to detect CRC using a colonoscopy, pooled sensitivity was 0.93 (95% CI, 0.87-1.0) and pooled specificity was 0.85 (95% CI, 0.84-0.86); to detect advanced adenoma, pooled sensitivity was 0.43 (95% CI, 0.40-0.46) and pooled specificity was 0.89 (95% CI, 0.86-0.92) (Table 3; eFigure 8 in the Supplement).
Currently, one serum test—Epi proColon (Epigenomics)—is available to screen average-risk adults for CRC through detection of circulating methylated SEPT9 DNA. Based on 1 fair-quality nested case-control study129 (n = 6857), sensitivity to detect CRC was 0.68 (95% CI, 0.53-0.80) and specificity was 0.79 (95% CI, 0.77-0.81) (Table 3). The sensitivity to detect advanced adenoma was 0.22 (95% CI, 0.18-0.24) and specificity was 0.79 (95% CI, 0.76-0.82).
While FIT studies that examined differences in test accuracy by age, sex, or race/ethnicity were limited, no consistent differences by subgroup were found. Overall, in 10 studies there were no significant differences in test accuracy by age strata, including 2 studies reporting stratified analyses for persons younger than 50 years; however, 2 studies suggested possible lower specificity to detect CRC in older persons (70 years or older). Six studies reported test accuracy by sex and produced inconsistent findings. One OC-Sensor study reported no difference in test accuracy for advanced neoplasia in Black vs White participants.99
The largest study108,162 on sDNA-FIT reported test accuracy by age, sex, and race/ethnicity groups, although this study was not designed to examine these differences. This study found that the specificity to detect CRC and advanced adenoma decreases as age increases, but there was not a clear pattern for increasing sensitivity with increasing age. Findings were inconsistent in 2 studies that reported test accuracy for White participants compared with Black participants.
Key Question 3. What are the serious harms of the different screening tests?
One hundred thirty-one fair- or good-quality studies18-29,33-36,43,47,49,102,105,110,114,117,128,131,138,163-266 (published in 162 articles18-29,33-36,43,47,49,51-54,56-58,60,61,64,65,68,69,71-80,102,105,110,114,117,128,131,138,143,163-273) were included. Among these, 18 studies19,22,24,28,29,33-35,49,203,206,212,216,234,235,239,254,260 (n = 395 077) evaluated serious harms from screening flexible sigmoidoscopy; 67 studies26,43,47,163,164,166,168,171,172,174,179,180,182-189,191-195,197-199,201,203-205,210,213,215-218,226,229,231,233,237-252,255,256,258,261-266 (n = 25 784 107) evaluated screening colonoscopy; 21 studies19-21,24,26,27,29,34-36,49,169,172,173,175-177,181,225,227,236 (n = 903 872) evaluated colonoscopy following an abnormal result from a stool test, flexible sigmoidoscopy, or CT colonography; and 38 studies18,23,43,102,105,110,114,117,128,138,165,167,170,178,189,190,196,200,202,203,207-211,214,219-224,228,230,232,253,257,259 (n = 140 607) evaluated CT colonography. Of the studies evaluating CT colonography, 7 studies102,105,117,138,202,203,253 (n = 3365) provided estimates of radiation exposure and 27 studies18,23,43,110,128,138,165,167,170,178,200,207-211,214,219-224,230,232,257,259 (n = 48 235) reported extracolonic findings. While no studies examined the harms of stool or serum testing, there are not hypothesized serious harms for these noninvasive tests other than diagnostic inaccuracy (ie, false-positive or false-negative test results) or downstream harms of follow-up tests.
Serious adverse events from colonoscopy among screening populations were estimated at 3.1 perforations (95% CI, 2.3-4.0) per 10 000 procedures (26 studies, n = 5 272 600) and 14.6 major bleeding events (95% CI, 9.4-19.9) per 10 000 procedures (20 studies, n = 5 172 508) (Table 4). Serious adverse events from screening flexible sigmoidoscopy alone were less common, with a pooled estimate of 0.2 perforations (95% CI, 0.1-0.4) per 10 000 procedures (11 studies, n = 359 679) and 0.5 major bleeding events (95% CI, 0-1.3) per 10 000 procedures (10 studies, n = 179 854). However, for colonoscopies following flexible sigmoidoscopy with abnormal findings, the pooled estimates were 12.0 perforations (95% CI, 7.5-16.5) per 10 000 colonoscopy procedures (4 studies, n = 23 022) and 20.7 major bleeding events (95% CI, 8.2-33.2) per 10 000 colonoscopy procedures (4 studies, n = 5790). Serious adverse events from colonoscopy following stool testing with an abnormal result were estimated at 5.4 perforations (95% CI 3.4-7.4) per 10 000 colonoscopy procedures (12 studies, n = 341 922) and 17.5 serious bleeding events (95% CI, 7.6-27.5) per 10 000 colonoscopy procedures (11 studies, n = 78 793). Other harms which may result from screening, such as cardiopulmonary events or infections, are best assessed using comparative study designs. Only 4 studies47,187,191,262 (n = 4 173 949) reported harms in a cohort that received colonoscopy compared with a cohort that did not. These studies did not find a higher risk of serious harms associated with colonoscopy.
Data from 17 studies (n = 89 073) showed little to no risk of serious adverse events (eg, symptomatic perforation) for screening CT colonography. While CT colonography may also require a follow-up colonoscopy, sufficient evidence was not found to estimate serious adverse events from colonoscopy follow-up. CT colonography also entails exposure to low-dose ionizing radiation (range, 0.8 to 5.3 mSv), which may increase the risk of malignancy. Additionally, extracolonic findings on CT colonography were common (eTable 5 in the Supplement) (27 studies, n = 48 234). Approximately 1.3% to 11.4% of CT colonographies had potentially important extracolonic findings (CT Colonography Reporting and Data System [C-RADS] category E4) that necessitated diagnostic follow-up. Additionally, 3.4% to 26.9% of CT colonographies had C-RADS category E3 findings, some of which may require additional workup because of incompletely characterized findings. Although some included studies did report the final diagnosis of extracolonic findings, it is still unclear if the detection of extracolonic findings represents an overall benefit (detection and treatment of clinically significant disease) or harm (unnecessary diagnostic workup or identification of condition not needing intervention).
Twenty-three studies provided analyses of differential harms of colonoscopy by age. These studies generally found increasing rates of serious adverse events with increasing age, including perforation and bleeding. Sex differences in serious harms, when reported in 12 studies, suggested little differential risk between men and women. There were inconsistent findings in 4 studies that report harm stratified by race/ethnicity.
In 4 studies, extracolonic findings on CT colonography were more common with increasing age.110,208,209,211 Three studies reported extracolonic findings by sex, finding similar rates of extracolonic findings in both groups.207,219,221
This systematic review assessed the effectiveness, test accuracy, and harms of CRC screening. A summary of the identified evidence is shown in Table 5. Since the 2016 USPSTF recommendation, more evidence has been published on the effectiveness and test accuracy of newer stool tests (FIT and sDNA-FIT) and the test accuracy of a US Food and Drug Administration–approved serum test (Epi proColon) for use in persons declining colonoscopy, flexible sigmoidoscopy, gFOBT, or FIT. More data on colonoscopy harms have also been published that reported higher estimates of major bleeding than previously appreciated. Overall, the different screening tests evaluated have different levels of evidence to demonstrate their ability to reduce cancer mortality and to detect cancer, precursor lesions, or both as well as their risk of serious adverse events.
Data from well-conducted population-based screening RCTs demonstrate that intention to screen with Hemoccult II or flexible sigmoidoscopy can reduce CRC mortality. Hemoccult II and flexible sigmoidoscopy, however, are no longer widely used for screening in the US. Newer screening tests with similar sensitivity may result in CRC mortality reductions similar to reductions shown in existing trials. If sensitivity is better, without a trade-off in specificity (eg, various FITs), mortality reductions could be greater.275 Decision analyses can help understand the trade-offs of false-positive results and optimal intervals of testing for tests that maximize sensitivity with a reduction in specificity (eg, sDNA-FIT). To date, while serum testing has more limited evidence around test accuracy, it has better patient acceptability and adherence than stool-based testing.276 While CT colonography has evidence to support the adequate detection for precursor lesions greater than or equal to 6 mm (similar to colonoscopy), it may have harms associated with the cumulative exposure of radiation with repeated examinations, the detection of incidental findings, or both.
Adherence to screening remains the biggest challenge to implementation of screening and has consistently lagged behind recommended screenings for other cancers.277 Adherence to a single round of screening, repeated screening, and follow-up colonoscopy vary across studies, setting, and populations.278 Differential adherence to screening tests influences the benefits and harms of screening program and may influence the selection of a preferred strategy.
Although the incidence of CRC has been increasing among adults younger than 50 years, there is little empirical evidence evaluating potential differences in the effectiveness of screening, test performance of screening tests, and the harms of screening in adults younger than 50 years. Any differences in the effectiveness of screening at younger ages would be attributable to varying the underlying risk or incidence of CRC, the natural history of disease, or both, as well as differences in test accuracy by age. Limited studies demonstrate no difference in test accuracy of stool testing or harms of colonoscopy in people younger than 50 years. Although it is not hypothesized that colonoscopy or CT colonography are more harmful in younger adults than older adults, initiating screening at an earlier age will accrue more procedural harms and extracolonic findings, which should be weighed against any incremental benefit of earlier start to screening.
Systematic reviews have identified multivariable risk prediction models with adequate discrimination,279,280 many of which have been externally validated281,282; however, they are not commonly used in clinical practice.279,283 In theory, multivariable risk assessment can identify persons at higher risk for CRC and tailor when to initiate screening.
While several CRC screening trials evaluating colonoscopy, CT colonography, and FIT are underway, future research should also include trials or well-designed cohort studies in average-risk populations to evaluate the effects of new serum- and urine-based tests on cancer mortality and incidence. In addition, future research should include adequate sampling of different populations (by age, family risk, and race/ethnicity) to allow for robust subgroup analyses, use multivariable risk assessment to guide screening, or both. Studies to confirm the screening test performance of FITs with thus-far limited reproducibility would be helpful to offer other FIT alternatives to OC-Sensor and OC-Light. Likewise, test accuracy studies adequately powered for cancer detection to establish or confirm the screening test performance of promising serum- and urine-based tests are needed to bolster a menu of options for screening that may have greater acceptability and feasibility. In general test accuracy studies to clarify any differential in detection of proximal vs distal test accuracy, and the detection of precursor lesions with more potential for malignant transformation (eg, serrated sessile lesions), would also be informative. In addition, understanding the overall net effect of detection of extracolonic findings may be helped by reporting of the downstream benefits and harms of extracolonic findings in randomized or nonrandomized studies with longer-term follow-up.
This review has several limitations. First, it excluded studies in symptomatic people and people with the highest hereditary risk. Second, it included only trials or prospective cohort studies designed to evaluate the association of screening with CRC incidence or mortality. It is possible that excluded well-designed nested case-control studies of colonoscopy or FIT may have lower risk of bias than included prospective cohort studies. Third, although this review addressed some important contextual issues related to screening (eg, adherence to testing, risk assessment to tailor screening, test acceptability and availability), it did not include an assessment of the mechanism of benefit of the different screening tests (primary prevention vs early detection), methods to increase screening adherence, prevalence of interval cancers between screenings, potential harms of overdetection of adenomas or unnecessary polypectomy, technological enhancements to improve the test accuracy of direct visualization, and surveillance after screening.
There are several options to screen for colorectal cancer, each with a different level of evidence demonstrating its ability to reduce cancer mortality, its ability to detect cancer or precursor lesions, and its risk of harms.
Corresponding Author: Jennifer S. Lin, MD, MCR, Kaiser Permanente Evidence-based Practice Center, The Center for Health Research, Kaiser Permanente Northwest, 3800 N Interstate Ave, Portland, OR 97227 (email@example.com).
Accepted for Publication: March 9, 2021.
Correction: This article was corrected on July 20, 2021, for incorrect terminology in the Results section.
Author Contributions: Dr Lin had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: All authors.
Acquisition, analysis, or interpretation of data: All authors.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: Perdue, Henrikson, Blasi.
Statistical analysis: Perdue.
Obtained funding: Lin.
Administrative, technical, or material support: Perdue, Bean, Blasi.
Conflict of Interest Disclosures: None reported.
Funding/Support: This research was funded under contract HHSA-290-2015-00007-I, Task Order 6, from the Agency for Healthcare Research and Quality (AHRQ), US Department of Health and Human Services, under a contract to support the US Preventive Services Task Force (USPSTF).
Role of the Funder/Sponsor: Investigators worked with USPSTF members and AHRQ staff to develop the scope, analytic framework, and key questions for this review. AHRQ had no role in study selection, quality assessment, or synthesis. AHRQ staff provided project oversight. reviewed the report to ensure that the analysis met methodological standards, and distributed the draft for peer review. Otherwise, AHRQ had no role in the conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript findings. The opinions expressed in this document are those of the authors and do not reflect the official position of AHRQ or the US Department of Health and Human Services.
Additional Contributions: We gratefully acknowledge the following individuals for their contributions to this project: Tina Fan, MD, MPH (AHRQ); current and former members of the USPSTF who contributed to topic deliberations; Samir Gupta, MD, MSCS (University of California, San Diego), and Carolyn Rutter, PhD (RAND Corporation), for their content expertise and review of the draft report; Rebecca Siegal, MPH (American Cancer Society), for providing incidence data; and Todd Hannon, MLS, Katherine Essick, BS, and Kevin Lutz, MFA (Kaiser Permanente Center for Health Research), for library and editorial assistance. USPSTF members, peer reviewers and those commenting on behalf of partner organizations did not receive financial compensation for their contributions.
Additional Information: A draft version of this evidence report underwent external peer review from 6 content experts (Douglas A. Corley, MD, PhD, MPH [Kaiser Permanente Northern California]; Desmond Leddin, MB, MSc [Dalhousie University]; David Lieberman, MD [Oregon Health and Science University]; Dawn Provenzale, MD, MS [Duke University]; and Paul Pinksy, PhD, and Carrie Klabunde, PhD [National Institutes of Health]) and 2 federal partners (Centers for Disease Control and Prevention and the National Cancer Institute). Comments were presented to the USPSTF during its deliberation of the evidence and were considered in preparing the final evidence review.
Editorial Disclaimer: This evidence report is presented as a document in support of the accompanying USPSTF Recommendation Statement. It did not undergo additional peer review after submission to JAMA.