Assessment of Radiologist Performance in Breast Cancer Screening Using Digital Breast Tomosynthesis vs Digital Mammography | Breast Cancer | JAMA Network Open | JAMA Network
[Skip to Navigation]
Sign In
Figure 1.  Screening Recall Rates on Mammography Examinations with Digital Breast Tomosynthesis (DBT) vs Digital Mammography (DM) Before Use of DBT
Screening Recall Rates on Mammography Examinations with Digital Breast Tomosynthesis (DBT) vs Digital Mammography (DM) Before Use of DBT

A, Scatterplot of raw recall rate on DBT examinations vs DM examinations pre-DBT. Each point represents a single radiologist; error bars depict 95% CI; the dashed black line represents the points at which the recall rates are equal in the comparison groups; and the dashed blue lines represent the expert-recommended upper threshold recall rate of 12%. B, Distribution of the multivariable-adjusted difference in recall rate on DBT examinations vs DM examinations pre-DBT by radiologist. Error bars represent the width of the 95% CI, and the horizontal dashed line represents no difference in recall rate between comparison groups.

Figure 2.  Secular Trends in Recall and Cancer Detection Rates
Secular Trends in Recall and Cancer Detection Rates

Multivariable-adjusted recall rate (A) and cancer detection rate (B) by calendar year according to comparison group, adjusted for the average examination characteristics profile and radiologist-level effects. Error bars depict 95% CIs. Partial year data for 2017 is not shown. DBT indicates digital breast tomosynthesis; and DM, digital mammography.

Table 1.  Characteristics of 2 252 065 Screening Examinations Interpreted From 2009 to 2017 by 198 Radiologists From 104 Facilities in the BCSC
Characteristics of 2 252 065 Screening Examinations Interpreted From 2009 to 2017 by 198 Radiologists From 104 Facilities in the BCSC
Table 2.  Characteristics of 198 Radiologists Included in the Study From 104 Facilities in the Breast Cancer Surveillance Consortium
Characteristics of 198 Radiologists Included in the Study From 104 Facilities in the Breast Cancer Surveillance Consortium
Table 3.  Recall Rate and Cancer Detection Rate by Screening Mammography Modality Among 126 Radiologists in the Breast Cancer Surveillance Consortium Who Began Interpreting DBT Between 2009 and 2017
Recall Rate and Cancer Detection Rate by Screening Mammography Modality Among 126 Radiologists in the Breast Cancer Surveillance Consortium Who Began Interpreting DBT Between 2009 and 2017
1.
Nelson  HD, Pappas  M, Cantor  A, Griffin  J, Daeges  M, Humphrey  L.  Harms of breast cancer screening: systematic review to update the 2009 U.S. Preventive Services Task Force recommendation.  Ann Intern Med. 2016;164(4):256-267. doi:10.7326/M15-0970PubMedGoogle ScholarCrossref
2.
Lehman  CD, Arao  RF, Sprague  BL,  et al.  National performance benchmarks for modern screening digital mammography: update from the Breast Cancer Surveillance Consortium.  Radiology. 2017;283(1):49-58. doi:10.1148/radiol.2016161174PubMedGoogle ScholarCrossref
3.
Institute of Medicine.  Improving Breast Imaging Quality Standards. National Academies Press; 2005.
4.
D’Orsi  CJ, Sickles  EA, Mendelson  EB, Morris  EA,  et al.  ACR BI-RADS Atlas, Breast Imaging Reporting and Data System. American College of Radiology; 2013.
5.
Carney  PA, Sickles  EA, Monsees  BS,  et al.  Identifying minimally acceptable interpretive performance criteria for screening mammography.  Radiology. 2010;255(2):354-361. doi:10.1148/radiol.10091636PubMedGoogle ScholarCrossref
6.
Rosenberg  RD, Yankaskas  BC, Abraham  LA,  et al.  Performance benchmarks for screening mammography.  Radiology. 2006;241(1):55-66. doi:10.1148/radiol.2411051504PubMedGoogle ScholarCrossref
7.
Sechopoulos  I.  A review of breast tomosynthesis: part I—the image acquisition process.  Med Phys. 2013;40(1):014301. doi:10.1118/1.4770279PubMedGoogle Scholar
8.
Sechopoulos  I.  A review of breast tomosynthesis: part II—image reconstruction, processing and analysis, and advanced applications.  Med Phys. 2013;40(1):014302. doi:10.1118/1.4770281PubMedGoogle Scholar
9.
US Food and Drug Administration. 2018 Scorecard statistics. Accessed April 11, 2019. https://www.fda.gov/Radiation-EmittingProducts/MammographyQualityStandardsActandProgram/FacilityScorecard/ucm595007.htm
10.
Marinovich  ML, Hunter  KE, Macaskill  P, Houssami  N.  Breast cancer screening using tomosynthesis or mammography: a meta-analysis of cancer detection and recall.  J Natl Cancer Inst. 2018;110(9):942-949. doi:10.1093/jnci/djy121PubMedGoogle ScholarCrossref
11.
Ballard-Barbash  R, Taplin  SH, Yankaskas  BC,  et al; Breast Cancer Surveillance Consortium.  Breast Cancer Surveillance Consortium: a national mammography screening and outcomes database.  AJR Am J Roentgenol. 1997;169(4):1001-1008. doi:10.2214/ajr.169.4.9308451PubMedGoogle ScholarCrossref
12.
Breast Cancer Surveillance Consortium. Research sites and principal investigators. Accessed February 25, 2020. https://www.bcsc-research.org/about/sites
13.
Burnside  ES, Lin  Y, Munoz del Rio  A,  et al.  Addressing the challenge of assessing physician-level screening performance: mammography as an example.  PLoS One. 2014;9(2):e89418. doi:10.1371/journal.pone.0089418PubMedGoogle Scholar
14.
Tice  JA, Miglioretti  DL, Li  CS, Vachon  CM, Gard  CC, Kerlikowske  K.  Breast density and benign breast disease: risk assessment to identify women at high risk of breast cancer.  J Clin Oncol. 2015;33(28):3137-3143. doi:10.1200/JCO.2015.60.8869PubMedGoogle ScholarCrossref
15.
Miglioretti  DL, Rutter  CM, Geller  BM,  et al.  Effect of breast augmentation on the accuracy of mammography and cancer characteristics.  JAMA. 2004;291(4):442-450. doi:10.1001/jama.291.4.442PubMedGoogle ScholarCrossref
16.
Houssami  N, Abraham  LA, Miglioretti  DL,  et al.  Accuracy and outcomes of screening mammography in women with a personal history of early-stage breast cancer.  JAMA. 2011;305(8):790-799. doi:10.1001/jama.2011.188PubMedGoogle ScholarCrossref
17.
Gardiner  JC, Luo  Z, Roman  LA.  Fixed effects, random effects and GEE: what are the differences?  Stat Med. 2009;28(2):221-239. doi:10.1002/sim.3478PubMedGoogle ScholarCrossref
18.
Gilbert  FJ, Tucker  L, Young  KC.  Digital breast tomosynthesis (DBT): a review of the evidence for use as a screening tool.  Clin Radiol. 2016;71(2):141-150. doi:10.1016/j.crad.2015.11.008PubMedGoogle ScholarCrossref
19.
Friedewald  SM, Rafferty  EA, Rose  SL,  et al.  Breast cancer screening using tomosynthesis in combination with digital mammography.  JAMA. 2014;311(24):2499-2507. doi:10.1001/jama.2014.6095PubMedGoogle ScholarCrossref
20.
Melnikow  J, Fenton  JJ, Miglioretti  D, Whitlock  EP, Weyrich  MS. Screening for breast cancer with digital breast tomosynthesis. Published January 2016. Accessed February 21, 2020. https://www.uspreventiveservicestaskforce.org/Home/GetFile/1/16477/dbt-screen-tomo-finalevidrev/pdf
21.
Siu  AL, US Preventive Services Task Force.  Screening for breast cancer: US Preventive Services Task Force recommendation statement.  Ann Intern Med. 2016;164(4):279-296. doi:10.7326/M15-2886PubMedGoogle ScholarCrossref
22.
Oeffinger  KC, Fontham  ET, Etzioni  R,  et al; American Cancer Society.  Breast cancer screening for women at average risk: 2015 guideline update from the American Cancer Society.  JAMA. 2015;314(15):1599-1614. doi:10.1001/jama.2015.12783PubMedGoogle ScholarCrossref
23.
Houssami  N, Miglioretti  DL.  Digital breast tomosynthesis: a brave new world of mammography screening.  JAMA Oncol. 2016;2(6):725-727. doi:10.1001/jamaoncol.2015.5569PubMedGoogle ScholarCrossref
24.
Richman  IB, Hoag  JR, Xu  X,  et al.  Adoption of digital breast tomosynthesis in clinical practice.  JAMA Intern Med. 2019. doi:10.1001/jamainternmed.2019.1058PubMedGoogle Scholar
25.
DiPrete  O, Lourenco  AP, Baird  GL, Mainiero  MB.  Screening digital mammography recall rate: does it change with digital breast tomosynthesis experience?  Radiology. 2017;286(3):838-844. doi:10.1148/radiol.2017170517PubMedGoogle ScholarCrossref
26.
Narod  S.  Breast cancer: the importance of overdiagnosis in breast-cancer screening.  Nat Rev Clin Oncol. 2016;13(1):5-6. doi:10.1038/nrclinonc.2015.203PubMedGoogle ScholarCrossref
27.
Welch  HG, Prorok  PC, O’Malley  AJ, Kramer  BS.  Breast-cancer tumor size, overdiagnosis, and mammography screening effectiveness.  N Engl J Med. 2016;375(15):1438-1447. doi:10.1056/NEJMoa1600249PubMedGoogle ScholarCrossref
28.
Hovda  T, Holen  AS, Lang  K,  et al.  Interval and consecutive round breast cancer after digital breast tomosynthesis and synthetic 2D mammography versus standard 2D digital mammography in BreastScreen Norway.  Radiology. 2020;294(2):256-264. doi:10.1148/radiol.2019191337PubMedGoogle ScholarCrossref
29.
McDonald  ES, Oustimov  A, Weinstein  SP, Synnestvedt  MB, Schnall  M, Conant  EF.  Effectiveness of digital breast tomosynthesis compared with digital mammography: outcomes analysis from 3 years of breast cancer screening.  JAMA Oncol. 2016;2(6):737-743. doi:10.1001/jamaoncol.2015.5536PubMedGoogle ScholarCrossref
30.
Bahl  M, Gaffney  S, McCarthy  AM, Lowry  KP, Dang  PA, Lehman  CD.  Breast cancer characteristics associated with 2d digital mammography versus digital breast tomosynthesis for screening-detected and interval cancers.  Radiology. 2018;287(1):49-57. doi:10.1148/radiol.2017171148PubMedGoogle ScholarCrossref
Limit 200 characters
Limit 25 characters
Conflicts of Interest Disclosure

Identify all potential conflicts of interest that might be relevant to your comment.

Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.

Err on the side of full disclosure.

If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.

Not all submitted comments are published. Please see our commenting policy for details.

Limit 140 characters
Limit 3600 characters or approximately 600 words
    Original Investigation
    Oncology
    March 30, 2020

    Assessment of Radiologist Performance in Breast Cancer Screening Using Digital Breast Tomosynthesis vs Digital Mammography

    Author Affiliations
    • 1Department of Surgery, University of Vermont Cancer Center, University of Vermont, Burlington
    • 2Department of Radiology, University of Vermont Cancer Center, University of Vermont, Burlington
    • 3Kaiser Permanente Washington Health Research Institute, Seattle
    • 4Department of Medicine, University of California, San Francisco
    • 5General Internal Medicine Section, Department of Veterans Affairs, University of California, San Francisco
    • 6Department of Epidemiology and Biostatistics, University of California, San Francisco
    • 7Division of Epidemiology and Biostatistics, School of Public Health, University of Illinois at Chicago
    • 8Department of Radiology, University of North Carolina, Chapel Hill
    • 9Department of Epidemiology, University of North Carolina, Chapel Hill
    • 10Department of Biomedical Data Science, The Dartmouth Institute for Health Policy and Clinical Practice and Norris Cotton Cancer Center, Geisel School of Medicine at Dartmouth, Lebanon, New Hampshire
    • 11Department of Epidemiology, The Dartmouth Institute for Health Policy and Clinical Practice and Norris Cotton Cancer Center, Geisel School of Medicine at Dartmouth, Lebanon, New Hampshire
    • 12Department of Radiology, University of Washington School of Medicine, Seattle
    • 13Hutchinson Institute for Cancer Outcomes Research, Seattle, Washington
    • 14Department of Radiology, University of Vermont Cancer Center, University of Vermont, Burlington
    • 15The Dartmouth Institute for Health Policy and Clinical Practice, Lebanon, New Hampshire
    • 16Norris Cotton Cancer Center, Geisel School of Medicine at Dartmouth, Lebanon, New Hampshire
    • 17Division of Biostatistics, Department of Public Health Sciences, University of California Davis School of Medicine, Davis
    • 18Kaiser Permanente Washington Health Research Institute, Kaiser Permanente Washington, Seattle
    JAMA Netw Open. 2020;3(3):e201759. doi:10.1001/jamanetworkopen.2020.1759
    Key Points español 中文 (chinese)

    Question  Is digital breast tomosynthesis (DBT) associated with improved radiologist-level breast cancer screening performance?

    Findings  In this cohort study of 198 radiologists from 104 radiology facilities in the Breast Cancer Surveillance Consortium cohort, DBT was associated with an overall 15% decrease in recall rate and a 21% increase in cancer detection rate compared with digital mammography. Recall rates were significantly lower on DBT examinations compared with digital mammography examinations interpreted before DBT use for 35.7% of radiologists and significantly higher for 14.3%; 50.0% had no statistically significant change in recall rate.

    Meaning  In this study, DBT was associated with a clinically important overall reduction in recall rate and an overall increase in cancer detection rate; however, there was wide variability, with many radiologists showing no significant improvement in recall rate with DBT.

    Abstract

    Importance  Many US radiologists have screening mammography recall rates above the expert-recommended threshold of 12%. The influence of digital breast tomosynthesis (DBT) on the distribution of radiologist recall rates is uncertain.

    Objective  To evaluate radiologists’ recall and cancer detection rates before and after beginning interpretation of DBT examinations.

    Design, Setting, and Participants  This cohort study included 198 radiologists from 104 radiology facilities in the Breast Cancer Surveillance Consortium who interpreted 251 384 DBT and 2 000 681 digital mammography (DM) screening examinations from 2009 to 2017, including 126 radiologists (63.6%) who interpreted DBT examinations during the study period and 72 (36.4%) who exclusively interpreted DM examinations (to adjust for secular trends). Data were analyzed from April 2018 to July 2019.

    Exposures  Digital breast tomosynthesis and DM screening examinations.

    Main Outcomes and Measures  Recall rate and cancer detection rate.

    Results  A total of 198 radiologists interpreted 2 252 065 DM and DBT examinations (2 000 681 [88.8%] DM examinations; 251 384 [11.2%] DBT examinations; 710 934 patients [31.6%] aged 50-59 years; 1 448 981 [64.3%] non-Hispanic white). Among the 126 radiologists (63.6%) who interpreted DBT examinations, 83 (65.9%) had unadjusted DM recall rates of no more than 12% before using DBT, with a median (interquartile range) recall rate of 10.0% (7.5%-13.0%). On DBT examinations, 96 (76.2%) had an unadjusted recall rate of no more than 12%, with a median (interquartile range) recall rate of 8.8% (6.3%-11.3%). A secular trend in recall rate was observed, with the multivariable-adjusted risk of recall on screening examinations declining by 1.2% (95% CI, 0.9%-1.5%) per year. After adjusting for examination characteristics and secular trends, recall rates were 15% lower on DBT examinations compared with DM examinations interpreted before DBT use (relative risk, 0.85; 95% CI, 0.83-0.87). Adjusted recall rates were significantly lower on DBT examinations compared with DM examinations interpreted before DBT use for 45 radiologists (35.7%) and significantly higher for 18 (14.3%); 63 (50.0%) had no statistically significant change. The unadjusted cancer detection rate on DBT was 5.3 per 1000 examinations (95% CI, 5.0-5.7 per 1000 examinations) compared with 4.7 per 1000 examinations (95% CI, 4.6-4.8 per 1000 examinations) on DM examinations interpreted before DM use (multivariable-adjusted risk ratio, 1.21; 95% CI, 1.11-1.33).

    Conclusions and Relevance  In this study, DBT was associated with an overall decrease in recall rate and an increase in cancer detection rate. However, our results indicated that there is wide variability among radiologists, including a subset of radiologists who experienced increased recall rates on DBT examinations. Radiology practices should audit radiologist DBT screening performance and consider additional DBT training for radiologists whose performance does not improve as expected.

    Introduction

    In the United States, high false-positive rates on mammography have been recognized as a significant harm of breast cancer screening.1 Only 4% to 5% of positive mammograms recalled for further evaluation ultimately lead to a cancer diagnosis.2 There have long been calls for quality improvement efforts to lower screening recall rates in the United States,3 but there is little evidence of improvements to date. The American College of Radiology professional guidelines for mammography interpretation issued in 20134 included a recommended upper threshold of 12% for recall rate, citing the findings of a panel of expert breast imaging physicians.5 In an evaluation of digital mammography (DM) screening performance in the Breast Cancer Surveillance Consortium (BCSC) from 2007 to 2013, only 155 of 249 radiologists (62.2%) had a recall rate below the expert-recommended upper threshold of 12%.2 The median recall rate among BCSC radiologists increased from 9.7% on film-screen mammography during 1996 to 2002 to 10.8% on DM during 2007 to 2013.2,6

    Digital breast tomosynthesis (DBT) is a new mammography-based tool for breast cancer screening that has quickly disseminated. Digital breast tomosynthesis acquires multiple low-dose 2-dimensional mammography images that are reconstructed computationally and can be scrolled through in a 3-dimensional format.7,8 Approved by the US Food and Drug Administration in 2011, this approach is designed to clarify areas of overlapping breast tissue that may obscure lesions. Approximately half of Food and Drug Administration–certified mammography facilities now have a DBT-capable mammography unit.9

    Prior studies suggest that DBT screening examinations typically have lower recall rates in the United States than conventional DM examinations, while maintaining or even increasing cancer detection.10 To our knowledge, radiologist-level variability in DBT performance has not been examined in a large national sample. We compared radiologist recall and cancer detection rates with DBT and DM using clinical data from the BCSC, which includes a large sample of radiologists from diverse practice settings in the United States. We hypothesized that most radiologists would have a lower recall rate on DBT examinations compared with their recall rate on DM examinations that were interpreted before beginning to use DBT, without a negative association with cancer detection rates.

    Methods
    Study Setting and Data Sources

    We used data from the 6 following active BCSC registries: Carolina Mammography Registry, Kaiser Permanente Washington, New Hampshire Mammography Network, Vermont Breast Cancer Surveillance System, San Francisco Mammography Registry, and Metropolitan Chicago Breast Cancer Registry.2,11,12 The BCSC registries capture examination-level risk factor and radiology data directly from participating radiology facilities. Data on breast cancer diagnoses are obtained by linking to pathology databases; regional Surveillance, Epidemiology, and End Results programs; and state tumor registries. Each registry and the Statistical Coordinating Center received institutional review board approval for this study with either active or passive consenting processes or a waiver of consent. All procedures were Health Insurance Portability and Accountability Act compliant. This study followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guideline for cohort studies.

    Participants

    Radiologists who interpreted screening mammography examinations at participating BCSC facilities were eligible for the study. For each radiologist who interpreted DBT examinations, we determined the date of their first DBT examination interpretation (range, June 2011 to November 2016). A total of 3 groups of screening mammograms were defined for each radiologist who initiated use of DBT during the study period, as follows: (1) DM examinations in the 2 years before the radiologist’s DBT start date; (2) DBT examinations; and (3) DM examinations after the radiologist’s DBT start date. Radiologists who interpreted DBT examinations were required to have interpreted at least 100 DBT examinations as well as at least 100 DM examinations before using DBT (n = 126) to ensure reasonable precision13 for the primary comparison of recall rate among groups while also maintaining representation of low-volume radiologists.

    To estimate and control for secular trends in mammography performance, we also identified radiologists who did not interpret DBT examinations during the study period (ie, DBT nonusers). We restricted these to the 72 radiologists who read 960 or more DM examinations (the minimum required by the Mammography Quality Standards Act) in the BCSC database during the 2 year period surrounding July 23, 2014 (the median DBT start date for DBT interpreters).

    Measures and Definitions

    Demographic, risk factor, and medical history information for women undergoing breast imaging was obtained on a self-administered questionnaire completed at each mammogram or by extraction from the electronic medical record. We estimated each woman’s 5-year breast cancer risk using the BCSC risk model, version 2.0.14

    Breast imaging data, including modality, examination indication, breast density, and assessments, were provided by radiology facilities using standard nomenclature defined by the American College of Radiology Breast Imaging Reporting and Data System.4 All analyses were restricted to DM and DBT screening examinations conducted from 2009 to 2017 among women without a personal history of breast cancer or a history of breast augmentation, given that screening performance is known to differ markedly among these women.15,16 The recall rate was defined as the fraction of screening examinations with a positive initial assessment (ie, category 0, 3, 4, or 5) based on American College of Radiology guidelines.4 The cancer detection rate was defined as the number of positive examinations with invasive cancer or ductal carcinoma in situ diagnosed within 365 days and before the next screening mammogram divided by the total number of examinations.

    Statistical Analysis

    Descriptive statistics were used to evaluate the unadjusted recall rates and cancer detection rates within comparison groups. To control for secular trends and differences in covariates among comparison groups, a generalized linear model with log link and robust variance estimation was used to estimate the probability of a positive examination (ie, recall) associated with 4 different comparison groups, as follows: DM interpreted before any DBT interpretation (ie, DM pre-DBT), DBT examinations, DM examinations interpreted after DBT interpretation (ie, DM post-DBT), and DM examinations interpreted by radiologists who did not interpret DBT examinations during the study period. Examination-level covariates in the models were identified a priori based on known associations with screening performance, and included age, race/ethnicity, first-degree family history of breast cancer, history of breast biopsy, Breast Imaging Reporting and Data System breast density, BCSC 5-year risk, time since last mammogram, history of DBT imaging, examination year, and BCSC registry site. We included radiologist-level fixed effects to account for differences in performance owing to radiologist-level factors associated with DBT use or performance; thus, the estimated relative risks are within-radiologist effects that are adjusted for between-radiologist differences, examination-level covariates, and secular trends.17 For radiologists who interpreted DBT examinations during the study period, separate radiologist-level fixed effects were estimated for DM examinations pre-DBT, DBT examinations, and DM examinations post-DBT. We estimated adjusted recall rates over time for each comparison group using model-based estimates of examination year effects and comparison group–specific radiologist-level fixed effects, adjusted for the average examination characteristics.

    Radiologist characteristics were described according to whether the radiologist interpreted DBT during the study period and, for DBT interpreters, whether a change in adjusted recall rate on DBT vs DM was observed. We used χ2 tests to compare radiologist characteristics among radiologist groups that had a statistically significant decrease, no significant change, or a statistically significant increase in recall rates on DBT examinations compared with DM examinations pre-DBT.

    Sample size was insufficient to evaluate radiologist-level cancer detection rates for each comparison group in our study. However, we estimated the overall relative risk of cancer detection by comparison group using a generalized linear regression model with log link, using a similar approach as described earlier for recall rate, including radiologist-level fixed effects but excluding radiologist-level and comparison group interactions. We restricted these analyses to 2 037 013 examinations with at least 365 days of follow-up for cancer diagnoses. We estimated adjusted cancer detection rates over time for each comparison group using model-based estimates of examination year effects and radiologist-level fixed effects, adjusted for the average examination characteristics.

    Statistical analyses were performed using R version 3.5 (R Project for Statistical Computing). Tests of statistical significance were 2-sided with α < .05.

    Results

    A total of 198 radiologists interpreted 2 252 065 DM and DBT examinations (2 000 681 [88.8%] DM examinations; 251 384 [11.2%] DBT examinations; 710 934 patients [31.6%] aged 50-59 years; 1 448 981 [64.3%] non-Hispanic white). Characteristics of women undergoing screening were generally similar across comparison groups (Table 1), although more women who underwent DBT examinations, compared with those who underwent DM examinations, were non-Hispanic white patients (190 399 of 251 384 [75.7%] vs 1 258 582 of 2 000 681 [62.9%]). A lower proportion of non-Hispanic black women (17 361 [6.9%] vs 235 429 [11.8%]) and Asian or Pacific Islander women (13 872 [5.5%] vs 254 999 [12.7%]) underwent DBT examinations than DM examinations. There were more women with a BCSC 5-year risk of at least 1.67% who had DM examinations interpreted by DBT nonusers during the study period than those who had DM or DBT examinations interpreted by DBT users (195 075 of 724 316 [26.9%] vs 338 428 of 1 527 749 [22.2%]). Most radiologists were not breast imaging specialists (127 of 163 [77.9%]) and did not practice at an academic medical center (171 of 197 [86.8%]) (Table 2). Compared with radiologists who only interpreted DM examinations, more radiologists who interpreted DBT examinations were breast imaging specialists (28 of 104 [26.9%] vs 8 of 59 [13.6%]), had fellowship training (30 of 106 [28.3%] vs 5 of 59 [8.5%]), and practiced at an academic medical center (25 of 125 [20.0%] vs 1 of 72 [1.4%]), in a hospital (113 of 125 [90.4%] vs 40 of 72 [55.6%]), or in a for-profit practice (41 of 109 [37.6%] vs 8 of 43 [18.6%]).

    Recall Rate
    DBT vs DM Pre-DBT

    Among the 126 radiologists (63.6%) who interpreted DBT examinations, the median (interquartile range) unadjusted DM recall rate before DBT use was 10.0% (7.5%-13.0%), and 83 radiologists (65.9%) had a recall rate below the expert-recommended upper threshold of 12% (Table 3; Figure 1A). The median (interquartile range) unadjusted DBT recall rate was 8.8% (6.3%-11.3%), with 96 radiologists (76.2%) meeting the 12% threshold. After adjusting for examination-level covariates and secular trends, the overall relative risk (RR) for recall on DBT vs DM pre-DBT examinations was 0.85 (95% CI, 0.83 to 0.87) (Table 3).

    There was a statistically significant decrease in the multivariable-adjusted recall rate for 45 radiologists (35.7%), a statistically significant increase for 18 radiologists (14.3%), and no statistically significant difference for 63 radiologists (50.0%) (Figure 1B). Radiologist characteristics were comparable across these groups in univariate tests, with the exception of the relative amount of DBT vs DM interpreted in the post-DBT period (Table 2). Most (27 [60.0%]) radiologists who experienced a decrease in recall rate on DBT had at least 50% DBT vs DM volume after beginning DBT interpretation.

    DM Before vs After DBT Use

    The unadjusted statistics also suggested a reduction in recall rate on DM examinations interpreted after DBT use. However, the multivariable-adjusted model estimates indicated comparable recall rates with DM examinations interpreted before DBT use (RR, 1.01; 95% CI, 0.99-1.03) (Table 3).

    DBT vs DM Among Radiologists Who Did Not Interpret DBT

    The median (interquartile range) unadjusted DM recall rate among radiologists who did not interpret DBT was 9.5% (6.5%-10.5%), with 61 radiologists (84.7%) meeting the 12% recall rate threshold (eTable in the Supplement). A secular trend was observed, with the multivariable-adjusted risk of recall on screening examinations declining by 1.2% (95% CI, 0.9%-1.5%) per year. Figure 2A illustrates the multivariable-adjusted model estimates of recall rate in each comparison group in the presence of this secular trend and also demonstrates that, after adjustment for secular trends and woman-level characteristics, recall rates on DBT examinations were lower than recall rates on DM examinations among radiologists who did not interpret DBT examinations.

    Cancer Detection Rate
    DBT vs DM Before DBT Use

    Among radiologists who interpreted DBT examinations, the unadjusted cancer detection rate on DM examinations before DBT use was 4.7 per 1000 examinations (95% CI, 4.6-4.8 per 1000 examinations), and the unadjusted cancer detection rate on DBT examinations was 5.3 per 1000 examinations (95% CI, 5.0-5.7 per 1000 examinations) (P < .001) (Table 3). In the multivariable-adjusted regression analysis, the RR of cancer detection on DBT vs DM examinations before DBT use was 1.21 (95% CI, 1.11-1.33) (Table 3). Among 45 radiologists who had a statistically significant decrease in recall rate on DBT compared with DM pre-DBT (median DBT recall rate, 7.5%), the cancer detection rate on DBT examinations was 5.0 per 1000 examinations (95% CI, 4.6-5.4 per 1000 examinations) compared with 4.7 per 1000 examinations (95% CI, 4.5-4.9 per 1000 examinations) on DM pre-DBT examinations (P = .22). Among 81 radiologists who did not have a statistically significant decrease in recall on DBT (median DBT recall rate, 9.1%), the cancer detection rate on DBT examinations was 5.8 per 1000 examinations (95% CI, 5.2-6.3 per 1000 examinations) compared with 4.7 per 1000 examinations (95% CI, 4.5-4.9 per 1000 examinations) on DM pre-DBT examinations (P < .001).

    DM Before vs After DBT Use

    The unadjusted statistics also suggested an increase in cancer detection rate on DM examinations interpreted after DBT use. However, the multivariable-adjusted model estimates indicated comparable cancer detection rates with DM examinations interpreted before DBT use (RR, 1.06; 95% CI, 0.98-1.14) (Table 3).

    DBT vs DM Among Radiologists Who Did Not Interpret DBT

    The unadjusted statistics indicated that radiologists who did not interpret DBT examinations had relatively high cancer detection rates (eTable in the Supplement), but the multivariable-adjusted model indicated that, after adjustment for secular trends and woman-level characteristics, cancer detection rates on DBT examinations were higher than cancer detection rates on DM examinations among radiologists who did not interpret DBT examinations (Figure 2B). In the multivariable-adjusted regression model, the secular trend in cancer detection rates was small and not statistically significant (annual change, 0.4%; 95% CI, −1.8% to 1.0%).

    Discussion

    Our results demonstrate a clinically important downward shift in the distribution of radiologist recall rates with DBT screening among a large, geographically diverse sample of US radiologists. This shift was accompanied by elevated cancer detection rates on DBT examinations, indicating that the reduction in recall was not associated with a decrease in cancer detection rates. Improvements in screening outcomes remained after adjusting for differences in woman- and examination-level characteristics and secular trends.

    Overall, our results are consistent with prior studies evaluating average differences in recall rates on DBT vs DM examinations in the US.10,18-20 A 2018 meta-analysis of 13 US studies10 found that the overall DBT recall rate was 2.2% points lower than that for DM examinations, while the cancer detection rate was elevated by 1.1 per 1000 examinations.10 However, our study demonstrated that the association of DBT with recall rate varied widely among radiologists. The observed variability indicates that decreased recall on DBT is not universal. We found no strong differences in DBT vs DM recall rate patterns associated with radiologist specialty, fellowship training, or practice characteristics, but radiologists who shifted toward predominant use of DBT (ie, >50% of screening volume) were more likely to have a decrease in recall rate on DBT examinations compared with radiologists who continued interpreting a large percentage of DM examinations after beginning DBT. Additional evidence is needed to characterize radiologist- and practice-level factors associated with DBT performance and to identify settings where additional training or other interventions are warranted to realize the potential benefits of DBT.

    Our analyses were designed to test the hypothesis that radiologist-specific recall rates on DBT were lower than their recall rates on DM before the use of DBT. We included a group of radiologists who did not interpret DBT to estimate and adjust for secular trends. The reasons for the declining secular trend in recall rate observed in our study are not clear, but we speculate that it may be because of increased attention to high recall rates in the United States through the publication of updated national benchmarks,2,5 the publicity of screening recommendation statements noting the harms of mammography screening,21,22 and the expectation for lower recall as DBT disseminates.23

    Notably, differences in unadjusted recall and cancer detection rates between radiologists who did vs did not interpret DBT cannot be used to infer the effects of DBT on performance. We found differences in the characteristics of radiologists who did vs did not interpret DBT and differences in the characteristics of the women they screened, all of which may contribute to different unadjusted performance statistics. This is supported by our findings that radiologists who interpreted DBT examinations had higher unadjusted recall rates and lower unadjusted cancer detection rates on DM before DBT interpretation compared with the cohort of radiologists who did not interpret DBT.

    In our study, DBT use was more frequent among non-Hispanic white women and more commonly performed by fellowship-trained breast imaging specialists at academic medical centers. This suggests potential inequities in access to DBT, consistent with a 2019 study24 examining DBT adoption according to community-level socioeconomic resources, and warrants further investigation.

    We are aware of only 1 study examining DM performance after DBT experience, which reported increased recall and cancer detection rates for DM after vs before DBT use in a single practice.25 Notably, the 6 radiologists in that study had very low DM recall rates (mean, 6.8%) and very low cancer detection rates (mean, 2.5 per 1000 examinations) before DBT use. In contrast, our study, which included more than 100 radiologists from varied clinical settings across the US, suggests that DBT experience has little association with DM recall rates or cancer detection rates.

    Limitations

    This study has limitations. The large number of examinations and radiologists from a geographically and racially diverse sample of academic and nonacademic practice settings in the BCSC engenders greatly increased precision and representativeness compared with prior studies of DBT performance. However, the potential for selection bias must be considered. Our analysis of within-radiologist effects minimizes this source of bias by effectively using each radiologist as their own control. Collection of comprehensive examination-level data permitted the control of numerous factors known to be associated with recall and cancer detection rates, such as age, breast density, and time since last mammogram. Our inclusion of data from radiologists who began using DBT at a variety of points as well as radiologists who did not interpret DBT permitted control for secular trends.

    A further limitation of the study was our inability to examine radiologist-level cancer detection rates. However, we were able to evaluate the cancer detection rate separately for the subgroup of radiologists with a significant decrease in recall rate on DBT. While our results indicate that DBT examinations were associated with increases in cancer detection rates overall, future studies will be needed to examine variation in radiologist-level cancer detection rates. Additionally, given concerns regarding overdiagnosis in breast cancer,26,27 it is important to evaluate DBT-detected cancer characteristics. It remains unclear whether DBT is associated with a reduction in interval cancers, advanced stage cancers, or breast cancer mortality among women undergoing screening.28-30

    Conclusions

    In this cohort study, we found that implementation of DBT was associated with an overall reduction in radiologist recall rate and an overall increase in cancer detection rate. However, there is wide variability in these associations among radiologists. Thus, radiology practices should audit radiologist DBT performance statistics closely and consider additional DBT training for radiologists whose screening performance does not improve as expected. Policy makers and women should be informed that most women undergoing DBT examinations will experience a reduced recall rate and elevated cancer detection rate, although these benefits will vary substantially depending on the individual radiologist interpreting the examination.

    Back to top
    Article Information

    Accepted for Publication: February 2, 2020.

    Published: March 30, 2020. doi:10.1001/jamanetworkopen.2020.1759

    Open Access: This is an open access article distributed under the terms of the CC-BY License. © 2020 Sprague BL et al. JAMA Network Open.

    Corresponding Author: Brian L. Sprague, PhD, Departments of Surgery and Radiology, University of Vermont Cancer Center, University of Vermont, 1 S Prospect St, UHC Room 4425, Burlington, VT 05401 (bsprague@uvm.edu).

    Author Contributions: Drs Coley and Miglioretti had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.

    Concept and design: Sprague, Coley, Kerlikowske, Henderson, Onega, Lee, Tosteson, Miglioretti.

    Acquisition, analysis, or interpretation of data: All authors.

    Drafting of the manuscript: Sprague, Coley, Lee.

    Critical revision of the manuscript for important intellectual content: Coley, Kerlikowske, Rauscher, Henderson, Onega, Lee, Herschorn, Tosteson, Miglioretti.

    Statistical analysis: Coley, Onega, Miglioretti.

    Obtained funding: Sprague, Kerlikowske, Henderson, Onega, Lee, Tosteson, Miglioretti.

    Administrative, technical, or material support: Kerlikowske.

    Supervision: Miglioretti.

    Conflict of Interest Disclosures: Dr Kerlikowske reported receiving grants from Google Sciences outside the submitted work and serving as an unpaid consultant to Gene Relationships Across Implicated Loci for the Strive study. Dr Lee reported receiving grants from GE Healthcare; receiving personal fees from the American College of Radiology and Gene Relationships Across Implicated Loci; receiving textbook royalties from McGraw Hill, Oxford University Press, and Wolters Kluwer; and serving on the advisory board of Hologic Scientific outside the submitted work. No other disclosures were reported.

    Funding/Support: This study was funded through award PCS-1504-30370 from the Patient-Centered Outcomes Research Institute. Data collection for this research was additionally supported by the Breast Cancer Surveillance Consortium with funding from the National Cancer Institute (grants P01CA154292 and U54CA163303), the Agency for Healthcare Research and Quality (grant R01 HS018366-01A1), and the University of Vermont Cancer Center, with funds generously awarded by the Lake Champlain Cancer Research Organization (grant 032800). Dr Lee was also supported in part by the American Cancer Society (grant 126947-MRSG-1416001CPHPS). Cancer and vital status data collection were supported by several state public health departments and cancer registries (http://www.bcsc-research.org/work/acknowledgement.html).

    Role of the Funder/Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

    Disclaimer: The statements presented in this work are solely the responsibility of the authors and do not necessarily represent the official views of Patient-Centered Outcomes Research Institute, its Board of Governors or Methodology Committee, the National Cancer Institute, or the National Institutes of Health.

    Additional Contributions: We thank the participating women, mammography facilities, and radiologists for the data they have provided for this study. You can learn more about the Breast Cancer Surveillance Consortium at http://www.bcsc-research.org/.

    References
    1.
    Nelson  HD, Pappas  M, Cantor  A, Griffin  J, Daeges  M, Humphrey  L.  Harms of breast cancer screening: systematic review to update the 2009 U.S. Preventive Services Task Force recommendation.  Ann Intern Med. 2016;164(4):256-267. doi:10.7326/M15-0970PubMedGoogle ScholarCrossref
    2.
    Lehman  CD, Arao  RF, Sprague  BL,  et al.  National performance benchmarks for modern screening digital mammography: update from the Breast Cancer Surveillance Consortium.  Radiology. 2017;283(1):49-58. doi:10.1148/radiol.2016161174PubMedGoogle ScholarCrossref
    3.
    Institute of Medicine.  Improving Breast Imaging Quality Standards. National Academies Press; 2005.
    4.
    D’Orsi  CJ, Sickles  EA, Mendelson  EB, Morris  EA,  et al.  ACR BI-RADS Atlas, Breast Imaging Reporting and Data System. American College of Radiology; 2013.
    5.
    Carney  PA, Sickles  EA, Monsees  BS,  et al.  Identifying minimally acceptable interpretive performance criteria for screening mammography.  Radiology. 2010;255(2):354-361. doi:10.1148/radiol.10091636PubMedGoogle ScholarCrossref
    6.
    Rosenberg  RD, Yankaskas  BC, Abraham  LA,  et al.  Performance benchmarks for screening mammography.  Radiology. 2006;241(1):55-66. doi:10.1148/radiol.2411051504PubMedGoogle ScholarCrossref
    7.
    Sechopoulos  I.  A review of breast tomosynthesis: part I—the image acquisition process.  Med Phys. 2013;40(1):014301. doi:10.1118/1.4770279PubMedGoogle Scholar
    8.
    Sechopoulos  I.  A review of breast tomosynthesis: part II—image reconstruction, processing and analysis, and advanced applications.  Med Phys. 2013;40(1):014302. doi:10.1118/1.4770281PubMedGoogle Scholar
    9.
    US Food and Drug Administration. 2018 Scorecard statistics. Accessed April 11, 2019. https://www.fda.gov/Radiation-EmittingProducts/MammographyQualityStandardsActandProgram/FacilityScorecard/ucm595007.htm
    10.
    Marinovich  ML, Hunter  KE, Macaskill  P, Houssami  N.  Breast cancer screening using tomosynthesis or mammography: a meta-analysis of cancer detection and recall.  J Natl Cancer Inst. 2018;110(9):942-949. doi:10.1093/jnci/djy121PubMedGoogle ScholarCrossref
    11.
    Ballard-Barbash  R, Taplin  SH, Yankaskas  BC,  et al; Breast Cancer Surveillance Consortium.  Breast Cancer Surveillance Consortium: a national mammography screening and outcomes database.  AJR Am J Roentgenol. 1997;169(4):1001-1008. doi:10.2214/ajr.169.4.9308451PubMedGoogle ScholarCrossref
    12.
    Breast Cancer Surveillance Consortium. Research sites and principal investigators. Accessed February 25, 2020. https://www.bcsc-research.org/about/sites
    13.
    Burnside  ES, Lin  Y, Munoz del Rio  A,  et al.  Addressing the challenge of assessing physician-level screening performance: mammography as an example.  PLoS One. 2014;9(2):e89418. doi:10.1371/journal.pone.0089418PubMedGoogle Scholar
    14.
    Tice  JA, Miglioretti  DL, Li  CS, Vachon  CM, Gard  CC, Kerlikowske  K.  Breast density and benign breast disease: risk assessment to identify women at high risk of breast cancer.  J Clin Oncol. 2015;33(28):3137-3143. doi:10.1200/JCO.2015.60.8869PubMedGoogle ScholarCrossref
    15.
    Miglioretti  DL, Rutter  CM, Geller  BM,  et al.  Effect of breast augmentation on the accuracy of mammography and cancer characteristics.  JAMA. 2004;291(4):442-450. doi:10.1001/jama.291.4.442PubMedGoogle ScholarCrossref
    16.
    Houssami  N, Abraham  LA, Miglioretti  DL,  et al.  Accuracy and outcomes of screening mammography in women with a personal history of early-stage breast cancer.  JAMA. 2011;305(8):790-799. doi:10.1001/jama.2011.188PubMedGoogle ScholarCrossref
    17.
    Gardiner  JC, Luo  Z, Roman  LA.  Fixed effects, random effects and GEE: what are the differences?  Stat Med. 2009;28(2):221-239. doi:10.1002/sim.3478PubMedGoogle ScholarCrossref
    18.
    Gilbert  FJ, Tucker  L, Young  KC.  Digital breast tomosynthesis (DBT): a review of the evidence for use as a screening tool.  Clin Radiol. 2016;71(2):141-150. doi:10.1016/j.crad.2015.11.008PubMedGoogle ScholarCrossref
    19.
    Friedewald  SM, Rafferty  EA, Rose  SL,  et al.  Breast cancer screening using tomosynthesis in combination with digital mammography.  JAMA. 2014;311(24):2499-2507. doi:10.1001/jama.2014.6095PubMedGoogle ScholarCrossref
    20.
    Melnikow  J, Fenton  JJ, Miglioretti  D, Whitlock  EP, Weyrich  MS. Screening for breast cancer with digital breast tomosynthesis. Published January 2016. Accessed February 21, 2020. https://www.uspreventiveservicestaskforce.org/Home/GetFile/1/16477/dbt-screen-tomo-finalevidrev/pdf
    21.
    Siu  AL, US Preventive Services Task Force.  Screening for breast cancer: US Preventive Services Task Force recommendation statement.  Ann Intern Med. 2016;164(4):279-296. doi:10.7326/M15-2886PubMedGoogle ScholarCrossref
    22.
    Oeffinger  KC, Fontham  ET, Etzioni  R,  et al; American Cancer Society.  Breast cancer screening for women at average risk: 2015 guideline update from the American Cancer Society.  JAMA. 2015;314(15):1599-1614. doi:10.1001/jama.2015.12783PubMedGoogle ScholarCrossref
    23.
    Houssami  N, Miglioretti  DL.  Digital breast tomosynthesis: a brave new world of mammography screening.  JAMA Oncol. 2016;2(6):725-727. doi:10.1001/jamaoncol.2015.5569PubMedGoogle ScholarCrossref
    24.
    Richman  IB, Hoag  JR, Xu  X,  et al.  Adoption of digital breast tomosynthesis in clinical practice.  JAMA Intern Med. 2019. doi:10.1001/jamainternmed.2019.1058PubMedGoogle Scholar
    25.
    DiPrete  O, Lourenco  AP, Baird  GL, Mainiero  MB.  Screening digital mammography recall rate: does it change with digital breast tomosynthesis experience?  Radiology. 2017;286(3):838-844. doi:10.1148/radiol.2017170517PubMedGoogle ScholarCrossref
    26.
    Narod  S.  Breast cancer: the importance of overdiagnosis in breast-cancer screening.  Nat Rev Clin Oncol. 2016;13(1):5-6. doi:10.1038/nrclinonc.2015.203PubMedGoogle ScholarCrossref
    27.
    Welch  HG, Prorok  PC, O’Malley  AJ, Kramer  BS.  Breast-cancer tumor size, overdiagnosis, and mammography screening effectiveness.  N Engl J Med. 2016;375(15):1438-1447. doi:10.1056/NEJMoa1600249PubMedGoogle ScholarCrossref
    28.
    Hovda  T, Holen  AS, Lang  K,  et al.  Interval and consecutive round breast cancer after digital breast tomosynthesis and synthetic 2D mammography versus standard 2D digital mammography in BreastScreen Norway.  Radiology. 2020;294(2):256-264. doi:10.1148/radiol.2019191337PubMedGoogle ScholarCrossref
    29.
    McDonald  ES, Oustimov  A, Weinstein  SP, Synnestvedt  MB, Schnall  M, Conant  EF.  Effectiveness of digital breast tomosynthesis compared with digital mammography: outcomes analysis from 3 years of breast cancer screening.  JAMA Oncol. 2016;2(6):737-743. doi:10.1001/jamaoncol.2015.5536PubMedGoogle ScholarCrossref
    30.
    Bahl  M, Gaffney  S, McCarthy  AM, Lowry  KP, Dang  PA, Lehman  CD.  Breast cancer characteristics associated with 2d digital mammography versus digital breast tomosynthesis for screening-detected and interval cancers.  Radiology. 2018;287(1):49-57. doi:10.1148/radiol.2017171148PubMedGoogle ScholarCrossref
    ×