IRB indicates institutional review board. Positive reference standard is a diagnosis of cancer within 365 days of the initial screening examination. Negative reference standard is the absence of a diagnosis of cancer at 1 year follow-up or, for 3 cases, double prophylactic mastectomy. Early second-year screens contribute to the reference standard within the 365-day window.aA Breast Imaging Reporting Data System score of greater than 3 was considered a positive test result; a score of 3 or less, negative. One thousand eight hundred thirty participants with both negative mammographic and negative ultrasonographic results were imputed as having a negative integrated reading.
bBecause of the paired design, missing reference standard data would not bias the comparison of mammography with integrated mammography and ultrasound but may affect generalizability.
Receiver operating characteristic (ROC) curves were calculated based on a bivariate, binomial model (See “Methods” section for details). Table 2 presents summary characteristics for these curves. The ultrasound ROC is included for completeness; the study was not designed to permit direct comparison to ultrasound alone. The fitted area under the curve for mammography alone is 0.78 (95% confidence interval [CI], 0.67-0.87); for mammography plus ultrasound, 0.91 (95% CI, 0.84-0.96); and for ultrasound alone, 0.80 (95% CI, 0.70-0.88).
Berg WA, Blume JD, Cormack JB, Mendelson EB, Lehrer D, Böhm-Vélez M, Pisano ED, Jong RA, Evans WP, Morton MJ, Mahoney MC, Hovanessian Larsen L, Barr RG, Farria DM, Marques HS, Boparai K, ACRIN 6666 Investigators FT. Combined Screening With Ultrasound and Mammography vs Mammography Alone in Women at Elevated Risk of Breast Cancer. JAMA. 2008;299(18):2151–2163. doi:10.1001/jama.299.18.2151
Author Affiliations: American Radiology Services Inc, Johns Hopkins Green Spring, Lutherville, Maryland (Dr Berg); Center for Statistical Sciences, Brown University, Providence, Rhode Island (Drs Blume and Cormack and Ms Marques); Feinberg School of Medicine, Northwestern University, Chicago, Illinois (Dr Mendelson); CERIM, Buenos Aires, Argentina (Dr Lehrer); Weinstein Imaging Associates, Pittsburgh, Pennsylvania (Dr Böhm-Vélez); University of North Carolina School of Medicine, Chapel Hill (Dr Pisano); University of Toronto, Sunnybrook and Women's Hospital, Toronto, Ontario, Canada (Dr Jong); University of Texas Southwestern Medical Center, Dallas (Dr Evans); Mayo Clinic, Rochester, Minnesota (Dr Morton); University of Cincinnati, Cincinnati, Ohio (Dr Mahoney); Keck School of Medicine, University of Southern California, Los Angeles (Dr Larsen); Forum Health, Western Reserve Care System, Youngstown, Ohio (Dr Barr); Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, Missouri (Dr Farria); American College of Radiology, Philadelphia, Pennsylvania (Ms Boparai).
Context Screening ultrasound may depict small, node-negative breast cancers not seen on mammography.
Objective To compare the diagnostic yield, defined as the proportion of women with positive screen test results and positive reference standard, and performance of screening with ultrasound plus mammography vs mammography alone in women at elevated risk of breast cancer.
Design, Setting, and Participants From April 2004 to February 2006, 2809 women, with at least heterogeneously dense breast tissue in at least 1 quadrant, were recruited from 21 sites to undergo mammographic and physician-performed ultrasonographic examinations in randomized order by a radiologist masked to the other examination results. Reference standard was defined as a combination of pathology and 12-month follow-up and was available for 2637 (96.8%) of the 2725 eligible participants.
Main Outcome Measures Diagnostic yield, sensitivity, specificity, and diagnostic accuracy (assessed by the area under the receiver operating characteristic curve) of combined mammography plus ultrasound vs mammography alone and the positive predictive value of biopsy recommendations for mammography plus ultrasound vs mammography alone.
Results Forty participants (41 breasts) were diagnosed with cancer: 8 suspicious on both ultrasound and mammography, 12 on ultrasound alone, 12 on mammography alone, and 8 participants (9 breasts) on neither. The diagnostic yield for mammography was 7.6 per 1000 women screened (20 of 2637) and increased to 11.8 per 1000 (31 of 2637) for combined mammography plus ultrasound; the supplemental yield was 4.2 per 1000 women screened (95% confidence interval [CI], 1.1-7.2 per 1000; P = .003 that supplemental yield is 0). The diagnostic accuracy for mammography was 0.78 (95% CI, 0.67-0.87) and increased to 0.91 (95% CI, 0.84-0.96) for mammography plus ultrasound (P = .003 that difference is 0). Of 12 supplemental cancers detected by ultrasound alone, 11 (92%) were invasive with a median size of 10 mm (range, 5-40 mm; mean [SE], 12.6 [3.0] mm) and 8 of the 9 lesions (89%) reported had negative nodes. The positive predictive value of biopsy recommendation after full diagnostic workup was 19 of 84 for mammography (22.6%; 95% CI, 14.2%-33%), 21 of 235 for ultrasound (8.9%, 95% CI, 5.6%-13.3%), and 31 of 276 for combined mammography plus ultrasound (11.2%; 95% CI. 7.8%-15.6%).
Conclusions Adding a single screening ultrasound to mammography will yield an additional 1.1 to 7.2 cancers per 1000 high-risk women, but it will also substantially increase the number of false positives.
Trial Registration clinicaltrials.gov Identifier: NCT00072501
Early detection reduces deaths due to breast cancer. The US Preventive Services Task Force analysis of 7 randomized trials of mammographic screening found that the point estimate of the reduction in mortality from screening mammography was 22% in women aged 50 years or older and 15% among women between 40 and 49 years,1 with some individual trials showing far greater benefits in both age groups and with any specific age distinction arbitrary. The magnitude of reduction in mortality seen in individual trials parallels reductions in size distribution2 and rates of node-positive breast cancer.3
Mammography can depict calcifications due to malignancy, including ductal carcinoma in situ (DCIS). Invasive cancers, which can spread to lymph nodes and cause systemic metastases, are most often manifest as noncalcified masses4 and can be mammographically subtle or occult, particularly when the parenchyma is dense. Dense breast tissue is common. More than half of women younger than 50 years5 have either heterogeneously dense, visually estimated as 51% to 75% glandular,6 or extremely dense, visually estimated as more than 75% glandular6 breasts, as do at least one-third of women older than 50 years.5 In women with dense breasts, mammographic sensitivity may be as low as 30% to 48%,7,8 with much higher interval cancer rates7,9 and worse prognosis for resulting clinically detected cancers. Furthermore, dense breast tissue is itself a marker of increased risk of breast cancer on the order of 4- to 6-fold.10 In dense breasts, digital mammography has improved performance, with sensitivity increasing from 55% with screen film to 70% with digital in 1 large series using mammographic and clinical follow-up as a gold standard.11 Digital mammography does not, however, eliminate the fundamental limitation that noncalcified breast cancers are often obscured by surrounding and overlying dense parenchyma.
In women younger than 50 years, the reduced benefit of mammographic screening is attributed to increased breast density, biologically more aggressive cancers, and reduced prevalence of disease. Using a screening interval of 12 months, rather than 24 months, should improve results with rapidly growing malignancies, even though dense tissue remains a major limitation to improving outcomes.12 Methods to address improving detection despite dense breast tissue are needed.
Supplemental screening ultrasound has the potential of depicting small, node-negative breast cancers not seen on mammography,8,13- 17 and its performance is improved in dense parenchyma.8 It is natural to expect that methods that improve the detection of small, node-negative cancers would further reduce mortality when performed in addition to screening mammography. However, direct evidence of a mortality reduction due to screening can only be generated in a large prospective randomized screening trial with mortality as an end point. Such trials are costly, require extensive infrastructure and resources, and are not practical under all contexts. Surrogate aims and end points, such as the diagnostic performance for the screening modality or the size and stage of breast cancers depicted, have been correlated with mortality outcomes,18,19 and can be used to project the mortality reduction if the screening modality were implemented.
Across 42 838 examinations from the 6 published single-center studies of screening ultrasound to date,8,13- 17 126 women (0.29%) were shown to have 150 cancers identified only on supplemental ultrasound.20 Of 141 invasive cancers detected only on ultrasound, 99 (70%) were 1 cm or smaller in size.20 In studies for which staging was detailed, 36 of 40 cancers (90%) depicted by ultrasonography alone were categorized as stage 0 or I.20
Concerns remain, however, over the generalizability of such favorable results with screening ultrasound. In particular, there is concern for the operator dependence of freehand screening breast ultrasound because an abnormality must be perceived while scanning for it to be documented. Importantly, recent reports have shown that consistent breast ultrasound examination performance and interpretation is possible with minimal training.21,22 Other limitations to implementing widespread screening ultrasound include a shortage of qualified personnel to perform and interpret the examination and lack of standardized scanning protocols. These concerns have hampered use of screening ultrasound; 35% of surveyed facilities specializing in breast imaging offered it in 2005,23 even though most facilities offering screening ultrasound will do so only on a limited basis.
In this study, we report a prospective, multicenter trial, randomized to sequence of performance of mammography and ultrasound, designed to investigate and validate the performance of screening ultrasound in conjunction with mammography, using a standardized protocol and interpretive criteria. This trial was designed to compare the diagnostic yield of screening breast mammography plus ultrasound with mammography alone in women at increased risk of breast cancer. Since beginning this trial, a multicenter study was published from Italy in which 6449 women with dense breasts and negative mammogram results underwent screening ultrasound, with 29 cancers depicted by ultrasound (cancer detection rate, 0.45%).24 The American College of Radiology Imaging Network (ACRIN) 6666 is the largest trial of screening ultrasound in which mammography and ultrasound have been performed and read independently, allowing detailed analysis of the performance of each modality separately and in combination and reducing potential biases in patient recruitment and interpretation of both mammography and ultrasound. Furthermore, we used standardized scanning and interpretive criteria (http://www.acrin.org/TabID/153/Default.aspx), which should facilitate generalizability of our results.
Unlike previous reports evaluating screening ultrasound, we chose to study a population at elevated risk of breast cancer. Supplemental screening in addition to mammography may be more cost-effective in such populations because the expected prevalence of disease is higher than it is for populations with no risk factors. Furthermore, patients at higher risk may be encouraged to begin screening at an earlier age when the tissue is denser and mammography is more limited in its benefits. Indeed, annual magnetic resonance imaging (MRI) is now recommended in addition to mammography for women at very high risk of breast cancer,25 but it remains limited by high cost, required injection of contrast, reduced patient tolerance, and limited availability and expertise. Ultrasound is relatively inexpensive, requires no contrast, is well tolerated, and is widely available.
Participants were women at elevated risk of breast cancer (Table 1) who presented for routine annual mammography and provided written informed consent. Each participant underwent mammographic and ultrasonographic screening examinations in randomized order with the interpreting radiologist for each examination masked to results of the other. Random assignment of screening order was stratified by site and block randomization with alternating block sizes of 6 and 8 used within each site. If the recommendation from the study mammography or ultrasound was for other than routine annual screening, an integrated mammography plus ultrasound interpretation was recorded by a qualified site investigator radiologist. Otherwise, if both ultrasound and mammography were interpreted as negative or benign, no separate integrated interpretation was performed, and the combination of mammography plus ultrasound was assumed to be negative. Management was based on recommendations from the integrated examination. If needed, targeted ultrasonographic or additional mammographic views were then performed and results, assessments, and recommendations were separately recorded. Results of repeat screening at 12 and 24 months after study entry are still being collected. Race and ethnic group were self-assigned from a list of options for ethnicity and a series of yes or no questions for race.
Web-based data capture and quality monitoring was conducted by ACRIN's Biostatistics and Data Management Center. For all analyses in this study, data were cleaned and locked as of May 14, 2007. The study received institutional review board approval from all participating sites; ACRIN and National Cancer Institute-Cancer Imaging Program approval; and data and safety monitoring committee review every 6 months.
A total of 2809 women were recruited from 21 sites between April 2004 and February 2006, of whom 2725 were eligible (Figure 1, Table 1). Women aged at least 25 years who presented for routine annual mammography were eligible to participate if they met uniform definitions of elevated risk (Table 1) as determined by study personnel and had heterogeneously dense or extremely dense parenchyma6 in at least 1 quadrant, either by prior mammography report or by review of prior mammograms. Otherwise eligible women with no prior mammography were allowed to enroll under the rationale that such women would be high-risk young women presenting for baseline screening who would usually have dense breasts. Women were excluded if they had signs or symptoms of breast cancer, recent surgical or percutaneous image-guided breast interventional procedures or MRI or tomosynthesis of the breast(s) within the prior 12 months, or mammography or whole breast ultrasound fewer than 11 months earlier. Also excluded were women with breast implants and those who were pregnant, lactating, or planning to become pregnant within 2 years of study entry or who had known metastatic disease. We did not exclude women with prior breast cancer or basal or squamous cell skin cancer or in situ cervical cancer. Women with other prior cancers were eligible to enroll if they had been disease-free for at least 5 years.
At least 2-view mammography was performed using either screen-film or digital mammography. Visually estimated overall mammographic breast density on study mammograms was recorded as less than 25%; 26% to 40%; 41% to 60%; 61% to 80%; or more than 80% dense. Computer-assisted detection was not permitted. Radiologist investigators who had successfully completed both phantom scanning26 and mammographic and ultrasonographic interpretive skills tasks27 performed separate, masked interpretations of mammographic and ultrasonographic examinations. Survey ultrasound was performed using high-resolution linear array, broad bandwidth transducers with maximum frequency of at least 12 MHz, with scanning in transverse and sagittal planes. Lesions other than simple cysts were imaged with and without spatial compounding and power or color Doppler in orthogonal planes (typically radial and antiradial orientations). An image (with embedded clock time) was recorded on entering the ultrasound suite, at the beginning and end of ultrasonographic screening, and on leaving the suite to determine the time to scan and the total physician time in the room. Electively, the axilla could be scanned, and its inclusion was recorded. Investigators recorded ultrasonographic background echotexture and lesion features using Breast Imaging and Reporting Data System (BI-RADS): Ultrasound descriptors28 and average breast thickness to the nearest centimeter.
Assessments for each lesion and for each breast overall were recorded on the expanded 7-point BI-RADS6 scale: 1, negative; 2, benign; 3, probably benign; 4a, low suspicion; 4b, intermediate suspicion; 4c, moderate suspicion; and 5, highly suggestive of malignancy. To allow for meaningful receiver operating characteristic (ROC) analysis, we did not allow use of a 0 BI-RADS score. The ability to recommend additional imaging was separately allowed. Investigators were also asked to rate likelihood of malignancy from 0% to 100% to provide a scale that would potentially improve the ROC analysis. Recommendations for routine annual follow-up, short interval follow-up in 6 months, additional imaging, and biopsy were recorded separately from assessments.
Reference standard information is a combination of biopsy results within 365 days and clinical follow-up at 1 year. One year follow-up was targeted for 365 days after the last screening date and very few visits were early; of 2637 participants, 32 (1.2%) occurred before 11 months and 12 (0.46%) before 10.5 months. The absence of a known diagnosis of cancer on a participant interview, review of medical records at the 1-year screening follow-up, or both was considered disease negative, as were 3 cases with double prophylactic mastectomies. Biopsy results showing cancer (in situ or infiltrating ductal carcinoma, or infiltrating lobular carcinoma) in the breast or axillary lymph nodes were considered malignant, disease positive, as was 1 other invasive cancer, which proved to be a case of melanoma metastatic to axillary lymph nodes. The melanoma case was retained in the analysis because of its classification at the time the database was locked for analysis. Excision was prompted for core biopsy results of atypical or high-risk lesions including atypical ductal or lobular hyperplasia, lobular carcinoma in situ (LCIS), atypical papilloma, and radial sclerosing lesion.
Statistical software used to perform this analysis was SAS, version 9.1 (SAS Institute Inc, Cary, North Carolina), STATA, version 9.2 (STATA Corp, College Station, Texas), S-PLUS, version 7 (Insightful Corp, Seattle, Washington), and ROCKIT, version 0.9.4 beta (available from the Kurt Rossmann Laboratories for Radiologic Image Research, University of Chicago, Chicago, Illinois). All P values were reported as 2-sided. P <.05 was set as the threshold for significance. All confidence intervals (CIs) are reported at the 95% level.
The primary unit of analysis is the participant, with the most severe breast imaging assessment on mammography or on mammography plus ultrasound used as the primary end point. A BI-RADS assessment of 4a, 4b, 4c, or 5 was considered positive (seen and suspicious) for the mammographic or ultrasonographic imaging test or combination of tests, and an assessment of BI-RADS 1, 2, or 3 was considered negative, as is standard in audits of mammographic outcomes.6,29 We separately analyzed results based on recommendations, with additional imaging or biopsy or both considered positive and short interval or routine follow-up considered negative. Sample-size projections were designed to achieve both the desired level of statistical precision for estimating the yields and at least 80% power to detect a difference in the yields of at least 3 per 1000, while allowing for 17% missing data.
The diagnostic yield (ie, the proportion of women with a positive screen test and positive reference standard), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were estimated as simple proportions with exact 95% CIs. To account for the natural pairing of assessments within a participant, the McNemar test was used to compare the diagnostic yields, sensitivity, and specificity (Table 2) and the test was inverted to provide a CI for their difference. Conditional logistic regression was used where appropriate. Comparison of PPVs and NPVs was done according to Leisenring et al.30 For sensitivity at the lesion level, we accounted for clustering by using a logistic regression with robust SEs. Empirical and model-based ROC curves were estimated from degree of suspicion (BI-RADS) and quasi-continuous probability scales pooled across the study.31 The areas under the curve (AUCs) were compared under a bivariate, binormal model that accounts for the paired-test design (ie, every participant underwent both screening modalities).32,33 The paired-study design eliminates confounding by participant characteristics in the primary comparison between modalities.
Of 2725 eligible participants enrolled, only 3.23% (88) were excluded due to missing data. Thirteen (0.48%) never completed imaging and 75 (2.75%) yielded no reference standard information (Figure 1). The analysis cohort, consisting of all eligible participants with assessment data and reference standard (n = 2637), was compared with the full eligible study cohort (n = 2725) on baseline characteristics to detect potential biases (Table 1). We note that among the 88 participants with missing data, we would expect only 1 cancer if the data are missing at random.
There were no differences in demographics or risk factors between the analysis cohort of 2637 (4786 breasts) and the overall eligible group of 2725. (Figure 1 and Table 1). The mean (SE) age at enrollment was 55 years (0.2; range, 25-91 years). Fourteen hundred women (53.09%) had a personal history of breast cancer. Nine of 23 women who carried either the BRCA-1 or BRCA-2 mutation also had a history of breast cancer, as did 4 of 8 women who had undergone chest or mediastinal radiation. Seventy-three percent of participants had undergone mammography in no less than 11 full months to no more than 14 months before entering the study, 11% had prior screening ultrasound, and 7% had prior contrast-enhanced breast magnetic resonance imaging at least a year before entering the study.
Forty of 2637 participants (1.5%) were diagnosed with cancer, 39 of whom had breast cancer: 6, DCIS; 20, invasive ductal carcinoma (IDC) with or without DCIS; 3, invasive lobular carcinoma; and 10, mixed invasive ductal and lobular carcinoma with or without DCIS. One participant had melanoma metastatic to axillary nodes with no evidence of cancer in the breasts. One patient with IDC had contralateral DCIS (41 total breasts with cancer). Four patients had multifocal invasive cancer (45 total malignant lesions). Median size of invasive cancers (considering only the largest per participant) was 12.0 mm (range, 4-40 mm; interquartile range [IQR], 8-18 mm; mean [SE], 14 [1.5] mm; 95% CI, 11.1-17.4 mm). Axillary lymph node staging was performed for 25 participants with invasive cancer, with nodal metastases found in 5 (20%, including the melanoma); axillary staging was not performed for 6 participants with recurrent breast cancer nor was it performed for 3 others.
At the participant level, based on BI-RADS assessments, 20 of 40 (50%) of cancers were identified on mammography for a yield of 7.6 per 1000 (Table 2 and Table 3); 5 of 6 DCIS lesions (83%) were seen only on mammography. Fifteen invasive cancers, with a median size of 12 mm (range, 4-25 mm; IQR, 7-20 mm; mean [SE], 14 [1.9] mm; 95% CI, 9.9-18.2 mm) were seen on mammography, with axillary nodes negative in 7 of 10 participants (70%) with staging. Seven invasive cancers were suspicious only on mammography and 8 were suspicious on both mammography and on ultrasound. Ultrasound alone depicted cancer in 12 participants: 1, DCIS, and 11, invasive cancers with median size of 10 mm (range, 5-40 mm; IQR, 6-15 mm; mean [SE] mm, 12.6 [3.0]; 95% CI, 6.0-19.1 mm), with axillary nodes negative in 8 of 9 participants (89%) with staging. One 4-mm IDC lesion considered suspicious initially on mammography (true positive on mammography) was downgraded to a BI-RADS score of 3 after integration with ultrasound (false negative on mammography plus ultrasound), even though it was still recalled for additional mammographic views (thought to be probably benign after recall, and benign at the 6-month follow-up), and was diagnosed when the patient presented with palpable metastatic adenopathy 264 days after study entry. This is not included among interval cancers.
Thirty-one cancers were depicted in 2637 participants by the combination of mammography plus ultrasound, producing a yield of 11.8 per 1000 women and an increased yield due to ultrasound of 4.2 per 1000 (95% CI, 1.1-7.2; Table 2) over mammography alone. The diagnostic accuracy of mammography alone was 0.78 (95% CI, 0.67-0.87), for ultrasound alone was 0.80 (95% CI, 0.70-0.88), and for combined mammography plus ultrasound was 0.91 (95% CI, 0.84-0.96, Table 2, Figure 2). The AUC for mammography plus ultrasound did not change when incorporating full diagnostic workup that included additional mammographic views.
Defined as the percentage of participants with a BI-RADS 4a assessment or higher, without a diagnosis of cancer in the following 12 months, the false-positive rate for mammography alone was 4.4% (136 had a BI-RADS score of 4a or higher, of whom 20 had cancer: 116 were false positive of 2637 participants [95% CI, 3.7%-5.3%; Table 3]); for ultrasound alone, the false-positive rate was 8.1% (213 of 2637; 95% CI, 7.1%-9.2%); and for combined mammography plus ultrasound was 10.4% (275 of 2637; 95% CI, 9.3%-11.7%). In 5.2% of participants (136 of 2637; 95% CI, 4.3%-6.1%), ultrasound, but not mammography, resulted in a suspicious assessment and biopsy and 8.8% (12 of 136; 95% CI, 4.6%-14.9%) of these participants had cancer. Seventy-one participants had only a cyst aspiration, without a biopsy, with no malignancies among these lesions; 43 of these participants had a suspicious assessment only on ultrasound and 2 had a suspicious assessment only on mammography.
Table 4 details the recommendations by modality. To calculate the PPV16 of recall, the number of participants with cancer are divided by those who were recalled for additional evaluation, biopsy, or both. For mammography, 21 participants were diagnosed with cancer among of 276 participants who underwent additional evaluation or biopsy, or both for a PPV1 of 7.6% (95% CI, 4.8%-11.4%); for ultrasound the PPV1 was 22 of 337 (6.5%; 95% CI, 4.1%-9.7%); and for combined integrated mammography plus ultrasound, 32 of 436, 7.3% (95% CI, 5.1%-10.2%). Of those 276 participants recalled from routine mammography, after complete diagnostic workup, 84 participants were recommended for biopsy, of whom 19 had cancer, resulting in a PPV26 of 22.6% (95% CI, 14.2%-33%). For ultrasound, 21 of 235 patients with a biopsy recommendation after workup had cancer, resulting in a PPV2 of 8.9% (95% CI, 5.6%-13.3%). Although 1 of these cancers was classified as BI-RADS 3 based on the initial ultrasound, it was nevertheless worked up and classified as BI-RADS 4b on mammography. After mammography plus ultrasound and a full diagnostic workup, 31 of 276 participants who had undergone biopsy had cancer, resulting in a PPV2 of mammography plus ultrasound of 11.2% (95% CI, 7.8%-15.6%).
Based on mammographic results, 177 women (6.7%) were classified as BI-RADS 3 (Table 3); of those, 1 (0.6%) was diagnosed with cancer detected at the early second screen, 363 days after study entry (after initial additional mammographic recall at time 0 for unrelated benign findings). Three hundred twenty-one participants (12.2%) were classified as BI-RADS 3 based on ultrasonographic screening, 5 of whom (1.6%) were diagnosed with cancer within the first 12 months of follow-up. Of the 5 participants with cancer who were classified as BI-RADS 3 on ultrasound, 3 were BI-RADS 5 on mammography and were diagnosed from 1 to 23 days after initial screens were completed. Two women had interval cancers that were identified incidentally as a result of a 6-month follow-up ultrasound for complicated cysts (the first was a 7-mm IDC found at surgery in adjacent tissue after a core biopsy result of LCIS from the lesion being followed up; the other was a 27-mm IDC-DCIS adjacent to the cyst being followed up). Both participants were node negative.
Based on results from mammography, short-interval follow-up was recommended for 59 (2.2%) of 2637 participants (95% CI, 1.7%-2.9%), and based on ultrasound, recommendations for short-term follow-up were made for 227 (8.6%) participants (95% CI, 7.6%-9.7%). Two hundred twenty of these recommendations were based on ultrasound alone. Two hundred eighty-six participants (10.8%) were recommended for short-term follow-up after mammography plus ultrasound (95% CI, 9.7%-12.1%).
Initial assessments of 27 participants as BI-RADS 3, 7 as BI-RADS 4a, and 1 as BI-RADS 4b based on mammography were downgraded to BI-RADS 2 after integrating mammographic and ultrasonographic results. Similarly, initial assessments of 26 participants as BI-RADS 3, 3 as BI-RADS 4a, 4 as BI-RADS 4b, and 1 as BI-RADS 5 based on ultrasound were downgraded to a BI-RADS score of 2 after integrating ultrasonographic and mammographic results.
Eight participants had cancer not considered suspicious on either mammography or ultrasound, with cancer identified during the 12 months after initial screening, ie, interval cancers. Three node-negative cancers (an 8-mm IDC and ILC, a 35-mm ILC, and a 20-mm IDC-DCIS) were identified at the second screen (performed early, after 11 full months), with biopsies taken from 359 to 364 days after study entry. One participant noted a palpable lump, with biopsy showing a 12-mm mixed IDC/ILC 337 days after study entry. One participant presented with skin recurrence of prior breast cancer 231 days after study entry. Two cancers were found at the 6-month follow-up ultrasound as detailed in the section on short-interval follow-up. One nonbreast malignancy was identified in the interval in a participant with prior melanoma of the back, who, 6 years later, developed a palpable axillary mass due to metastatic adenopathy, with no evidence of malignancy within the breasts. Thus, the interval cancer rate was 8 of 40 (20%) if the melanoma case is included as cancer, or 7 of 39 (18%) if not; only 2 of 39 participants (5.1%) with breast cancer were identified because of symptoms in the interval between screenings—or 3 of 39 (7.8%), if one includes the 4-mm IDC seen on initial mammography but not on additional imaging or at the 6-month follow-up, but which was diagnosed when the participant presented with palpable metastatic adenopathy 264 days after study entry. A ninth breast had cancer not seen on either mammography or ultrasound: DCIS was identified only at prophylactic mastectomy after diagnosis of contralateral multifocal IDC seen only on ultrasound.
Cancers seen only on ultrasound were evenly distributed across breast density categories (Table 5). The data were inconclusive with respect to most differences between film-screen and digital mammography; however, slightly higher specificity was observed with digital mammography than with film screen (97.0% vs 94.7%, P = .007).
In 1400 women with a personal history of breast cancer, 28 (2.0%) were found to have cancer, with 9 of 28 (32%) seen only on ultrasound (Table 5). Cancers were evenly distributed between the breast ipsilateral to the initial cancer and contralateral disease. Among 1237 women with risk factors other than a personal history of breast cancer, 12 (1.0%) were found to have cancer, 3 of which cancers (25%) were seen only on ultrasound. Significantly more cancers overall were found in women with a personal history of cancer (P = .03), but there was no difference in supplemental yield of ultrasound in women with or without a personal history of breast cancer.
The median time to perform screening breast ultrasound was 19 minutes (range, 2-90; IQR, 12-27, mean [SE], 20.8 [0.3], 95% CI, 20.3-21.3 minutes) for a bilateral scan and 9 minutes for a unilateral scan (range, 1 to 70; IQR, 5-15; mean [SE], 11.6 [0.4], 95% CI, 10.7-12.4 minutes). A median of another 2.0 minutes was spent in the room with the participant (range, 0-19; IQR, 2-3; mean [SE], 2.7 [0.04]; 95% CI, 2.6-2.7 minutes). For 869 (33.0%) of 2637 participants, the investigator scanned at least 1 axilla while performing ultrasonographic scanning of the breast(s). Ninety-four percent of breasts were less than 4 cm thick.
Supplemental physician-performed screening ultrasound increases the cancer detection yield by 4.2 cancers per 1000 women at elevated risk of breast cancer, as defined in this protocol (95% CI, 1.1-7.2 cancers per 1000) on a single, prevalent screen. This is similar to rates of ultrasound-only cancers of 2.7 to 4.6 cancers per 1000 women screened in other series.8,13- 17,24 As in prior studies, the vast majority of cancers seen only on ultrasound were invasive because DCIS is difficult to see on ultrasound. All but 1 cancer seen only on ultrasound was node negative. Invasive cancers not seen on mammography can be expected to present as interval cancers with a worse prognosis: detection of asymptomatic, mammographically occult, node-negative invasive carcinomas with ultrasound should reduce mortality from breast cancer, although mortality was not an end point of this study.
Strengths of our study include its matching within a participant, and examinations performed by radiologists who were masked to results of the other examination. Randomized order of these tests helped control biases of recruiting women with vague mammographic abnormalities. Furthermore, these results were consistent and generalizable across 21 international centers. The radiologist investigators in this trial were all specialists in breast imaging who met experience requirements and completed qualification tasks. As such, our results may vary slightly from those observed in general practice, even though similar results were observed by Kaplan16 for which study technologists performed screening ultrasound. Educational materials used for radiologist investigator training in ultrasound lesion detection and characterization are archived by ACRIN.
The use of the Gail and Claus models to calculate risk may have affected the racial distribution of participants, for the Gail model is known to underestimate risk in African Americans.34 Neither model has been validated in other races other than whites,34,35 although Gail et al36 have recently validated a new risk assessment tool based on data from the Contraceptives and Reproductive Experiences (CARE) Study, which involved African American women (which was not available for use in this protocol).
In our elevated-risk study population, enriched in women with dense breasts, mammographic sensitivity was only 50% (95% CI, 33.8%-66.2%) and the sensitivity of mammography plus ultrasound was 77.5% (95% CI, 61.6%-89.2%; Table 2). From a detection standpoint, it may be reasonable to offer supplemental screening ultrasound to women with similar risk criteria. As stated, dense breast tissue is common: approximately half of women younger than 50 years and a third of older women have dense breast parenchyma.5 Approximately 6% of women presenting for routine annual mammography have a personal history of breast cancer,29 and 15% have a family history of breast cancer.29
Our ongoing study, allowing for contrast-enhanced breast magnetic resonance imaging (MRI) within 8 weeks of the final 24-month mammography and ultrasound screening round, may soon shed some light on the possible competitive roles of ultrasound and MRI as adjuncts to mammographic screening for breast cancer. Across 4 other series for which screening mammography, ultrasound, and MRI had been performed for women at very high risk of breast cancer, the combined sensitivity of mammography and ultrasound averaged 55% vs 93% after combined mammography and MRI.37- 40 There appears to be no role for screening ultrasound in women undergoing screening MRI, even though ultrasound may be helpful in guiding biopsy of suspicious findings seen first on MRI.37- 40 Ultrasound may be more appropriate than MRI for screening women of intermediate risk due to its reduced cost relative to MRI. Many of the cancers seen only on MRI are small, node-negative invasive cancers.37- 40 Unlike ultrasound, MRI readily depicts DCIS,41 although DCIS remains overrepresented among false-negative MRI examinations.42 It is uncertain whether detection of DCIS is required or whether detection of node-negative invasive breast cancer is sufficient for a screening test. It will be important to see the stage distribution of breast cancers in subsequent rounds of screening with mammography plus ultrasound in this study and to know how many invasive cancers will be seen only on MRI at the 24-month time point.
Despite a 20% interval cancer rate (8 of 40 participants with cancer) in our series, none of the interval breast carcinomas were node positive; the only interval cancer that was node positive was a nonbreast cancer (melanoma metastatic to axillary nodes). Another cancer considered suspicious on initial mammography (and therefore not included among “interval cancers”) was considered probably benign after full diagnostic workup and went unbiopsied until the patient presented with palpable, metastatic nodes, yet was only 4 mm in size at eventual detection. One interval cancer was a skin recurrence of prior breast cancer.
Ultrasound is well tolerated, the technology is widely available, and it does not require intravenous contrast material. If, however, screening ultrasound is to be widely implemented, several major issues remain. First, it will be very important to know the role of annual screening ultrasound in addition to mammography, and such a study is in progress with participants in this protocol. The time to perform bilateral screening ultrasound is problematic, at a median of 19 minutes. This does not include comparison to prior studies, discussion of results with patients, nor creation of a final report, although the time may be artificially prolonged by protocol requirements to measure each lesion other than a simple cyst in 2 planes and to fully characterize each such lesion with and without spatial compounding and with and without color or power Doppler. Nineteen minutes is considerably longer than the average 4 minutes 39 seconds reported by Kolb et al8 for physicians scanning or the average 10 minutes reported by Kaplan16 for technologists. Currently, there is only a single billing code for breast ultrasound (current procedural terminology code 76645), and Medicare global reimbursement is $85 in 2008, which does not fully cover the costs of performing and interpreting the examination. Outcomes similar to those of our physician-performed study have been reported with technologist-performed ultrasound,16 and specialized training of technologists is encouraged to counter a current shortage of qualified physician and technologist personnel. Further validation of technologist-performed screening breast ultrasound is encouraged. Automated whole-breast ultrasound may facilitate implementation and profitability of screening ultrasound but will result in hundreds of images to be reviewed and stored, with attendant increased capital and professional costs and potential increased malpractice exposure; validation of such methods is needed. The full costs of screening breast ultrasound in this protocol, including the costs of induced additional testing and biopsy, are being analyzed and reported separately.
The final barrier to implementing screening ultrasound is the risk of false-positive results. The performance characteristics of mammography were within accepted ranges (10.5% recalled for additional imaging or biopsy; 3.2% of participants biopsied after full workup, with 23% proving malignant; 2.2% recommended for short interval follow-up). We observed a 5.4% recall rate for ultrasound (142 of 2637 recommended for additional imaging), which may be artificially low in this series because physicians performed the screening ultrasound and could directly evaluate lesions in real-time. Of 2637 participants, 233 (8.8%) participants had findings considered suspicious on ultrasound with 136 participants having suspicious findings on ultrasound but not mammography and prompting biopsy, and 235 participants (8.9%) were recommended for biopsy based on ultrasound after full workup. Only 20 of 233 (8.6%) of participants with suspicious ultrasonographic findings—12 (8.8%) of 136 of those with suspicious findings biopsied based on ultrasound alone—and 21 of 235 (8.9%) of participants whose lesions were recommended for biopsy based on ultrasound proved to have cancer. The 8.8% to 8.9% PPV of biopsies prompted by ultrasound in our study is similar to the 11% rate seen across prior series.20,43 Diagnostic uncertainty for complicated cysts remains a major source of false-positive results, with 43 participants undergoing only cyst aspiration included among those with a suspicious finding on ultrasound. Elastography, in which the deformability of the mass is assessed during ultrasound, can help distinguish complicated cysts from suspicious solid masses and should reduce this source of false positives.44 Another 227 participants (8.6%) were recommended for short interval follow-up based on ultrasound, similar to the 6.3% rate across other series.8,15,16,45 Whether the risk of false-positive results with ultrasound will diminish in our study population with subsequent screening rounds, as has been seen with mammography46 and in small series with both ultrasound and MRI37 is under evaluation. We have been separately quantifying patient anxiety and discomfort (ie, “process utility”47) induced by addition of screening ultrasound.
The addition of a single screening ultrasonographic examination to mammography for women at elevated risk of breast cancer results in increased detection of breast cancers that are predominantly small and node-negative. We defined elevated risk using a variety of criteria, including personal history of breast cancer, prior atypical biopsy, and elevated risk by Gail or Claus models or both. Recent literature43 suggests that any combination of factors that confers 3-fold relative risk compared with women without the risk factor would be “high risk,” including dense breast tissue.9 Across all series to date, over 90% of cancers seen only on ultrasound have been in women with more than 50% dense breast tissue,20,24 although 3 of 12 cancers (25%) seen only on ultrasound in this series were in women with only 26% to 40% dense breast tissue (as visually estimated), suggesting that women with other risk factors may benefit from screening ultrasound even if their breast tissue is less dense. The age at which to begin screening women at increased risk would reasonably derive from the age at which the risk of breast cancer is equal to that for an average woman aged 40 or 50 years, depending on national policy.9
The detection benefit of a single screening ultrasound in women at elevated risk of breast cancer is now well validated. However, it comes with a substantial risk of false-positive results (ie, biopsy with benign results and/or short interval follow-up). Our results should be interpreted in the context of recent guidelines recommending annual MRI in women at very high risk of breast cancer.25 Importantly, evaluation of annual (incidence) screening ultrasound is continuing in ACRIN 6666, as is evaluation of a single screening MRI in these women.
Corresponding Author: Wendie A. Berg, MD, PhD, 10755 Falls Rd, Suite 440, Lutherville, MD 21093 (email@example.com).
Author Contributions: Dr Blume had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Berg, Blume, Mendelson, Pisano.
Acquisition of data: Berg, Blume, Mendelson, Lehrer, Böhm-Vélez, Pisano, Jong, Evans, Morton, Mahoney, Hovanessian-Larsen, Barr, Farria, Boparai.
Analysis and interpretation of data: Berg, Blume, Cormack, Mendelson, Pisano, Marques.
Drafting of the manuscript: Berg, Blume, Cormack, Marques.
Critical revision of the manuscript for important intellectual content: Berg, Blume, Mendelson, Lehrer, Böhm-Vélez, Pisano, Jong, Evans, Morton, Mahoney, Hovanessian-Larsen, Barr, Farria, Boparai.
Statistical analysis: Blume, Cormack, Marques.
Obtained funding: Berg, Blume.
Administrative, technical, or material support: Berg, Mendelson, Lehrer, Pisano, Jong, Morton, Barr, Boparai.
Study supervision: Berg, Blume.
Financial Disclosures: Dr Berg reports that she has served as a consultant to Naviscan PET Systems, MediPattern, and Siemens and has received equipment support from Siemens and a travel grant from General Electric. Dr Mendelson reports that she is a member of the scientific advisory boards of MediPattern and Siemens and has received equipment support from Philips. Dr Böhm-Vélez reports that she is a member of the physicians advisory board of MediPattern. Dr Pisano reports that her laboratory receives research support from General Electric, Hologic, Konica, Sectra, and Siemens. Dr Jong reports that she has a research collaboration with General Electric. Dr Evans reports that he is a member of the scientific advisory board of Hologic. Dr Mahoney reports that she is a consultant to Johnson & Johnson and SenoRx. Dr Larsen reports that she receives equipment support from Naviscan PET Systems. Dr Barr reports that he is a member of the ultrasound advisory boards of and has received equipment support from Siemens and Philips. The remaining coauthors report no financial disclosures.
ACRIN 6666 Site Investigators:Allegheny-Singer Research Institute, Pittsburgh, Pennsylvania: William R. Poller, MD, principal investigator (PI), Michelle Huerbin, research associate (RA); American Radiology Services–Johns Hopkins Green Spring, Baltimore, Maryland: Wendie A. Berg, MD, PhD (PI), Barbara E. Levit, RT (RA); Beth Israel Deaconess Medical Center, Boston, Massachusetts:Janet K. Baum, MD, and Valerie J. Fein-Zachary, MD (PIs), Suzette M. Kelleher, BA (RA); CERIM, Buenos Aires: Daniel E. Lehrer, MD (PI), Maria S. Ostertag (RA); Duke University Medical Center, Durham, North Carolina: Mary Scott Soo, MD (PI), Brenda N. Prince, RT (RA); Mayo Clinic, Rochester, Minnesota: Marilyn J. Morton, DO (PI), Lori M. Johnson, AAS (RA); Feinberg School of Medicine, Northwestern University, Chicago, Illinois: Ellen B. Mendelson, MD (PI), Marysia Kalata, AA (RA); Radiology Associates of Atlanta, Atlanta, Georgia: Handel Reynolds, MD (PI), Y. Suzette Wheeler, RN, MSHA (RA); Radiology Consultants/Forum Health, Youngstown, Ohio: Richard G. Barr, MD, PhD (PI), Marilyn J. Mangino, RN (RA); Radiology Imaging Associates, Denver, Colorado: A. Thomas Stavros, MD (PI), Margo Valdez (RA); Sunnybrook Health Sciences Centre, University of Toronto, Toronto, Ontario, Canada: Roberta A. Jong, MD (PI), Julie H. Lee, BSC (RA); Thomas Jefferson University Hospital, Philadelphia, Pennsylvania: Catherine W. Piccoli, MD, and Christopher R. B. Merritt, MS, MD (PIs), Colleen Dascenzo (RA); David Geffen School of Medicine at University of California Los Angeles Medical Center, Los Angeles: Anne C. Hoyt, MD (PI), Roslynn Marzan, BS (RA); University of Cincinnati Medical Center, Cincinnati, Ohio: Mary C. Mahoney, MD (PI), Monene M. Kamm, AS (RA); University of North Carolina, Chapel Hill: Etta D. Pisano, MD (PI), Laura A. Tuttle, MA (RA), Keck School of Medicine, University of Southern California, Los Angeles: Linda Hovanessian Larsen, MD (PI), Christina E. Kiss, AA (RA); University of Texas M. D. Anderson Cancer Center, Houston: Gary J. Whitman, MD (PI), Sharon R. Rice, AA (RA); University of Texas Southwestern Medical Center, Dallas: W. Phil Evans, MD (PI), Kimberly T. Taylor, AA (RA); Washington University School of Medicine, St. Louis, Missouri: Dione M. Farria, MD, MPH (PI), Darlene J. Bird, RT, AS (RA); and Weinstein Imaging Associates, Pittsburgh, Pennsylvania: Marcela Böhm-Vélez, MD, (PI), Antoinette Cockroft (RA).
Funding/Support: The study was funded by the Avon Foundation and grants CA 80098 and CA 79778 from the National Cancer Institute.
Role of the Sponsors: The Avon Foundation was not involved in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript. The trial was conducted by the American College of Radiology Imaging Network, a member of the National Cancer Institute's Clinical Trials Cooperative Groups Program, and was developed and carried out adhering to the standard cooperative group processes. These processes include review of and input about the trial design from the NCI's Cancer Therapy Evaluation Program (CTEP). Upon CTEP's approval of the research protocol, the NCI was not involved in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript.
Additional Contributions: We thank Amanda M. Adams, MPH, Center for Statistical Sciences, Brown University, Providence, Rhode Island, for assistance with data analysis; Eric A. Berns, PhD, University of Colorado, Denver, for ultrasound quality assurance; Cynthia B. Olson, MBA, MHS, and Sophia Sabina, MBA, American College of Radiology (ACR), Philadelphia, Pennsylvania, for administrative assistance; Glenna J. Gabrielli, BS, Stephanie Clabo, BS, CCRP, Jillene DeBari, BA, and Judy M. Green, RT(M) at ACR for data management; Cheryl L. Crozier, RN, ASQ, CQA, and Josephine Schloesser, AS, RT(R)(M), CCRP at ACR for monitoring; Anthony M. Levering, AS, RT(R)(CT)(MR) at ACR for image management; and Nancy S. Fredericks, MBA, at ACR for communications support. We also thank Cecilia M. Brennecke, MD, and other colleagues at American Radiology Services, Johns Hopkins Green Spring, for their support; and Mark D. Schleinitz, MD, MS, at Brown University, Barbara K. LeStage, BS, MHP, consultant to the ACR, Edward A. Sickles, MD, University of California, San Francisco Medical Center, and Elizabeth A. Patterson, MD, Seattle, Washington, for engaging in helpful discussions. We are indebted to the many investigators, coinvestigators, and research associates at the clinical sites. No one was compensated beyond their usual salary for their efforts for this study.
This article was corrected online for typographical errors on 4/21/2010.