A-D, Nodules that are entirely solid. A-C, Nodules with microcalcifications.
eAppendix. Details of Statistical Analysis
Smith-Bindman R, Lebda P, Feldstein VA, Sellami D, Goldstein RB, Brasic N, Jin C, Kornak J. Risk of Thyroid Cancer Based on Thyroid Ultrasound Imaging CharacteristicsResults of a Population-Based Study. JAMA Intern Med. 2013;173(19):1788–1795. doi:10.1001/jamainternmed.2013.9245
There is wide variation in the management of thyroid nodules identified on ultrasound imaging.
To quantify the risk of thyroid cancer associated with thyroid nodules based on ultrasound imaging characteristics.
Retrospective case-control study of patients who underwent thyroid ultrasound imaging from January 1, 2000, through March 30, 2005. Thyroid cancers were identified through linkage with the California Cancer Registry.
A total of 8806 patients underwent 11 618 thyroid ultrasound examinations during the study period, including 105 subsequently diagnosed as having thyroid cancer. Thyroid nodules were common in patients diagnosed as having cancer (96.9%) and patients not diagnosed as having thyroid cancer (56.4%). Three ultrasound nodule characteristics—microcalcifications (odds ratio [OR], 8.1; 95% CI, 3.8-17.3), size greater than 2 cm (OR, 3.6; 95% CI, 1.7-7.6), and an entirely solid composition (OR, 4.0; 95% CI, 1.7-9.2)—were the only findings associated with the risk of thyroid cancer. If 1 characteristic is used as an indication for biopsy, most cases of thyroid cancer would be detected (sensitivity, 0.88; 95% CI, 0.80-0.94), with a high false-positive rate (0.44; 95% CI, 0.43-0.45) and a low positive likelihood ratio (2.0; 95% CI, 1.8-2.2), and 56 biopsies will be performed per cancer diagnosed. If 2 characteristics were required for biopsy, the sensitivity and false-positive rates would be lower (sensitivity, 0.52; 95% CI, 0.42-0.62; false-positive rate, 0.07; 95% CI, 0.07-0.08), the positive likelihood ratio would be higher (7.1; 95% CI, 6.2-8.2), and only 16 biopsies will be performed per cancer diagnosed. Compared with performing biopsy of all thyroid nodules larger than 5 mm, adoption of this more stringent rule requiring 2 abnormal nodule characteristics to prompt biopsy would reduce unnecessary biopsies by 90% while maintaining a low risk of cancer (5 per 1000 patients for whom biopsy is deferred).
Conclusions and Relevance
Thyroid ultrasound imaging could be used to identify patients who have a low risk of cancer for whom biopsy could be deferred. On the basis of these results, these findings should be validated in a large prospective cohort.
Ultrasound imaging has replaced nuclear medicine as the most frequently used imaging test of the thyroid.1 The increase in the use of thyroid ultrasound imaging by radiologists, endocrinologists, and head and neck surgeons has led to the discovery of large numbers of asymptomatic thyroid nodules, which may occur in 50% or more of adults,2,3 as well as a rapid increase in the diagnosis of thyroid cancer.4 In contrast, clinically apparent thyroid cancer is rare, affecting 1 in 10 000 people annually and less than 1% of individuals throughout their lives.4- 6 Because of the high prevalence of nodules and the rarity of symptomatic cancer, only a small percentage of thyroid nodules are malignant. Uncertainty about which nodules may harbor cancer and lack of evidence-based management guidelines have resulted in a myriad of conflicting recommendations regarding which nodules warrant biopsy,6- 21 frequent thyroid biopsies, and the overdiagnosis of thyroid cancers that would otherwise likely have remained asymptomatic in the absence of detection.4,22,23
Although many studies have analyzed the association between the ultrasound imaging characteristics of thyroid nodules and the risk of thyroid cancer, most studies are small, and all limited their analysis to patients who underwent biopsy, in which the decision to biopsy was influenced by the ultrasound imaging result.6- 21 This ascertainment bias will overestimate the risk of cancer associated with thyroid biopsy and the accuracy of ultrasound imaging.24- 26 The information that is most important to patients and physicians managing care is quantifying the risk of cancer associated with a nodule with a particular imaging characteristic, and no prior publication can accurately provide this information. This obstacle has hindered the development of an evidence-based strategy for determining which nodules should be biopsied because of an elevated cancer risk. The purposes of this study were to determine the ultrasound imaging characteristics that are associated with cancer and to use this information to create a standardized system for interpreting thyroid ultrasound imaging.
We conducted a retrospective, case-control study at the University of California, San Francisco, of consecutive patients who underwent thyroid ultrasound imaging from January 1, 2000, through March 30, 2005. A waiver of patient informed consent was obtained. This study was approved by the institutional review board of the University of California, San Francisco. Patients were excluded if they had a prior unilateral or bilateral thyroidectomy for benign or malignant disease.
Cancers identified in the cohort were identified through linkage with the California Cancer Registry, a population-based cancer registry collecting cancer incidence and mortality data for all of California.27 The registry is a collaboration between the Cancer Surveillance Section of the California Department of Public Health, The Public Health Institute, and 8 regional cancer registries that by legislative mandate have collected cancer incidence data from hospitals and other facilities across the state since 1988. The registry is certified by the North American Association of Central Cancer Registries as meeting their highest standard for completeness of cancer ascertainment, reflecting capture of more than 97% of cancers diagnosed in the state.28 We included thyroid cancers diagnosed through March 30, 2007, allowing a minimum 2 years of follow-up after the last enrolled patient’s ultrasound imaging during which a cancer could be diagnosed and at least 2 years of further follow-up to ensure reporting to the registry.27 Patients diagnosed as having nonthyroid malignant neoplasms (other than skin cancer) were excluded to prevent the inclusion of the rare but theoretically possible metastatic cancer to the thyroid because these metastatic cancers would not be captured by the cancer registry. All patients diagnosed as having cancer (cases, Table 1) and a sample of patients not diagnosed as having cancer (controls) matched 4 to 1 to the cancer patients for age, sex, and year of the ultrasound examination were selected for detailed review of the sonogram.
We retrieved and reviewed the results of the ultrasound examinations of 96 cancer patients (91.4%) from the Radiology Picture Archiving and Communication System and 369 controls. Each ultrasound examination result was reviewed independently by 2 board-certified radiologists (R.S.-B., P.L., V.A.F., D.S., R.B.G., and N.B.) masked to cancer status. Disagreement was resolved by consensus. For each patient, each reviewer independently recorded the number, size, and characteristics of all nodules larger than 5 mm. There was good to outstanding agreement (κ = 0.73 to 1.0) in the categorization of the specific ultrasound imaging characteristics.
In patients selected as controls, all nodules were considered benign. In 43 cancer patients (44.8%), a single nodule was identified and was considered malignant. In 50 cancer patients (52.1%), multiple thyroid nodules were identified. To ensure correct attribution of cancer to the correct nodule, one of the authors (D.S.) was unmasked, and patient records (radiology, pathology, and surgery) were reviewed to determine which nodules were malignant. In the small number of cases for which we were unable to determine which nodule harbored cancer, all nodules were considered malignant. Nodules in patients never diagnosed as having thyroid cancer (n = 428) and benign nodules in cancer patients (n = 87) were combined to create our final control group of benign nodules (n = 515). Note that 3 cancer patients (3.1%) did not have any nodules larger than 5 mm identified on ultrasound imaging.
We compared mean age, age group, sex, and year of study between patients diagnosed as having cancer and controls. We used the χ2 test to determine whether the number of nodules varied by age group. We performed single-predictor modeling to assess the association between specific ultrasound imaging characteristics and cancer status using generalized estimating equations, with a compound symmetry (exchangeable) correlation structure to account for the correlated outcomes among multiple nodules within a patient. For variables that were statistically significant in the single-predictor model, we calculated diagnostic accuracy statistics (sensitivity, specificity, likelihood ratios, and predictive values).
To build the generalized estimating equation models, we added variables that were statistically significant in single-predictor models one at a time in the order of the effect size. Variables were retained if the associated P value after inclusion was <.10 for that variable. The ultrasound imaging characteristics that were retained in the final multiple-predictor model (microcalcifications, size ≥2 cm, and solid composition; Figure) were combined in various ways, via logic and/or criteria, to define an abnormal ultrasound imaging interpretation. The risk of cancer (predictive values) associated with each definition of an abnormal ultrasound imaging interpretation was calculated, accounting for the sampling strategy in the entire cohort. The positive predictive value (PPV) is the risk of cancer for a patient who is found to have an abnormal ultrasound imaging interpretation, and the negative predictive value is the probability of being cancer free if the ultrasound imaging result is negative for cancer. For each definition of an abnormal ultrasound imaging result, we calculated the number of cancers missed per 1000 ultrasound imaging examinations. The number of patients needed to undergo a biopsy to detect a single cancer was defined as the inverse of the PPV. We performed several sensitivity analyses to determine whether implicit assumptions in the primary analysis were reasonable. More details on the analysis are provided in the eAppendix in the Supplement.
A total of 8806 patients underwent 11 618 thyroid ultrasound examinations during the study period, including 105 patients diagnosed as having thyroid cancer (incidence of 0.9 cancers per 100 ultrasound examinations). The cancers were diagnosed 1 day to 6.1 years after ultrasound imaging, and among control patients, there was a mean follow-up of 4.2 years (range, 2.0-10.9 years). There were no significant differences in the matching variables between cases and controls (Table 2).
Thyroid nodules were common among patients diagnosed as having thyroid cancer (96.9%) and patients not diagnosed as having thyroid cancer (56.4%) (Table 3). Among the 96 cases, 102 malignant nodules and 87 benign nodules were identified, with an increase in the number of nodules seen with advancing age. Among the 369 controls, 428 benign nodules were identified, and the number of nodules did not vary with age.
Several ultrasonographic findings were significantly associated with the odds of a nodule harboring cancer (Table 4). Microcalcifications had the strongest association with cancer; 38.2% of cancer nodules vs 5.4% of benign nodules had microcalcifications, reflecting approximately a 7-fold increase in the likelihood of cancer if microcalcifications were seen (likelihood positive ratio, 7.0; 95% CI, 6.0-8.2) and a 30% reduction in the likelihood of cancer if microcalcifications were not seen (likelihood negative ratio, 0.65; 95% CI, 0.56-0.76). The corresponding odds ratio (OR) was 11.6 (95% CI, 6.5-20.0). Coarse calcifications, nodule composition, nodule echogenicity, central vascularity, margins, and shape were also each significantly associated with cancer, but the magnitude of association was smaller, with ORs ranging from 1.6 to 2.9. Rim calcifications and comet tail artifacts, peripheral vascularity, and the presence of a halo were not associated with the likelihood of cancer. The odds of cancer increased with nodule size, and the largest nodules had the greatest odds of cancer (likelihood ratio, 1.8; 95% CI, 1.5-2.1; and OR, 3.1; 95% CI, 1.8-5.2) for nodules larger than 2 cm compared with nodules smaller than 1 cm. Simple cysts never reflected cancer.
Only 3 nodule characteristics were significantly associated with the risk of cancer in the multiple-predictor modeling: microcalcifications (OR, 8.1; 95% CI, 3.8-17.3; P<.001), size greater than 2 cm (OR, 3.6; 95% CI, 1.7-7.6; P=.001), and an entirely solid composition (OR, 4.0; 95% CI, 1.7-9.2; P=.001). The nodule characteristics were not significantly associated with the risk of cancer, and including them in the definition of an abnormal nodule added less than 2% cancer detection.
The accuracy of the several definitions of an abnormal ultrasound imaging interpretation is given in Table 5. If any 1 of the 3 characteristics is used to prompt biopsy, most cases of thyroid cancer would be detected (sensitivity, 0.88; 95% CI, 0.80-0.94) at a false-positive rate of 0.44 (95% CI, 0.43-0.45). The high false-positive rate of this approach is reflected in a low PPV (ie, risk of cancer) of 1.8% (95% CI, 1.5%-2.2%) when a single characteristic is used to prompt biopsy, and 56 biopsies will be required per cancer diagnosed. If 2 abnormal ultrasound imaging characteristics were required to prompt biopsy, the sensitivity and false-positive rates would be lower (sensitivity, 0.52; 95% CI, 0.42-0.62; false-positive rate, 0.07; 95% CI, 0.07-0.08), and the risk of cancer in those with an ultrasound image suggestive of cancer would be higher (PPV, 6.2%; 95% CI, 4.7%-8.7%) and fewer biopsies (n=16) would be required per cancer diagnosed. Compared with existing guidelines that recommend biopsy of all thyroid nodules greater than 5 mm,7,8 adoption of this more stringent rule requiring 2 abnormal characteristics to prompt biopsy would reduce unnecessary biopsies by 90% while maintaining a low risk of cancer in patients in whom biopsy is deferred (ie, 5 cancers [0.5%] per 1000 ultrasound examinations).
The most specific definition of an abnormal ultrasound image is one requiring all 3 abnormal characteristics to prompt biopsy; however, this definition would detect only a small proportion of cancers (sensitivity, 0.07; 95% CI, 0.03-0.14) but would have a high positive likelihood ratio of 28 (95% CI, 23-34).
The tradeoff between the different definitions of an abnormal ultrasound imaging result and test accuracy is shown in the Figure. As the number of criteria required to prompt biopsy increases, the number of missed cancers (false-negative results) increases, and the number of patients who undergo biopsy to detect a cancer will decrease. For example, if 2 criteria instead of 1 are required to prompt biopsy, the rate of missed cancers among patients who do not undergo biopsy increases from 2 to 5 per 1000 ultrasound examinations, whereas the number of biopsies needed to detect a cancer decreases from 56 to 16.
The risk of cancer based on the appearance of the thyroid on the ultrasound image is given in Table 6. The risk of cancer is low among patients with a homogeneous thyroid, in which no nodules were identified (0.63 cancers per 1000 patients). The risk of cancer is also low in patients in whom the only ultrasound imaging characteristic is a simple cyst (0.32 cancers per 1000 patients).
If the presence of a single abnormal characteristic is used to define an abnormal ultrasound examination result, patients with a normal examination result will have a risk of cancer of 2 per 1000 patients, whereas patients with an abnormal examination result will have a risk of cancer of 18 per 1000 patients. If 2 or more characteristics are required to define an ultrasound examination result as abnormal, patients with a negative examination result will have a risk of cancer of 5 per 1000 patients, and patients with an abnormal examination result will have a risk of cancer of 62 per 1000 patients, putting them in a moderate risk category. Microcalcifications are the most predictive characteristic and are associated with a cancer risk of 82 per 1000 patients. If an abnormal ultrasound examination result is defined as one in which microcalcifications or a solid mass greater than 2 cm is seen, 58 cancers will be diagnosed as having cancer per 1000 patients. When a solid mass larger than 2 cm with microcalcifications is seen, almost all of these nodules harbor cancer (960 per 1000 patients).
The results were robust across all of the sensitivity analyses and changed little when we varied our primary assumptions in the analysis.
Thyroid nodules are extremely common. Even among patients selected as controls in our study, 56.4% had thyroid nodules greater than 5 mm, and nearly one-third had multiple nodules. In contrast to previous reports that have suggested the prevalence of cancer in thyroid nodules as high as 23%, we found that only 1.6% of patients who had 1 or more thyroid nodules 5 mm or greater harbored cancer. Thus, although thyroid nodules are common, most (98.4%) are benign, highlighting the importance of being prudent in deciding which nodules should be sampled to reduce unnecessary biopsies.22 Unnecessary tissue sampling not only is invasive and costly but also leads to repeated sampling and unnecessary open surgical procedures because up to one-third of fine-needle aspiration biopsies may be nondiagnostic, requiring open surgical biopsy for diagnosis.8,9,29,30 We found that only 3 ultrasound imaging characteristics—microcalcifications, size larger than 2 cm, and entirely solid composition—were statistically significantly associated with the risk of cancer and that, when used in combination, these 3 characteristics could be used to help determine which nodules should be sampled. Simple cysts are essentially never malignant and should not be sampled.31
There are many ways to characterize the accuracy of ultrasound imaging. We believe the risk of cancer (PPV) is the most relevant to patients and physicians, and ours is the first study, to our knowledge, that permits estimating this risk. A patient’s risk of harboring cancer ranges from 2 per 1000 patients among those whose thyroid ultrasound image has none of the 3 characteristics identified, 18 per 1000 patients if a patient has a nodule with a single characteristic, 62 per 1000 patients if a patient has a nodule with 2 abnormal characteristics, and 960 per 1000 patients if a patient has a nodule with all 3 characteristics. Although there is growing concern regarding overdiagnosis and overtreatment across all areas of medicine,22,32- 34 there are no well-established guidelines regarding what risk is low enough that an imaging finding can be ignored. In other areas of diagnostic testing, for example, when assessing patients at risk for acute coronary syndrome or breast cancer (diseases with higher morbidity and mortality than thyroid cancer), often a risk of less than 1% or 0.5% is considered sufficiently low that further evaluation is deemed unnecessary. If a thyroid cancer risk less than 0.5% is considered acceptable for those in whom biopsy is deferred, using microcalcifications or the combined observations of a large (>2-cm) solid nodule as the only features to prompt biopsy reflects a good choice. In comparison with various guidelines that recommend biopsies in a larger number of patients,13 limiting biopsy to nodules that fulfill this definition would reduce the number of biopsies by as much as 90% while maintaining a low cancer rate of 5 per 1000 patients among individuals who do not undergo thyroid sampling. Most thyroid cancers have a favorable prognosis, with a 20-year survival greater than 97% seen even among patients who do not receive immediate treatment.10,23,34,35 Thus, given the favorable prognosis of most thyroid cancers even without treatment, a risk of cancer of 0.5% among those with a negative examination result seems to balance between detection and unnecessary tissue sampling. Ongoing ultrasound imaging surveillance of patients with nodules who do not meet the criteria for biopsy is unlikely to prove beneficial given that our results ascribe these patients a low risk of cancer for as long as 10 years after imaging.
Our study was designed to determine how to reduce unnecessary and excessive thyroid surveillance and biopsy. Our study does not provide evidence as to whether the detection of thyroid cancers will lead to improved patient outcomes. There has been a recent increase in the observed incidence of small and microthyroid cancer4,5,35 without a corresponding change in the thyroid cancer mortality rate, raising the question of whether there is benefit to the earlier diagnosis or treatment of incidental thyroid cancer.22,23,36,37
Many previous studies6- 8,10- 21,38- 43 have assessed the risk of cancer associated with the appearance of the thyroid on ultrasound imaging. All previous studies have inflated the association between nodule characteristics and cancer risk because they limited their analysis to nodules that underwent biopsy. For example, Ahn et al13 compared various existing guidelines for prompting fine-needle aspiration in a sample of 1398 patients who had undergone biopsy. In this sample, 20% of the included patients had cancer, contrasting with the 1.5% cancer rate in our study. They report that the PPV for cancer in a patient who has microcalcifications is 85.1%, whereas using our population-based approach without ascertainment bias, we found a PPV of 5.8%. We considered many nodule characteristics endorsed by other authors,5,7- 26 but when put into the multiple-predictor models, most of the characteristics were not significantly associated with cancer risk.
It is widely reported that the number of benign thyroid nodules increases with age. We observed this relationship among patients diagnosed as having cancer but not among patients without cancer.
The main strength of our study is the large sample size and the linkage of the cohort with data from a comprehensive cancer registry, which allows accurate assessment of the true underlying prevalence of cancer. The analysis has several limitations. We did not have accurate information about why patients underwent imaging, and the risk of cancer may vary based on the reasons patients received sonograms. We did not stratify the results by the histologic type of cancer, although most included cancers were papillary cancer, as is the case for thyroid cancer in general. There are several ultrasonographic features that we did not assess, but these are rare, such as extracapsular growth or abnormal lymph nodes.11 We did not include the theoretical metastatic cancer to the thyroid because these would not be captured in the cancer registry data. However, we also linked to the local pathology database, and no cases of metastatic cancer were identified.
The increased use1 and improved technical quality of ultrasonographic imaging has given rise to the detection of multiple morphologic characteristics, without clear criteria for what nodules need further evaluation,22 resulting in greater tissue sampling and excessive treatment.23,35 In mammography, the adoption of uniform interpretation standards through the Breast Imaging Reporting and Data System has been useful in allowing comparative effectiveness work in breast imaging and efforts to standardize the interpretation of mammograms. Similar adoption of uniform standards for the interpretation of thyroid sonograms would be a first step toward standardizing the diagnosis and treatment of thyroid cancer and limiting unnecessary diagnostic testing and treatment.
Accepted for Publication: May 21, 2013.
Corresponding Author: Rebecca Smith-Bindman, MD, Department of Radiology and Biomedical Imaging, University of California, 350 Parnassus Ave, Ste 307, San Francisco, CA 94143-0336 (firstname.lastname@example.org).
Published Online: August 26, 2013. doi:10.1001/jamainternmed.2013.9245.
Author Contributions:Study concept and design: Smith-Bindman, Feldstein, Goldstein.
Acquisition of data: Smith-Bindman.
Analysis and interpretation of data: All authors.
Drafting of the manuscript: Smith-Bindman.
Critical revision of the manuscript for important intellectual content: Smith-Bindman, Lebda, Feldstein, Sellami, Goldstein, Brasic, Kornak.
Statistical analysis: Smith-Bindman, Jin, Kornak.
Obtained funding: Smith-Bindman, Sellami.
Administrative, technical, and material support: All authors.
Study supervision: Smith-Bindman.
Conflict of Interest Disclosures: None reported.
Funding/Support: This study was supported by grants R21CA131698 and K24 CA125036 from the National Cancer Institute (Dr Smith-Bindman) and a SEED grant from the Department of Radiology and Biomedical Engineering, University of California, San Francisco (Dr Sellami).
Role of the Sponsor: The funding organizations had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Disclaimer: The content is solely the responsibility of the authors and does not represent the official views of the National Cancer Institute or the National Institutes of Health.
Additional Contributions: Phillip Chu and the Northern California Cancer Registry gathered data for the study. The collection of cancer incidence data used in this study was supported by the California Department of Public Health as part of the statewide cancer reporting program mandated by California Health and Safety Code §103885; the National Cancer Institute's Surveillance, Epidemiology and End Results Program under contract HHSN261201000140C awarded to the Cancer Prevention Institute of California, contract HHSN261201000035C awarded to the University of Southern California, and contract HHSN261201000034C awarded to the Public Health Institute; and the Centers for Disease Control and Prevention's National Program of Cancer Registries, under agreement U58DP003862-01 awarded to the California Department of Public Health. The ideas and opinions expressed herein are those of the author(s) and endorsement by the State of California, Department of Public Health the National Cancer Institute, and the Centers for Disease Control and Prevention or their Contractors and Subcontractors is not intended nor should be inferred.