The Mayo Clinic model probability of cancer is calculated from the following variables: patient age, smoking history (yes/no), history of previous nonthoracic cancer, lesion size, spiculation, and lesion location. AUC indicates area under the curve for the ability of PET to predict cancer diagnosis on logistic regression.
Customize your JAMA Network experience by selecting one or more topics from the list below.
Maiga AW, Deppen SA, Mercaldo SF, et al. Assessment of Fluorodeoxyglucose F18–Labeled Positron Emission Tomography for Diagnosis of High-Risk Lung Nodules. JAMA Surg. 2018;153(4):329–334. doi:10.1001/jamasurg.2017.4495
How accurate is fluorodeoxyglucose F18–labeled positron emission tomography with computed tomography in diagnosing malignancy across different populations with varying cancer prevalence?
In this multi-institutional cohort study, 18F-fluorodeoxyglucose positron emission tomography with computed tomography imaging demonstrated good sensitivity but poor specificity for the diagnosis of lung cancer.
An abnormal positron emission tomography with computed tomography scan may be a reliable indicator that highly suspicious lung nodules may be cancerous, but a normal positron emission tomography with computed tomography scan is not a reliable indicator of the absence of disease.
Clinicians rely heavily on fluorodeoxyglucose F18–labeled positron emission tomography (FDG-PET) imaging to evaluate lung nodules suspicious for cancer. We evaluated the performance of FDG-PET for the diagnosis of malignancy in differing populations with varying cancer prevalence.
To determine the performance of FDG-PET/computed tomography (CT) in diagnosing lung malignancy across different populations with varying cancer prevalence.
Design, Setting, and Participants
Multicenter retrospective cohort study at 6 academic medical centers and 1 Veterans Affairs facility that comprised a total of 1188 patients with known or suspected lung cancer from 7 different cohorts from 2005 to 2015.
18F fluorodeoxyglucose PET/CT imaging.
Main Outcome and Measures
Final diagnosis of cancer or benign disease was determined by pathological tissue diagnosis or at least 18 months of stable radiographic follow-up.
Most patients were male smokers older than 60 years. Overall cancer prevalence was 81% (range by cohort, 50%-95%). The median nodule size was 22 mm (interquartile range, 15-33 mm). Positron emission tomography/CT sensitivity and specificity were 90.1% (95% CI, 88.1%-91.9%) and 39.8% (95% CI, 33.4%-46.5%), respectively. False-positive PET scans occurred in 136 of 1188 patients. Positive predictive value and negative predictive value were 86.4% (95% CI, 84.2%-88.5%) and 48.7% (95% CI, 41.3%-56.1%), respectively. On logistic regression, larger nodule size and higher population cancer prevalence were both significantly associated with PET accuracy (odds ratio, 1.027; 95% CI, 1.015-1.040 and odds ratio, 1.030; 95% CI, 1.021-1.040, respectively). As the Mayo Clinic model–predicted probability of cancer increased, the sensitivity and positive predictive value of PET/CT imaging increased, whereas the specificity and negative predictive value dropped.
Conclusions and Relevance
High false-positive rates were observed across a range of cancer prevalence. Normal PET/CT scans were not found to be reliable indicators of the absence of disease in patients with a high probability of lung cancer. In this population, aggressive tissue acquisition should be prioritized using a comprehensive lung nodule program that emphasizes advanced tissue acquisition techniques such as CT-guided fine-needle aspiration, navigational bronchoscopy, and endobronchial ultrasonography.
Following the results of the National Lung Screening Trial (NLST), lung cancer screening by low-dose computed tomography scans (LDCT) was adopted by the US Preventive Services Task Force and the Center for Medicare/Medicaid Services.1 Even more indeterminate pulmonary nodules (IPNs) will be discovered and require evaluation with increased penetrance of LDCT screening. The evidence-based algorithm for the evaluation of lung nodules by the National Cancer Center Network and the American College of Radiologists Lung CT Screening Reporting and Data System (Lung-RADS) practice guidelines for IPN diagnosis suggest that fluorodeoxyglucose F18–labeled positron emission tomography combined with computed tomography (FDG-PET/CT) should be used for evaluating lesions with at least 8-mm solid aspects.2,3
Initially, FDG-PET/CT was found to have high sensitivity (97%) and fairly high specificity (78%) for diagnosing IPNs suspicious for lung cancer, thereby reducing unnecessary invasive testing.4,5 A 2014 meta-analysis6 examining FDG-PET/CT accuracy to diagnose cancer in regions of endemic infectious lung diseases found the specificity of FDG-PET/CT to be significantly less in endemic regions (61%, translating to a 39% false-positive rate) compared with the specificity observed in nonendemic regions (77%).6 Infectious lung diseases, such as histoplasmosis, coccidioidomycosis, or tuberculosis, can generate benign lung granulomas. These cancer-mimicking lung granulomas frequently cause false-positive imaging by CT or FDG-PET/CT.7 False-positive imaging increases unnecessary surgical biopsies and the significant morbidity associated with invasive operations to the lung.8
Nevertheless, clinicians still rely heavily on FDG-PET/CT imaging to evaluate lung nodules suspicious for cancer. We sought to evaluate the performance of FDG-PET/CT for the diagnosis of malignancy in 7 different populations with varying cancer prevalence and methods of nodule discovery from across the United States.
We performed a multicenter retrospective cohort study of patients with known or suspected lung cancer from 7 different sources. This population was a subset analysis of a re-estimation of a predictive model for lung cancer, including all patients with PET imaging. Table 1 provides an overview of the 7 cohorts included.
Cohort 1 comprised patients evaluated by a pulmonary nodule clinic at the Vanderbilt University Medical Center in Nashville, Tennessee. Cohort 2 included patients evaluated by a pulmonary nodule clinic at the Mayo Clinic in Phoenix, Arizona. Cohort 3 included patients evaluated by a thoracic surgery clinic at Vanderbilt University Medical Center in Nashville, Tennessee, and has been described previously.9 Cohort 4 included patients from the Lahey Hospital and Medical Center in Burlington, Massachusetts, who underwent lung cancer screening and had a Lung-RADS 4 nodule or mass. Cohort 5 included patients who underwent a lung resection for known or suspected lung cancer at the Tennessee Valley Healthcare System Veterans Affairs hospital in Nashville from 2005 to 2015. Cohort 6 included patients who underwent a lung resection for known or suspected lung cancer at the University of Virginia in Charlottesville as part of the Lung Cancer Biospecimen Resource Network collaboration. Cohort 7 included patients who underwent a lung resection for known or suspected lung cancer at Vanderbilt University Medical Center in Nashville, Tennessee.
Cohorts 1 and 2 were high-risk pulmonary nodule cohorts enrolled from pulmonary nodule clinics whose IPNs were evaluated by pulmonologists. Cohorts 3 and 4 represent patients typically referred to a thoracic surgery clinic. Cohort 3 was from a thoracic specialist surgical clinic population. Cohort 4 was a Lung-RADS 4 population from a lung cancer screening program who were evaluated either in a multidisciplinary conference or in clinic by a surgeon. Indeterminate pulmonary nodule diagnosis for these 4 cohorts was either by radiographic determination of benign disease, 18 months of radiographic surveillance, or pathological tissue diagnosis. Cohorts 5, 6, and 7 were pure surgical cohorts of patients who underwent lung resections for known or suspected lung cancer. All diagnoses were pathologically determined in these 3 surgical cohorts. Table 2 shows the grouping of the 7 cohorts that reflect the similar populations and cancer prevalences (cohorts 1 and 2, cohorts 3 and 4, and cohorts 5, 6, and 7). The institutional review board of each facility approved this study, with waivers of individual patient consent.
Our study consisted of participants in these 7 cohorts who had the presence of nodules after chest PET imaging. Exclusion criteria for surgical patients included multiple nodules suspicious for cancer, evidence of benign diseases (eg, benign calcification, infiltrates, bronchiolitis obliterans organizing pneumonia, or empyema), preoperative neoadjuvant chemotherapy and/or radiation, and known metastatic disease. For patients who underwent a second lung resection for a new primary or recurrence during the study period, only the first resection was included. Exclusion criteria for nonsurgical patients also included less than 2 years of radiographic surveillance or clinical follow-up from the date of their initial nodule evaluation.
A Research Electronic Data Capture form was created to facilitate data entry across multiple sites. Reference standard information (outcome, cancer, or benign disease) was determined by either pathological tissue diagnosis, radiographic evidence of benign disease (eg, calcifications), or 18 months of radiographic surveillance among patients not undergoing a diagnostic procedural biopsy. Positron emission tomography avidity was determined by radiologist impression of the nodule or standardized uptake value of 2.5 or greater when impression was not reported.4,10
The estimated probability that a suspicious lesion was cancer was calculated for each patient using the published formula of the Mayo Clinic model ex/(1 + ex), where x = −6.8272 + (0.0391 × Age) + (0.7917 × Smoking History) + (1.3388 × Previous Nonthoracic Cancer) + (0.1274 × Lesion Size) + (1.0407 × Spiculated Lesion Edge) + (0.7838 × Upper Lobe Location).11
Analyses were performed in R, version 3.2.3 (R Programming) and Stata, version 12.1 (StataCorp). We calculated the overall accuracy (area under the curve), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of PET imaging for each cohort and the entire data set. Using a multivariable logistic regression model, we examined the association between population cancer prevalence, lesion size, and FDG-PET accuracy. A logistic regression model was developed to determine the degree to which population cancer prevalence and lesion size predicted FDG-PET accuracy. The primary purpose of this analysis was to determine whether population cancer prevalence was a primary driver of FDG-PET accuracy. The association between Mayo Clinic model–predicted cancer probability and PET performance was plotted. All analyses used a 2-sided P value of .05 as indication of statistical significance.
Our multicenter population included a total of 1188 patients (Table 1). The overall lung cancer prevalence was 81% (962 of 1188 patients). Most patients were male smokers older than 60 years (Table 2). Cohort 4 had a smoking prevalence of 100% because it was drawn from a lung cancer screening population constituted entirely of smokers. Median nodule size ranged from 16 mm (interquartile range [IQR], 13-22 mm) in cohort 4 (Lung-RADS 4 screening population) to 24 mm (IQR, 16-35 mm) in cohort 5 (veterans surgical population). As expected, the median nodule size was larger in the surgical cohorts than in the pulmonary nodule clinic cohorts. The median Mayo Clinic model–predicted probability of cancer was 67.5% (IQR, 39.6%-90.6%).
Median pack-year history of smoking ranged from 29 (IQR, 1-48) in cohort 1 (pulmonary nodule clinic population) to 50 (IQR, 40-80) in cohort 5 (veterans surgical population). Cancer prevalence was lowest (n = 112 of 115; 52%) in cohorts 1 and 2 (pulmonary nodule clinic populations), moderately high (n = 355 of 433; 82%) in cohorts 3 and 4 (typical thoracic surgery clinic populations), and high (n = 495 of 540; 92%) in cohorts 5 and 7 (pure surgical populations).
The overall sensitivity and specificity of FDG-PET to diagnose lung cancer was 90.1% (95% CI, 88.1%-91.9%) and 39.8% (33.4%-46.5%), respectively. The sensitivity of PET ranged from 77% in cohort 2 (high-risk nodule population from Arizona) to 96% in cohort 5 (veteran surgical cohort from Tennessee) (Table 3). Sensitivity was significantly lower (mean, 79.1%; 95% CI, 72.2%-86.0% compared with 92.3%; 95% CI 90.2%-94.4% in the cohorts with a median lesion size > 2 cm; P < .001) in the 3 cohorts with a median lesion size of less than 2 cm. Specificity was lowest at 22% in the 2 pure surgical cohorts from Tennessee and Virginia. Specificity was highest at 75% in cohort 4, the Lung-RADS 4 population from Massachusetts, albeit with wide range (95% CI, 19%-99%) owing to the smaller cohort size. Cohort 4 was the only truly nonendemic cohort with regard to fungal lung disease.12 Specificity was also generally higher in cohorts that relied on a combination of pathology and radiographic surveillance.
The overall PPV was 86.4% and by population ranged from 55% to 97%. The overall NPV was 48.7%, ranging from 17% to 70%. To quantify the association between PET accuracy and both nodule size and cancer prevalence, a logistic model was fit. Based on this model, both larger nodule size and higher population cancer prevalence were significantly associated with PET accuracy (odds ratio, 1.027; 95% CI 1.015-1.040; P < .001 and 1.030; 95% CI, 1.021-1.040; P < .001, respectively).
As the Mayo Clinic model–estimated IPN probability of cancer increased, the sensitivity and PPV of PET/CT imaging increased, whereas the specificity and NPV dropped (Figure). The area under the curve for the ability of PET imaging to predict cancer diagnosis did not vary appreciably by probability of cancer quintile (results not shown). When the Mayo Clinic model estimated individual predicted probabilities of cancer to be less than 40%, PET imaging had greater than 50% specificity, sensitivity, NPV, and PPV.
Our findings suggest that FDG-PET/CT has more limited value than previously recognized to diagnose IPNs suspicious for lung cancer across a variety of clinical contexts, geographic regions, and nodule sizes. We report an overall sensitivity and specificity of FDG-PET to diagnose lung cancer of 90.1% (95% CI, 88.1%-91.9%) and 39.8% (33.4%-46.5%), respectively. These compare less favorably to the early reports of a sensitivity of 97% and specificity of 78% for diagnosing lung cancer.4,5
We found PET diagnostic performance to be particularly poor in smaller lesions from populations with lower cancer prevalence, where a normal test result does not always rule out cancer. The overall NPV of PET/CT was 48.7 (95% CI, 41.3-56.1), although it was higher at 69.1 (95% CI, 56.7-79.8) in the 2 pulmonary nodule cohorts. Both PET specificity and NPV drop as the Mayo Clinic model–predicted cancer prevalence increases: essentially, a normal PET/CT did not rule out cancer in a meaningful way in this population. Of note, only at low predicted probabilities of cancer (less than 40%) did PET imaging have greater than 50% specificity, sensitivity, NPV, and PPV. Positive predictive value and NPV vary naturally with cancer prevalence.
18F fluorodeoxyglucose is a combination of deoxyglucose, a glucose analogue, and the positron-emitting flourine 18 radionuclide. This radiographic biomarker of metabolic activity disperses quickly when injected and accumulates in tissues that use glycolysis such as the kidneys and brain. Concentrations of the fluorine radionuclide are measured by PET. Normal lung tissue is not metabolically active and does not accumulate the fluorine emitter. However, active infections, growing cancers, and some benign lung processes, such as sarcoidosis, metabolize glucose and accumulate FDG in the lung. Conversely, slow-growing cancers of the lung, such as adenocarcinoma in situ and carcinoid, do not actively use glucose and often generate false-negative scan results.
On the heels of the National Lung Screening Trial, the United States is experiencing a veritable epidemic of lung nodules, driven largely by imaging, many of which turn out to be false-positives and lead to unnecessary invasive biopsies and/or surgeries. An estimated 1.6 million13 incidentally discovered IPNs are found annually, with a 1% to 12% chance of malignancy.14-16 These nodules represent a significant diagnostic burden. The National Lung Screening Trial demonstrated a reduction in lung cancer and overall mortality but also a 96% false-positive imaging rate, 24% benign operative rate, and 1.2% mortality in patients undergoing diagnostic surgical procedures. Extrapolating from the results of the National Lung Screening Trial to health care use from a national lung cancer screening program, an additional 1.5 million follow-up CT scans, approximately 250 000 FDG-PET/CT scans, and 120 000 diagnostic operations will be performed annually.17 Among the 120 000 diagnostic operations, 29 000 operations in the first 3 years would result in benign disease, subjecting patients to harm with little therapeutic benefit.1
18F fluorodeoxyglucose PET/CT scans are currently recommended for noninvasive clinical diagnosis of lung cancer.18 However, infectious fungal diseases, such as histoplasmosis, blastomycosis and coccidioidomycosis, are well documented to cause false-positive scans.6,8 In addition, mycobacterium, including mycobacterium tuberculosis and Mycobacterium avium-intracellulare, may result in false-positive imaging.19
Although the procedure costs between $2000 and $3000, FDG-PET/CT is still considered standard of care for the noninvasive diagnosis of IPNs. However, our work has questioned the use of FDG-PET/CT for the diagnosis of suspicious IPNs where fungal lung diseases are endemic.6,8,20,21 Limitations of this study include our reliance on data from areas, with endemic fungal infections as well as the lack of prospective data collection with multiple cohorts specifically designed to represent a range of cancer prevalence, exposure to fungal lung disease, and other variables. Positron emission tomography specificity was highest in the Massachusetts cohort, a population with minimal fungal lung diseases and whose measured endemic rate is far lower than that of the other states’ cohorts (1.1%).22 In addition, there is some potential for verification bias in this retrospective study because in some cohorts, PET-avid lesions were more likely to have a definitive diagnosis. This was particularly true for cohort 4, where at least half of the Lung-RADS 4 source population did not have sufficient follow-up duration to prove benign disease on imaging alone. Accounting for this verification bias would likely shift the specificity for cohort 4 closer to 85%.
Current National Cancer Center Network and American College of Radiologists algorithms for the evaluation of lung nodules recommend the use of FDG-PET/CT imaging before biopsy or surgical excision for pathological tissue diagnosis.2,3 In a cost-effectiveness study comparing diagnostic FDG-PET/CT scan with aggressive tissue acquisition, we have previously reported that clinicians should pursue diagnosis by pathological tissue procedures instead of a diagnostic FDG-PET/CT scan when the specificity of FDG-PET/CT is less than 72%.21 Frequent use of preoperative tissue acquisition strategies has been shown to result in lower rates of nontherapeutic lung resections.23
Given the overall poor specificity and NPV across our 7 populations of patients with IPNs, these findings reiterate the need for a comprehensive lung nodule program that emphasizes advanced tissue acquisition techniques, such as image-guided fine-needle aspiration, navigational bronchoscopy, and endobronchial ultrasonography, to obtain tissue, avoid thoracoscopic procedures driven by FDG-PET/CT inaccuracies, and ultimately reduce nontherapeutic resection rates. Positron emission tomographic imaging is still useful for cancer staging after pathological diagnosis, but a normal PET should not be relied on to rule out cancer in patients with a high likelihood of cancer.
Corresponding Author: Eric L. Grogan, MD, MPH, Department of Thoracic Surgery, Vanderbilt University Medical Center, 609 Oxford House, 1313 21st Ave S, Nashville, TN 37232 (firstname.lastname@example.org).
Accepted for Publication: August 6, 2017.
Published Online: November 8, 2017. doi:10.1001/jamasurg.2017.4495
Author Contributions: Dr Maiga had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: Maiga, Blume, Rickman, Pinkerman, Nesbitt, Grogan.
Acquisition, analysis, or interpretation of data: Maiga, Deppen, Fletcher Mercaldo, Blume, Montgomery, Vaszar, Williamson, Isbell, Rickman, Lambright, Grogan.
Drafting of the manuscript: Maiga, Blume, Pinkerman.
Critical revision of the manuscript for important intellectual content: Deppen, Fletcher Mercaldo, Blume, Montgomery, Vaszar, Williamson, Isbell, Rickman, Lambright, Nesbitt, Grogan.
Statistical analysis: Maiga, Deppen, Fletcher Mercaldo, Blume, Vaszar.
Obtained funding: Grogan.
Administrative, technical, or material support: Isbell, Pinkerman, Lambright, Grogan.
Supervision: Blume, Williamson, Rickman, Nesbitt, Grogan.
Conflict of Interest Disclosures: None reported.
Funding/Support: Dr Maiga is supported by the Office of Academic Affiliations, Department of Veterans Affairs National Quality Scholars Program. Dr Grogan is a recipient of the Department of Veterans Affairs, Veterans Health Administration, Health Services Research and Development Service career development award 10-024.
Role of the Funder/Sponsor: The funding sources had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Meeting Presentation: This paper was presented at the 2017 Association of VA Surgeons Annual Meeting; May 7, 2017; Houston, Texas.
Disclaimer: The views expressed in this article are those of the authors and do not necessarily represent the views of the Department of Veterans Affairs.