Context Three fourths of US men older than 50 years have been screened with
prostate-specific antigen (PSA) for prostate cancer.
Objective To estimate the receiver operating characteristic (ROC) curve for PSA.
Design, Setting, and Participants Calculation of PSA ROC curves in the placebo group of the Prostate Cancer
Prevention Trial, a randomized, prospective study conducted from 1993 to 2003
at 221 US centers. Participants were 18 882 healthy men aged 55 years
or older without prostate cancer and with PSA levels less than or equal to
3.0 ng/mL and normal digital rectal examination results, followed up for 7
years with annual PSA measurement and digital rectal examination. If PSA level
exceeded 4.0 ng/mL or rectal examination result was abnormal, a prostate biopsy
was recommended. After 7 years of study participation, an end-of-study prostate
biopsy was recommended in all cancer-free men.
Main Outcome Measures Operating characteristics of PSA for prostate cancer detection, including
sensitivity, specificity, and ROC curve.
Results Of 8575 men in the placebo group with at least 1 PSA measurement and
digital rectal examination in the same year, 5587 (65.2%) had had at least
1 biopsy; of these, 1225 (21.9%) were diagnosed with prostate cancer. Of 1213
cancers with Gleason grade recorded, 250 (20.6%) were Gleason grade 7 or greater
and 57 (4.7%) were Gleason grade 8 or greater. The areas under the ROC curve
(AUC) for PSA to discriminate any prostate cancer vs no cancer, Gleason grade
7 or greater cancer vs no or lower-grade cancer, and Gleason grade 8 or greater
cancer vs no or lower-grade cancer were 0.678 (95% confidence interval [CI],
0.666-0.689), 0.782 (95% CI, 0.748-0.816), and 0.827 (95% CI, 0.761-0.893),
respectively (all P values <.001 for AUC vs 50%).
For detecting any prostate cancer, PSA cutoff values of 1.1, 2.1, 3.1, and
4.1 ng/mL yielded sensitivities of 83.4%, 52.6%, 32.2%, and 20.5%, and specificities
of 38.9%, 72.5%, 86.7%, and 93.8%, respectively. Age-stratified analyses showed
slightly better performance of PSA in men younger than 70 years vs those 70
years or older with AUC values of 0.699 (SD, 0.013) vs 0.663 (SD, 0.013) (P = .03).
Conclusion There is no cutpoint of PSA with simultaneous high sensitivity and high
specificity for monitoring healthy men for prostate cancer, but rather a continuum
of prostate cancer risk at all values of PSA.
One of the most common cancer screening activities in the United States
is the measurement of prostate-specific antigen (PSA) levels for the early
detection of prostate cancer. In 2001, approximately 75% of men in the United
States aged 50 years and older reported that they had previously undergone
PSA screening and 54% have reported regular PSA screening.1,2 Prostate
cancer screening with PSA has been controversial, as no studies have proven
that this strategy reduces mortality from prostate cancer.3 After
almost 2 decades of PSA screening in the United States, mortality from prostate
cancer has decreased, but it is unknown if the mortality reduction is due
to screening or to other factors such as treatment efficacy.4-6 Of
concern relative to a causal interpretation is that prostate cancer mortality
rates have declined in countries where PSA screening is uncommon.7-9 In the United States,
regions with different rates of prostate cancer screening and treatment have
similar rates of disease-specific mortality.10
A potential explanation for these observations may be due to the characteristics
of PSA measurement as a screening test. In general, prostate biopsy has not
been recommended unless PSA levels exceed a threshold value, generally 4.0
ng/mL, with slightly lower values recommended recently by some authors.11,12 We have reported that as many as
15% of men with a PSA value less than 4.0 ng/mL have prostate cancer and that
15% of these cancers are high grade.13 With
an understanding that the performance characteristics of a screening test
play an important role in determining its efficacy and efficiency, we report
the receiver operating characteristic (ROC) curve for PSA.
The Prostate Cancer Prevention Trial, conducted from 1993 to 2003 at
221 US centers, randomized 18 882 men aged 55 years or older with a normal
digital rectal examination result and PSA level less than or equal to 3.0
ng/mL to receive either finasteride or placebo for 7 years.14 Measurement
of PSA levels and digital rectal examination were performed annually. Measurements
of PSA levels were performed in a central laboratory using the Tandem E assay
(Hybritech; Beckman Coulter Inc, Fullerton, Calif) until 2000, when this was
replaced with the Access assay (Beckman Coulter). A prostate biopsy with a
minimum of 6 cores was recommended if PSA levels exceeded 4.0 ng/mL or the
digital rectal examination result was suspicious for cancer. At the end of
7 years all participants not previously diagnosed with cancer were requested
to undergo an end-of-study prostate biopsy within 90 days of the randomization
anniversary date. Race was defined because of the greater impact of prostate
cancer on African American men; race was self-reported by the participants
using categories defined by the National Institutes of Health. All participants
provided written informed consent, and the study was approved by the institutional
review boards of the participating institutions.
Two groups of participants were analyzed. Verified participants were defined as those who underwent prostate biopsy and
had had a PSA measurement and digital rectal examination within 1 year previous
to their biopsy. For individuals with multiple biopsies the last biopsy was
used; all analyses were repeated using instead the first biopsy and confirmed
results. Unverified participants were defined as
those without a prostate biopsy over the course of the trial; for this group,
the last PSA measurement available with an accompanying digital rectal examination
result within the year was used for analysis.
The operating characteristics are summarized in terms of the sensitivity
and specificity for cutoff values of PSA and the calculated ROC curve for
prostate cancer vs no prostate cancer. To examine the operating characteristics
of PSA for detecting more-aggressive, higher-grade disease, the operating
characteristics for Gleason grade 7 or greater prostate cancer vs Gleason
grade less than 7 or no prostate cancer, and Gleason grade 8 or greater prostate
cancer vs Gleason grade less than 8 or no prostate cancer were also calculated.
The sensitivity is defined as the proportion of cases with a PSA value exceeding
each cutoff value, and the specificity as the proportion of noncases with
a PSA value equal to or below each cutoff value. The ROC curve is a plot of
1−specificity vs sensitivity for all cutoff values in the range of PSA
levels observed. A test of the null hypothesis that the area under the ROC
curve (AUC) is 50% was performed using the Wilcoxon rank sum test.
A confirmatory ROC analysis for prostate cancer vs no prostate cancer
was performed by adding unverified participants to biopsy-verified participants
and using a verification bias adjustment.15,16 To
perform the adjustment, a Markov Chain Monte Carlo algorithm using the covariates
age, family history of prostate cancer (0 = no; 1 = yes),
current digital rectal examination result (0 = negative or normal;
1 = positive indicating suspicion for cancer), and PSA level was
used to estimate the probability of cancer and to impute the missing disease
status indicator for each of the unverified participants.17 The
algorithm essentially weights the unknown disease statuses for the unverified
participants by what was observed for similar verified cases. Similar ROC
curves with and without verification bias adjustment indicate a lack of verification
bias. The program was executed in the C programming language. P<.05 was used to determine statistical significance.
Of 9459 men randomized to the placebo group of the study, 8575 had a
PSA value and digital rectal examination result available for analysis; characteristics
of these participants are shown in Table 1.
Of these 8575 participants, 5587 (65.2%) had at least 1 biopsy performed during
the 7 years of the study, with a respective PSA and digital rectal examination
result available. The participants who were verified were more likely to be
older, to have a family history of prostate cancer, and to be white than those
who did not undergo a biopsy (P<.001). Of the
participants who underwent biopsy, 1225 (21.9%) had prostate cancer (Table 1). Of 1213 cancers with Gleason grade
recorded, 250 (20.6%) were Gleason grade 7 or greater and 57 (4.7%) were Gleason
grade 8 or greater.
Prostate-specific antigen values and digital rectal examination results
for placebo group participants who did and did not undergo prostate biopsy
are shown in Table 2. Participants who
did not undergo prostate biopsy were more likely to have PSA values less than
or equal to 4.0 ng/mL and negative digital rectal examination results (P<.001).
The performance characteristics of PSA for detecting prostate cancer
of any grade, Gleason grade 7 or greater, and Gleason grade 8 or greater are
shown in the Figure and Table 3. For detecting any grade of cancer, the ROC curve for verified
participants only (AUC, 0.682; 95% CI, 0.664-0.699) is nearly identical to
that corrected for verification bias (AUC, 0.678; 95% confidence interval,
0.666-0.689), so results are shown only after verification bias adjustment.
Although the AUC is significantly better than 50% (P<.001),
a clear-cut decision rule for prostate biopsy based on PSA values would be
challenging to derive from these data. On one hand, the commonly used cutoff
value of 4.1 ng/mL would have a 6.2% false-positive rate (1−specificity)
but would detect only 20.5% of cancer cases (sensitivity). To improve cancer
detection, the cutoff could be lowered to 1.1 ng/mL, thus detecting 83.4%
of cancer cases, but would subject 61.1% of men without cancer to prostate
biopsy. The recently recommended cutoff of 2.6 ng/mL would detect only 40.5%
of cancer cases. Scanning the first 3 columns of Table 3 shows that there is no single cutoff that would simultaneously
yield both high sensitivity and high specificity.
The operating characteristics of PSA measurement improve for detection
of higher-grade disease, as shown in the Figure and Table 3. The AUCs for
Gleason grade 7 or greater and Gleason grade 8 or greater cancer are 0.782
(95% confidence interval, 0.748-0.816) and 0.827 (95% confidence interval,
0.761-0.893), respectively. For each PSA cutoff value, the test is more sensitive
for higher-grade disease (Table 3).
The standard cutoff of 4.1 ng/mL detects 50.9% of highest-grade (Gleason grade
≥8) disease. Lowering the cutoff to 1.6 ng/mL would increase sensitivity
for highest-grade disease to 90% at the expense of decreasing the specificity
to 53.5%. To examine the impact of age on PSA performance, men younger than
70 years (n = 2956) were compared with men aged 70 years and older
(n = 2631). For the younger than 70 years vs 70 years and older
age ranges, PSA measurement performed slightly better for younger men with
AUC values of 0.699 (SD, 0.013) and 0.663 (SD, 0.013), respectively (P = .03). Sensitivity and specificity for PSA
measurement in these 2 age ranges are provided in Table 3. The AUC for PSA measurement in men with normal digital
rectal examination results was 0.684 (SD, 0.010) vs 0.662 (SD, 0.024) for
men with abnormal results (P = .38).
The Prostate Cancer Prevention Trial provides the first large-scale
opportunity to evaluate the operating characteristics of PSA measurement in
a prospective-screening setting. The ability to do so results from a unique
aspect of the trial—the protocol recommendation for universal verification
by prostate biopsy for all men at the end of the study, regardless of PSA
levels and digital rectal examination findings. Previous studies have retrospectively
estimated sensitivity and specificity.18 Other
prospective screening studies have generally performed a prostate biopsy only
in those men with PSA levels above 4.0 ng/mL and consequently have been subject
to verification bias. Punglia et al16 attempted
to correct for verification bias, but the conditions necessary for verification
bias adjustment were not strictly satisfied since the study protocol did not
require prostate biopsy in men with negative test results.
It is important to recognize the unique nature of this study population.
At enrollment, all participants had PSA values of 3.0 ng/mL or less and were
older than 54 years (mean age, 62 years), an age similar to that of participants
in previous screening studies.19 The participants
in the current trial anticipated semiannual visits and annual examinations
for 7 years of study. By selecting a healthier, more compliant population,
with generally low initial PSA values, these criteria could affect generalizability
of the estimates of performance characteristics of PSA measurement to the
general population.
The high frequency of cancer in men with PSA levels less than 4.0 ng/mL,
previously reported from the Prostate Cancer Prevention Trial, implies that
the use of PSA measurement for early detection of prostate cancer may result
in delayed detection with the common 4.0 ng/mL cutoff.13 However,
test performance may differ in men who have not had previous screening or
who have clinically important disease.20 We
found sensitivity to increase within the subset of higher-grade cases. Among
men with Gleason grade 8 and higher, the sensitivity of the standard PSA cutoff
of 4.1 ng/mL was 50.9%, considerably greater than the 20.5% sensitivity observed
among all cases. By comparison, Gann et al18 found
a 73% sensitivity for PSA measurement among symptomatic cancer cases diagnosed
within the 4 years following their serum draw. Cases in the series by Gann
et al had never undergone clinical PSA testing, whereas all participants in
the current study had PSA levels less than or equal to 3.0 ng/mL and a negative
digital rectal examination result at study enrollment. Because of repeated
screening, cases in our series were more likely to be diagnosed at an early
stage in their disease progression.
This analysis of the operating characteristics of PSA measurement may
help explain several observations regarding PSA screening and trends in prostate
cancer diagnosis and mortality in the United States since 1985. That many
prostate cancers, including high-grade tumors, are missed at low levels of
PSA could explain the discrepancy between the rate of PSA screening and the
change in prostate cancer mortality over the past 15 years of intensive PSA
screening. The delay in diagnosis of high-grade tumors until PSA levels exceed
current threshold “normal” values could also explain why there
is a 35% risk of subsequent treatment after radical prostatectomy, presumably
due to disease recurrence.21 However, lowering
the threshold would have 2 consequences: increased biopsy rates and the possibility
of increased detection and treatment of biologically inconsequential cancers.
Currently, men in the United States have a 17.3% lifetime risk of prostate
cancer diagnosis, while the lifetime risk of prostate cancer death is 3%.22 An inherent property of all screening tests is that
they disproportionately enhance the detection of slower-growing cancers, because
more-aggressive tumors have a greater likelihood of becoming clinically apparent
between screenings.23 While lowering the PSA
threshold is likely to increase the detection of such aggressive cancers at
an earlier stage, the unavoidable tradeoff is the increased detection of biologically
inconsequential cancers.
The implications of this analysis are substantial. Prior to clinical
use of biomarkers or other tests for cancer screening, properly designed validation
studies are essential. A multistep process for validation is currently used
by the Early Detection Research Network of the National Cancer Institute.24 While prostate cancer is not unique, it has a variable
natural history, ranging from markedly aggressive to indolent. Consideration
should be given to the development of biomarkers that incorporate disease
prognosis. Finally, it will be a challenge to the medical community to change
the long-held notion that there is a “normal” PSA level. Patients
and health care professionals must be reeducated that there is a continuum
of risk and no clearly defined PSA cutpoint at which to recommend biopsy.
It will be the patient, in concert with his health care professional, who
will ultimately have to weigh the sensitivity- specificity tradeoffs in combination
with the uncertain natural history of the disease to determine whether further
evaluation with a prostate biopsy is appropriate.
Corresponding Author: Ian M. Thompson, MD,
Department of Urology, University of Texas Health Science Center at San Antonio,
7703 Floyd Curl Dr, San Antonio, TX 78229 (thompsoni@uthscsa.edu).
Author Contributions: Drs Thompson, Ankerst,
and Crowley and Mss Chi and Goodman had full access to all of the data in
the study and takes responsibility for the integrity of the data and the accuracy
of the data analysis.
Study concept and design: Thompson, Ankerst,
Goodman; Parnes, Coltman.
Acquisition of data: Thompson, Lucia, Goodman,
Crowley.
Analysis and interpretation of data: Ankerst,
Chi, Lucia, Goodman, Crowley.
Drafting of the manuscript: Thompson, Ankerst,
Coltman.
Critical revision of the manuscript for important
intellectual content: Thompson, Ankerst, Chi, Lucia, Goodman, Crowley,
Parnes.
Statistical analysis: Ankerst, Chi, Goodman.
Obtained funding: Coltman.
Administrative, technical, or material support:
Thompson, Coltman.
Study supervision: Thompson, Ankerst, Goodman,
Crowley.
Financial Disclosures: None reported.
Funding/Support: This study was supported in
part by Public Health Service grants CA 37429, CA 35178, CA 45808, and CA
86402 from the National Cancer Institute.
Role of the Sponsor: The National Cancer Institute
sponsored the conduct of the Prostate Cancer Prevention Trial, including collection
of primary participant information (PSA measurement and analysis, pathologic
evaluation, follow-up of participants, and initial data analysis), but had
no role in the data analysis or the decision to publish.
Acknowledgment: We thank Ruth Etzioni, PhD,
and Ross Prentice, PhD, of the Fred Hutchinson Cancer Research Center, Seattle,
Wash, for their assistance with development of the manuscript.
Reprint Requests: Southwest Oncology Group
Operations Office, 14980 Omicron Dr, San Antonio, TX 78245-3217.
1.Weir HK, Thun MJ, Hankey BF.
et al. Annual report to the nation on the status of cancer, 1975-2000, featuring
the uses of surveillance data for cancer prevention and control.
J Natl Cancer Inst. 2003;95:1276-129912953083
Google ScholarCrossref 2.Sirovich BE, Schwartz LM, Woloshin S. Screening men for prostate and colorectal cancer in the United States:
does practice reflect the evidence?
JAMA. 2003;289:1414-142012636464
Google ScholarCrossref 3.Brawley OW. Prostate cancer screening: clinical applications and challenges.
Urol Oncol. 2004;22:353-35715283896
Google ScholarCrossref 4.Feuer EJ, Mariotto A, Merrill R. Modeling the impact of the decline in distant stage disease on prostate
carcinoma mortality rates.
Cancer. 2002;95:870-88012209732
Google ScholarCrossref 5.Clegg LX, Li FP, Hankey BF, Chu K, Edwards BK. Cancer survival among US whites and minorities: a SEER (Surveillance,
Epidemiology, and End Results) Program population-based study.
Arch Intern Med. 2002;162:1985-199312230422
Google ScholarCrossref 6.Etzioni R, Legler JM, Feuer EJ, Merrill RM, Cronin KA, Hankey BF. Cancer surveillance series: interpreting trends in prostate cancer,
III: quantifying the link between population prostate-specific antigen testing
and recent declines in prostate cancer mortality.
J Natl Cancer Inst. 1999;91:1033-103910379966
Google ScholarCrossref 7.Oliver SE, May MT, Gunnell D. International trends in prostate-cancer mortality in the “PSA
ERA.”
Int J Cancer. 2001;92:893-89811351313
Google ScholarCrossref 8.Oliver SE, Gunnell D, Donovan JL. Comparison of trends in prostate-cancer mortality in England and Wales
and the USA.
Lancet. 2000;355:1788-178910832832
Google ScholarCrossref 9.Quinn M, Babb P. Patterns and trends in prostate cancer incidence, survival, prevalence
and mortality, I: international comparisons.
BJU Int. 2002;90:162-17312081758
Google ScholarCrossref 10.Lu-Yao G, Albertsen PC, Stanford JL, Stukel TA, Walker-Corkery ES, Barry MJ. Natural experiment examining impact of aggressive screening and treatment
on prostate cancer mortality in two fixed cohorts from Seattle area and Connecticut.
BMJ. 2002;325:74012364300
Google ScholarCrossref 11.Antenor JA, Ham M, Roehl K, Nadler RB, Catalona WJ. Relationship between initial prostate specific antigen level and subsequent
prostate cancer detection in a longitudinal screening study.
J Urol. 2004;172:90-9315201744
Google ScholarCrossref 12.Krumholtz JS, Carvalhal GF, Ramos CG.
et al. Prostate-specific antigen cutoff of 2.6 ng/mL for prostate cancer screening
is associated with favorable pathologic tumor features.
Urology. 2002;60:469-47312350486
Google ScholarCrossref 13.Thompson IM, Pauler DK, Goodman PJ.
et al. Prevalence of prostate cancer among men with a prostate-specific antigen
level
<4.0 ng per milliliter.
N Engl J Med. 2004;350:2239-224615163773
Google ScholarCrossref 14.Thompson IM, Goodman PJ, Tangen CM.
et al. The influence of finasteride on the development of prostate cancer.
N Engl J Med. 2003;349:211-22012867604
Google ScholarCrossref 15.Begg CB, Greenes RA. Assessment of diagnostic tests when disease verification is subject
to selection bias.
Biometrics. 1983;39:207-2156871349
Google ScholarCrossref 16.Punglia RS, D’Amico AV, Catalona WJ, Roehl KA, Kuntz KM. Effect of verification bias on screening for prostate cancer by measurement
of prostate-specific antigen.
N Engl J Med. 2003;349:335-34212878740
Google ScholarCrossref 17.Gelfand AE, Smith AF. Sampling-based approaches to calculating marginal densities.
J Am Stat Assoc. 1990;85:398-409
Google ScholarCrossref 18.Gann PH, Hennekens CH, Stampfer MJ. A prospective evaluation of plasma prostate-specific antigen for detection
of prostatic cancer.
JAMA. 1995;273:289-2947529341
Google ScholarCrossref 19.Catalona WJ, Smith DS, Ratliff TL, Basler JW. Detection of organ-confined prostate cancer is increased through prostate-specific
antigen–based screening.
JAMA. 1993;270:948-9547688438
Google ScholarCrossref 20.Etzioni R, Cha R, Feuer EJ, Davidov O. Asymptomatic incidence and duration of prostate cancer.
Am J Epidemiol. 1998;148:775-7859786232
Google ScholarCrossref 21.Lu-Yao GL, Potosky AL, Albertsen PC, Wasson JH, Barry MJ, Wennberg JE. Follow-up prostate cancer treatments after radical prostatectomy: a
population-based study.
J Natl Cancer Inst. 1996;88:166-1738632490
Google ScholarCrossref 23.Prorok PC, Kramer BS, Gohegan JK. Screening theory and study design: the basics. In: Kramer BS, Gohagan JK, Prorok PC, eds. Cancer
Screening, Theory and Practice. New York, NY: Marcel Dekker Inc; 1999:29-53
24.Sullivan Pepe M, Etzioni R, Feng Z.
et al. Phases of biomarker development for early detection of cancer.
J Natl Cancer Inst. 2001;93:1054-106111459866
Google ScholarCrossref