AFP, alpha-fetoprotein; beta HCG, beta human chorionic gonadotropin; CA 15-3, cancer antigen 15-3; CA-125, cancer antigen-125; CEA, carcinoembryonic antigen; PSA, prostate-specific antigen; LDH, lactate dehydrogenase.
Appendix 1. Guideline organizations evaluated by cancer type
Appendix 2. Guideline methodology and evidence quality
Customize your JAMA Network experience by selecting one or more topics from the list below.
Merkow RP, Korenstein D, Yeahia R, Bach PB, Baxi SS. Quality of Cancer Surveillance Clinical Practice Guidelines: Specificity and Consistency of Recommendations. JAMA Intern Med. 2017;177(5):701–709. doi:10.1001/jamainternmed.2017.0079
What is the specificity and consistency of recommendations for cancer surveillance after active treatment across guidelines?
In this retrospective cross-sectional analysis of 41 national cancer guidelines across 9 cancer types, we found recommendations are often nonspecific and inconsistent. Within the same disease, different guidelines often did not address all the same surveillance modalities, and relatively few surveillance modalities were recommended across all guidelines.
As guidelines continue to be revised, developers should clarify recommendations with simple, nonambiguous, definitive language for or against the use of specific tests to optimize care quality and resource utilization.
Primary care clinicians, who are increasingly responsible for caring for the growing population of cancer survivors, may be unfamiliar with appropriate cancer surveillance strategies. Clinical practice guidelines can inform cancer follow-up care and surveillance testing. Vague recommendations and inconsistencies among guidelines can lead to overuse and underuse of health care resources and have a negative impact on cost and quality of survivorship care.
To examine the specificity and consistency of recommendations for surveillance after active treatment across cancer guidelines.
Design, Setting, and Participants
Retrospective cross-sectional analysis of national cancer guidelines from North America and Europe published since 2010 addressing posttreatment care for survivors of the 9 most common cancers. We categorized surveillance modalities into history and physical examinations, tumor markers, diagnostic procedures (eg, colonoscopy), and imaging. Within each guideline, we classified individual recommendations into 5 categories: (1) risk-based recommendation, (2) recommendation for surveillance, (3) addressed but no clear recommendation, (4) recommendation against surveillance, or (5) cases in which surveillance was not addressed. We reviewed each surveillance recommendation for frequency and a stop date, evaluated consistency among guidelines, and analyzed associations between the organizations proposing the guidelines and recommendation characteristics.
Main Outcomes and Measures
Description of guideline recommendations for cancer surveillance.
We identified 41 guidelines published between January 1, 2010, and March 1, 2016. Eighty-five percent of guidelines (35) were from professional organizations. Ambiguous recommendations (ie, modality not discussed or discussed without a clear recommendation) were present in 83% of guidelines (34), and 44% (18) recommended against at least 1 test. European guidelines were more likely than North American guidelines to contain ambiguous recommendations (100% vs 68%; P < .01). Recommendations commonly specified testing frequency (from 88% [14 of 16] for tumor markers to 92% [24 of 26] for procedures and/or imaging) but infrequently provided a definitive stop time. Cross-sectional imaging recommendations varied among guidelines for each cancer. For example, among breast cancer guidelines, surveillance computed tomographic scans were recommended against in 2, discussed without a clear recommendation in 1, and not addressed in 3 guidelines.
Conclusions and Relevance
Guidelines addressing the care of cancer survivors have low specificity and consistency. As guidelines continue to be revised, developers should clarify recommendations with simple, nonambiguous, definitive language for or against the use of specific tests to optimize care quality and resource utilization.
There are an estimated 33 million living survivors of cancer globally, and this number is expected to grow owing to a rising cancer incidence in an aging population and improved survival following a cancer diagnosis.1,2 Currently, in the United States, 1 in every 20 people, or 14 million, meets the definition of cancer survivor3; this number is expected to grow to 18 million survivors by 2022.2 Surveillance for recurrent or secondary cancer is a fundamental component of survivorship care.4 Depending on the site of primary disease and time since treatment, surveillance modalities can include medical history and physical examinations, tumor markers, direct visualization with endoscopic procedures, and radiographic imaging.
Given a growing shortage of oncologists in the United States,5 survivorship care is increasingly provided by primary care physicians (PCPs).6,7 However, PCPs infrequently receive guidance from oncologists regarding appropriate surveillance care4 and may lack knowledge and confidence in this area.8
To provide optimal survivorship care, PCPs9 and professional organizations6 have acknowledged the need for clinical practice guidelines with clear recommendations addressing the care of cancer survivors. Given the size of this patient population, their potential vulnerability, and the high cost of some tests used for surveillance testing (eg, positron emission tomographic [PET] scanning), high-quality guidelines in the area of cancer survivorship have the potential to have a great impact on value, by both improving clinical outcomes and controlling costs. In other clinical settings, guidelines have been criticized for vagueness of recommendations10 and inconsistency,11 limiting their applicability and usefulness to clinicians for determining appropriate care. To our knowledge, characteristics of guidelines related to the care of cancer survivors have not been previously described. We sought to evaluate the specificity of national guidelines containing recommendations about surveillance testing in survivors and to analyze the consistency of recommendations across guidelines addressing the same cancer.
We performed a cross-sectional analysis of clinical practice guidelines from North America and Europe addressing cancers with the highest estimated number of survivors in the United States as identified by the American Cancer Society.2 We included 9 cancers (breast, colorectal, non–small-cell lung, prostate, melanoma, uterine corpus, bladder, thyroid, and testicular), which represented 73% of all cancer survivors (10 623 240 people) in the United States in 2014.2
We performed an online search for publicly available cancer guidelines for each selected cancer; searches were performed by 2 investigators (R.P.M. and R.Y.). We included any national-level guideline from a government agency or a professional group or society in North America or Europe published in English between January 1, 2010, and March 1, 2016, addressing posttreatment cancer surveillance. We identified guidelines for inclusion by performing internet searches using relevant keywords (eg, clinical practice guideline, oncology, cancer follow-up), examining websites of well-established guideline development organizations (eg, National Comprehensive Cancer Network [NCCN], National Institute for Health and Clinical Excellence [NICE]) and national societies (eg, American Society of Clinical Oncology [ASCO], European Society for Medical Oncology [ESMO]) and querying the Agency for Healthcare Research and Quality’s National Guideline Clearinghouse12 website. Clinical guidelines that did not contain surveillance recommendations were excluded. After guideline selection was complete, we recorded specific characteristics, including organization type (professional or government), year of guideline publication (2010-2013 or 2014-2016) and region of origin (North America or Europe). For each guideline we evaluated aspects of the guideline development process (specification of clinical questions, performance of a systematic review) and the reported strength of evidence in support of surveillance recommendations, based on prioritized elements from the Grades of Recommendation Assessment, Development and Evaluation [GRADE] system, the Reporting Items for Practice Guidelines in Healthcare (RIGHT) checklist, and the Institute of Medicine (IOM) standards for guideline development.13-16
We categorized methods of surveillance as history and physical examination, tumor marker, diagnostic procedure (eg, colonoscopy), or imaging. We included any surveillance modality that was addressed by at least 1 guideline. One of 3 clinicians (R.P.M., D.K., or S.S.B.) classified each recommendation as 1 of the following: (1) risk-based recommendation, (2) recommendation for surveillance, (3) addressed but no clear recommendation provided, (4) recommendation against, or (5) cases in which surveillance was not addressed. We defined risk-based recommendations as those in which the use of a mechanism of surveillance differed based on the level of risk of recurrence. If the clinician was unsure how to classify a recommendation, it was reviewed by 1 or both of the others and consensus was reached.
To assess for the specificity of each recommendation, we evaluated for inclusion of a surveillance frequency (eg, tumor marker testing every 3 months), the presence of a definitive stop date (eg, tumor marker testing every 3 months for 1 year), and the presence of ambiguity (ie, without a clear recommendation for or against any given test). To evaluate for consistency regarding the same surveillance method for the same cancer, we compared testing recommendations among guidelines addressing the same cancer type. We defined inconsistent guidelines when recommendations did not agree, including when one guideline recommended for or against a test while another discussed a test without a clear recommendation or did not discuss that test at all.
We used descriptive statistics to characterize surveillance methods, recommendation types, specificity, and consistency, and used χ2 tests to evaluate associations between guideline sources and recommendation characteristics. Owing to small sample size, we did not perform multivariable analysis. Significance was set at P = .05, and all tests were 2-sided. All statistical analysis was performed using SAS software (version 9.4; SAS Institute Inc).
We identified a total of 41 guidelines addressing posttreatment surveillance across the 9 cancer types (Table 1). The number of guidelines per cancer type ranged from 3 to 6 per cancer, and a total of 22 specific testing modalities were addressed. Thirty-five guidelines (85%) were from professional organizations, of which 25 (71%) were developed by national societies. Twenty guidelines (49%) were from North America, and most guidelines were published between 2014 and 2016 (66%). eAppendix 1 in the Supplement lists all guidelines included by cancer type.17-59 Guideline development processes were variable: clinical questions were specified in 11 guidelines (27%), a systematic review was performed in 14 (34%), and 34 (83%) rated the strength of evidence and/or strength of recommendations. Supporting evidence was weak for most recommendations (eAppendix 2 in the Supplement).
Medical history and physical examinations were recommended in most guidelines (37 [90%]) across all cancer types, while other forms of surveillance were less commonly addressed and varied more across cancers, including imaging (34 [83%]), endoscopic procedures (26 [63%]), and tumor markers (23 [56%]). Ambiguous recommendations (ie, recommendations neither for or against a particular modality) were present in 34 guidelines (83%) across cancer types while 18 guidelines (44%) recommended against at least 1 test. Fourteen guidelines (34%) included risk-based recommendations (Table 1). A recommendation against use was included in at least 1 guideline for 12 of 22 total testing modalities identified, although no test was recommended against consistently. Recommendations for surveillance testing varied by cancer type and sometimes across guidelines addressing the same cancer type. Some testing modalities were universally recommended across guidelines for a specific cancer type, including mammography in breast cancer, colonoscopy and tumor markers in colorectal cancer, tumor markers in prostate cancer, and ultrasonography and tumor markers in thyroid cancer.
Recommendations regarding other surveillance modalities were less consistent. With regard to tumor markers, 2 of 4 testicular cancer guidelines (50%) and 1 of 4 melanoma guidelines (25%) recommended risk-based tumor marker testing; 2 of 6 breast cancer guidelines (33%) and 1 of 5 lung cancer guidelines (20%) recommended against tumor marker testing (Figure 1). The tests that were most commonly recommended against were CT imaging in uterine cancer (67% of relevant guidelines) and bone scans in prostate cancer (33%) (Table 2).
Positron emission tomographic imaging was recommended by only 1 of 41 guidelines; this was for bladder cancer (Figure 2). The remainder of guidelines either recommended against or did not address routine PET imaging. Uterine cancer had the most guidelines recommending against the use of PET imaging (67%) followed by lung cancer (60%). The cancer types with the most guidelines with ambiguous recommendations for PET scans were bladder (83%), prostate (83%), and breast (67%) cancers.
Testing frequency was provided for most of the surveillance modalities addressed (range, 88%-92%), but stop times were infrequently provided (range, 31%-38%). There was no statistically significant difference in testing frequency, inclusion of a stop time, presence of a risk-based recommendation, recommendation against at least 1 test, or guideline ambiguity by organization type or year of publication. However, there was significant variation in the presence of a stop time recommendation by cancer type (range, 0% for prostate, uterine, and thyroid cancers to 100% for colorectal cancer; P < .01). In addition, European guidelines were more likely than North American guidelines to contain ambiguous recommendations (100% vs 68%; P < .01) (Table 3).
Clinical practice guidelines addressing cancer surveillance testing are critical tools for clinicians for optimizing care of the large and growing population of cancer survivors. Unclear or imprecise recommendations present challenges for all health care providers (eg, physicians, nurses, nurse practitioners) caring for cancer survivors. The specificity and consistency of recommendations across guidelines is particularly important because survivorship care is increasingly transitioned to clinicians with less familiarity with specific cancers.6,7 In this study, we found multiple guidelines from North America and Europe addressing posttreatment cancer surveillance containing recommendations that were often nonspecific and inconsistent. In fact, within the same disease, different guidelines often did not address all the same surveillance modalities, and relatively few surveillance modalities were recommended across all guidelines. Our findings are consistent with those of prior studies addressing the specificity and consistency of guideline recommendations related to both screening10 and cancer care.60
Most surveillance recommendations included a testing frequency, but fewer than 1 in 3 provided a definitive stop time. Reasons for infrequent stop times are unclear, although there is a clear decreased risk of recurrence over time for most malignant neoplasms, and few surveillance modalities are required indefinitely.61-63 However, PCPs may be reluctant to halt testing without clear recommendations on when to do so. Similarly, specificity in guideline recommendations is key to their usability,10 and lack of recommendation specificity is associated with poor guideline adherence in other clinical contexts.64,65 The lack of clarity in cancer surveillance recommendations is particularly relevant because cancer survivors are transitioning earlier after active treatment66,67 from oncologists to PCPs. Just as high-quality guideline recommendations can help clinicians maximize patient benefit and minimize potential harm, a lack of specificity may impede guideline adherence and contribute to either overuse or underuse of care.68 Although underuse has been more thoroughly studied, excessive ongoing surveillance may harm patients through exposure either to direct harms of unnecessary surveillance tests or to harms of more invasive downstream procedures.69 Much of the lack of recommendation specificity is likely driven by the low quality of evidence to inform optimal surveillance strategies in cancer survivors that we documented in our study,70 and clearly poor evidence is a major barrier to the development of high-quality guidelines.71 However, developers can optimize guideline usability by maintaining transparency about the strength of evidence while still making specific recommendations even in the absence of strong evidence.
The Institute of Medicine has stated that guidelines should be valid, reliable, applicable, flexible, and clear, and should reflect a multidisciplinary process that can be regularly updated.72 The guidelines in our sample fall short in many of these domains, which is not unique among oncology guidelines.73 However, we believe that a number of simple changes to the development of cancer surveillance recommendations would improve their clarity, applicability, and, therefore, their ability to optimize patient outcomes.
First, recommendations about testing should use language that is unambiguous and includes a testing frequency with definitive start and stop intervals.65 For example, with respect to surveillance imaging, a guideline could state that a specific test should be performed “every 6 months for the first 2 years, yearly for 3 years and should not be performed after a total of 5 years if there is no evidence of recurrence.” Definitive statements such as “positron emission tomography scans should not be used for surveillance outside of a clinical trial”23 should be encouraged and adopted. While shared decision-making with patients is critical for optimizing care and clinicians may not apply every recommendation to every patient,74 clarity and consistency in guideline recommendations, along with transparent evidence ratings, can facilitate communication and patient understanding.
Next, cancer surveillance strategies should include recommendations that are tailored to recurrence risk. There is increasing recognition that risk-based guideline recommendations may optimize outcomes and care value, both generally75,76 and specifically in the setting of long-term monitoring of survivors of childhood cancer.77 One-third of guidelines in our sample included at least 1 risk-based recommendation, although in these cases, risk was generally based on stage at diagnosis alone. Robust risk-based follow-up of adult cancer survivors should incorporate factors that are well established from randomized clinical trials and observational data, including cancer and patient characteristics (eg, stage, grade, genetic mutation status). In colorectal cancer, for example, extensive data exist outlining recurrence risk from decades of randomized clinical trials, including risk-based models, but these data are not currently incorporated into surveillance recommendations.78-82 The NCCN melanoma guidelines are an example of higher-quality, risk-based recommendations38 in which patients with stage I and II disease are followed by history and physical examinations only, while those with stage III and IV disease undergo more extensive surveillance with cross-sectional imaging (eg, CT scans, magnetic resonance imaging) including PET scans. The lack of risk-based recommendations among guidelines in our sample likely reflects the limited data available to help instruct surveillance programs60; indeed, most recommendations were based on low-quality evidence. Other barriers to risk-based recommendations include the inherent complexity of developing them and perhaps the perceived challenges with clinician interpretation. Nevertheless, risk-based recommendations are likely to provide a more efficient, cost-effective approach to patient follow-up, and further incorporation of risk into surveillance recommendations would improve their usefulness.
Third, survivorship guideline development panels should incorporate all stakeholders, including generalist physicians, advanced practice clinicians, and patient representatives.72 Currently, panels developing cancer guidelines are tasked with providing recommendations across the continuum of care, including diagnosis, treatment, and posttreatment management. While the panels may be multidisciplinary as recommended by the IOM,64 including a variety of oncologic specialists to address the complexity of diagnosis and treatment of cancer, the panels likely contain very few, if any, general practitioners.83 Yet it is often generalists who must translate surveillance recommendations into clinical care. Incorporating the diverse opinions and experiences of all groups affected by surveillance recommendations may facilitate greater guideline specificity and encourage more active engagement and adherence to guideline recommendations.65
Most recommendations addressing posttreatment care in cancer survivors are made in the context of guidelines addressing the diagnosis and treatment of a particular cancer; in this context, surveillance recommendations are included but not emphasized. Developing surveillance guidelines separately from general cancer care guidelines may allow for the inclusion of more appropriate panelists, better focus, and more specific recommendations. Recently, the American Cancer Society and the American Society of Clinical Oncology18 published the breast cancer survivorship guideline, illustrating the advantages of this approach. The guideline focused narrowly on breast cancer care after the completion of acute cancer therapy. The survivorship guideline was developed by a multidisciplinary panel that included appropriate stakeholders for surveillance testing (cancer clinicians, generalists, and patients) and made specific and actionable recommendations. More guidelines with focus only on survivorship care could facilitate change and allow developers to focus on improving recommendation quality.
There are several important limitations to our study. First, we restricted our search to national cancer guidelines and excluded regional recommendations. This approach excluded provincial clinical practice guidelines in Canada, although they may be widely used and influential.84 Nevertheless, including additional guidelines is likely to have increased the variation we found and would unlikely qualitatively change our results. Second, this study sample was small, and we were only able to evaluate the association between guideline characteristics and the specificity and consistency of recommendations using a univariable analysis. Owing to the nature of the study and the limited number of guidelines in existence, there was not an alternative methodological approach, and this would only influence the comparative analysis and not our primary findings. Third, there is inherent subjectivity in the interpretation of recommendations. However, we attempted to mitigate this issue by identifying and extracting important data elements to standardize guideline reporting and comparisons. Finally, our study is cross-sectional and offers a snapshot in the status of surveillance clinical practice guidelines up to March 1, 2016. Nevertheless, given the current state of cancer surveillance guidelines, it is unlikely that major qualitative changes will occur in the near future.
The number of cancer survivors is growing, and optimizing cancer surveillance is an important issue for individual patients, payers, and clinicians. Our review of 41 surveillance recommendations from clinical practice guidelines across 9 cancer types found a lack of specificity and consistency that hinders optimal patient care. As cancer guidelines are reviewed and revised, we believe developers should clarify recommendations with simple, nonambiguous, definitive language for, or against, the use of specific tests and specific recommendations based on patient risk.
Corresponding Author: Ryan P. Merkow, MD, MS, Department of Surgery, Memorial Sloan Kettering Cancer Center, 1275 York Ave, C-1272, New York, NY 10065 (email@example.com).
Accepted for Publication: January 10, 2017.
Published Online: March 20, 2017. doi:10.1001/jamainternmed.2017.0079
Author Contributions: Dr Merkow had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Merkow, Korenstein, Bach, Baxi.
Acquisition, analysis, or interpretation of data: Merkow, Korenstein, Yeahia, Baxi.
Drafting of the manuscript: Merkow, Yeahia, Baxi.
Critical revision of the manuscript for important intellectual content: Korenstein, Bach, Baxi.
Statistical analysis: Merkow, Yeahia.
Administrative, technical, or material support: Korenstein, Yeahia, Baxi.
Supervision: Korenstein, Bach.
Conflict of Interest Disclosures: Dr Bach reports personal fees from the Association of Community Cancer Centers, America's Health Insurance Plans, AIM Specialty Health, American College of Chest Physicians, American Society of Clinical Oncology, Barclays, Defined Health, Express Scripts, Genentech, Goldman Sachs, McKinsey and Company, MPM Capital, National Comprehensive Cancer Network, Biotechnology Industry Organization, The American Journal of Managed Care, The Boston Consulting Group, Foundation Medicine, Anthem Inc, Novartis, and Excellus Health Plan. No other disclosures are reported.
Funding/Support: This study was supported in part by the National Institute of Health/National Cancer Institute (NIH/NCI) P30 CA008748 Cancer Center Support Grant and by grants from the Kaiser Foundation Health Plan and the Laura and John Arnold Foundation.
Role of the Funder/Sponsor: The funding sources had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.