HPV indicates human papillomavirus; NSAID, nonsteroidal anti-inflammatory drug.
eAppendix 1. Coding Specification for Measuring 9 of the Earliest Choosing Wisely Recommendations Using Administrative Claims Data
eAppendix 2. Choosing Wisely Low-Value Services—Descriptive Results
Rosenberg A, Agiro A, Gottlieb M, Barron J, Brady P, Liu Y, Li C, DeVries A. Early Trends Among Seven Recommendations From the Choosing Wisely Campaign. JAMA Intern Med. 2015;175(12):1913-1920. doi:10.1001/jamainternmed.2015.5441
The Choosing Wisely campaign consists of more than 70 lists produced by specialty societies of medical practices or procedures of minimal clinical benefit to patients in most situations, with recommendations regarding judicious use.
To quantify the frequency and trends of some of the earliest Choosing Wisely recommendations using nationwide commercial health plan population-level data.
Design, Setting, and Participants
Retrospective analysis of claims data for members of Anthem-affiliated commercial health plans. The low-value services selected were (1) imaging tests for uncomplicated headache; (2) cardiac imaging without history of cardiac conditions; (3) low back pain imaging without red-flag conditions; (4) preoperative chest x-rays with unremarkable history and physical examination results; (5) human papillomavirus testing for women younger than 30 years; (6) use of antibiotics for acute sinusitis; and (7) use of prescription nonsteroidal anti-inflammatory drugs (NSAIDs) for members with hypertension, heart failure, or chronic kidney disease.
Main Outcomes and Measures
The number of members with medical and/or pharmacy claims for the included low-value services was assessed quarterly over a 2- to 3-year span through 2013. Trend changes in recommendations were evaluated across all quarters using Poisson regression with denominators as offsets.
Two services had declines: Use of imaging for headache decreased from 14.9% to 13.4% (trend estimate, 0.99 [95% CI, 0.98-0.99]; P < .001), and cardiac imaging decreased from 10.8% to 9.7% (trend estimate, 0.99 [95% CI, 0.99-0.99]; P < .001). Two services had increases: Use of NSAIDs in select conditions increased from 14.4% to 16.2% (trend estimate, 1.02 [95% CI, 1.01-1.02]; P < .001), and human papillomavirus testing in younger women increased from 4.8% to 6.0% (trend estimate, 1.01 [95% CI, 1.00-1.01]; P < .001). Use of antibiotics for sinusitis remained stable (0.8% decrease from 84.5% to 83.7%; trend estimate, 1.00 [95% CI, 1.00-1.00]; P = .16). Use of preoperative chest x-rays (0.2% decrease, ending utilization 91.5%; trend estimate, 1.00 [95% CI, 1.00-1.00]; P = .70) and imaging for low back pain (53.7% utilization throughout study; P = .71) remained high with no statistically significant changes.
Conclusions and Relevance
For this population-level analysis of 7 low-value services analyzed, changes were modest but showed a desirable decrease for 2 recommendations (imaging for headache, cardiac imaging for low-risk patients). The effect sizes were marginal, however, and although 4 of the 7 lists had statistically significant changes—unsurprising given the large sample size—the clinical significance is uncertain. These results suggest that additional interventions are necessary for wider implementation of Choosing Wisely recommendations.
Reducing use of unnecessary medical procedures and treatments is a key component to controlling health care expenditures. Quiz Ref IDAs much as 30% of US health care expenditures are for interventions with marginal clinical benefit for patients and unnecessary diagnostic procedures that may lead to inappropriate treatment or a cascade of additional tests or procedures.1- 4
The Choosing Wisely campaign represents a physician-driven effort to create conversations between physicians and patients around overuse and waste. The campaign was piloted in 2009 by the National Physicians Alliance, funded by the American Board of Internal Medicine (ABIM) Foundation, and was expanded in 2012.5 Choosing Wisely currently consists of more than 70 lists of approximately 400 recommendations produced by specialty societies of frequently used medical practices or procedures that provide minimal clinical benefit to patients in most situations.5,6 Each list includes recommendations regarding judicious use of tests, services, or medications in situations in which there is likely to be low value returned to the patient.7,8
Because each society produced its own list, the processes for choosing the services included in the lists varied from society to society. The ABIM Foundation, however, reviewed the lists and provided general parameters, such as requirements for evidence-based recommendations and inclusion of only frequently performed services.8
Quiz Ref IDWhereas specialty societies have long promoted practice guidelines, typically guideline dissemination occurs within the confines of peer-reviewed journals, teaching hospitals, and conferences. That is, dissemination occurs within a specific physician area of specialty. The Choosing Wisely campaign, in contrast, involves a more public process, for example, partnering with Consumer Reports and including website material specifically directed toward patients.
Such lists draw attention to low-value services, but they must be translated into measurable recommendations to assess their effect on changing behavior.5,9 A recent article called for the ABIM Foundation and third-party payers to collaborate in creating high-impact lists whereby payers could estimate the volume, effect on quality, and cost of services listed in Choosing Wisely.5
The present study examined metrics related to a range of recommendations appearing on lists released early in the Choosing Wisely campaign. The goal was to assess the frequency at which use of these “low-value” services occurred, and to determine whether, after several years, use of the low-value services occurred more or less frequently. This study is among the first to evaluate a variety of recommendations from the Choosing Wisely campaign using nationwide population-level data from commercial health plans.
This retrospective analysis used medical and pharmacy claims from Anthem-affiliated Blue Cross and Blue Shield health care plans for approximately 25 million members across the United States. All study data were kept anonymous throughout to safeguard patient confidentiality; researchers only accessed a limited data set, devoid of individual patient identifiers, in full compliance with relevant provisions of the Health Insurance Portability and Accountability Act. This nonexperimental study, conducted under the Research Exception provisions of the Privacy Rule 45 CFR 164.514(e), was thus exempt from institutional review board approval and informed consent was not required.
For this analysis, we identified 7 services from the initial Choosing Wisely lists published in April 2012.8,10 Each selected service could be analyzed via commercial insurance claims data using the International Classification of Diseases, Ninth Revision; Current Procedural Terminology codes; and whenever appropriate, National Drug Codes and Logical Observation Identifiers Names and Codes (Table 1 and eAppendix A in the Supplement). They also reflected the variety of services and specialty societies participating in the Choosing Wisely campaign and had been published relatively early in the campaign, allowing time for adequate dissemination, and they were anticipated to have more frequent use.
Four of the 7 recommendations included in the analysis related to diagnostic imaging, consistent with the proportion of Choosing Wisely recommendations pertaining to imaging5: (1) imaging tests for headache with uncomplicated conditions, (2) cardiac imaging for members without a history of cardiac conditions, (3) preoperative chest x-rays with unremarkable history and physical examination results, and (4) low back pain imaging for members without red-flag conditions. One recommendation related to age guidelines for cervical cancer screening: (5) human papillomavirus (HPV) testing for women younger than 30 years. Last, 2 recommendations related to inappropriate medication use: (6) antibiotics for acute sinusitis and (7) prescription nonsteroidal anti-inflammatory drugs (NSAIDs) for members with select chronic conditions (hypertension, heart failure, or chronic kidney disease).11
Whereas cardiac testing represented a compilation of services listed by several specialty organizations (the American Academy of Family Physicians, the American College of Cardiology, the American College of Physicians, and the American Society of Nuclear Cardiology), the other services selected were listed by 1 or 2 societies. Preoperative chest x-rays were listed by the American College of Radiology and the American College of Physicians, whereas imaging for headache was listed only by the American College of Radiology; low back pain imaging, sinusitis antibiotics, and HPV testing were listed only by the American Academy of Family Physicians; and NSAID use was listed only by the American Society of Nephrology.
The rationale for including specific services in the Choosing Wisely lists varied, although most services selected typically yield a low level of clinically relevant information, such as imaging for headache, certain cardiac imaging, preoperative chest x-rays in certain situations, and low back pain imaging (Table 1). Use of NSAIDs to treat musculoskeletal pain in those with hypertension, heart failure, or chronic kidney disease was included because these drugs may increase blood pressure (interfering with antihypertensive medications) or cause fluid retention (which may worsen kidney function).
Each of the 7 recommendations analyzed had separate target, or study, populations. The denominators and numerators for each recommendation (Appendix B in the Supplement) were determined by specific criteria relevant to that recommendation (Table 1).
For 5 of the 7 recommendations analyzed, we strived to include all of the target population that met our study criteria (Table 1 and Appendix A in the Supplement). For 2 measures, however, the number of patients meeting denominator criteria was too large to include the entire target population. Consequently, we used a 5% random sample for cardiac imaging without a history of cardiac conditions and an 8% random sample for prescription NSAIDs for members with select chronic conditions (hypertension, heart failure, or chronic kidney disease). We selected a random sample percentage that optimized the analysis while capturing as much of the target populations as possible.
The number and percentage of members with claims for low-value services for each recommendation were assessed quarterly for at least 10 quarters through the third quarter of 2013.8 The starting point for each measure ranged from 2010 through 2011 on the basis of data availability. The 7 Choosing Wisely lists were published in the second quarter of 2012.
The HPV testing recommendation was a population-based measure (ie, the denominator definition depended on age and sex only); consequently, a member could be counted more than once in the denominator (or numerator) calculation each instance the criteria were met. The remainder of the recommendations were event-driven measures, so a member would qualify for the numerator or denominator on the basis of an event that took place in a given quarter. The same member could be recounted in future quarters but only if that member had a new qualifying event in a subsequent quarter. For both population- and event-based measures, members were counted only once per quarter.
The descriptive statistics (numerator, denominator, and frequency) of each recommendation is provided in Appendix B in the Supplement. We assessed trend changes in each recommendation across all quarters using Poisson regression with offsets. The dependent variable consisted of numerator counts in each quarter while time (quarter) was the only independent variable, as a measure of trend effect. The log of denominators in each quarter served as an offset. Poisson regression with offsets assumes a linear change in trends over time. Poisson regression with population size (denominator) offsets offers a natural way of analyzing aggregate data in the absence of patient-level data sets. The trend estimates should be interpreted as exponentiated rates per quarter as exponentiation was applied to estimates and confidence intervals. All analyses were conducted using SAS, version 9.4. Statistical significance was set at P = .05.
In addition to the regression analysis, we plotted the quarterly frequencies using statistical process control charts. These charts are often used as a tool for continuous improvement or quality control in monitoring process measures. In this case, quarterly data (ie, frequency of low-value services from the 7 recommendations) were plotted along with 3 reference lines (Figure). The raw mean of frequencies across all quarters for each recommendation was graphed as the “average” reference line. The standard deviation of frequencies across all quarters for each recommendation was calculated and used to define upper and lower limits (Appendix B in the Supplement).
Quiz Ref IDDuring the study, the percentage of members with imaging for headaches with uncomplicated conditions decreased from 14.9% to 13.4%, a relative reduction of 10.1%. On the basis of Poisson regression, the trend effect was decreasing to a statistically significant extent (trend estimate, 0.99 [95% CI, 0.98-0.99]; P < .001) (Table 2). Quiz Ref IDThe percentage of members with cardiac imaging in the absence of cardiac disease (using a 5% random sample) decreased from 10.8% to 9.7, a 10.2% relative reduction. The Poisson trend effect was decreasing to a statistically significant extent (trend estimate, 0.99 [95% CI, 0.99-0.99]; P < .001).
In contrast, the percentage of preoperative chest x-rays and low back pain imaging in the absence of red-flag conditions remained stable although high: the percentage for preoperative chest x-rays began at 91.3% and ended at 91.5%, a relative change of 0.2%; the percentage for low back pain imaging began and ended at 53.7%. There was no Poisson trend effect for preoperative chest x-rays (trend estimate, 1.00 [95% CI, 1.00-1.00]; P = .70) and low back pain imaging (trend estimate, 1.00 [95% CI, 1.00-1.00]; P = .71). Use of antibiotics for sinusitis decreased slightly, from 84.5% to 83.7%, a relative reduction of 0.9%. The Poisson trend effect was not statistically significant (trend estimate, 1.00 [95% CI, 1.00-1.00]; P = .16). Sinusitis antibiotic use exhibited minor seasonality that did not alter the overall stable trend.
The percentage of women younger than 30 years receiving HPV testing increased from 4.8% to 6.0%, a 25.0% relative increase. The Poisson trend effect was increasing to a statistically significant extent (trend estimate, 1.01 [95% CI, 1.00-1.01]; P < .001). Last, the use of prescription NSAIDS by members with select chronic conditions increased from 14.4% to 16.2%, a 12.5% relative increase. The Poisson trend effect was increasing to a statistically significant extent (trend estimate, 1.02 [95% CI, 1.01-1.02]; P < .001).
The statistical process control charts shown in the Figure support the general findings from the regression analysis in terms of trend directionality. For example, imaging for headache showed a modestly consistent decline with all 3 quarterly measures for 2013 falling below the lower limit. In contrast, use of NSAIDs by members with select chronic conditions showed an increasing trend in which all 3 quarterly measures for 2013 were at or above the upper limit.
The Choosing Wisely campaign represents an innovative approach toward changing physician and patient attitudes, via the greater emphasis on including the patient in the conversation. However, the campaign currently lacks a structural mechanism to evaluate its influence on clinical practice or to gauge whether additional efforts are necessary.5,8,9,12 Accessing claims data allowed an early evaluation of Choosing Wisely recommendations on low-value services based on nationwide population-level data. Prior studies on Choosing Wisely recommendations either focused on 1 or 2 recommendations13 or consolidated different recommendations into a single summary measure.14 To our knowledge, this study encompassed the largest number of patients studied to date and was the first to separately analyze a number of recommendations.
For the 7 low-value services analyzed, trend changes were modest but showed a desirable decrease for 2 recommendations (imaging for headache, cardiac imaging for low-risk patients). These results were statistically significant, unsurprising given the large sample size. The effect sizes were marginal, however, and may not represent clinically significant changes. Increasing trends were observed for HPV testing in women younger than 30 years and in 1 list focused on medication use (NSAIDs for people with certain chronic conditions), an undesirable trend in light of the Choosing Wisely recommendations. There was no significant change in the remaining 3 metrics: antibiotic use for sinusitis, preoperative chest x-rays, and low back pain imaging in the absence of red-flag conditions. Although the trend changes were statistically significant for 4 of the 7 lists analyzed, the clinical significance is uncertain. Our mixed results highlight the need for interventions beyond the current level of promotion, such as data feedback, physician communication training, systems interventions (eg, clinical decision support in electronic medical records), clinician scorecards, patient-focused strategies, and financial incentives.15
Although HPV testing was used by fewer than 10% of the members in the target population, other metrics in this study showed much higher use. Antibiotics for sinusitis were used by 8 of 10 members in the target population, preoperative chest x-rays were performed in 9 of 10 of the target, and low back pain imaging was performed in 5 of 10 members of the target population. The fact that some procedures or treatments were used quite often (eg, preoperative chest x-rays) while others were relatively rare (eg, HPV testing) highlights the importance of establishing frequency when creating recommendations and priorities. We initially looked at 2 additional recommendations (Papanicolaou tests for women younger than 21 or with hysterectomy for noncancer disease and sinusitis imaging for acute mild-to-moderate sinusitis or uncomplicated acute rhinitis) but found them to be used so infrequently as to be unsuitable for trend analysis. In fact, a recent publication showing that preoperative cardiac stress testing was infrequently used in the target population before Choosing Wisely lists were published also highlighted the need for specialty societies to focus Choosing Wisely lists on procedures with a high baseline frequency of use.13
Frequency of the most highly used services either remained stable (preoperative chest x-rays and low back pain imaging) or decreased slightly (sinusitis antibiotics) during the study period, which underscores the view that simple publication of recommendations—such as the Choosing Wisely lists—is insufficient to produce major changes to practice.16 The continued evaluation of low-value services will allow payers to use validated lists to inform quality improvement programs, as well as coverage, payment, and utilization management decisions.
This analysis did not examine patient-level risk differences and regional variations. Because our population-level data analysis will not be able to account for confounding, the potential effects of individual or clinician-level variations could be the focus of future studies. By examining each recommendation separately, this analysis preserved the effects of factors such as payment incentives, malpractice concerns, and patient demand on each specific recommendation.16 It is possible that assessing the local implementation of Choosing Wisely recommendations at the delivery system level could be a fruitful area for future studies.
Quiz Ref IDOur study has several limitations. Our analysis is based on administrative claims data that do not adequately capture the clinical circumstances that led to ordering a service, which may be especially important for recommendations on NSAIDs. Therefore, the recommendations that we defined may include some services that might be appropriate for an individual patient. In addition, claims data are subject to coding inaccuracies in terms of diagnostic designation and the reported timing of services. Because our trend analysis was based on quarterly data, we may have excluded some patients because workup and treatments could span more than 1 calendar quarter. However, as a population-level analysis, this will probably not affect the modest findings in this study.
It is likely that several factors in addition to the Choosing Wisely campaign may have been responsible for changes or lack of changes in frequency of low-value services. The first is secular trends and preexisting patterns. For example, during the study period, guidelines from different organizations regarding Papanicolaou testing in young women converged around the notion that those younger than 21 years should not get Papanicolaou tests. This might explain the relatively rare use of Papanicolaou tests. At the same time, HPV testing became more widely available in the marketplace and its use increased over time.
In fact, the changes or lack of changes in frequency of use of low-value services after publication of Choosing Wisely recommendations could be affected by changes in the nature and size of the target population over time, or in criteria used for the underlying clinical diagnoses. Therefore, creating and validating measures for Choosing Wisely recommendations is another area for future studies.
This analysis of 7 Choosing Wisely recommendations provides a starting point for further evaluation of the influence of the initiative on changing behavior by analyzing changes in volume of these services from the early years of the initiative. Our population-level analysis showed both desirable and undesirable modest trends in use of low-value services. The relatively small use changes suggest that additional interventions are necessary for wider implementation of Choosing Wisely recommendations in general practice. Some of the additional interventions needed include data feedback, physician communication training, systems interventions (eg, clinical decision support in electronic medical records), clinician scorecards, patient-focused strategies, and financial incentives.
Accepted for Publication: July 7, 2015.
Corresponding Author: Abiy Agiro, PhD, HealthCore Inc, 123 Justison St, Ste 200, Wilmington, DE 19801 (email@example.com).
Published Online: October 12, 2015. doi:10.1001/jamainternmed.2015.5441.
Author Contributions: Dr Rosenberg had full access to all the data in the study and takes responsibility for the integrity and accuracy of the data analysis.
Study concept and design: Rosenberg, Gottlieb, Barron, Brady.
Acquisition, analysis, or interpretation of data: All authors.
Drafting of the manuscript: Agiro, Brady, DeVries.
Critical revision of the manuscript for important intellectual content: Rosenberg, Agiro, Gottlieb, Barron, Liu, Li, DeVries.
Statistical analysis: Agiro.
Obtained funding: DeVries.
Administrative, technical, or material support: Rosenberg, Agiro, Brady, Liu, Li.
Study supervision: Rosenberg, Gottlieb, Barron, DeVries.
Conflict of Interest Disclosures: Drs Agiro, Barron, and DeVries are employees of HealthCore, which is a wholly owned Anthem subsidiary. Dr Rosenberg, Mssrs Gottlieb and Brady, and Mss Liu and Li are employees of Anthem. All authors have stock or other ownership with Anthem. No other disclosures are reported.
Funding/Support: Research support was provided by Anthem.
Role of the Funder/Sponsors: Employees of Anthem had a role in all aspects of the study, including the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Disclaimer: The interpretation and reporting of these data are the sole responsibility of the authors.
Additional Contributions: The authors gratefully acknowledge the contributions of Thomas Wasser, PhD, MEd, CIM, senior scientist for biostatistics, HealthCore, Inc, for reviewing the statistical and design methods used in this study. The authors also acknowledge Cheryl Jones, senior medical writer, HealthCore, Inc, for editorial assistance. The authors gratefully acknowledge the following employees of Anthem, Inc, for the design and critical review of measures as well as oversight and technical assistance: Geoffrey B. Crawford, MD, MS; Jeffrey Clyman, MD; Weihong Huang, MD, MS; David Wetzel, PharmD; Jevon Mitsuoka, PharmD; David S. Chiou, BA; and Geraldine Nojadera, BS, CCS. None were compensated beyond their salary for their contribution to the study.
Correction: This article was corrected for an error in the abstract Results on October 19, 2015.