CRC indicates colorectal cancer; osteo, osteoporosis; PSA, prostate-specific antigen.
Antic inaction regret indicates anticipated inaction regret; HMO, health maintenance organization.
Sheridan SL, Sutkowi-Hemstreet A, Barclay C, Brewer NT, Dolor RJ, Gizlice Z, Lewis CL, Reuland DS, Golin CE, Kistler CE, Vu M, Harris R. A Comparative Effectiveness Trial of Alternate Formats for Presenting Benefits and Harms Information for Low-Value Screening ServicesA Randomized Clinical Trial. JAMA Intern Med. 2016;176(1):31-41. doi:10.1001/jamainternmed.2015.7339
Healthcare overuse, the delivery of low-value services, is increasingly recognized as a critical problem. However, little is known about the comparative effectiveness of alternate formats for presenting benefits and harms information to patients as a strategy to reduce overuse.
To examine the effect of different benefits and harms presentations on patients’ intentions to accept low-value or potentially low-value screening services (prostate cancer screening in men ages 50-69 years; osteoporosis screening in low-risk women ages 50-64 years; or colorectal cancer screening in men and women ages 76-85 years).
Design, Setting, and Participants
Randomized clinical trial of 775 individuals eligible to receive information about any 1 of the 3 screening services and scheduled for a visit with their clinician. Participants were randomized to 1 of 4 intervention arms that differed in terms of presentation format: words, numbers, numbers plus narrative, and numbers plus framed presentation. The trial was conducted from September 2012 to June 2014 at 2 family medicine and 2 internal medicine practices affiliated with the Duke Primary Care Research Consortium. The data were analyzed between May and September of 2015.
One-page evidence-based decision support sheets on each of the 3 screening services, with benefits and harms information presented in 1 of 4 formats: words, numbers, numbers plus narratives, or numbers plus a framed presentation.
Main Outcomes and Measures
The primary outcome was change in intention to accept screening (on a response scale from 1 to 5). Our secondary outcomes included general and disease-specific knowledge, perceived risk and consequences of disease, screening attitudes, perceived net benefit of screening, values clarity, and self-efficacy for screening.
We enrolled and randomly allocated 775 individuals, aged 50 to 85 years, to 1 of 4 intervention arms: 195 to words, 192 to numbers, 196 to narrative, and 192 to framed formats. Intentions to accept screening were high before the intervention and change in intentions did not differ across intervention arms (words, −0.07; numbers, −0.05; numbers plus narrative, −0.12; numbers plus framed presentation, −0.02; P = .57 for all comparisons). Change in other outcomes also showed no difference across intervention arms. Results were similar when stratified by screening service.
Conclusions and Relevance
Single, brief, written decision support interventions, such as the ones in this study, are unlikely to be sufficient to change intentions for screening. Alternate and additional interventions are needed to reduce overused screening services.
clinicaltrials.gov Identifier: NCT01694784
Delivery of low-value services, also known as health care overuse, is a critical problem for US health care. Low-value services are those in which the degree of benefit does not justify the harms and costs.1 The United States spends an estimated $192 billion annually on delivery of such services.2 This results in physical, psychological, and financial harms; hassle; and opportunity costs without potential benefit for patients.3
However, the best ways to address health care overuse are unknown. It is unclear whether the same strategies that increase use of effective health care services work equally well to reduce overuse of health care services.4 Furthermore, little comparative effectiveness research is available to guide prevention and deimplementation of strategies targeted at overuse.5,6
Any comprehensive strategy to address health care overuse must include messages for patients.6 While the message might vary based on several factors, experts agree that key components to inform patients about health care services include the likelihood of acquiring and dying from the targeted disease; the benefits and harms of the health care service; and encouragement to make a decision about whether to have the service based on the patient’s individual preferences and values.7
The effectiveness of such messages is likely to be influenced by many factors, including their presentation format. Convincing literature8 suggests that words describing probability do not have shared meaning and that the format of numbers has substantial impact on understanding and behavioral intention.9,10 Furthermore, different narratives and framed presentation formats differentially affect these same outcomes.11- 13 However, to our knowledge, no studies have compared the effectiveness of these presentation formats in reducing intentions or behavior for low-value or potentially low-value screening services.
Therefore, our goal was to examine the comparative effectiveness of 4 alternate formats for presenting benefits and harms information in reducing intentions for screening and changing secondary behavioral and decision-making outcomes for patients eligible for 1 of 3 low-value or potentially low-value screening services.
We conducted a randomized clinical trial in a convenience sample of four community-based practices (2 family medicine, 2 internal medicine), with 26 810 patients and 32 clinicians, that were affiliated with the Duke Primary Care Research Consortium (PCRC). The institutional review boards at the University of North Carolina and Duke University approved the study, and a data safety monitoring committee monitored the study. The trial protocol is provided in Supplement 1.
From weekly practice patient lists, study staff recruited a consecutive sample of individuals, ages 50 to 85 years, who received continuing care for more than 1 year at any of the participating PCRC sites, had an upcoming medical visit with their clinician, and were eligible to receive information about any 1 of the 3 screening services of interest. These services had either net harm at the population level (ie, prostate cancer screening in men ages 50-69 years) or small net benefit (making it likely that some individuals would benefit and some be harmed; ie, osteoporosis screening in low-risk women ages 50-64 years or colorectal cancer screening in men and women ages 76-85 years). We chose these services because they represent a spectrum of net benefit, encompass a variety of test types (a laboratory test, a procedural test, and a radiographic study), and are applied to a spectrum of adult patients. Because age groups did not overlap, each patient was eligible for only 1 service. Although we initially planned to include cardiovascular screening in low-risk men and women as an additional service, we dropped this service before recruitment given low numbers of eligible patients at study sites. Additional inclusion and exclusion criteria are published elsewhere.14
We identified eligible patients for each service through electronic clinical records and consecutively sampled them until we reached target enrollment (n = 775). Patients provided written informed consent and were given $35 in compensation for participating. We stratified recruitment by site and service and attempted to recruit at least 25% of the sample with no prior screening for each service (although this was not always possible). We mailed letters to all potentially eligible patients and then followed up with up to 3 telephone calls to reach the patient and verify eligibility.
Staff invited eligible participants to a study visit before or after a regularly scheduled physician’s appointment or at a separate time if necessary. At the study visit, patients gave informed consent, completed a preintervention survey, and were assigned using central computerized randomization to 1 of 4 intervention arms for a single screening service. Randomization was stratified by site, screening service, and prior screening with allocation concealed from staff in a computerized database until after baseline survey completion. Patients were told only that they were participating in a study about how to best communicate with patients about screening. Patients read their 1 assigned evidence-based decision support sheet and completed postintervention surveys but did not discuss the materials with their clinicians.
The intervention consisted of 1-page, written, evidence-based decision support sheets for each of the screening services. These were based on US Preventive Services Task Force (USPSTF) recommendations,15- 18 were developed through an expert consensus process (that included 1 former USPSTF member), and were written at an eighth-grade reading level. Each sheet included a description of the disease for which screening could be undertaken (including disease incidence and mortality rates as derived from population estimates19,20), a description of the screening test and its benefits (primarily disease-specific mortality reduction), and harms (including physical and psychological harms across the screening cascade),14 and encouragement to make a decision. We represented overdiagnosis only indirectly by showing that incident disease rates exceed clinically important outcomes (see eTable 1 in Supplement 2) and, for prostate cancer only, overtreatment as a physical harm in which participants received unnecessary treatment for harmless disease. Decision support sheets within each screening service were identical, except that we provided information on the benefits and harms of screening in 1 of the 4 alternate presentations: words, numbers, numbers plus narrative (narratives), or numbers plus a framed presentation (framed) (see eFigures 1-16 in Supplement 2). We chose these formats to represent presentations of information that might affect intentions for accepting screening based on behavioral, communication, and economic theory21- 23 and other literature.10,11
The words presentation format used ordered descriptions for probabilities (eg, “many,” “few,” or “very few” people affected). The numbers format presented probabilities in a frequency format with a common denominator (x/1000) in text and a separate “facts box.” Benefits were presented among those screened, and harms were presented among those treated, which may have presented a slightly more negative view of screening than if harms had been presented among those screened. The narrative format built on an approach that we previously used24 and was designed to engage readers, inform them about the screening decision, and model the process of decision-making.13 Narratives included (1) photographs of 4 racially diverse patients in the screening age group, and (2) text that showed individuals investigating the facts. The framed format was designed to dissuade screening. It used a gain frame to promote risk aversion and discourage screening instead of a loss frame to promote risk-seeking in the face of the risky option of screening to detect disease. The format presented the benefits of not screening (ie, harms avoided by not screening) rather than the harms of screening, which was presented in the other study arms. We tested the quantitative vignette during a linked qualitative study, revising it based on participant feedback. We then created the other intervention formats and tested them among administrative staff.
Study outcomes were intention to accept screening (primary outcome) and multiple decision-making and behavioral theory–related outcomes (secondary outcomes).
Intention is a measurable antecedent to behavior25 and explains as much as 30% of variance in health behavior.26 The survey measured intention to accept screening by assessing patients’ plans to be screened for the service for which they were eligible during the usually recommended screening interval (1 year for prostate cancer screening, 5 years for osteoporosis screening, and 10 years for colon cancer screening). The 5-point response scale ranged from strongly disagree (coded as 1) to strongly agree (5).
The survey assessed knowledge of key screening concepts using 8 items created by the study team that addressed: the definition of screening, false-positive results, false-negative results, overdiagnosis (3 questions regarding harmless disease and the need to live long enough and have effective treatments in order to benefit from screening), overtreatment, and the potential for harm. Response options were true, false, and don’t know, with scoring based on total number of correct responses, from 0 to 8.
The survey assessed disease-specific knowledge using 2 items for each service that we adapted from prior work27,28 or, in the case of osteoporosis screening, we developed. For prostate screening, items were (1) “Some men can live long, normal lives with untreated prostate cancer”; and (2) “Problems with sexual function and urination are common side effects of prostate cancer treatments.” For colorectal cancer screening, items were (1) “Most polyps in the bowel never become cancer”; and (2) “Bleeding and tears in the bowel are complications of a colonoscopy.” For osteoporosis screening, items were (1) “Broken hip bones are uncommon before the age of 65”; and (2) “Treatments for osteoporosis can sometimes result in bone damage.” Response options were true, false, and don’t know, with scoring based on total number of correct responses, from 0 to 2.
The survey assessed patients’ perceived risk of disease with a single question: “How likely is it that you will get disease x in the next 10 years?” The 4-point response scale ranged from not at all likely (coded as 1) to very likely (coded as 4).
The survey assessed perceived severity of the disease using the Lay Perceptions of Serious Illnesses Scale that had 4 items assessing perceptions that the “disease is very serious,” “has serious financial consequences,” “affects the way the person sees himself as a person,” and “causes difficulties for those close to the patient” (α .66).29 The 5-point response scale ranged from strongly disagree (coded as 1) to strongly agree (5). For this and the remaining multi-item measures, we averaged items to create a scale except as noted.
The survey measured disease-specific screening attitudes using 6 items developed by investigators that assessed agreement that screening is a good idea in a healthy person of the patient’s age, not a special responsibility (reverse coded), associated with little harm, owed to one’s family or physician, and would be regretted if not done. The 5-point response scale ranged from strongly disagree (coded as 1) to strongly agree (5).
We assessed perceived net benefit using a single question on decisional balance adapted from our previous work.27 Respondents were instructed to think about how they felt at that moment about the decision to accept the screening in question. The 5-point response scale ranged from the harms greatly outweigh the benefits (coded as 1) to the benefits greatly outweigh the harms (5).
We assessed values clarity using the Values Clarity Subscale of the Decisional Conflict Scale that had 3 items that assessed whether patients agreed that they are clear about which benefits and harms matter most and whether benefits or harms are most important (α >.78). The 5-point response scale ranged from strongly agree (coded as 0) to strongly disagree (100), so that lower scores indicated greater clarity about personal values.
We measured self-efficacy for screening using a single item, for each type of screening that read “How confident are you that you could get screened for disease x if you wanted to?” The 5-point response scale ranged from not at all confident (coded as 1) to very confident (5).
To support exploratory analysis of moderators of the intervention’s effect, we measured variables related to the ability and motivation to process information. These included education; numeracy (3 items, reported in aggregate as percentage correct, “Imagine that we flip a fair coin 1000 times. What is your best guess about how many times the coin would come up heads in 1000 flips?”; “In the lottery, the chance of winning the prize is 1%. If 1000 people each buy a single ticket to the lottery, how many people would win the prize?”; “In a publisher’s sweepstakes, the chance of winning a car is 1 in 1000. What percent of tickets in this sweepstakes win a car?”)30; need for cognition (3 items with the highest item-total correlations from an 18-item scale: “thinking is not my idea of fun”; “I would rather do something that requires little thought than something that is sure to challenge my thinking abilities”; and “I find satisfaction in deliberating hard and for long hours”)31; anticipated regret of not screening (1 item, reported on a 5-point scale from strongly disagree  to strongly agree , “I would regret opting not to get screened if I later tested positive for disease.”)32; worry (3 items from the Illness Attitudes Scale, with responses on a 5-point scale from “no”  to “most of the time”  and summed from 0 to 12: “Do you worry about your health?”; “Are you worried you might get a serious illness in the future?”; “Does the thought of serious illness scare you?”)33; prior screening noted on medical chart review; and prior information (5 investigator-developed items with a common stem reported on 5-point scales from “never”  to “always”  and summed for a range of 1-25; items addressed exposure to screening information through newspapers, magazines, the Internet, television, or friends and family). We also assessed demographics.
We monitored potential harms of our intervention by assessing increases in illness-related worry between preintervention and postintervention.33
As important characteristics of the sample, we also measured general screening attitudes using 11 items, in 2 subscales, developed and validated as part of this study rather than used as an outcome as originally planned (Jessica DeFrank, PhD, email communication, May 5, 2015). These items were about the perceived benefits of screening (α .82) and feelings of duty or obligation to screen (α .84) and were correlated with intention for screening (r = 0.25 and r = 0.35, respectively).
We calculated our sample sizes to be able detect a mean difference in pre-post changes in intention to accept screening of at least 0.5 points across intervention groups overall and within specific screening services. Based on prior work, we considered this 0.5 point difference the minimally clinically important difference in intention to accept screening services; this difference corresponds to a 21% reduction in screening intention.34 Assuming 2-sided t tests with α = .001 and a standard deviation of change of 1,35 we calculated that we would need 184 participants in each of the 4 intervention arms to give 95% power to detect this difference. This sample size provided about 80% power to detect a 0.5-point mean change in intention in subgroup analyses of the 3 screening services.
We summarized sample characteristics using descriptive statistics. Per a priori plans, we tested the effectiveness of intervention formats on change in intention-to-accept screening in our overall sample, and subsequently in subgroups of screening services. We conducted analysis of covariance that included format as the key variable of interest, baseline intent for screening, and other covariates that differed among intervention arms. We compared several approaches for assessing change and found that all produced similar effects.36,37 Similar analyses were conducted for secondary outcomes (using logistic regressions for binary outcomes). Per a priori plans, we first tested the difference between all study arms using an omnibus F test. If this test result was negative after accounting for multiple comparisons (in which we considered P <.001 significant), we did not pursue additional statistical testing between intervention arms. To examine potential moderators of the impact of the interventions on our primary outcome, we visually depicted moderation and added interaction terms to the models. To examine pre-post changes in primary and secondary outcomes, we used paired t tests for continuous outcomes and McNemar χ2 tests for binary ones. Within subgroups of screening services, we repeated similar analyses (finding similar effects, thereby supporting our decision to combine analyses).
We enrolled and randomized 775 patients to the 4 intervention arms: words (n = 195), numbers (n = 192), narrative (n = 196), or framed (n = 192) formats (Figure 1). Within each intervention arm, patients were distributed evenly across the 3 screening services. Baseline characteristics were mostly well-balanced, although those in the framed format arm were slightly more educated (Table 1). Baseline characteristics were less well balanced in subgroups (eTables 2-4 in Supplement 2).
The intervention arms had high intention-to-accept screening at baseline (words, 3.56; numbers, 3.71; narrative, 3.66; framed, 3.53 out of 5.00; Table 2). The 4 intervention arms did not differ in change in screening intentions (adjusted P = .57). In analyses within each arm, the narrative format had lower intention-to-accept screening at postintervention compared with preintervention (−0.12; 95% CI, −0.22 to −0.02 on a 5-point scale), but other interventions arms had no changes from baseline. In separate subgroup analyses for the 3 screening services, we observed a similar pattern of findings (see Table 2 and eTables 5-7 in Supplement 2).
Overall and subgroup analyses found no statistically significant differences in change in intention for screening across intervention arms for any of the secondary study outcomes after accounting for multiple comparisons. However, within intervention arms, some secondary outcomes improved from baseline (eg, screening knowledge, screening attitudes, and perceived net benefit of screening; see Table 2 and eTables 5-7 in Supplement 2).
In the overall sample, change in intention for screening did not differ across study arms for subgroups of patients defined by ability, motivation, or demographics (Figure 2).
We observed no evidence of increase in postintervention illness-related worry.
In a randomized clinical trial of 4 formats for presenting benefits and harms of 3 screening services with low-value or potentially low-value, we found no differences in change in intention for screening across intervention arms. Furthermore, while secondary outcomes showed small improvements from baseline, none of these changes were sufficient to change intentions to accept screening. There were no clinically important differences in subgroups of patients defined by ability, motivation, or demographics and no evidence of harm from the interventions.
Our findings are consistent with those of systematic reviews38 showing that patient decision aids produce increases in screening knowledge and improve other decision-making outcomes; however, they also suggest that single, brief, written decision support sheets, such as those used in this trial, are unlikely to be sufficient to change intention for screening of low-value or potentially low-value screening services, regardless of their format. Decisions about screening are driven by a complex interplay of attitudes, social norms, and self-efficacy, many of which often strongly favor screening. Furthermore, many decision-makers rely on emotions and heuristic decision-making, rather than the rational processes involved in weighing harms and benefits, and are subject to a host of cognitive biases that make foregoing health care services difficult.14,39,40 This suggests that either more intensive interventions or new approaches will be needed. More intensive decision support interventions for prostate cancer screening have been shown to reduce screening intentions and behavior (one by 22% at 9 months),24,41 likely through their inclusion of more detailed information and additional components such as modeling self-discovery about harms. Such interventions may have an important role in reducing screening intentions for low-value services. However, even more intensive interventions may not be enough.
Rather than simply intensifying current clinical interventions, effective approaches to reducing overuse of low-value services may need to take a comprehensive approach. The most successful campaigns have targeted multiple levels of the public health pyramid.6 It may be that prevention and deimplementation of low-value care will require combinations of interventions such as (1) patient and clinician engagement through campaigns, like Choosing Wisely; (2) aligned recommendations and incentives42,43; (3) committed leaders and champions; (4) the time and space for change; (5) system-level supports42,44- 47; and (6) more intensive clinician and patient decision support than the 1-page written decision support sheets provided in this study.24,48 Possible adjunctive decision-making interventions include highlighting the financial and opportunity costs of screening, emphasizing the potential harms of overdiagnosis and overtreatment,49 increasing the salience of harms through video or other media, and testing appeals to peripheral cues that are persuasive to those who do not centrally process benefits and harms information.22,40,46
In interpreting our results, readers should consider the limitations of our study. First, we tested decision support sheets for only 3 screening services. Similar interventions for other screening services could produce other results, particularly if services have different rates of overuse or public visibility. Second, some of our measures were single items or previously unvalidated measures adapted from other studies. Different measures may produce different results.50 Third, some characteristics differed across trial arms at baseline. Analyses controlled for these potential confounders, but residual confounding remains a possibility. Fourth, the success of our gain framed option depended on patients’ perceptions that screening is the riskier option; however, we did not measure this perception explicitly. Fifth, we may have slightly overestimated the rate of osteonecrosis of the jaw in average-risk individuals in the osteoporosis decision support sheet, however, this does not change net benefit of the service. Finally, we conducted the study in 4 clinics in the southeastern United States. To the extent that screening rates, clinician training, local decision-making patterns, or patient characteristics (eg, education, numeracy, insurance, presence of usual source of care) are different, results could be different in future studies.
Despite limitations, our study provides important insights about what is required to change decision-making about low-value screening services. A single brief decision support intervention, regardless of format, is unlikely to be sufficient to change intentions for screening. Alternate and additional interventions should be explored.
Corresponding Author: Stacey L. Sheridan, MD, MPH, University of North Carolina at Chapel Hill, Division of General Medicine and Epidemiology, 5039 Old Clinic Building, CB 7110, Chapel Hill, NC 27599 (Stacey_sheridan@med.unc.edu).
Accepted for Publication: October 29, 2015.
Correction: This article was corrected on January 5, 2016, to clarify editorial issues in the abstract, text, and Figure 2 title and caption.
Published Online: December 28, 2015. doi:10.1001/jamainternmed.2015.7339.
Author Contributions: Dr Sheridan and Ms Sutkowi-Hemstreet had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Sheridan, Brewer, Gizlice, Lewis, Golin, Kistler, Harris.
Acquisition, analysis, or interpretation of data: Sheridan, Sutkowi-Hemstreet, Barclay, Brewer, Dolor, Gizlice, Reuland, Golin, Kistler, Vu, Harris.
Drafting of the manuscript: Sheridan, Barclay, Brewer, Harris.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Sutkowi-Hemstreet, Brewer, Gizlice.
Obtained funding: Sheridan, Brewer, Lewis, Golin, Harris.
Administrative, technical, or material support: Sutkowi-Hemstreet, Barclay, Brewer, Vu, Harris.
Study supervision: Sheridan, Sutkowi-Hemstreet, Lewis.
Conflict of Interest Disclosures: None reported.
Funding/Support: This project was supported by the Agency for Healthcare Research and Quality (AHRQ) Research Centers for Excellence in Clinical Preventive Services, grant P01 HS021133-03.
Role of the Funder/Sponsor: The Agency for Healthcare Research and Quality (AHRQ) Research Centers for Excellence in Clinical Preventive Services had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Disclaimer: The content of this paper is solely the responsibility of the authors and does not necessarily represent the official views of the Agency for Healthcare Research and Quality.
Additional Contributions: We are grateful for the participation of the clinicians and staff at Duke Primary Care Croasdaile, Sutton Station Internal Medicine, Duke Primary Care Pickett Road, and Duke Primary Care Timberlyne; and for data collection by the coordinators and trial assistants at Duke PCRC–Erica Suarez, Liz Fisher, Lynn Harrington, Kathlene Chmielewski, Nick Walter, Beth Patterson, Luis Ballon, and Sarah Ricker. Clinicians participating in the study received no reimbursement; however, practices received reimbursement ($1000/y) for assisting with recruitment letters and use of clinic space to conduct study visits. Coordinators and trial assistants were reimbursed at an hourly rate for their assistance.