Key PointsQuestion
How does a comprehensive workplace wellness program affect health, health beliefs, and medical use among university employees after 24 months?
Findings
In a 2-year randomized clinical trial of 4834 employees at a large US university, employees invited to join a wellness program showed no significant differences in biometrics, medical diagnoses, or medical use relative to the control group. The intervention increased self-reports of having a primary care physician and improved a set of employee health beliefs among the treatment group.
Meaning
The workplace wellness changed health beliefs and increased self-reports of having a primary care physician but did not significantly affect clinical outcomes.
Importance
Many employers use workplace wellness programs to improve employee health and reduce medical costs, but randomized evaluations of their efficacy are rare.
Objective
To evaluate the effect of a comprehensive workplace wellness program on employee health, health beliefs, and medical use after 12 and 24 months.
Design, Setting, and Participants
This randomized clinical trial of 4834 employees of the University of Illinois at Urbana-Champaign was conducted from August 9, 2016, to April 26, 2018. Members of the treatment group (n = 3300) received incentives to participate in the workplace wellness program. Members of the control group (n = 1534) did not participate in the wellness program. Statistical analysis was performed on April 9, 2020.
Interventions
The 2-year workplace wellness program included financial incentives and paid time off for annual on-site biometric screenings, annual health risk assessments, and ongoing wellness activities (eg, physical activity, smoking cessation, and disease management).
Main Outcomes and Measures
Measures taken at 12 and 24 months included clinician-collected biometrics (16 outcomes), administrative claims related to medical diagnoses (diabetes, hypertension, and hyperlipidemia) and medical use (office visits, inpatient visits, and emergency department visits), and self-reported health behaviors and health beliefs (14 outcomes).
Results
Among the 4834 participants (2770 women; mean [SD] age, 43.9 [11.3] years), no significant effects of the program on biometrics, medical diagnoses, or medical use were seen after 12 or 24 months. A significantly higher proportion of employees in the treatment group than in the control group reported having a primary care physician after 24 months (1106 of 1200 [92.2%] vs 477 of 554 [86.1%]; adjusted P = .002). The intervention significantly improved a set of employee health beliefs on average: participant beliefs about their chance of having a body mass index greater than 30, high cholesterol, high blood pressure, and impaired glucose level jointly decreased by 0.07 SDs (95% CI, −0.12 to −0.01 SDs; P = .02); however, effects on individual belief measures were not significant.
Conclusions and Relevance
This randomized clinical trial showed that a comprehensive workplace wellness program had no significant effects on measured physical health outcomes, rates of medical diagnoses, or the use of health care services after 24 months, but it increased the proportion of employees reporting that they have a primary care physician and improved employee beliefs about their own health.
Trial Registration
American Economic Association Randomized Controlled Trial Registry number: AEARCTR-0001368
Employers increasingly offer workplace wellness programs to reduce health care costs and improve employee health. Among large US firms offering health benefits in 2019, 84% also offered a wellness program.1Quiz Ref ID The wellness industry has grown rapidly since the passage of the 2010 Patient Protection and Affordable Care Act, which encouraged firms to adopt wellness programs by raising the maximum limit on financial incentives offered to program participants.
However, evidence of the causal effects of workplace wellness programs is limited. Observational studies that compare participants with nonparticipants are susceptible to selection bias.2 Randomized trials frequently evaluate narrow wellness interventions with only 1 or 2 program components and examine only a few outcomes.3-8 Reviews of the literature have yielded mixed results and raised concerns about publication bias.9,10 A recently published randomized clinical trial (RCT) with 160 randomized worksites reported outcomes at 18 months after the intervention.11 Another recently published RCT with 4834 randomized participants reported effects on medical spending and employee productivity, but not clinical outcomes.2 Neither study investigated the effect of workplace wellness programs on employees’ beliefs about their own health. Measuring these beliefs sheds light on employees’ perceptions about the effectiveness of participating in wellness programs. These beliefs may also shape how much value and effort individuals place on health behaviors, a channel emphasized by social cognitive theory.12,13
In this study, individual employees were randomly assigned to a treatment group, which was eligible to participate in a 2-year comprehensive workplace wellness program, or a control group, which was not eligible. We evaluated the effects of this program on health beliefs, self-reported health behaviors, clinician-collected biometrics, and claims-based medical diagnoses and medical use during the 24 months after initial randomization into the program.
We conducted an RCT of a workplace wellness program among employees of the University of Illinois at Urbana-Champaign (UIUC). Our preanalysis protocol was publicly archived and is available in Supplement 1. Among the study population, 3300 employees were randomly assigned to be eligible for program participation (treatment group). The other 1534 study participants were ineligible for the program (control group). Randomization was stratified by employee class, sex, age, salary, and race/ethnicity (eAppendix 1 in Supplement 2). We specified the research design, subgroup analysis, and the health belief, biometric, and medical use outcomes prior to analysis. Self-reported health behavior and medical diagnosis outcomes were specified post hoc. The UIUC, University of Chicago, and National Bureau of Economic Research institutional review boards approved the study. All study participants provided written informed consent. We followed the Consolidated Standards of Reporting Trials (CONSORT) reporting guideline.
A total of 12 459 benefits-eligible UIUC employees were invited in July 2016 to enroll in the study and complete a survey (Figure). Employees were informed that they might be selected for further participation in the study, but no other details about the intervention were disclosed prior to enrollment. Invitations were sent by mail and email (eFigures 4-6 in Supplement 2). Our study population consisted of 4834 employees who enrolled in the study during a 3-week enrollment period. Random assignment of study participants to treatment and control groups occurred in August 2016, after study enrollment had closed.
A comprehensive workplace wellness program named iThrive was introduced at UIUC and ran for 2 years, from August 9, 2016, to April 26, 2018. Quiz Ref IDThe program, designed to be representative of typical comprehensive wellness programs offered by employers, included 3 annual components: an onsite biometric screening and survey, an online health risk assessment (HRA), and a choice of wellness activities (eFigure 1 and eFigure 2 in Supplement 2).14 Employees in the treatment group were eligible to participate in all 3 intervention components using paid time off and received randomly assigned cash awards that ranged from $0 to $200 per year for completing the annual screening and HRA. Treatment group participants who completed the biometric screening and HRA were then eligible to register for 1 wellness activity class per semester, for a total of 2 activities per year. Classes ranged in length from 6 to 12 weeks and addressed numerous topics (eg, physical activity, nutrition, and stress management) (eTables 2, 3, 5, and 6 in Supplement 2). On completion of a wellness activity, participants earned $0 to $75 as a cash reward or Amazon.com gift card. The onsite biometric screenings and surveys were administered by local clinicians. The HRA was designed by Wellsource, an established wellness vendor. The wellness activities were selected and implemented by UIUC’s director of campus well-being services. Details on all study components are provided in eFigures 4 to 37 and eTables 1 to 7 of Supplement 2.
Employees in the control group were invited to complete the onsite biometric screening and survey in August 2017 (12 months after randomization) and in August 2018 (24 months after randomization) to serve as a comparison group. Control group employees were not eligible to participate in the first onsite biometric screening and survey in August 2016 and were never eligible to participate in any of the HRAs or wellness activities offered throughout the 2-year iThrive program. Although the research team never informed the control group about the intervention, some may have learned about it from coworkers. To assess how often control group members learned about the intervention from coworkers, a 2017 online survey asked study participants whether they ever communicated about iThrive with coworkers. Only 3.4% (39 of 1157) of the control group responded affirmatively, compared with 43.6% (1050 of 2410) of the treatment group.2
Health beliefs, self-reported health behaviors, and biometrics were collected onsite by clinicians. Study participants were asked to report their height and weight. They also reported, on a scale from 0 to 100, their expected chances (subjective probabilities) of having high cholesterol, high blood pressure, an impaired fasting glucose level, and a body mass index above 30 (calculated as weight in kilograms divided by height in meters squared) (eFigure 17 in Supplement 2). We interpret self-reported height and weight and these expected probabilities as measures of participants’ health beliefs.12,15,16 Study participants were then directed to a station where a clinician measured their height, weight, waist circumference, and blood pressure. The clinician also measured their cholesterol (total, high-density lipoprotein, and low-density lipoprotein), triglycerides, and glucose levels using a CardioChek Plus Analyzer (PTS Diagnostics), and recorded their answers to questions about tobacco use, physical activity, mood, and having a primary care physician (eFigure 18 in Supplement 2).
Administrative health claims data were obtained for employees enrolled in UIUC’s Health Alliance insurance plan, which covers 69.3% (3350 of 4834) of employees in our sample. These data include all inpatient, outpatient, and prescription drug claims with a date of service between October 1, 2015, and July 31, 2018. Additional details on these and other data sets collected for the study are described in eAppendix 2 and eFigure 3 of Supplement 2.
Statistical analysis was performed on April 9, 2020. We performed power calculations for all outcomes by calculating ex post minimum detectable effects.17 The results are provided in eAppendix 1 and eTable 1 in Supplement 3. We estimated the effect of being invited to participate in the iThrive wellness program in the available population. Some employees in our sample ceased employment with the university during the 24-month study. For administrative health claims outcomes, we restricted comparisons to employees enrolled in Health Alliance. For all other outcomes, we compared participants in the treatment group who completed the follow-up (2017 or 2018) onsite screening and survey with all employees in the control group who completed the follow-up (2017 or 2018) onsite screening and survey (Figure). Baseline characteristics of the treatment and control groups were compared to evaluate the potential for bias due to missing data (eAppendix 2 and eTables 2 and 3 in Supplement 3).18
For each outcome, we estimated an individual-level linear model with a binary indicator for treatment assignment as the key independent variable. For biometric and self-reported outcomes, we included all study participants who completed the onsite follow-up screening and survey in 2017 (n = 2004) or 2018 (n = 1761). For medical diagnoses and medical use outcomes, we included all study participants (n = 4834) and weighted each individual by the number of months with Health Alliance insurance coverage. We included baseline values of the outcome (when available) and stratification variables as controls in our linear model to improve precision. Analyses were performed using Stata, version 15 (StataCorp).19 We calculated SEs that are robust to arbitrary heteroscedasticity and used 2-tailed tests with a significance level of P = .05.
To account for less-than-universal participation among the treatment group, we used an instrumental-variable approach to estimate the local mean treatment effect of participating in the program, instrumenting participation with assignment into the treatment group.11,20,21 Participation was defined as completing the first (2016) screening component, which was offered only to members of the treatment group (Figure). The results are provided in eAppendix 3 and eTables 4 to 6 in Supplement 3.
Because we estimated our model for many outcomes, the probability that we incorrectly reject at least 1 null hypothesis is greater than the significance level used for each individual hypothesis test. We accounted for this multiple testing concern in 2 ways. First, we calculated a standardized treatment effect for a “family” of outcomes by dividing the estimate for each individual outcome by its SD and then averaging across all the outcomes within the family.11,22 This method gives equal weight to each outcome in the family, which may be undesirable. Therefore, we also used resampling to calculate an adjusted P value for each outcome that corrects for the number of hypothesis tests within a family of outcomes.2,23 We considered effects to be statistically significant at an adjusted P < .05 or a standardized treatment effect P < .05.
Baseline Characteristics and Program Participation
Table 1 reports baseline characteristics for the treatment (n = 3300) and control (n = 1534) groups. Among all 4834 study participants, the mean (SD) age was 43.9 (11.3) years, 2770 (57.3%) were female, 786 (16.3%) were nonwhite, 963 (19.9%) were faculty, and 1172 (24.2%) earned less than $40 000 per year. A total of 3217 participants (66.5%) were enrolled in Health Alliance insurance coverage during the 10-month preintervention period from October 2015 to July 2016. Among this subsample and during this time, study participants had 2.5 outpatient visits on average and had medical claims with diagnoses codes related to 3 common chronic conditions in the following proportions: type 1 and 2 diabetes (172 [5.3%]), hypertension (440 [13.7%]), and hyperlipidemia (508 [15.8%]). Inpatient and emergency department visits were uncommon in this sample. Overall, baseline participant characteristics were well balanced across both study groups.
Of the 3300 participants in the treatment group, 1848 (56.0%) completed both the biometric screening and online HRA in the first year and 1036 (31.4%) completed the biometric screening, online HRA, and at least 1 wellness activity in the first year. During the 2-year program, 2123 participants (64.3%) in the treatment group completed at least 1 component of the iThrive wellness program. These completion rates are similar to those reported for other comprehensive wellness programs.11,14
Effects of the Intervention
Table 2 reports effects of the intervention on health beliefs and self-reported health behaviors. When combined into a standardized treatment effect, participant beliefs about their chance of having a body mass index greater than 30, high cholesterol, high blood pressure, and impaired glucose level jointly decreased by 0.07 SDs (95% CI, −0.12 to −0.01 SDs; P = .02). Although these health beliefs changed significantly as a group, changes in specific measures of health beliefs were less precise and thus not individually significant.
Quiz Ref IDSelf-reports of having a primary care physician significantly increased by 6.1 percentage points (95% CI, 3.0-9.2 percentage points; adjusted P = .002) after 24 months. There were no significant effects on self-reported tobacco use, physical activity intensity, or mood after 12 or 24 months.
Quiz Ref IDThe intervention had no significant effects on height, weight, waist circumference, body mass index, blood pressure, cholesterol, or glucose level (Table 3). There were also no significant changes in diagnoses of hypertension, diabetes, or hyperlipidemia after 12 or 24 months (Table 4). Likewise, no significant effects were found for office visits, inpatient visits, or emergency department visits. The 95% CI for systolic blood pressure (–1.48 to 1.18 mm Hg) after 24 months rules out a decrease of 1.48 mm Hg compared with a control group mean of 122.4 mm Hg (Table 3). The 95% CI for diagnoses of hyperlipidemia (–2.47% to 3.07%) after 24 months rules out a decrease of 2.47% compared with a control group mean of 26.5% (Table 4). Likewise, the 95% CI for office visits (–0.30 to 0.46) after 24 months rules out a decrease of 0.30 compared with a control group mean of 6.67. For emergency department visits after 24 months, the 95% CI rules out a decrease of 0.1 compared with a control group mean of 0.28. Additional analysis also found no significant effects for primary care physician visits (eAppendix 5 and eTable 25 in Supplement 3).
eAppendix 4 and eTables 7 to 24 in Supplement 3 report effects for prespecified subgroups. Compared with women, men had higher effects on claims-based diabetes diagnoses after 12 months (2.4%; 95% CI, 0.6%-4.2%; adjusted P = .04), but not after 24 months (1.5%; 95% CI, −0.6% to 3.7%; adjusted P = .49) (eTable 9 in Supplement 3). Compared with younger employees, employees 50 years or older had lower effects on self-reports of having a primary care physician after 24 months (−9.9%; 95% CI, −15.1% to −4.7%; adjusted P = .006) (eTable 10 in Supplement 3). No significant heterogeneity was found with respect to race/ethnicity, employee classification (faculty, civil service, or academic professional), or salary.
This individual-level RCT of a 2-year comprehensive workplace wellness program demonstrated that the program significantly improved employee beliefs about their own health and increased the proportion of employees reporting that they have a primary care physician. However, no significant effects were found on biometrics, medical diagnoses, or medical use after 24 months. Our study was powered to detect clinically meaningful effects across these 3 domains.
These results complement recent RCT evidence that workplace wellness programs affect some self-reported outcomes but have limited effects on clinical or administrative outcomes. Prior findings showed that the iThrive program increased self-reported lifetime health screening rates and improved employee perceptions of management, but did not significantly affect administrative measures of medical spending.2 A cluster RCT of a wellness program at BJ’s Wholesale Club found significant effects on self-reports of engaging in regular exercise and actively managing weight but found no significant effects on medical spending or biometric outcomes after 18 months.11 The similarity in these RCT findings using different randomization designs in different populations increases confidence in their reliability and generalizability.
Our measures of health beliefs, elicited using self-reported subjective probabilities, are a contribution to the literature on wellness interventions. Employees in the treatment group believed they had lower chances of poor biometric health, suggesting that they expected their participation in the wellness program to improve their health. However, there was no significant effect of the program on biometrics or medical use, and prior findings showed no significant effects on administratively measured health behaviors.2 These results demonstrate a mismatch between employee perceptions and physical and administrative measures of health.
Findings from the Illinois Workplace Wellness Study2 and the BJ’s Wholesale Club study,11 both RCTs, differ from those of many prior studies that found that wellness programs improve employee health and reduce medical use. Many of these prior studies used observational research designs, which can result in significant selection bias even after controlling for many covariates.2 Findings from RCTs are less susceptible to selection bias.
This study has several limitations. The results may not be generalizable to other workplace settings with different populations or different wellness programs.24Quiz Ref ID Our 95% CIs do not rule out meaningful effects for some outcomes—such as a decrease in emergency department visits after 24 months of 0.1 compared with a control group mean of 0.28. Also, the outcomes were measured during the first 24 months after randomization. We do not know whether the significant effects on self-reported outcomes persisted beyond 24 months, or whether detectable effects on biometrics, medical diagnoses, or medical use emerged beyond 24 months.
Finally, data were not available for all study participants. Medical diagnoses and use outcomes were obtained only for participants enrolled in Health Alliance. Biometric and self-reported outcomes were obtained only for participants who completed the onsite screening and survey in 2017 or 2018. However, Health Alliance enrollment was well balanced between the treatment and control groups (Table 1). Baseline characteristics of participants who completed the onsite screenings and surveys were well balanced between the treatment and control groups (eTables 2 and 3 in Supplement 3). The balance between treatment and control groups suggests that bias from missing data is unlikely to be substantial.
Among employees of a large employer, a comprehensive workplace wellness program significantly changed a set of beliefs about biometric outcomes and significantly increased self-reports of having a primary care physician, but no significant effects on clinician-measured biometrics, medical diagnoses, or medical use were found after 24 months. These findings shed light on employees’ perceptions of workplace wellness programs, which may influence long-term effects. However, we add to a growing body of evidence from RCTs that workplace wellness programs are unlikely to significantly improve employee health or reduce medical use in the short term.
Accepted for Publication: March 23, 2020.
Corresponding Author: David Molitor, PhD, Gies College of Business, University of Illinois at Urbana-Champaign, 31206 S Sixth St, 40 Wohlers Hall, Champaign, IL 61820 (dmolitor@illinois.edu).
Published Online: May 26, 2020. doi:10.1001/jamainternmed.2020.1321
Author Contributions: Drs Reif and Molitor had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: Reif, Jones, Payne, Molitor.
Acquisition, analysis, or interpretation of data: All authors.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Reif, Jones, Molitor.
Obtained funding: Reif, Jones, Payne, Molitor.
Administrative, technical, or material support: All authors.
Supervision: Reif, Jones, Molitor.
Conflict of Interest Disclosures: Drs Reif, Jones, Payne, and Molitor reported receiving grants from the National Institutes of Health, the Abdul Latif Jameel Poverty Action Lab (J-PAL) North America US Health Care Delivery Initiative, the National Science Foundation, the Robert Wood Johnson Foundation, and the W. E. Upjohn Institute for Employment Research during the conduct of the study. No other disclosures were reported.
Funding/Support: This research was supported by award R01AG050701 from the National Institute on Aging of the National Institutes of Health; grant 1730546 from the National Science Foundation; the J-PAL North America US Health Care Delivery Initiative; Evidence for Action (E4A), a program of the Robert Wood Johnson Foundation; and the W. E. Upjohn Institute for Employment Research. Illinois Human Resources provided in-kind logistical support for developing the program.
Role of the Funder/Sponsor: The funding sources had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Data Sharing Statement: See Supplement 4.
Additional Contributions: Lauren Geary, MPH, University of Illinois at Urbana-Champaign, provided project management; she was compensated for her contribution. Michele Guerra, MS, Certificate of Advanced Study, University of Illinois at Urbana-Champaign, provided input into the design of the wellness program and the selection of wellness activities; she was not compensated for her contribution. Illinois Human Resources provided institutional support without financial compensation. Marian Huhman, PhD, University of Illinois at Urbana-Champaign, Mark Stehr, PhD, Drexel University, David Studdert, LLB, ScD, MPH, Stanford University, and seminar participants at the American Society of Health Economists provided comments on this article; they were not compensated for their contributions.
6.Royer
H, Stehr
M, Sydnor
J. Incentives, commitments, and habit formation in exercise: evidence from a field experiment with workers at a Fortune-500 company.
Am Econ J Appl Econ. 2015;7(3):51-84. doi:
10.1257/app.20130327
Google ScholarCrossref 7.Meenan
RT, Vogt
TM, Williams
AE, Stevens
VJ, Albright
CL, Nigg
C. Economic evaluation of a worksite obesity prevention and intervention trial among hotel workers in Hawaii.
J Occup Environ Med. 2010;52(suppl 1):S8-S13. doi:
10.1097/JOM.0b013e3181c81af9
PubMedGoogle ScholarCrossref 17.Ioannidis
JP, Stanley
TD, Doucouliagos
H.
The Power of Bias in Economics Research. Oxford University Press; 2017. doi:
10.1111/ecoj.12461
18.Gerber
A, Green
D. Field Experiments: Design, Analysis, and Interpretation. W. W. Norton; 2012.
19.Stata Statistical Software. Release 16 [computer program]. StataCorp LLC; 2019.
23.Westfall
PH, Young
SS. Resampling-Based Multiple Testing: Examples and Methods for P Value Adjustment. Vol 279. John Wiley & Sons; 1993.