Variation by Institution in Sexual Harassment Experiences Among US Medical Interns

This cross-sectional study investigates possible institutional and specialty variations in experiences of sexual harassment among US medical interns.


Recruitment strategy
The present study was performed as part of the Intern Health Study, 1 an NIH-funded longitudinal cohort study that assesses stress and mood in medical interns at institutions around the US and has been conducted annually since 2007.Email addresses for incoming first-year residents across all specialties throughout the United States were gathered from residency programs and publicly available databases.Eligible residents were invited via email to complete a baseline survey two months prior to internship start and quarterly surveys during their internship year.The sexual harassment questions were included in the fourth quarterly survey for the 2016 and 2017 cohorts.

Survey questions
Baseline questions used in the present study

Sexual harassment experience
Self-reported sexual harassment experiences were defined as "yes" if the participant reported "once" or more frequent responses (e.g., sometimes, often, very often) to questions 2-20 from the SEQ-S survey instrument.

General approach
We used a 2-level multilevel logistic regression model to assess institutional variation in intern experiences of sexual harassment.We initially calculated the variation in the reporting of sexual harassment across institutions without adjusting for any intern characteristics ("empty model").We subsequently created a multilevel logistic model for experiencing sexual harassment (Outcome Yes=1) adjusting for intern demographics (e.g., age, sex, race/ethnicity).

Reliability adjustments
We used reliability adjustment to avoid overestimating the probability of experiencing sexual harassment at institutions with a low number of resident responders.To do this, we used a multilevel model with a random intercept for the institution.The three main advantages of using a random effects model are (1) to reduce the number of parameters estimated, (2) to adjust for institution level covariates, and (3) to benefit from the property of shrinkage.The shrinkage estimator approach places more weight on a hospital's point estimate when it is measured reliably but "shrinks" it toward the population mean when there is more error in the measurement (e.g., lower case volume).We performed reliability adjustment by generating empirical Bayes estimates.

Quantifying the variation between academic institutions
We created two multilevel logistic models.For each model, we calculated the ICC.We then directly compared the ICC for the two models (model 1 was the "empty model" and model 2 included resident demographics).The median odds ratio (MOR) was calculated from the final model as per the method of Merlo.We present MORs, as they provide more interpretable information on the odds ratio scale of the impact of hospitals on the reporting of sexual harassment by randomly comparing pairs of institutions at highest risk to those at lowest risk.

Sensitivity Analysis
To understand if there was variation by specialty training, all interns who responded to the SEQ-S questions were included and categorized into specialty training programs.The specialty training programs were categorized into 9 broad categories: internal medicine, family medicine, pediatrics, emergency medicine, general surgery and specialties (e.g., orthopedic, urology), obstetrics-gynecology, neurology, psychiatry, and other.
A similar statistical approach was used to understand if there was variation across specialties by creating a 2-level multilevel logistic regression model to assess specialty training program variation in intern experiences of sexual harassment.We initially calculated the variation in the experiences of sexual harassment across specialty programs without adjusting for any intern characteristics ("empty model").We subsequently created a model adjusting for intern characteristics (demographics).We then directly compared the ICC for the two models (model 1 was the "empty model" and model 2 included resident demographics).The median odds ratio (MOR) was calculated from the final model as per the method of Merlo.