In the last decade, methods based on propensity scores (PSs) have been frequently used in multiple sclerosis (MS) studies comparing disease-modifying treatments in nonrandomized observational settings.1 Propensity score adjustment was applied even in situations when all the necessary conditions for its applicability were not satisfied. The PS adjustment can reduce the intrinsic selection bias of nonrandomized studies only if all the confounders are measurable and are at least minimally overlapped between the treatment groups. Sometimes the calendar period or the geographical region of study conduction can be completely nonoverlapping between the compared groups. In such cases, in the causal inference jargon, the positivity assumption, requiring that the probability to receive any of the treatments in all the PS strata is higher than 0, is violated. We will show, with a practical example, the extent of failure of PS adjustment when the positivity assumption is violated.
We merged the placebo arm (placebo A) of the AFFIRM study (published in 20062) testing natalizumab vs placebo and the pooled placebo group (placebo B) from the DEFINE and CONFIRM studies (published in 20123,4) testing dimethyl-fumarate vs placebo. The DEFINE and CONFIRM trials were approved by central and local ethics committees and conducted in accordance with International Conference on Harmonisation Good Clinical Practice guidelines and the Declaration of Helsinki.
We assessed the association of placebo A vs placebo B with the time to 6-month confirmed disability progression (CDP) as defined in the study reports,2-4 preferring the 6-month rather than the 3-month confirmation (primary end point of the randomized trials) to mimic what happens in observational studies. The PS was calculated by a logistic regression model that included all the baseline covariates common to the 3 trials. Cohen standardized mean differences were calculated between the 2 placebo groups in the original samples and after matching or weighting.
We used first a Greedy 5-to-1-digit, 1:1-matching algorithm in which matched pairs are randomly selected from all possible pairs with equal PSs. Second, we applied an inverse probability weighting (IPW) approach analyzing all patients first (untrimmed IPW) and then including only patients with a PS overlapping between arms (trimmed IPW). Finally, we applied marginal structural models (MSMs) to account for differences in withdrawal patterns before CDP. The association of the 2 placebos was compared by a Cox model. Stata, version 16 (StataCorp) was used and statistical significance was set at P < .05.
The baseline differences were consistently reduced by the PS adjustments. The unadjusted analysis indicated the superiority of placebo B over placebo A in association with CDP (hazard ratio, 0.64; 95% CI, 0.47-0.88; P = .006). The superiority of placebo B vs placebo A was confirmed and reinforced after adjustment using IPW and MSM methods (Figure). Only the PS matching 1:1 showed a nonsignificant difference between the 2 placebo arms, mainly because of the reduced number of patients after the matching (hazard ratio, 0.78; 95% CI, 0.53-1.16; P = .22). The analysis on 3-month CDP gave identical results.
In this study, we showed the superiority of a placebo over another placebo using data from randomized clinical trials, adjusting the comparison with various PS methods and MSMs. Despite the efficiency of these techniques in mitigating the baseline differences of the compared cohorts (the 1:1 PS matching generally displayed the lowest Cohen standardized mean differences but the largest potentially informative losses), the superiority of placebo in DEFINE-CONFIRM vs the placebo in AFFIRM did not decrease when adjusting for PS. In this setting, PS cannot adjust for the different period of the trials’ conduction. It is well known in the epidemiological community that PS methods in the presence of a positivity assumption violation cannot guarantee the adjustment of the treatment selection bias. Nevertheless, the positivity assumption violation is present in most of the observational studies published in the MS field, in which the treatment cohorts do not overlap in geographical areas5,6 or periods (as in the present example). Therefore, the results of such comparisons must be interpreted with caution.
Corresponding Author: Maria Pia Sormani, PhD, Department of Health Sciences, University of Genoa, Via Pastore 1, Genova, Italy (mariapia.sormani@unige.it).
Published Online: April 20, 2020. doi:10.1001/jamaneurol.2020.0678
Author Contributions: Drs Signori and Pellegrini had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: Signori, Pellegrini, Carmisciano, de Moor, Sormani.
Acquisition, analysis, or interpretation of data: Signori, Pellegrini, Bovis, de Moor, Sormani.
Drafting of the manuscript: Signori, Pellegrini, Bovis, Sormani.
Critical revision of the manuscript for important intellectual content: Signori, Pellegrini, Carmisciano, de Moor, Sormani.
Statistical analysis: Signori, Pellegrini, Bovis, de Moor, Sormani.
Administrative, technical, or material support: Pellegrini, de Moor.
Supervision: Pellegrini, Sormani.
Conflict of Interest Disclosures: Dr Pellegrini reported that he is a Biogen employee and owns stocks of my company. Dr Bovis reported personal fees from Novartis and Eisai outside the submitted work. Dr Sormani reported personal fees from Biogen during the conduct of the study. No other disclosures were reported.
Funding/Support: This study was supported by Biogen.
Role of the Funder/Sponsor: Biogen had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
5.Kalincik
T, Brown
JWL, Robertson
N,
et al; MSBase Study Group. Treatment effectiveness of alemtuzumab compared with natalizumab, fingolimod, and interferon beta in relapsing-remitting multiple sclerosis: a cohort study.
Lancet Neurol. 2017;16(4):271-281. doi:
10.1016/S1474-4422(17)30007-8PubMedGoogle ScholarCrossref