Assessment of Treatment Effects and Long-term Benefits in Immune Checkpoint Inhibitor Trials Using the Flexible Parametric Cure Model

Key Points Question Does the flexible parametric cure model (FPCM) provide additional information compared with the classic Cox proportional hazards regression model in the analysis of randomized immune checkpoint inhibitor (ICI) clinical trials using progression-free survival as an end point? Findings This systematic review of reconstructed individual patient data extracted from ICI advanced or metastatic melanoma and lung cancer phase 3 trials provides empirical evidence that FPCM is a complementary approach to the Cox proportional hazards regression model. The FPCM allows estimation of treatment effects on the overall population and on the following components of the population: long-term responder fraction and progression-free survival in non–long-term responders. Meaning The findings of this review suggest that FPCM is a complementary approach that provides a comprehensive and pertinent evaluation of benefit and risk by assessing whether ICI treatment is associated with an increased probability of patients being long-term responders or with an improved progression-free survival in patients who are not long-term responders.


Progression free survival endpoint
To evaluate the accuracy of the reconstructed IPD data, we compared the number of events, median PFS and PFS rates reported in published articles for given timepoints, to the corresponding reconstructed values obtained.
The number of events was reported in 7 publications (19 treatment arms). We found a median difference of 3.

Estimation of long-term responder fraction
Flexible parametric cure models were used to predict the long-term responder fraction by modeling the cumulative hazard (denoted H(t)). In a proportional hazards model, the log cumulative hazard function was modeled using natural cubic splines: Covariates included in distribution F (time-dependent component) characterize a "short-term effect", but covariates do not describe the survival for those who are not long-term responders (Othus, Clinical cancer research 2012). As long as no time-dependent effects are modelled, the FPCM can be written as a proportional hazards model. For models with time-dependent effects, the PH assumptions are violated and HRs may not provide a relevant summary measure of the treatment effect in Table 1. The PH assumption is not appropriate, for example, where the primary effect of a treatment is ultimately in the long-term responder fraction, with little difference in early outcome, nor would it be appropriate if an early difference in outcome did not translate into a difference in the long-term responder fraction.

Estimation of treatment effect in the non-long-term responder population
As proposed by Chen et al. (Chen, JASA 2012), the mathematical expression of progression free survival in the non-long-term responder population was modelled as a function of the longterm responder fraction and the distribution characterizing the short-term effect ( , ) = ( , ) − exp[− exp( 00 + )] 1 − exp[− exp( 00 + )] Progression Free survival in the non-long term responder population was predicted using the Newton-Raphson algorithm. In the non-long-term responder fraction, the treatment effect was therefore measured by a time-dependent hazard ratio with corresponding 95% confidence intervals (obtained by a robust bootstrap method with 1000 samples). Time varying hazard ratios and their corresponding confidence intervals were interpreted from graphs.

Royston Parmar model -Goodness-of-fit and location of internal knots
In the Royston Parmar model, the mathematical expression of the log cumulative hazard is similar to the FPCM (Appendix C): ln{ ( , )} = {ln( ), , } + + {ln( ), 1 , } To improve the stability of the fitted function, the boundary knots of the restricted cubic splines are located at the extremes of uncensored survival times. To allow flexibility (Royston, Stat Med 2002), restricted cubic splines ranging from one to six interior knots were examined to model the baseline hazard. For the time-dependent effect, the number of investigated internal knots varied between 1 and 5. Models with the lowest AIC and BIC were considered to have the best fit. In cases where AIC and BIC were discordant, the BIC was considered. The BIC corresponds to the most parsimonious model with the lowest number of knots and was preferred to limit the risk of over-parametrization. The To evaluate the sensitivity of the number of knots in the FPCM, we compared the FPCM model hazard ratios and long-term responder fractions of different knot positions and varying numbers of knots. The sensitivity analysis was performed on the 6 models with the lowest BIC.

Sensitivity Analysis
When the automatic process could not be performed, the sensitivity analysis of Checkmate-037 was conducted by manually assigning internal knots. The automatic knot finding process was not feasible when several knots were located at exactly the same timepoints. To avoid duplication of knot positions, one of the knots was therefore shifted to the next percentile of the event time.

Short-term treatment effect
The following