Model of a Queuing Approach for Patient Accrual in Phase 1 Oncology Studies

Key Points Question Can the duration of phase 1 studies using either the 3 + 3 or rolling 6 designs be substantially reduced without exceeding the patient risk limits or changing the operating characteristics of the parent design? Findings This decision analytical model found that the modified study designs were associated with reduced expected study durations. The modified designs were associated with minimal changes in the number of patients treated and the determination of the maximum tolerated dose, without changing the operating characteristics or exceeding the risk limits of the parent design. Meaning Per this analysis, a substantial reduction in the time required to bring new advances to the clinic can be accomplished by simple modifications of 2 commonly used phase 1 trial designs.

is consistently almost 3 times the number of phase 3 studies (eTable 1 in the Supplement). In addition, while multiple phase 2 and 3 studies can run in parallel, they must wait for the relevant phase 1 study to complete. Because phase 1 study duration is often slot limited and minimally affected by adding sites, we must look elsewhere for opportunities to alter this early bottleneck that delays therapeutic advances in oncology. Here, we focus on the patient queue and evaluation process to reduce phase 1 study duration. As a proof of principle, we focus on the traditional 3 + 3 phase 1 design 5 and the rolling 6 design often used in pediatric studies. 6 To our knowledge, this represents the most beneficial modification of 2 of the most commonly used designs (1 for adults and 1 for pediatric populations) to reduce expected study duration without affected the operating characteristics. These new designs have been successfully implemented in several completed and ongoing clinical trials.

Methods
Phase 1 oncology designs typically consider only the first cycle of experimental treatment in both the determination of dose-limiting toxicities (DLTs) and the formal guidelines for dose-escalation decisions. To limit the number of patients at risk for a DLT, most phase 1 designs restrict the number of patients enrolled on the current dose level during the first cycle of experimental therapy. This is our starting point for the study of the phase 1 queue. The activities performed for the purposes of this simulation study are not considered to be human subjects research (per US Department of Health and Human Services regulation under the 45 CFR 46 Common Rule). Figure 1 illustrates the major steps involved in a phase 1 trial that form the basis for the patient queue that are reflected in the simulation tool. After each patient provides consent or any change occurs in any patient's evaluation status, the protocol team decides the dose level and availability of slots for the accrual of additional patients based on the accumulated data and the protocolspecified phase 1 design. The decision can be to escalate (accrue at the next higher dose level), accrue at the same dose level, deescalate (accrue at the next lower dose level), hold accrual, or end accrual and either declare a maximum tolerated dose (MTD) or declare that the lowest dose level tested is too toxic. The phase 1 design guides these decisions.
We modified the 3 + 3 design and rolling 6 design decisions to better accommodate the phase 1 queue. The modifications were evaluated on the time and number of patients required to determine the MTD while constrained to (1) inherit the maximum level of patient risk associated with the parent design, (2) provide at least an equally rigorous assessment of toxicity, (3) maintain the parent design's operating characteristics with respect to the MTD determination, and (4) revert to the parent design if accrual is slow or the principal investigator chooses to delay securing consent from patients when the queue-based designs permit accrual but the parent design does not.
The respective queue-based modifications ( Table 1 and Table 2) 7,8 are denoted the IQ 3 + 3 and the IQ rolling 6 designs. At any time, on the current dose level, there is the total number of patients enrolled (eg, promised or taking a slot, excluding those deemed inevaluable for DLT determination), the number of DLTs, and the number of patients evaluable. The number evaluable represents patients who were fully assessed and either had a DLT or did not (termed a pass), whereas the difference between total and evaluable numbers represents patients whose evaluations are pending.
The column for the 3 + 3 design can been seen as representing the traditional rules: accrue 3 patients, escalate with 0 DLTs, deescalate with 2 DLTs, and expand to 6 patients with 1 DLT (in which 2 or more DLTs results in dose deescalation and 1 DLT in 6 patients results in dose escalation if the next higher dose is open, and the MTD requires 6 patients treated with 1 DLT at most). Both IQ-based modifications can have up to 8 patients per dose level during the escalation phase. In addition, when deescalating, applying the rule that all patients (up to as many as 10 patients, which occurs in <0.1% of simulations) should not be denied treatment once given a consent form are possible in both IQ designs.   e The action to be taken for the next patient for the IQ design and the parent design, respectively. If a patient pending evaluation on a lower dose experiences a dose-limiting toxicity, the principal investigator in consultation with the sponsor may choose to reduce the dose level of any patients currently on a higher dose level, pending review of the adverse event data.
f If the next higher dose level is not available (there is no higher dose level or the higher dose level was already tested and found to be too toxic), a maximum of 8 patients can be treated at the current dose level and the principal investigator should declare the MTD with 0 or 1 dose-limiting toxicity of 6 (or 0 of 5) patients. For IQ 3 + 3, no more than 4 patients at risk are allowed, with no more than 6 patients at risk for the IQ rolling 6 design. Two dose-limiting toxicities in 7 or 8 patients means that the principal investigator can also declare the MTD (and it is suggested to continued using monitoring rules for the expanded cohort).
g Current level exceeds the MTD. The MTD is the highest level at which less than 33% of patients had dose-limiting toxicities, with at least 6 patients evaluable.

JAMA Network Open | Statistics and Research Methods
Model of a Queuing Approach for Patient Accrual in Phase 1 Oncology Studies As an example modification, for the IQ 3 + 3 design (Table 1; row 4), enrolling a fourth patient at the same dose level, knowing that 1 of 3 patients has a pass with 2 patients pending, is considered less risky than putting 3 patients at risk without any patient data, as is permitted by the 3 + 3 design, so this modification does not exceed the risk allowed by the 3 + 3 but rather starts a patient through the process in case a patient who started earlier is considered not to meet screening criteria or becomes inevaluable. This is the key principle behind the IQ 3 + 3 design. Additional details and rationale for special IQ 3 + 3 decisions can be found in the eAppendix in the Supplement, including why and when up to 8 patients can be accrued during dose-level escalation.
For the IQ rolling 6 design (Table 2), the parent design allows 6 patients to be put at risk but does not allow escalation of the first 3 patients pass when there are patients pending. As a result, the rolling 6 does not achieve its anticipated speed advantage. 9 The IQ rolling 6 allows escalation with 3 or more patients who are treated and fully evaluated with no DLTs, with the additional knowledge that none of the patients whose data are pending or have begun treatment have reported a DLT, which is consistent with the rules for the IQ 3 + 3 design with patients pending. The decision rules have also been modified so the IQ rolling 6 design improves patient flow but does not exceed the maximum risk permitted in the rolling 6 design (eAppendix in the Supplement). e The action to be taken for the next patient for the IQ design and the parent design, respectively. If a patient pending evaluation on a lower dose experiences a dose-limiting toxicity, the principal investigator in consultation with the sponsor may choose to reduce the dose level of any patients currently on a higher dose level, pending review of the adverse event data.
f If the next higher dose level is not available (there is no higher dose level or the higher dose level was already tested and found to be too toxic), a maximum of 8 patients can be treated at the current dose level and the principal investigator should declare the MTD with 0 or 1 dose-limiting toxicity of 6 (or 0 of 5) patients. For IQ 3 + 3, no more than 4 patients at risk are allowed, with no more than 6 patients at risk for the IQ rolling 6 design. Two doselimiting toxicities in 7 or 8 patients means that the principal investigator can also declare the MTD (and it is suggested to continued using monitoring rules for the expanded cohort).
g Current level exceeds the MTD. The MTD is the highest level at which less than 33% of patients had dose-limiting toxicities, with at least 6 patients evaluable.
Using decision tables 1 and 2, the operating characteristics of the 3 + 3 and IQ 3 + 3 designs and the rolling 6 and IQ rolling 6 designs were evaluated for 12 scenarios detailed in Table 3, each motivated by our clinical trial experience. In each scenario, we specify a starting dose level, the lowest and highest dose levels, and other parameters (Table 3). The maximum waiting time was assumed to be 0 in the simulations, representing patients who will not wait if slots are unavailable and instead will be accrued to a different phase 1 study (or treated outside of a study).
Scenarios A1 through A3 (Table 3)  in which we looked to demonstrate the result of an increase in screen failures, and scenario D, in which we modeled a specific clinical trial with a known screen failure rate.
Scenario B was provided to compare the simulation of the IQ designs with its respective parent design for a safety lead-in study. Scenario C1 is based on a study of blinatumomab and lenalidomide (NCT02568553). This study's experimental treatment was introduced on cycle 2, increasing the number of inevaluable patients (whose data could not be considered for a dose-escalation decision), while simultaneously allowing patients to hold a slot for an extended period. Early in the study, two-thirds of the patients were inevaluable, and combined with the cycle 2 evaluation, this provides the opportunity for a very substantial improvement while using queue-based methods. We amended the study to change from a 3 + 3 design to an IQ 3 + 3 design, which was subsequently approved by the Cancer Therapy Evaluation Program and the central institutional review board based on the simulation work. Scenarios C2 and C3 are variations that show the result of changes in accrual rate or inevaluability rate.
Scenario D is based on a study of intraperitoneal chemotherapy (NCT00825201), in which surgery preceding chemotherapy increased the time a patient can occupy a slot; the cycle length was 28 days, but surgery and recovery could delay chemotherapy up to 90 days (screening time). Forty percent of the patients were ineligible per screening criteria, but only 7.5% were inevaluable for DLTs. We estimated the screening distribution and the arrival distribution from the study data.

JAMA Network Open | Statistics and Research Methods
Model of a Queuing Approach for Patient Accrual in Phase 1 Oncology Studies

Results
For all scenarios, the IQ 3 + 3 design had shorter expected study durations than the 3 + 3 design, ranging from a reduction of 1.6 to 10.4 months (Figure 2A), and likewise, the IQ rolling 6 design has lower expected study durations than the rolling 6 design, ranging from a reduction of 0.4 months to 10.5 months ( Figure 2B). There was a small increase in the mean number of patients treated on the IQ 3 + 3 design (difference in mean number of patients in all scenarios, <3.2 patients; Figure 2C), while the IQ rolling 6 had a smaller mean number of patients than the rolling 6 in 9 of the 12 scenarios, with a difference that did not exceed 3.3 patients ( Figure 2D). Detailed results can be found in eTables 3 and 4 in the Supplement.
For scenario A1, the typical phase 1 study in the consortium, the expected (mean) duration of the phase 1 study using the traditional 3 + 3 design was 19.5 months (range, 7.1-41.3 months), whereas the expected duration was 15.8 months (range, 5.3-27.0 months) for the IQ 3 + 3 design. This represents an expected reduction of 3.7 months (difference in medians, 3.6 months; a 19% reduction). There is a mean increase of 2.8 patients. In eTable 5 in the Supplement, we extended this mean scenario to 9 dose levels, so the MTD would almost assuredly be reached before the highest allowable dose level. In that setting, the expected duration was reduced by 5.4 months (26.0 vs 20.6 months; a 21% reduction), and we also present additional metrics (expected percentage of patients on each dose level) that confirm that the operating characteristics of the 3 + 3 design are maintained.
By reducing inevaluability to 3.6% (scenario A2), the expected duration of both the 3 + 3 and the IQ 3 + 3 designs were shorter compared with scenario A1. The expected times were 16.5 vs 13.9 months, respectively (a 16% difference).
When increasing the rate of patients who are inevaluable to 44% (scenario A3), the expected duration of the 3 + 3 and the IQ 3 + 3 designs increased. The reduction because of the IQ-based design became 6.4 months (24%); compared with a reduction of 3.7 months (19%) associated with scenario A1.
With all designs, a higher number of patients who are inevaluable tended to reduce the dose level selected as the MTD (eTable 3 and 4 in the Supplement). This is because of the competing events of a patient experiencing a DLT and a patient being deemed inevaluable. As an extreme example, if the DLT evaluation period was very long, all patients would eventually be inevaluable, so the only patients who are evaluable would be those with a DLT before being deemed inevaluable, making all doses appear toxic.
In scenarios A4 through A7, we explored the results of (1) increasing screen failures, (2) increasing the availability of patients (reducing the interarrival time), (3) reducing the course length, and (4) increasing toxicity. The IQ designs had uniformly shorter expected durations (reduction of 6.7 months, 4.2 months, 3.7 months, and 3.1 months, respectively). The frequency of the selection of

JAMA Network Open | Statistics and Research Methods
Model of a Queuing Approach for Patient Accrual in Phase 1 Oncology Studies Figure 2. Box-Whisker Plots of Study Duration for the 3 + 3 and Phase the MTD between the IQ design and the parent design differed by a range of less than 0.1% to 3%, and the mean number of DLTs at a dose level greater than the MTD differed by 0.01 to 0.10 patients.
For scenario B, which was considered the safety lead-in, even though 77.8% and 77.1% of the simulations selected the starting dose as the MTD (for the 3 + 3 and the IQ 3 + 3, respectively), the expected study duration was reduced from 7.6 months to 6.0 months (a 21% reduction). The expected number of patients differed by 0.6 patients.
For scenario C1, the expected study duration was reduced from 34.2 months to 24.5 months, a 9.7-month (28%) reduction. This was expected given the inevaluable rate of 66% (with a 30% screen failure rate) and the amount of time a patient will be pending (up to 84 days). The

Discussion
The phase 1 queue depends on the treatment, patient population, and specific language in the protocol. For example, the screen failure rate is dependent on specific eligibility criteria, insurance issues, and the frailty of the patient population. Inevaluability for the purpose of DLT determination depends in part on the length of the evaluation period, the frailty of the patient population, the frequency of rapid disease progression, and other causes of drug discontinuation or unplanned dose reductions that are not associated with adverse events attributed to the dose of the investigational agent (or agents). Patients may also be considered inevaluable for DLT determination if critical tests are missed.
We observed in our phase 1 studies that these details can dramatically affect phase 1 study duration and such imperfections are present in every study. We can adapt designs to these imperfections to reduce the study duration while not exceeding the risk permitted in the parent design or affecting the operating characteristics. Because the rolling 6 design was originally developed for special settings, such as pediatrics, after the adult study, we compared the 3 + 3 with the IQ 3 + 3 design and the rolling 6 with the IQ rolling 6 design. In our typical phase 1 study, the IQ 3 + 3 design is expected to save 3.7 months in study duration compared with the 3 + 3 design, and  12 in which prior data were judged sufficient to allow 6 patients to accrue, analogous to the pediatric setting. After the first queue-based variation of the 3 + 3 design started accrual in May 2012, 10 the experience of investigators, coordinating centers, and statistical teams with these designs have increased. Their use has increased, and additional refinements have been implemented that are reflected in the rules and simulations presented here.
The use of queuing methods to evaluate and optimize the operating characteristics of phase 1 designs or other aspects of clinical research is a nascent field. As with the parent designs, the determination of the MTD with the IQ variations is not usually the end of the dose-finding study. An expansion cohort at the recommended phase 2 dose or MTD is recommended with additional monitoring rules (as noted in the eAppendix in the Supplement for 1 study [NCT02568553]). When appropriate, a randomized dose-ranging phase 2 study evaluating candidate doses may also be suggested. 13 Expanding the queue-based methods beyond the MTD determination is one possible area of future work. Queue-based modifications of DLT-rate targeting phase 1 designs with an implicit or explicit queue can also be explored. 14,15 The direct cost of the IQ 3 + 3 design when compared with the traditional 3 + 3 design comes in the form of a few extra patients. There is no similar cost when converting the rolling 6 to the IQ rolling 6 design. However, the IQ rolling 6 design does allows escalation with patients pending, which is less conservative than the rolling 6 designs. Indirect costs for the IQ designs include more frequent review of the data and more clinical judgment as to whether to add new patients or escalate with patients pending (when permitted) or to hold accrual and revert to the parent design.

Limitations
We did not model time to acquire data or decision time, which is in control of the data coordinating center and is usually short relative to the other intervals. We also did not implement separate screening-time distributions for patients considered screen failures vs successes or model delays in treatment.

Conclusions
The IQ 3 + 3 and the IQ rolling 6 designs should be considered as alternatives to the 3 + 3 and rolling 6 designs, respectively. The IQ designs better adapt to the patient queue to reduce study duration without exceeding the risk limits of the parent design or affecting operating characteristics.

JAMA Network Open | Statistics and Research Methods
Model of a Queuing Approach for Patient Accrual in Phase 1 Oncology Studies