The width of the lines is proportional to the patient distribution from 1 clinical category to the next. Int indicates intermediate.
Power is plotted as a function of sample size for a given anatomical risk–based and minimum residual risk of recurrence score (rRS)-based eligibility criteria under 4 trial simulation scenarios (described in Table 1).
eFigure. Example of distribution of 10-year residual risk of recurrence in a simulated trial under Scenario 1 (A) and under Scenario 4 (B)
eTable. Patient characteristics and median 10-year residual risk of recurrence from Adjuvant! Online
eAppendix. Methods of Simulation
Customize your JAMA Network experience by selecting one or more topics from the list below.
Wei W, Kurita T, Hess KR, et al. Comparison of Residual Risk–Based Eligibility vs Tumor Size and Nodal Status for Power Estimates in Adjuvant Trials of Breast Cancer Therapies. JAMA Oncol. 2018;4(4):e175092. doi:https://doi.org/10.1001/jamaoncol.2017.5092
Could refinements in how we define trial eligibility improve the reliability of power and event predictions in randomized clinical trials of adjuvant breast cancer therapies?
In this modeling study of 4 clinical trial simulation scenarios tested for 443 patients, using a minimum residual risk threshold to define eligibility led to more reliable power calculations than eligibility based on categories of tumor size and nodal status.
Defining trial eligibility based on a minimum residual risk threshold, calculated by a validated multivariate model, could ensure that the minimum required number of events occur in the control arm of a trial.
Many large adjuvant clinical trials end up underpowered because of fewer than expected events in the control arm. Ensuring a minimum number of events would result in more informative trials.
To calculate individualized residual risk estimates using residual risk prediction software and assess whether defining eligibility based on a minimum residual risk threshold could increase the reliability of clinical trial power calculations compared with eligibility criteria based on tumor size and nodal status.
Design, Setting, and Participants
We estimated residual risk in 443 consecutive patients with early-stage breast cancer and assessed clinical trial power as a function of residual risk distribution among the accrued patients. We defined residual risk as the risk of recurrence that remains despite receipt of standard-of-care therapy; this risk is determined by baseline prognostic risk and by the improvement from adjuvant therapy. We performed trial simulations to examine how the power of a 2-arm, 1:1 randomized clinical trial would change as the residual risk distribution of the trial population that met eligibility criteria based on tumor size and nodal status changes. We also simulated trials that use a minimum residual risk value as eligibility criterion.
Main Outcomes and Measures
Residual risk; clinical trial power as a function of residual risk distribution among the patients.
In the 443 patients (mean [SD] age, 56.1 [12.3] years; range, 23-89 years), baseline prognostic and residual risks differed substantially: 328 (74%) patients had more than 20% baseline risk of recurrence; however, after adjustment for treatment effect only 12 (27%) had more than 20% residual risk. We assessed residual risk distribution in patient cohorts that met tumor size– and nodal status–based eligibility criteria for 3 currently accruing randomized adjuvant trials; the median residual risks were 28% (interquartile range [IQR], 25%-31%), 22% (IQR, 15%-28%), and 22% (IQR, 15%-28%), respectively, indicating that the power of these trials could vary unpredictably. Simulations showed that trials that use anatomical risk–based eligibility criteria can become underpowered if they accrue patients with low residual risk despite all participants meeting eligibility requirements. Using a minimum required residual risk threshold as eligibility criterion produced more reliable power calculations.
Conclusions and Relevance
When tumor size and nodal status are used to determine trial eligibility, the residual risk of recurrence can vary broadly, leading to unstable power estimates. The success of future adjuvant trials could be improved by defining patient eligibility based on a minimal residual risk of recurrence, and these trials can achieve a prespecified power with smaller sample sizes.
Death from breast cancer has decreased over the past 25 years, and 85% of patients with early-stage breast cancer do not die from their disease.1 It is important that future adjuvant trials be restricted to patients who remain at risk of recurrence. Inclusion of low-risk patients, who are likely to be cured by existing therapies, reduces the statistical power to detect treatment effect. Adjuvant trials usually define patient eligibility based on a combination of baseline tumor characteristics including tumor size, lymph node involvement, and hormone receptor status that broadly reflect the prognostic risk of a patient population at diagnosis. Each of these variables contributes to risk, but they are not combined into a multivariable risk model that appropriately weighs and sums up the contributions of each risk factor; instead, these variables are used to create multiple binary categories that define eligibility. For example, eligibility might be defined as (1) patients with tumors larger than 2 cm are eligible, and also (2) patients with node-positive disease with any tumor size are eligible. Using these criteria, the baseline risk of recurrence will vary broadly among the eligible patients (eg, consider a 2.2-cm node-negative cancer vs a 4.5-cm cancer with 4 positive nodes). Equally important, baseline tumor size– and nodal status–based eligibility does not capture the expected improvement in the risk of recurrence that standard-of-care adjuvant therapies administered in the control arm will bring about. A critical component of the power and sample size calculations is the assumption about the number of events that will occur in the control arm. If the event rate in the control arm, which is determined by the baseline risk distribution of the study population and also by the survival improvement due to standard-of-care systemic adjuvant therapy, is misjudged, the power estimates to demonstrate improvement with the experimental therapy will become inaccurate.
There are several validated multivariable residual risk predictors such as Adjuvant! Online or PREDICT that consider both the baseline anatomical risk factors and the expected benefit from adjuvant therapy to estimate the risk of recurrence, or death, that remains after completing all therapy.2-4 We call this residual risk because it is the risk of recurrence that remains despite receiving standard-of-care therapy. Residual risk determines the event rates in the control arm of a randomized adjuvant trial. Despite the ready availability of these online tools, individualized residual risk estimates are not used to define trial eligibility.
To illustrate prognostic heterogeneity of patients accrued based on tumor size and nodal status, consider the following examples. The currently ongoing NRG BR003 adjuvant chemotherapy trial (NCT02488967) for patients with triple-negative breast cancer (TNBC) includes patients with node-negative disease if the primary tumor is larger than 3 cm or patients with node-positive disease with any tumor size (excluding T4 disease). The residual risk of patients who meet these eligibility criteria can vary broadly; the 10-year residual risk of recurrence, estimated by Adjuvant! Online, for a patient with a 4.5-cm, grade 3 TNBC with 4 positive lymph nodes treated with a third-generation adjuvant chemotherapy is 45%, while another patient who also meets study eligibility criteria with a 0.8-cm, grade 3 TNBC with 1 positive lymph node treated with a third-generation adjuvant chemotherapy has only 15% residual risk of recurrence. The NSABP-B55/BIG 6-13 adjuvant oleparib trial (NCT02032823) defines eligibility for TNBC as larger than 2 cm if node negative, or any tumor size if node positive, and for estrogen receptor (ER)-positive human epidermal growth factor receptor 2 (ERBB2/HER2)-negative cancers as at least 4 pathologically confirmed positive nodes regardless of tumor size. Among eligible patients, a 55-year-old woman with a 2-cm, grade 2, ER-positive cancer with 4 positive nodes, treated with adjuvant endocrine therapy and third-generation chemotherapy, would have an 18% residual risk of death at 10 years, calculated by PREDICT V2.0, while a patient with TNBC with the same disease characteristics would have a 40% risk. This broad range of residual risk among eligible patients implies that the power of these trials to detect a given survival difference between treatment arms is susceptible to the residual risk distribution of the accrued patient population.
Adjuvant breast cancer trials have increasingly faced the challenge of fewer events occurring in trial populations than anticipated in the statistical design.5,6 A recent example is the ALTTO trial, which reported only 555 breast cancer events at a median follow-up of 4.5 years instead of the 850 events assumed in the study design.6 The event rates for the control arm (ie, trastuzumab + chemotherapy) were estimated from the earlier HERA trial that compared 2 years of treatment with trastuzumab with 1 year of treatment, and compared 1 year of trastuzumab vs observation; all patients received adjuvant chemotherapy concurrent or sequential to trastuzumab.7 The eligibility criteria for the HERA and ALTTO trials were almost identical and required either node-positive disease, or node-negative disease with tumor size of at least 1 cm; both ER-positive and ER-negative patients were eligible. In the ALTTO trial, 41% of the participants had T1 (<2 cm) tumors, 40% were node negative, and 57% were ER positive.6 In the HERA trial, 40% of the participants had T1 tumors, 32% were node negative, and 51% were ER positive.7,8 The HERA trial accrued patients between December 2001 and June 2005, while the ALTTO trial accrued between June 2007 and July 2011. During the time between 2001 and 2011, practice standards have changed. Only 18% of HERA patients received an aromatase inhibitor as endocrine therapy, compared with 50% in the ALTTO trial. The proportion of patients treated with combined anthracycline plus taxane chemotherapy was 25% in the HERA trial, and it was higher in the ALTTO trial. These relatively small numerical differences in the distribution of risk variables between the 2 trials and the changing practice standards over a 10-year period resulted in a substantial difference in event rates. The lower-than-expected event rate in ALTTO has led to reduced power to demonstrate statistically significant improvement in outcome. The disease-free survival hazard ratio (HR) was 0.84 (97.5% CI, 0.70-1.02; P = .048), which did not reach the prespecified significance threshold of P ≤ .025.
In this study, we suggest that using a minimal residual risk threshold as the eligibility criterion, calculated by a validated multivariate model, could lead to more predictable event rates and therefore more reliable power estimates in adjuvant trials. This would also enable clinical trialists to more predictably identify high-risk patients who are not likely to be cured by current therapies and therefore require further improvements in therapy.
This study was reviewed and approved by the Yale Human Investigations committee and was determined to be low risk and exempt from informed consent. Medical records of 500 consecutive patients with stage I to III breast cancer treated at the Breast Center at Smilow Cancer Hospital were reviewed between April 1 and September 30, 2015. Fifty-seven patients (11%) were excluded because they received neoadjuvant therapy and therefore baseline pathologic tumor size and nodal status were not available for accurate residual risk calculations. Patient characteristics are given in the eTable in the Supplement. We calculated baseline prognostic risk (ie, risk of recurrence if treated with surgery alone) and residual risk (ie, risk of recurrence that remains after completion of all planned adjuvant therapies) using Adjuvant! Online, version 8.0 (https://www.adjuvantonline.com). Adjuvant chemotherapies were grouped into first- (cyclophosphamide-doxorubicin ×4; cyclophosphamide, methotrexate, and fluorouracil ×6), second- (cyclophosphamide, doxorubicin, fluorouracil ×6; fluorouracil, doxorubicin hydrochloride, and cyclophosphamide ×6; doxorubicin hydrochloride and cyclophosphamide once every 3 weeks followed by paclitaxel; docetaxel and cyclophosphamide ×4), and third-generation (docetaxel, doxorubicin hydrochloride, and cyclophosphamide ×6; dose-dense doxorubicin hydrochloride and cyclophosphamide ×4 followed by dose-dense paclitaxel ×4 or weekly paclitaxel ×12) regimens. To calculate residual risk of recurrence by means of Adjuvant! Online, we subtracted from the number of cases predicted to experience relapse (per 100 patients) if treated with locoregional therapy alone, the number of cases that avoided recurrence because of systemic adjuvant therapy. For example, if Adjuvant! Online estimated that 20 of 100 individuals would have a recurrence with surgery alone and 13 patients would avoid recurrence because of adjuvant therapy, the predicted residual risk for this individual is 20% − 13% = 7%; we call this number the residual risk of recurrence score (rRS). Residual risk of death was calculated by adding adjuvant treatment–related percent improvement in overall survival to the survival rate without adjuvant therapy and subtracting this sum from 100. For illustration purposes, we defined 3 rRS categories including low (<10%), intermediate (10%-20%), and high residual risk (>20%).
For trial simulations, we took eligibility criteria based on tumor size (T) and clinical nodal status (N) from a currently recruiting trial for patients with TNBC (NCT03036488) that defines eligibility as having either T1c/N1-N2 or T2/N0-N2 or T3/N0-N2 or T4/N0-N2 disease. These broad T and N categories were further grouped into smaller prognostic groups to form more uniform risk cohorts. We simulated 2-arm, 1:1 randomized clinical trials with target HR of recurrence-free survival of 0.70 under 4 different accrual scenarios as described in Table 1. In scenario 1, the proportion of patients in each T/N cohort and their corresponding 5-year survival rates were obtained from the MD Anderson Cancer Center Department of Breast Medical Oncology database; this patient population is described in detail by Wu et al.9 Scenarios 2 and 3 were created by increasing the proportion of low-risk patients (T2/N0) from the observed 40% to 55% and 70%, respectively, and correspondingly lowering the proportion of high-risk (T2/N1, T3/N1, T4) patients. Scenario 4 represented increasing both the proportion and the 5-year survival of the T2/N0 cohort. Clinical trial power of each scenario was determined based on 5000 simulated trials. The eAppendix in the Supplement describes details of the simulation steps. We also simulated trials that use a minimum rRS value as eligibility criterion and explored the impact of setting minimum residual risk to 20%, 30%, 40%, 50%, and 60% as eligibility requirement on power estimates. When residual risk is used for eligibility, we assume that all patients who enter the study have equal to or greater than the prespecified residual risk of recurrence.
Baseline prognostic risk and residual risk distributions calculated by Adjuvant! Online are shown in Figure 1. For many patients, the baseline and the treatment-adjusted residual risks differ substantially. Among the 443 consecutive patients, 115 (26%) had intermediate baseline risk (ie, 10%-20% risk of recurrence) and 328 (74%) had high baseline risk (ie, >20% risk of recurrence). After adjustment for adjuvant therapy effect, 124 (28%) had low (ie, <10% risk of recurrence), 199 (45%) intermediate, and only 120 (27%) had high residual risk. Figure 2 shows how baseline tumor size and nodal status categories are distributed across clinical TNM stages and how clinical TNM stage is distributed into residual risk categories. While the residual risk categories that we use are arbitrary, the examples show how quantitative risk predictions for an individual patient change after treatment effect is factored in.
Next, we examined residual risk distribution in 3 patient cohorts that correspond to patients who would be eligible for 3 ongoing adjuvant randomized clinical trials, NRG BR003 (NCT02488967), NSABP-B55/BIG 6-13 (NCT02032823), and NSABP-B47 (NCT01275677). Table 2 presents the median residual risk and its interquartile ranges. The interquartile ranges ranged from 15 to 31, indicating that the power of these trials could vary unpredictably based on the residual risk distribution of the accrued population.
Figure 3 shows how the clinical trial power changes as a function of sample size and as the proportions of patients with different T and N categories change corresponding to the 4 simulation scenarios. As the proportion of T2/N0 patients increases and the proportion of higher-risk cohorts decreases in scenarios 2 and 3, the power of a trial is decreasing. Power is further reduced if the 5-year survival of the T2/N0 cohort increases as in scenario 4 (in real life this might happen if the standard of care improved over time). With a sample size of 800, the respective power associated with scenarios 1, 2, 3, and 4 is 0.87, 0.84, 0.80, and 0.76. We emphasize that these changes in power occur despite all patients meeting eligibility criteria based on tumor size and nodal status. Figure 3 also illustrates that a minimum power can be more reliably assured if the eligibility is based on a minimum residual risk value. Using the same 4 trial scenarios, defining eligibility as a minimum 10-year risk of recurrence of less than 40% had little effect on the power because the mean 5-year risk of recurrence of the lowest-risk group (T2/N0) was 36% under scenario 1 (Table 1). However, the power of the trial increases if we define eligibility as greater than 40% residual risk. By defining eligibility as a minimum 10-year risk of recurrence of 40%, an 800-patient randomized clinical trial retains a power of 82% even when 70% of patients have T2/N0 disease and the 5-year survival of the T2/N0 group is even higher than the observed survival in the MD Anderson Cancer Center data as in scenario 4. In scenario 1, a trial would have an 82% power to detect an HR of 0.70 with a sample size of 600 if eligibility is defined as a minimum 10-year residual risk of 50%, whereas a trial with the same sample size using combinations of nodal size and tumor size for eligibility as in NCT03036488 would have a power of 0.77. In scenario 4, a 600-patient randomized clinical trial that defines eligibility as a minimum 10-year residual risk of 50% has a power of 0.80, whereas a trial with the same sample size using anatomical risk–based eligibility would have a power of 0.63. eFigure 1 in the Supplement shows examples of distributions of residual risk estimates under different trial scenarios.
Several validated multivariate prognostic models exist that combine risk variables into a single score that can be translated into a probability of recurrence adjusted for improvement that can be expected from systemic adjuvant therapies. These models can provide a numerical estimate of risk of recurrence for individual patients. The prognostic risk based on tumor size and nodal status that typically determines trial eligibility differs substantially from the residual risk of recurrence, which is baseline risk adjusted for the projected benefit from standard-of-care adjuvant therapy. Twenty-six percent of our patient population had 10% to 20% predicted risk of recurrence without systemic adjuvant therapy and 74% had greater than 20% risk. After adjustment for the effect of the adjuvant treatments that patients received, only 27% had high (>20%) residual risk. When we calculated individual residual risk estimates for patients who would have meet eligibility criteria for ongoing adjuvant trials based on their tumor size and nodal status, we observed broad ranges in interquartile range values, which indicates that the eventual power of these trials could vary unpredictably based on the residual risk distribution of the accrued patients. If too many patients who meet eligibility criteria have low residual risk, the power to detect significant difference in outcome will be reduced.
Competing studies, changing practice standards, and patient referral and accrual biases can lead to underrepresentation or overrepresentation of low-risk patients in a trial despite their meeting eligibility requirements based on tumor size and nodal status.10-12 For example, it is common in most academic institutions to have several trials recruiting that have partially overlapping eligibility criteria; in such circumstances, patients and physicians may prefer enrolling higher-risk patients in a trial that is perceived to be “more aggressive.” This accrual bias can lead to lead to an excess of patients with better prognosis in the “less aggressive” trials. Disease evaluation standards also change over time and can result in unrecognized shifts in classification. For example, due to changing prevalence of screening and awareness, patients within a given T or N category may have smaller median tumor size and a smaller median number of positive nodes today than 20 years ago. With the advent of sentinel node biopsies, fewer nodes are examined but more thoroughly. This may lead to nodal upstaging by finding small macroscopic metastases in nodes that in earlier times would have been examined less comprehensively and called negative. Multivariable models can better integrate these effects into more accurate risk estimates than our traditional anatomic risk–based eligibility criteria. Using a reproducible and standardized risk prediction method that online tools provide could also lead to more uniform treatment recommendations and more appropriate application of adjuvant trial results in the general patient population.
An unavoidable limitation of all multivariable risk models is that building and validation of these models require clinical databases with long follow-up, which implies that patients received diagnosis and treatment several decades earlier. Their survival experiences may not fully reflect the survival of contemporary patients. However, this limitation also holds true when individual variables, such as tumor size or nodal status, are considered as risk predictors. We also recognize that many large randomized adjuvant trials are conducted with registration intent and there is no precedent to define a patient population for a new drug indication based on a multivariate prognostic risk score. However, physicians already routinely use risk predictions from Adjuvant! Online or PREDICT for medical decision making, and using predicted risk as patient eligibility criteria would not be difficult to implement. Another potential limitation is illustrated by the current off-line status of Adjuvant! Online. Free online access to validated models depends on institutions and organizations that maintain these websites.
We hypothesized that by using individual, patient-specific residual risk estimates from multivariate models as eligibility criteria, one could ensure a minimum event rate more reliably in a trial population than by using a combination of tumor size and nodal status to define eligibility. We performed trial simulations in which the proportions of low– and high–residual risk patients were varied within the allowable tumor size and nodal status brackets that defined eligibility. Our results show that shifting a larger proportion of the trial eligible population to lower residual risk categories can jeopardize the power of a study. These shifts can occur in real life because anatomical risk factor–based eligibility criteria encompass broad ranges of residual risks. We propose that the success of future adjuvant trials could be improved by defining patient eligibility based on a minimal residual risk of recurrence value rather than a combination of tumor size and nodal status measures.
Accepted for Publication: November 13, 2017.
Corresponding Author: Lajos Pusztai, MD, DPhil, Yale Cancer Center, Yale School of Medicine, 333 Cedar St, PO Box 208032, New Haven, CT 05620 (email@example.com).
Published Online: January 25, 2018. doi:10.1001/jamaoncol.2017.5092
Author Contributions: Dr Pusztai had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Drs Wei and Kurita served as joint first authors, each with equal contribution to the manuscript.
Study concept and design: Wei, Hess, Sanft, Pusztai.
Acquisition, analysis, or interpretation of data: Wei, Kurita, Hess, Sanft, Szekely, Hatzis.
Drafting of the manuscript: Wei, Kurita, Sanft, Szekely, Hatzis, Pusztai.
Critical revision of the manuscript for important intellectual content: Hess, Szekely, Hatzis.
Statistical analysis: Wei, Kurita, Hess, Szekely, Hatzis.
Obtained funding: Pusztai.
Administrative, technical, or material support: Kurita, Szekely, Pusztai.
Study supervision: Hatzis, Pusztai.
Conflict of Interest Disclosures: None reported.
Funding/Support: This project was supported by Investigator Awards from the Breast Cancer Research Foundation to Drs Hatzis and Pusztai and by a Yale Cancer Center Core Grant (NCI 2P30 CA16359-34).
Role of the Funder/Sponsor: The funding source had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Create a personal account or sign in to: