High scores for the Medical Outcomes Study Short Form-36 (SF-36) mental and physical health components indicate better status; high scores on the Center for Epidemiologic Studies–Depression (CES-D) indicate more depressive effect. The P value is based on repeated measures analysis. Mental and physical health component scores were scaled to have a population normative SD of 10 units. The SD range for the mental health component score is 7.65 to 9.42; for the physical component score, it is 8.52 to 10.57. The range for the CES-D score is 6.80 to 8.73 units.
The P value is based on repeated measures analysis.
Higher scores indicate greater severity. The P value is based on regression of average postbaseline scores.
Stephanie R. Land, D. Lawrence Wickerham, Joseph P. Costantino, Marcie W. Ritter, Victor G. Vogel, Myoungkeun Lee, Eduardo R. Pajon, James L. Wade, Shaker Dakhil, James B. Lockhart, Norman Wolmark, Patricia A. Ganz. Patient-Reported Symptoms and Quality of Life During Treatment With Tamoxifen or Raloxifene for Breast Cancer PreventionThe NSABP Study of Tamoxifen and Raloxifene (STAR) P-2 Trial. JAMA. 2006;295(23):2742–2751. doi:10.1001/jama.295.23.joc60075
Author Affiliations: National Surgical Adjuvant Breast and Bowel Project (NSABP) Operations and Biostatistical Centers (Drs Land, Wickerham, Ritter, and Costantino and Mr Lee), Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh (Drs Land, Ritter, and Costantino and Mr Lee), Allegheny General Hospital (Drs Wickerham and Wolmark), the University of Pittsburgh Cancer Institute (Drs Vogel and Land), and Magee Women's Hospital (Dr Vogel), Pittsburgh, Pa; Colorado Cancer Research Program, Denver (Dr Pajon); Community Clinical Oncology Program Central Illinois, Decatur Memorial Hospital, Decatur (Dr Wade); Community Clinical Oncology Program Wichita, Wichita, Kan (Dr Dakhil); Oklahoma Community Clinical Oncology Program, Warren Cancer Research Foundation, Tulsa (Dr Lockhart); University of California Los Angeles Schools of Public Health and Medicine, Jonsson Comprehensive Cancer Center (Dr Ganz).
Context Tamoxifen has been approved for breast cancer risk reduction in high-risk women, but how raloxifene compares with tamoxifen is unknown.
Objective To compare the differences in patient-reported outcomes, quality of life [QOL], and symptoms in Study of Tamoxifen and Raloxifene (STAR) participants by treatment assignment.
Design, Setting, Participants, and Interventions STAR was a double-blind, randomized phase 3 prevention trial designed to evaluate the relative efficacy of raloxifene vs tamoxifen in reducing the incidence of invasive breast cancer in high-risk postmenopausal women. Between July 1, 1999, and November 4, 2004, 19 747 participants were enrolled at centers throughout North America, with a median potential follow-up time of 4.6 years (range, 1.2-6.5 years). Patient-reported symptoms were collected from all participants using a 36-item symptom checklist. Quality of life was measured with the Medical Outcomes Study Short-Form Health Survey (SF-36), the Center for Epidemiologic Studies-Depression (CES-D), and the Medical Outcomes Study Sexual Activity Questionnaire in a substudy of 1983 participants, median potential follow-up 5.4 years (range, 4.6-6.0 years). Questionnaires were administered before treatment, every 6 months for 60 months and at 72 months.
Main Outcome Measures Primary QOL end points were the SF-36 physical (PCS) and mental (MCS) component summaries.
Results Among women in the QOL analysis, mean PCS, MCS, and CES-D scores worsened modestly over the study's 60 months, with no significant difference between the tamoxifen (n = 973) and raloxifene (n = 1010) groups (P>.2). Sexual function was slightly better for participants assigned to tamoxifen (age-adjusted repeated measure odds ratio, 1.22%; 95% CI, 1.01-1.46). Of the women in the symptom assessment analyses, the 9769 in the raloxifene group reported greater mean symptom severity over 60 months of assessments than the 9743 in the tamoxifen group for musculoskeletal problems (1.15 vs 1.10, P = .002), dyspareunia (0.78 vs 0.68, P<.001), and weight gain (0.82 vs 0.76, P<.001). Women in the tamoxifen group reported greater mean symptom severity for gynecological problems (0.29 vs 0.19, P<.001), vasomotor symptoms (0.96 vs 0.85, P<.001), leg cramps (1.10 vs 0.91, P<.001), and bladder control symptoms (0.88 vs 0.73, P<.001).
Conclusions No significant differences existed between the tamoxifen and raloxifene groups in patient-reported outcomes for physical health, mental health, and depression, although the tamoxifen group reported better sexual function. Although mean symptom severity was low among these postmenopausal women, those in the tamoxifen group reported more gynecological problems, vasomotor symptoms, leg cramps, and bladder control problems, whereas women in the raloxifene group reported more musculoskeletal problems, dyspareunia, and weight gain.
Trial Registration clinicaltrials.gov Identifier: NCT00003906
Trial Registration Published online June 5, 2006 (doi:10.1001/jama.295.23.joc60075).
The National Surgical Adjuvant Breast and Bowel Project (NSABP) Study of Tamoxifen and Raloxifene (STAR) was a multicenter, double-blind, randomized phase 3 prevention trial designed to evaluate the relative efficacy of raloxifene (60 mg/d for 5 years) compared with tamoxifen (20 mg/d for 5 years) in reducing the incidence of invasive breast cancer in high-risk postmenopausal women. In addition, it was hypothesized that raloxifene would have a better safety profile with respect to uterine cancer and a number of patient-reported symptoms and would provide a potential alternative to tamoxifen in the prevention of breast cancer in postmenopausal women. Therefore, measurement of patient-reported outcomes was an important secondary objective of the STAR trial.
Tamoxifen citrate and raloxifene hydrochloride are selective estrogen receptor modulators that respectively have been approved by the US Food and Drug Administration for prevention of breast cancer and osteoporosis.1,2 The acceptability of drugs that are used for prevention often rests on their efficacy as well as their adverse-effect profiles. Selective estrogen receptor modulators bind competitively with the estrogen receptor in various tissues and can have properties of both an estrogen antagonist and agonist. This has resulted in patient reports of vasomotor symptoms with both tamoxifen and raloxifene.3,4 With tamoxifen, there have also been patient-reported symptoms of estrogen excess, eg, vaginal discharge and vaginal bleeding.3
The patient-reported outcome study for the STAR trial builds directly on the quality of life (QOL) outcome assessment in the Breast Cancer Prevention Trial (BCPT), which compared tamoxifen with placebo in the prevention of breast cancer.1,3,5 In that trial, we assessed patient-reported outcomes using several established self-report measures for which normative data were available in the general population of healthy women and we established a comprehensive symptom checklist.5 We found no significant differences in QOL, depression, or sexual functioning between patients treated with tamoxifen or placebo but noted increased rates of hot flashes, night sweats, and vaginal discharge among women treated with tamoxifen.3,6- 9
In 1999, when the STAR trial was being designed, relatively little was known about how raloxifene affected patient outcomes. In the Multiple Outcomes of Raloxifene Evaluation (MORE), a double-blind, randomized and placebo-controlled study examining osteoporosis prevention in older postmenopausal women, all participants were questioned about adverse events at each visit. Raloxifene-treated participants were found to have modest but significant increases in observer-reported hot flashes and leg cramps compared with placebo but no significant differences in vaginal bleeding.4
The STAR trial provided an opportunity to compare patient-reported outcomes with both of these selective estrogen receptor modulators in the setting of a double-blind, randomized, placebo-controlled trial. We expected STAR to show no difference in overall mental health, physical health, or QOL between tamoxifen and raloxifene. However, based on the extant data, we expected to see more frequent vasomotor and gynecological symptoms (eg, vaginal bleeding, discharge) with tamoxifen and more frequent leg cramps with raloxifene. This is the first report of the patient-reported outcomes in the STAR trial.
The study eligibility, recruitment procedures, and clinical outcomes are published elsewhere.10 Briefly, to be eligible for participation, a woman had to be postmenopausal, to be aged 35 years or older, and to have at least a 5-year predicted breast cancer risk of 1.66% as determined by the modified Gail model.11- 14 Women were randomly assigned to receive 5 years of therapy with either 20 mg/d of tamoxifen and a placebo or 60 mg/d of raloxifene and a placebo. The protocol-defined monitoring plan called for a final analysis and release of findings when 327 invasive breast cancer cases had been diagnosed in the total population. The protocol and consent form were approved by the National Cancer Institute and the institutional review boards of all participating institutions. All participants provided written informed consent for the study. The race of participants was collected because it is one of the risk factors for breast cancer. It was reported by the participants, using the predefined classifications “Caucasian/White,” “African-American/Black,” “Hispanic/Latina,” and “Other (specify).”
The QOL assessment used standardized measures that were identical to those used in the BCPT.5 They will be described briefly. Health-related QOL was assessed with the Medical Outcomes Study Short Form-36 (SF-36),15- 18 which contains 8 individual subscales. Each subscale is scored from 0 to 100, with 100 being the most favorable score. The scales are physical functioning, role function-physical, bodily pain, social functioning, emotional well-being, role function-emotional, vitality, and general health perceptions.19 General population norms are available for this instrument.19 The instrument can also be scored as 2 component summary scales—one for physical functioning (PCS) and a second for mental health (MCS).20 The data for these component summary scales are presented as T scores with a normal healthy population mean score set at 50 and a score of 60 or 40 representing an SD above or below the mean, respectively. This instrument has been widely used in recent health surveys and in multicenter clinical trials.21,22
Depressive symptoms were measured with the Center for Epidemiologic Studies−Depression (CES-D) scale,23 which is a 20-item self-report scale developed for the general population to measure depressive symptoms over the past week. Normative data are available from community-based samples.24,25 The instrument has excellent reliability and validity, including use with multiethnic samples.23 Responses to the CES-D are rated on a 4-point scale, and the instrument total score ranges from a minimum score of 0 to a maximum of 60. A higher score on the CES-D indicates a greater risk of depression, with scores greater than or equal to 16 indicating potentially significant levels of depression.23 The CES-D has been used in recent studies of healthy women participating in large clinical trials.5,26
Sexual functioning was assessed using a modification of the Medical Outcomes Study Sexual Functioning Scale27 that had been successfully used in the prior BCPT.3,5 First participants were asked a screening question about whether they had been sexually active in the past 6 months. Those responding affirmatively were then asked specifically about the past 4 weeks regarding how much of a problem each activity or function had been, with a scale of 0 as being not a problem; 1, a little problem; 2, a definite problem; and 3, a serious problem. The specific questions were “lack of sexual interest,” “difficulty in becoming sexually aroused,” “unable to relax and enjoy sex,” and “difficulty in having orgasm.”
The 43-item symptom checklist that had been used in the BCPT5 was shortened based on the data from the placebo comparison3 and from a preliminary psychometric evaluation of the scale items. This resulted in a 36-item symptom checklist that included a new item asking about leg cramps, as well as additional items to cover other potential adverse symptoms. The responses use a Likert-type scale and range from 0 to 4, representing the categories “not at all,” “slightly,” “moderately,” “quite a bit,” and “extremely,” respectively.
Symptom information was collected from all STAR participants using this modified symptom checklist. Based on statistical power considerations, the decision was made to collect QOL information from a smaller subset of participants. For practical reasons, this subset was restricted to English-speaking participants enrolled at institutions in the Community Clinical Oncology Program (CCOP), a National Cancer Institute–sponsored network for conducting cancer prevention and treatment clinical trials by community medical practitioners. Eligible CCOP institutions elected to participate in the QOL substudy and indicated the completion of their institutional review board approval by submitting a substudy initiation form to the NSABP. From each institution's initiation date to the date that accrual to the substudy was closed, all participants enrolled at participating institutions were considered enrolled in the substudy.
Questionnaires were administered in the office at clinic visits, although telephone or mail administration was also allowed when necessary. Clinical staff were instructed to allow the participant to complete the questionnaire on her own without help interpreting the items. Institutions were also instructed to submit a QOL Missing Data form in lieu of a QOL form for any assessment that was not obtained. The QOL Missing Data form is completed by clinical staff and requests information about the reason(s) for the missed assessment. Both the symptom checklist and QOL questionnaires were administered at baseline (before treatment), every 6 months until 60 months, and at 72 months. However, at the time of data file closure for the present analysis (December 31, 2005), only a small number of study participants had reached the 72-month assessment. Therefore, the present analysis is restricted to assessments performed through 60 months on study.
For the QOL study, the protocol-specified sample size of 1670 evaluable participants provides a power of greater than 99% for the repeated-measures analysis of variance of 2 primary end points at a 2-sided significance level of .025, assuming a mean treatment difference equal to one half of an SD. (The study was overpowered for the primary analyses to allow adequate power for secondary analyses.) It was estimated that an accrual goal of 2000 participants would yield adequate data, allowing for study attrition or missing data.
All analyses were performed using 2-sided tests with an intent-to-treat approach including all women with follow-up assessments available. Major analyses were also repeated once restricted to those assessments performed before treatment discontinuation and a second time with the same restriction and including only participants who discontinued treatment for nonprotocol reasons.
Using χ2 tests, participant characteristics were compared between the QOL substudy participants and contemporaneously accrued participants not in the QOL substudy. The time from randomization to treatment noncompliance (discontinuation of treatment in the absence of cancer diagnosis, stroke, or other event that was listed in the protocol as mandating discontinuation) was compared between treatment groups with Kaplan-Meier and log-rank methods. Primary QOL end points were the SF-36 MCS and PCS. For these 2 primary end points, a P value of less than .025 was the significance threshold. All secondary analyses were performed at significance level of .05. The NSABP policy does not require adjusting for multiple comparisons in secondary analyses, particularly for end points related to safety. Analyses were performed using SAS version 8.2 (SAS Institute Inc, Cary, NC).
The PCS and MCS of the SF-36 and the CES-D scores were compared between treatment groups in repeated measures (mixed effects) analysis. Baseline scores were included as covariates. The proportion sexually active was compared by treatment group and age using logistic mixed-effects modeling. The severity scores for sexual function items were averaged for all time points after baseline, then compared by treatment group using analysis of covariance, controlling for baseline scores and age. Based on results from the NSABP Project BCPT Study and on literature about menopause, we had a priori hypotheses that vasomotor symptoms would be worse in women younger than 60 years and that sexual activity would be less common among women older than 60 years. Therefore, we systematically included age relative to 60 years in our analyses of sexual function and symptoms.
For the analysis of the symptom assessment, the following subscales were selected based on previous psychometric validation of the BCPT symptom checklist in several settings28,29 and in research by David Cella, PhD, and colleagues (unpublished data, 2006): (1) musculoskeletal: joint pain, muscle stiffness, general aches and pains; (2) vasomotor: night sweats, hot flashes, and cold sweats; (3) gastrointestinal: vomiting, nausea; (4) dyspareunia: vaginal dryness, pain with intercourse; (5) bladder: difficulty with bladder control (when laughing or crying) and difficulty with bladder control (at other times); (6) gynecological: vaginal discharge, genital itching or irritation, and vaginal bleeding or spotting. The subscales form more robust measures of each symptom cluster than would be available from the individual items.
For each symptom assessment subscale and 3 single items (forgetfulness, weight gain, and leg cramps), severity scores for each patient were averaged over all time points after baseline and through 60 months. Regression analyses were performed to compare the average severity between treatment groups and age groups (<60 vs ≥60 years), controlling for the baseline severity value. Effect sizes for scales were computed as the mean treatment difference divided by the SD at baseline.30
For the 4 symptom domains that showed the largest treatment differences in the primary analyses, further exploration was performed with multivariable logistic regression to determine the effect of participant characteristics (age at baseline, hysterectomy status, and race), baseline symptom severity and treatment on the proportion of participants who experienced an increase in symptom checklist scale severity of at least 1 point from baseline to 6 months.
The trial opened on July 1, 1999. Accrual was completed November 4, 2004, at which time 19 747 women had been enrolled. Almost 200 clinical centers throughout North America participated in this study. The full study population of 19 747 participants was eligible for the symptom checklist assessment. The QOL study enrolled 1983 participants between January 4, 2000, and May 31, 2001, with 973 assigned to receive tamoxifen and 1010 assigned to receive raloxifene. As of December 31, 2005, the median potential follow-up time was 4.6 years in the full cohort and 5.4 years among the QOL participants. Among QOL participants, there were no significant differences in participant characteristics by treatment group (Table 1). Characteristics of those participating in QOL substudy were comparable with women accrued concurrently at nonparticipating institutions with the exception of 2 small differences: a 3% excess of women with atypical hyperplasia among the nonparticipants and a 3% excess of women with hysterectomy among the participants. The characteristics of all participants who provided a baseline symptom assessment are also provided in Table 1. Characteristics in the full population by treatment group are provided elsewhere.10
The mean duration of treatment up to the time of data file closure in the randomized STAR population was 3.03 years (range, 0-5 years) and 3.14 years (range, 0-5 years) for the tamoxifen and raloxifene groups. The Kaplan-Meier estimate of the 5-year treatment adherence rate (in the absence of cancer diagnosis, stroke, or other serious event requiring discontinuation, or censoring due to study closure) was 70.8% for tamoxifen and 73.9% for raloxifene, indicating a significant treatment difference (P<.001). Of participants in the tamoxifen group, 6576 (67.49%) vs 6910 (70.73%) in the raloxifene group continued their protocol-assigned therapy up to the time of analysis. Of the submitted symptom checklist (QOL) forms, 12.9% (15.0%) and 10.6% (11.7%) were completed after treatment discontinuation for tamoxifen and raloxifene, respectively.
Quality-of-life forms completion was high, with 95% at baseline, a range of 76% to 86% at all time points from 6 to 60 months, and no systematic difference between treatment groups. Symptom-checklist form completion was also high, with 99% submission of the baseline form, submission rates ranging from 83% to 95% for the other time points, and no difference between treatment groups greater than 1% at any point. Forms were not expected after death or consent withdrawal, which occurred at some point during follow-up for 197 women (1.0%) in the tamoxifen and 1352 (6.8%) in the raloxifene groups. Quality of life Missing Data forms were submitted for 41% of the assessments that were missed. The reasons for the missed assessments given were inadvertent staff error (61% of submitted QOL Missing Data forms), participant refused or objected to burden of completing form (13%), participant failed to appear for scheduled follow-up visit (8%), participant had withdrawn informed consent (2%), and participant failed to respond to telephone or mail request (25%). (The percentages are not additive because multiple reasons were permitted.) There was no difference in the distributions of reasons by treatment group.
Mean MCS and PCS scores (Figure 1) declined modestly over the 60 months of assessments with no significant difference between tamoxifen and raloxifene (P = .23, MCS and P = .21, PCS). There were significant differences in favor of raloxifene in 2 of the SF-36 subscales, but of small magnitude: role physical (P = .03, mean difference 2.4, effect size, 0.1) and social function (P = .02; mean difference, 1.0; effect size, 0.1). Mean CES-D scores worsened slightly after study initiation in both treatment groups (Figure 1) but with no significant difference between tamoxifen and raloxifene (P = .61).
Figure 2 displays the percentage reporting being sexually active over time by treatment and age groups. Age was significant (P<.001) with an odds ratio of 0.55 (95% confidence interval, 0.46-0.66), indicating a decrease among women older than 60 years. Treatment was also significant (P = .04), with an odds ratio of 1.22 (95% confidence interval, 1.01-1.46), indicating a slightly higher percentage sexually active among women in the tamoxifen group. The maximum difference was 7% at the 30-month assessment among women younger than 60 years. Among those who reported being sexually active, participants in the raloxifene group experienced significantly greater difficulty with sexual interest (P = .009, mean difference of 0.096 on the scale of 0-3); greater difficulty with sexual arousal (P = .028, mean difference 0.081); greater difficulty with sexual enjoyment (P = .032, mean difference 0.078); but no significant difference in the ability to experience an orgasm (P = .21). These results were unchanged after controlling for age (data not shown).
Statistically significant differences were noted between the tamoxifen and raloxifene groups for average severity of symptoms after baseline (Table 2). The raloxifene group experienced significantly greater musculoskeletal problems (P = .002), dyspareunia (P<.001), and weight gain (P<.001). Tamoxifen participants experienced significantly greater vasomotor symptoms (P<.001), bladder problems (P<.001), gynecological problems (P<.001), and leg cramps (P<.001). These treatment differences were unrelated to age for all symptoms except vasomotor symptoms. Overall, vasomotor problems diminished with age and younger (<60 years) raloxifene participants had less severe vasomotor symptoms. However, among those older than 60 years, the treatment difference was muted.
The magnitude of the mean differences in symptom severity between treatment groups was small, typically less than 0.2 on a scale of 0 to 4 (Table 2), with the largest differences seen in vasomotor symptoms and leg cramps (Figure 3). The effect sizes for the significant effects ranged from below 0.1 to 0.3 (Table 2). However, when examined in terms of the percentages of participants at least moderately bothered by their symptoms at 6 months, there was a small difference in vasomotor symptoms among women younger than 60 years: 32% in the tamoxifen group vs 23% in the raloxifene group (Table 3). Leg cramps also showed differences, with 32% at least moderately bothered for participants in the tamoxifen group vs 24% in the raloxifene group. The raloxifene group reported being bothered at least moderately by bladder problems 5% less often than those in the tamoxifen group.
The proportion of women who experienced a unit increase in severity (at least 1 point on the scale of 0-4) in vasomotor symptoms was significantly greater among those in the tamoxifen group (P<.001), and the effect of tamoxifen was significantly greater among those younger than 60 years (P = .002 for interaction) and without a hysterectomy (P = .006 for interaction; Table 4). The effect of tamoxifen on leg cramps was slightly stronger among younger women (P = .049), white women (P = .01), and those without a hysterectomy (P = .03). In these analyses, which adjust for baseline severity and participant characteristics, there was no significant treatment effect on bladder problems. A total of 1646 (17.95%) in the tamoxifen group vs 1086 (11.83%) in the raloxifene group experienced a unit increase in bladder problems. For dyspareunia, only treatment (P = .03) and age (P = .004) were significant. A total of 1153 (12.66%) participants in the tamoxifen group vs 1387 (15.20%) in the raloxifene group experienced a unit increase in dyspareunia.
The major findings (for MCS, PCS, CES-D, percentage sexually active, and symptom scales) did not change when the analyses were restricted to assessments before treatment discontinuation (data not shown). When the analyses were further restricted to only those women who discontinued therapy for nonprotocol reasons, the findings for the MCS, PCS, and CES-D were unchanged, but the treatment differences in rates of sexual activity were larger (odds ratio, 4.67; 95% confidence interval, 1.27-17.08; P = .002). For symptom scales in this subset, the P values and most of the effect sizes were unchanged, except that the benefit for raloxifene was increased for gynecological symptoms (effect size, 0.4) and leg cramps (effect size, 0.3).
There were no significant differences between tamoxifen and raloxifene in patient-reported outcomes for physical and mental health or depressive symptoms, and scores on all of these measures were well within the normal ranges for healthy women of this age. The significant differences in the role physical and social function scales were small relative to established standards for minimal clinically important differences in these scales.31
There were, however, significant differences in sexual function in favor of tamoxifen. A greater percentage of the tamoxifen group was sexually active at nearly every assessment time point. That effect was quite large when comparing participants who had discontinued therapy early. There were also significant but small differences (in favor of tamoxifen) among those who were sexually active in terms of sexual interest, arousal, and ability to enjoy sex. These differences are likely related to the associated reports of increased vaginal discharge and decreased vaginal dryness among women treated with tamoxifen in this trial. Women in both groups of the trial were provided with the opportunity to use vaginal lubricants and low-dose vaginal estrogen preparations. Future evaluation of these differences in sexual functioning will explore whether women assigned to raloxifene used these preparations at a different frequency.
Although symptom severity was generally low in this postmenopausal sample, we demonstrated significantly less severe gynecological problems, vasomotor symptoms, bladder control problems, and leg cramps among raloxifene-treated women. However, the effect sizes for the differences in mean severity ranged from 0.2 to 0.3. Effect sizes were as large as 0.4 when comparing only the subset of participants who discontinued therapy early. A systematic literature review covering a variety of patient-reported outcomes demonstrated that the minimal treatment difference that is clinically significant is typically found to be an effect size of 0.5, although this ranged as low as 0.2.32 That is consistent with conventional standards, in which 0.5 is a moderate effect size and 0.2 is a small one.33 Therefore, the mean symptom differences found in this study in favor of raloxifene can be considered at or just below minimal clinical significance. Nevertheless, the differences in the percentages of women who were bothered by these 4 symptoms demonstrated that the treatment difference was clinically apparent for a nonnegligible proportion of the participants. In addition, the proportions experiencing at least a 1-unit increase in severity were substantial. There were also significant differences in favor of tamoxifen in the average severity of musculoskeletal symptoms, dyspareunia, and weight gain, but the effect sizes did not exceed 0.1, and therefore are not likely to be clinically significant. The treatment differences in the percentages of women who were bothered by these 3 symptoms were small. The slight benefit of raloxifene in terms of symptoms is consistent with the increased treatment adherence that was observed in that group.
There was no subgroup of patients based on age, race, or hysterectomy status for whom the treatment difference in vasomotor symptoms and leg cramps was not substantial. Treatment effects were found to be greatest among the youngest participants without a hysterectomy. The reason for this finding is uncertain, although it is possible that women without a hysterectomy (and intact ovaries) would have more residual hormonal secretion of androgens and estrogens, and thus be more susceptible to the antiestrogen effects of tamoxifen. In contrast, those women with a prior hysterectomy may have already experienced diminished ovarian androgen and estrogen secretion for some time. For leg cramps, the effect of tamoxifen was strongest among white women.
The observation in STAR that vasomotor symptoms increased initially across age groups in each treatment group is largely consistent with results from other trials. The association between vasomotor symptoms and tamoxifen has been well-established and was even stronger in the BCPT due to the younger age of participants.3,34 Vasomotor symptoms were also associated with raloxifene in the MORE study.4 In addition, a pooled analysis of 8 randomized trials performed by Eli Lilly and Co found a consistent increase in hot flashes with raloxifene relative to placebo, hormone replacement, or unopposed estrogen.35 In contrast, however, a recent study found no increase in hot flashes with 12 weeks of raloxifene vs placebo after discontinuation of combined estrogen-progestin therapy.36 Another study compared hot flashes between raloxifene and placebo within early (<6 years) and later postmenopausal groups separately, with evaluations at 2 and 8 months of treatment, and found that raloxifene did increase hot flashes in the early postmenopausal group but not in the later postmenopausal group.37 Vasomotor symptoms in STAR diminished during the course of treatment in all age and treatment groups. This could be partially explained by aging. It is probably not related to discontinuation of therapy in those participants experiencing the most severe symptoms because the same trends were seen in the during-treatment assessments of participants who discontinued therapy early. Future analyses will examine the changes in symptoms after treatment discontinuation.
The present evidence regarding bladder control together with evidence from the MORE and BCPT indicates that raloxifene has no effect on urinary incontinence while tamoxifen increases urinary incontinence.3,38 In the QOL substudy of the Royal Marsden Hospital Tamoxifen Chemo-prevention Trial and the International Breast Cancer Intervention (IBIS-I) studies, there was no difference over time between tamoxifen and placebo with respect to sexual function.34 Together with the BCPT and STAR results, it appears that tamoxifen has no major effect on sexual function,3,9 while raloxifene causes a slight decrement compared with tamoxifen.
The increase in leg cramps among participants treated with tamoxifen vs those treated with raloxifene was an unexpected finding. Leg cramps were associated with raloxifene in the MORE study and in the Lilly pooled analysis.4,35 However, we found no prior reports linking leg cramps with tamoxifen. For example, the 1995 Physician's Desk Reference (PDR) did not list leg cramps as an anticipated adverse effect of tamoxifen.39 The BCPT symptom checklist did not include leg cramps, although the consent for that protocol did anticipate leg cramps as a rare adverse effect.
Together with the BCPT, Royal Marsden, and IBIS-I trials, STAR confirmed that tamoxifen does not impair mental health. In the Royal Marsden and IBIS-I QOL substudy, longitudinal measures of psychological morbidity (General Health Questionnaire [GHQ-30]) and anxiety (State-Trait Anxiety Inventory) slightly favored tamoxifen over placebo, with only marginal significance for the GHQ-30 and nonsignificance for the anxiety scores. The 48-month symptom checklist in the Royal Marsden and IBIS-I QOL substudy revealed a numerical difference in favor of tamoxifen in depression, mood swings, anxiety, and irritability.34 It is noteworthy that longitudinal data for several mental health−related measures in all treatment groups in the BCPT, Royal Marsden–IBIS-I, and STAR studies showed a decline initially after the start of therapy, followed by a partial return to baseline levels. The change was small (eg, about 1.5 points in the CES-D scores in both treatment groups in both the BCPT and STAR). Possible reasons for this phenomenon, including the effects of enrollment screening or the impact of participating in a prevention trial, have been explored elsewhere.8
Cognitive change in STAR will be evaluated in much greater detail in NSABP protocol Co-STAR. This ancillary study recruited 1510 women to undergo annual neuropsychological batteries focusing on verbal and nonverbal memory, other cognitive abilities, and mood. The primary hypotheses are that the cognitive changes with age will not be different between tamoxifen and raloxifene and that the changes with tamoxifen will be similar to that seen for placebo in the Women's Health Initiative Study of Cognitive Aging. The present analysis of the symptom checklist is consistent with these hypotheses: we found that self-reported forgetfulness did not change significantly over time, nor did it differ by treatment (data not shown). This observation is also consistent with past data. Two double-blind randomized trials demonstrated a protective effect of raloxifene against cognitive decline but only for raloxifene given at 120 mg/d and not at the 60 mg/d used in STAR.40,41 The BCPT symptom checklist included 3 cognitive items, forgetfulness, difficulty concentrating, and easily distracted. No treatment differences were reported for these items in the BCPT.3,7,9
Patient-reported outcomes are particularly useful in the setting of prevention, where individuals must make a choice between an agent with possible adverse effects and an abstract risk of cancer. A woman's physician can help guide her based on anecdotal evidence from the physician's own clinical practice. However, anecdotal evidence has at times been misleading.7,8 Since the 1970s, anecdotal evidence regarding clinical outcomes has been replaced with rigorously standardized clinical trials. Symptom and QOL evidence must also be rigorously evaluated. The NSABP's STAR trial, with its large-scale symptom evaluation and well-powered QOL substudy, provides a comprehensive, detailed view of the patient experience using raloxifene and tamoxifen. Both of these agents are indicated for prevention in large populations, so these results can be widely used as tools in decision making or in helping a woman anticipate and cope with the sequelae of her chosen agent.
Corresponding Author: Stephanie R. Land, PhD, 201 N Craig St, Suite 350, Pittsburgh, PA 15213 (email@example.com).
Published Online: June 5, 2006 (doi:10.1001/jama.295.23.joc60075).
Author Contributions: Dr Land had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Wickerham, Costantino, Wolmark, Ganz.
Acquisition of data: Land, Costantino, Ritter, Vogel, Pajon, Wade, Dakhil, Lockhart.
Analysis and interpretation of data: Land, Vogel, Lee, Wolmark, Ganz.
Drafting of the manuscript: Land, Costantino, Vogel, Lee, Wade, Ganz.
Critical revision of the manuscript for important intellectual content: Wickerham, Ritter, Vogel, Pajon, Dakhil, Lockhart, Wolmark, Ganz.
Statistical analysis: Land, Lee.
Obtained funding: Costantino.
Administrative, technical, or material support: Land, Wickerham, Costantino, Ritter, Vogel, Pajon, Wade, Lockhart, Wolmark.
Study supervision: Costantino, Vogel, Pajon, Wolmark, Ganz.
Financial Disclosures: Dr Wickerham has reported serving as a consultant for and on the speaker's bureau of AstraZeneca Pharmaceuticals; Dr Vogel has reported serving on the speaker's bureau of AstraZeneca Pharmaceuticals and Eli Lilly; and Dr Wolmark has reported receiving honorarium from Eli Lilly. No other authors reported disclosures.
Funding/Support: This study was supported by Public Health Service grants U10-CA-37377, U10-CA-69974, U10CA-12027, and U10CA-69651 from the National Cancer Institute, National Institutes of Health, Department of Health and Human Services, and AstraZeneca Pharmaceuticals and Eli Lilly and Company.
Role of the Sponsor: The study sponsors had no role in any aspect of study design, data collection, analysis and interpretation of data, or in the development of the manuscript. Per contractual arrangement, the manuscript was submitted to AstraZeneca and Eli Lilly before submission.
Acknowledgment: We thank Barbara C. Good, PhD, Director of Scientific Publications for the National Surgical Adjuvant Breast and Bowel Project, and Wendy Rea, BA, for editorial assistance. Dr Good and Ms Rea are employees of the NSABP. They were not compensated beyond their normal salaries for this work. We also thank the courageous participants without whom this trial could not have been accomplished.