Study participant flow diagram.
Mean Roland-Morris Disability Questionnaire scores (A) and symptom bothersomeness scores (B) and 95% confidence intervals by treatment group and time since randomization.
Participants with improvement. Percentage of participants improving by at least 3 points on the Roland-Morris Disability Questionnaire scale (A) and by at least 2 points on the symptom bothersomeness scale (B).
Cherkin DC, Sherman KJ, Avins AL, Erro JH, Ichikawa L, Barlow WE, Delaney K, Hawkes R, Hamilton L, Pressman A, Khalsa PS, Deyo RA. A Randomized Trial Comparing Acupuncture, Simulated Acupuncture, and Usual Care for Chronic Low Back Pain. Arch Intern Med. 2009;169(9):858-866. doi:10.1001/archinternmed.2009.65
Acupuncture is a popular complementary and alternative treatment for chronic back pain. Recent European trials suggest similar short-term benefits from real and sham acupuncture needling. This trial addresses the importance of needle placement and skin penetration in eliciting acupuncture effects for patients with chronic low back pain.
A total of 638 adults with chronic mechanical low back pain were randomized to individualized acupuncture, standardized acupuncture, simulated acupuncture, or usual care. Ten treatments were provided over 7 weeks by experienced acupuncturists. The primary outcomes were back-related dysfunction (Roland-Morris Disability Questionnaire score; range, 0-23) and symptom bothersomeness (0-10 scale). Outcomes were assessed at baseline and after 8, 26, and 52 weeks.
At 8 weeks, mean dysfunction scores for the individualized, standardized, and simulated acupuncture groups improved by 4.4, 4.5, and 4.4 points, respectively, compared with 2.1 points for those receiving usual care (P < .001). Participants receiving real or simulated acupuncture were more likely than those receiving usual care to experience clinically meaningful improvements on the dysfunction scale (60% vs 39%; P < .001). Symptoms improved by 1.6 to 1.9 points in the treatment groups compared with 0.7 points in the usual care group (P < .001). After 1 year, participants in the treatment groups were more likely than those receiving usual care to experience clinically meaningful improvements in dysfunction (59% to 65% vs 50%, respectively; P = .02) but not in symptoms (P > .05).
Although acupuncture was found effective for chronic low back pain, tailoring needling sites to each patient and penetration of the skin appear to be unimportant in eliciting therapeutic benefits. These findings raise questions about acupuncture's purported mechanisms of action. It remains unclear whether acupuncture or our simulated method of acupuncture provide physiologically important stimulation or represent placebo or nonspecific effects.
clinicaltrials.gov Identifier: NCT00065585
Americans spend at least $37 billion annually on medical care for back pain,1,2 and our economy suffers another $19.8 billion in lost worker productivity.3 There is no evidence that escalating expenses for spine care have improved self-assessed health status.2
Many patients with back pain are dissatisfied with medical care4 and seek care from complementary and alternative medical providers, including acupuncturists.5,6 Back pain is the leading reason for visits to licensed acupuncturists,7 and medical acupuncturists consider acupuncture an effective treatment for back pain.8
Several recent, well-designed European trials have suggested that real acupuncture and “sham” acupuncture (eg, shallow needling of points considered ineffective) are equally effective9,10 and that both are superior to best-practice medical care,10 usual care,11- 13 and a wait-list control.9
Our trial expands on the findings of the European studies by (1) including a noninsertive method of stimulating acupuncture points, which permitted assessment of the need for needle insertion to achieve therapeutic benefit, (2) including both individualized and standardized forms of acupuncture, and (3) following patient outcomes for longer than most of the European trials. Thus, this trial was designed to address the following questions about the value of acupuncture for chronic low back pain:
Is acupuncture more effective than usual medical care alone?
Is real acupuncture more effective than simulated (noninsertive) acupuncture?
Is individualized acupuncture more effective than standardized acupuncture?
We conducted a 4-arm randomized controlled trial comparing the effectiveness of individualized acupuncture, standardized acupuncture, simulated acupuncture, and usual care. Study design details are described elsewhere.14 This trial was approved by the institutional review boards of Group Health Cooperative, Seattle, Washington, and Kaiser Permanente of Northern California, Oakland. All participants gave written informed consent.
Patients aged 18 to 70 years who were receiving care for a back problem from an integrated health care delivery system in western Washington and another in northern California within the prior year were potentially eligible. We used electronic records to identify persons with diagnosis codes consistent with uncomplicated chronic low back pain within the prior 3 to 12 months. We excluded persons with (1) specific causes of back pain (eg, cancer, fractures, spinal stenosis, infections), (2) complicated back problems (eg, sciatica, prior back surgery, medicolegal issues), (3) possible contraindications for acupuncture (eg, coagulation disorders, cardiac pacemakers, pregnancy, seizure disorder), (4) conditions making treatment difficult (eg, paralysis, psychoses), and (5) conditions that might confound treatment effects or interpretation of results (eg, severe fibromyalgia, rheumatoid arthritis, concurrent care from other providers). Persons with less than 3 months of back pain or previous acupuncture treatment for any condition were excluded.
Recruitment occurred from March 2004 through August 2006. Three to 12 months after back-related visits, potential participants were mailed invitation letters. Study staff telephoned respondents to determine final eligibility, which required a severity rating of at least 3 on the 0 to 10 back pain bothersomeness scale. We also mailed letters to members without recent visits for back pain and advertised in clinics and newsletters.
Those found eligible were administered a baseline questionnaire and randomly allocated to 1 of 4 treatment groups, using a centrally generated variable-sized block design. Treatments began within 2 weeks of randomization. The study was described only as a comparison of 3 methods of stimulating acupuncture points without information about how treatments differed.
Participants assigned to a real or simulated acupuncture treatment were treated twice weekly for 3 weeks and then weekly for 4 weeks (10 treatments total). Participants were asked to wear eye masks and lie prone with their heads in a face cradle. Electrostimulation, moxibustion, herbs, and other nonneedle adjuncts were proscribed.
One of 5 diagnostician acupuncturists with 7 to 18 years’ experience evaluated participants at each visit using traditional Chinese medical diagnostic techniques and prescribed individualized traditional Chinese medical treatments to be used only for participants randomized to individualized acupuncture. A therapist acupuncturist then delivered the assigned treatments, interacting minimally with participants and the diagnostician, who remained masked to treatment. Treatments were performed in research clinics at the 2 sites by 6 licensed acupuncturists with 4 to 19 years of experience. All acupuncturists were experienced in using traditional Chinese medical acupuncture for musculoskeletal pain. Of the 11 study acupuncturists, 9 had at least 3 years of formal training and the 2 others had practiced for over 15 years.
Acupuncturists used sterile disposable 32-gauge needles (0.25 mm) at least 1.5 inches in length. Needling depth varied slightly, depending on the acupuncture point, but was generally between 1 and 3 cm.
This was the treatment prescribed by the diagnostician at the beginning of each visit. It could include any acupuncture points that could be needled with the participant lying prone. There were no constraints on number of needles, depth of insertion, or needle manipulation. Treatments averaged 10.8 needles (range, 5-20) retained for 18 minutes (range, 15-20 minutes). Seventy-four distinct points were used, half on the “Bladder meridian” that includes points on the back and legs.
We used a standardized acupuncture prescription considered effective by experts for chronic low back pain.15 This included 8 acupuncture points commonly used for chronic low back pain (Du 3, Bladder 23–bilateral, low back ashi point, Bladder 40–bilateral, Kidney 3–bilateral) on the low back and lower leg.14 All acupuncture points were needled for 20 minutes, with stimulation by twirling the needles at 10 minutes and again just prior to needle removal. Therapists manipulated the needles to elicit “de qi,” which they perceive as a biomechanical response in tissue as it tightens around the inserted needle and constricts its movement.16
We developed a simulated acupuncture technique using a toothpick in a needle guide tube, which was found to be a credible acupuncture treatment by acupuncture-naïve patients with back pain.14,17 Simulating insertion involved holding the skin taut around each acupuncture point and placing a standard acupuncture needle guide tube containing a toothpick against the skin. The acupuncturist tapped the toothpick gently, twisting it slightly to simulate an acupuncture needle grabbing the skin, and then quickly withdrew the toothpick and guide tube while keeping his or her fingers against the skin for a few additional seconds to imitate the process of inserting the needle to the proper depth. All acupuncture points were stimulated with toothpicks at 10 minutes (ie, the acupuncturist touched each acupuncture point with the tip of a toothpick without the guide tube and rotated the toothpick clockwise and then counterclockwise less than 30°) and again at 20 minutes just before they were “removed.” To simulate withdrawal of the needle, the acupuncturist tightly stretched the skin around each acupuncture point, pressed a cotton ball firmly on the stretched skin, then momentarily touched the skin with a toothpick (without the guide tube) and quickly pulled the toothpick away using the same hand movements as in regular needle withdrawal. The acupuncturists simulated insertion and removal of needles at the 8 acupuncture points used in the standardized treatment.
Participants in the usual care group received no study-related care—just the care, if any, they and their physicians chose (mostly medications, primary care, and physical therapy visits). All participants received a self-care book with information on managing flare-ups, exercise, and lifestyle modifications.18
Outcomes were measured at baseline and after 8, 26, and 52 weeks using computer-assisted telephone interviews by interviewers masked to treatment. Prespecified primary outcomes were back-related dysfunction and symptom bothersomeness at end of treatment (8 weeks). Dysfunction was measured using the modified Roland-Morris Disability Questionnaire (RMDQ), a reliable, valid, and sensitive measure19 appropriate for telephone administration. Participants were also asked to rate how bothersome their pain had been during the past week on a scale of 0 (“not at all bothersome”) to 10 (“extremely bothersome”). This measure demonstrates substantial construct validity.20- 22
Secondary outcomes included the following: (1) 26- and 52-week outcomes for the primary outcome measures, (2) proportion of participants with clinically meaningful improvements23 in dysfunction (≥3 point decrease on the RMDQ scale) and back pain (≥2 point decrease in symptom bothersomeness), (3) self-reported medication use for back pain in the prior week, (4) physical and mental health component summary scores of the Medical Outcomes Study Short-Form 36 Health Survey (SF-36),24 and (5) number of days spent in bed, number of days lost from work or school, or cutting down on usual activities due to back problems during the past month.25 Finally, participants' use of health services for back pain during the year following randomization was measured using interview data and, for the Washington site, automated health plan use data.
We also asked questions to determine if participants in the acupuncture groups perceived different experiences and to assess efforts to mask the diagnostician acupuncturists and outcomes assessors to study treatment.
Finally, participants were asked about adverse experiences at each visit and during the 8-week telephone follow-up assessment. This trial was monitored by the National Center for Complementary and Alternative Medicine Data Safety Monitoring Board.
Analyses were based on an intent-to-treat approach using randomized group assignment. The primary outcomes were analyzed as continuous measures. Analysis of covariance was used to test for treatment differences at the follow-up assessment, adjusting for the baseline measure. We also adjusted for site, age group (18-29, 30-39, 40-49, 50-59, and ≥60 years), and sex. Interactions of treatment group with age and site were added to test for effect modification by each of these covariates. Because expectations may influence outcomes, we also examined models that included expectation of acupuncture helpfulness and expectation of low back pain improvement as covariates and tested for effect modification with treatment groups dichotomized as treatment (real or simulated acupuncture) vs usual care.
Separate analyses were performed for each follow-up time. Results from the 8-week follow-up assessment were the primary end point. Pairwise comparisons using Tukey-Kramer adjustment were performed if the global test for differences among groups was significant at the P < .05 (2-sided) level. The study was powered to detect mean differences of 2.0 points on the RMDQ scale and 1.5 points on the symptom bothersomeness scale, using variance estimates from our pilot study.14 Previous studies have suggested that these cutoff values are at the lower end of clinically important differences,21- 23 so their use should result in ample statistical power. We had 99% power to detect such differences in the overall analysis and approximately 80% power to detect pairwise differences after adjustment for multiple comparisons. To test for overall differences in our secondary outcomes, a χ2 test was used for categorical outcomes and analysis of covariance was used for continuous outcomes. We used SAS/STAT statistical software (version 9.1; SAS institute Inc, Cary, NC).26
We evaluated 2605 potential participants for eligibility; 641 (25%) were eligible and randomized (Figure 1). The main reasons for ineligibility were less than 3 months of back pain, sciatica, previous acupuncture, and inability to attend treatment visits. Three participants were excluded after randomization when we learned they had had exclusionary criteria when randomized (previous acupuncture treatment, involvement in litigation, and fibromyalgia). Therefore, analyses included 638 participants randomized to individualized acupuncture (n = 157), standardized acupuncture (n = 158), simulated acupuncture (n = 162), or usual care (n = 161). Follow-up rates were 95%, 91%, and 91% at 8, 26, and 52 weeks, respectively, and were similar across groups.
Study participants had a mean age of 47 years, and 62% were female, 68% were white, and 53% were college graduates (Table 1). Overall mean scores of 10.6 on the RMDQ scale and 5.1 for symptom bothersomeness indicated moderately severe chronic back problems. Two-thirds of participants reported at least 1 year of pain and current use of low back pain medication. Overall, participants were moderately optimistic that acupuncture would help (mean of 6.7 on a 0-10 scale).
A priori, we defined treatment adherence as completion of 8 or more of the 10 possible visits. By this definition, 84%, 87%, and 90% of participants were adherent in the individualized, standardized, and simulated acupuncture groups, respectively. At 8 weeks, 18% of participants reported having read more than two-thirds of the self-care book with no differences among groups (P = .42).
All groups showed improved function and decreased symptoms at the primary end point of 8 weeks (Table 2 and Figure 2) in both the adjusted and unadjusted analyses. However, as seen in Table 2, unadjusted mean dysfunction scores for the individualized, standardized, and simulated acupuncture groups improved 4.4 to 4.5 points, compared with 2.1 points for those receiving usual care. There was a statistically significant difference in function among all 4 groups (P < .001 after adjustment for covariates) and statistically significant differences between usual care and each of the acupuncture groups adjusted for covariates and multiple comparisons (Table 3). However, there were no significant pairwise differences among the 3 acupuncture groups: individualized acupuncture was not significantly better than standardized acupuncture and real acupuncture was not significantly better than simulated acupuncture.
Mean values of the primary outcomes remained relatively stable from 8 to 52 weeks (Figure 2). The usual care group continued to have greater dysfunction than the real or simulated acupuncture groups through 52 weeks (P = .001). The real and simulated acupuncture groups did not differ significantly from one another, accounting for multiple comparisons (P > .05). The results for symptom bothersomeness were generally similar, but the differences among the 4 groups were smaller and no longer statistically significant at 52 weeks. Inclusion of the expectation measures did not alter the results and were not kept in the final models. There was no significant interaction between group and either age or site.
At 8 weeks, the proportion of participants improving at least 3 points on the RMDQ scale was about 60% in the real and simulated acupuncture groups, compared with only 39% in the usual care group (global test, P < .001) (Figure 3A). These superior outcomes in function for the real and simulated acupuncture groups remained significant at 26 weeks (P = .01) and 52 weeks (P = .02). Similar results were found for improvements of at least 2 points on the symptom bothersomeness score at 8 weeks (P < .001) (Figure 3B). However, overall differences were no longer significant at 26 or 52 weeks.
The use of medications for back pain in the past week (mostly nonsteroidal anti-inflammatory drugs) was similar across groups at baseline (ie, 62% to 65%), but by 8 weeks, it had decreased to 47% in the real and simulated acupuncture groups vs 59% in the usual care group (P = .01). This difference persisted at 26 and 52 weeks.
There was an overall group difference (favoring real and simulated acupuncture) at 8 weeks in both the SF-36 mental (P = .03) and physical (P < .001) component scores, but these differences were small (<4 points) and no longer significant at 52 weeks. At 52 weeks, significantly more participants reported cutting down on activities for more than 7 days in the past month in the usual care group (18%) than in the real or simulated acupuncture groups (5%-7%) (P < .001). Similarly, more participants in the usual care group missed work or school for more than a day in the past month (16%) than in the real or simulated acupuncture groups (5%-10%) (P = .01).
Use of nonstudy treatments for back pain reported at the 8-week interview was similar across the real and simulated acupuncture groups, so pooled results are reported. Participants in the usual care group were twice as likely as those receiving real or simulated acupuncture to report a physician or physical therapist visit (21% vs 11%; P = .001) or to have visited a complementary and alternative medicine provider (18% vs 8%; P < .001).
At the Washington site, mean total costs of back-related health services for the year after randomization were similar in the 4 treatment groups (range, $160-$221; P = .65). This excludes costs of the study's real and simulated acupuncture treatments and the cost of the 1 spine operation in the usual care group.
Participants rated the acupuncture and simulated acupuncture treatments almost identically with regard to provider skills and caring. The diagnostician acupuncturists rated the acupuncture and simulated acupuncture groups very similarly with regard to apparent efficacy and likelihood of receiving individualized treatment.
Of the 477 participants, 11 who were receiving real or simulated acupuncture reported a moderate adverse experience possibly related to treatment (mostly short-term pain) and 1 reported a severe experience (pain lasting 1 month). One participant reported dizziness and another, back spasms. Rates of adverse experiences differed by treatment group: 6 of 157 participants for individualized acupuncture, 6 of 158 for standardized acupuncture, and 0 of 162 for simulated acupuncture (P = .04).
Compared with usual care, individualized acupuncture, standardized acupuncture, and simulated acupuncture had beneficial and persisting effects on chronic back pain. These treatments resulted in clinically meaningful improvements in function. Substantial adverse experiences with needle insertion were infrequent (1 of 315 participants). Self-reported medication use in the real and simulated acupuncture groups decreased significantly more than in the usual care group and remained lower through the 1-year follow-up. However, the 8 to 10 acupuncture treatments received by most participants (which would cost between $600 and $1200) did not result in cost savings to the health plan during the year after randomization.
This trial differs from our earlier study (which found similar effects for acupuncture and a more rigorous educational intervention) by including a usual care group, participants with more chronic pain, and participants who were all acupuncture naïve.27 However, our findings are consistent with those of recent high-quality trials. One German trial found that both real acupuncture and sham acupuncture (superficial needling at nonacupuncture points) had similar effects that were superior to those of guideline-based conventional medical treatment.10 A second German trial found that both real and sham acupuncture were superior to a wait list control group but not significantly different from each other.9 Finally, a British trial found that traditional acupuncture care delivered in a primary care setting had modestly superior results compared with usual care after 2 years.12 Our trial extends the findings from these studies by demonstrating that needle insertion is not necessary to achieve therapeutic benefits and by measuring longer-term outcomes.
Collectively, these recent trials provide strong and consistent evidence that real acupuncture needling using the Chinese meridian system is no more effective for chronic back pain than various purported forms of sham acupuncture. However, both real and sham acupuncture appear superior to usual care. Possible explanations for these findings include the following: (1) superficial acupuncture point stimulation directly stimulates physiological processes that ultimately lead to improved pain and function, or (2) participants' improved functioning resulted from nonspecific effects such as therapist conviction, patient enthusiasm, or receiving a treatment believed to be helpful.
The appropriateness of using minimal, superficial, or sham control groups in trials of acupuncture remains controversial.28 In fact, the use of blunt needles that did not penetrate the skin was described 2000 years ago in the classic book on acupuncture needling.29 A study using functional magnetic resonance imaging found that superficial and deep needling of an acupuncture point elicited similar blood oxygen level–dependent responses.30 Another study demonstrated that lightly touching the skin can stimulate mechanoreceptors that induce emotional and hormonal reactions, which in turn alleviate the affective component of pain.31 This could explain why trials evaluating acupuncture for pain have failed to find that real acupuncture is superior to sham or superficial control treatments and raises questions about whether sham treatments truly serve as inactive controls.
The possibility that an acupuncture treatment “experience” could be beneficial because of nonspecific effects is also credible.32 A recent acupuncture trial for irritable bowel syndrome reported that nonspecific effects (especially the patient-clinician relationship) produced statistically and clinically significant outcomes.33,34 The potency of nonspecific effects has also been noted in placebo-controlled randomized trials of surgical interventions for pain conditions.35,36
The main strengths of this trial are its size, high compliance and follow-up rates, long-term follow-up, inclusion of a simulated acupuncture control, and effective masking. Limitations include restricting treatment to a single component (needling) of normal traditional Chinese medicine acupuncture,37 prespecification of the number and duration of treatments, limited conversation between acupuncturists and participants, and exclusion of a medical attention control group. However, a recent trial using a similar number and duration of visits for both the acupuncture and medical care control groups also found that acupuncture was superior.10
Our results have important implications for key stakeholders. For clinicians and patients seeking a relatively safe38,39 and effective treatment for a condition for which conventional treatments are often ineffective, various methods of acupuncture point stimulation appear to be reasonable options, even though the mechanism of action remains unclear. Furthermore, the reduction in long-term exposure to the potential adverse effects of medications is an important benefit that may enhance the safety of conventional medical care. The number of patients who would need to be treated with insertive or superficial acupuncture stimulation to result in 1 person achieving meaningful improvement in function ranges from 5 (for short-term benefits) to 8 (for persisting benefits).
In conclusion, acupuncturelike treatments significantly improved function in persons with chronic low back pain. However, the finding that benefits of real acupuncture needling were no greater than those of noninsertive stimulation raises questions about acupuncture's purported mechanism of action. Future research is needed to determine the relative contributions of the physiologic effects of noninsertive stimulation, patient expectations, and other nonspecific effects.
Correspondence: Daniel C. Cherkin, PhD, Center for Health Studies, 1730 Minor Ave, Ste 1600, Seattle, WA 98101 (firstname.lastname@example.org).
Accepted for Publication: December 17, 2008.
Author Contributions: The principal investigator, Dr Cherkin, had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Study concept and design: Cherkin, Sherman, Barlow, and Deyo. Acquisition of data: Cherkin, Sherman, Avins, Erro, Barlow, Delaney, Hawkes, Hamilton, and Pressman. Analysis and interpretation of data: Cherkin, Sherman, Avins, Ichikawa, Barlow, Delaney, Hamilton, Pressman, Khalsa, and Deyo. Drafting of the manuscript: Cherkin, Sherman, Ichikawa, and Barlow. Critical revision of the manuscript for important intellectual content: Cherkin, Sherman, Avins, Erro, Ichikawa, Barlow, Delaney, Hawkes, Hamilton, Pressman, Khalsa, and Deyo. Statistical analysis: Sherman, Ichikawa, Barlow, Delaney, and Pressman. Obtained funding: Cherkin and Avins. Administrative, technical, and material support: Cherkin, Avins, Erro, Delaney, Hawkes, Hamilton, Pressman, Khalsa, and Deyo. Study supervision: Cherkin, Sherman, Erro, Barlow, Hawkes, and Hamilton.
Financial Disclosure: None reported.
Funding/Support: This trial was funded through a National Institutes of Health (NIH) Cooperative Agreement (U01 AT 001110) with the National Center for Complementary and Alternative Medicine (NCCAM). The sponsor (NIH), through its project officer, Dr Khalsa, was involved in the analysis and interpretation of data and review and approval of the manuscript. Lhasa OMS Inc, Weymouth, Massachusetts, donated the Seirin acupuncture needles used in this study.
Disclaimer: The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official views of NCCAM.
Previous Presentation: Portions of the results described in the article were presented at the the Ninth International Forum for Primary Care Research on Low Back Pain; October 6, 2007; Majorca, Spain.
Additional Contributions: The following people provided assistance with this study: Sara Bayer, LAc, Ramey Fair, LAc, Larry Forsberg, LAc, Roxanne Geller, LAc, Caryn Goldman, LAc, Paul Griffin, LAc, Susan M. Kaetz, MPH, LAc, Ken Morris, LAc, Sabi Inderkum, LAc, Sachiko Nakano, LAc, Deborah Stanfill, LAc, and Eliot Wagner, LAc (diagnostician and therapist acupuncturists in private practice); Zoe Bermet, Marissa Brooks, John Ewing, Erika Holden, Danielle Huston, Christel Kratohvil, and Melissa Parson (clinic research assistants at Group Health Center for Health Studies); Olivia Anaya, Pete Bogdanos, Cynthia H. Huynh, Rebecca Rogot, and Caroline M. Sison (clinic research assistants at Kaiser Permanente Northern California Division of Research); Kabba Anand, DAc, LAc (consultant acupuncturist in private practice); Harley Goldberg, DO (physician with Kaiser Permanente Northern California); Juanita Jackson (administrative assistant, Group Health Center for Health Studies); and John Maio (programmer, Kaiser Permanente Northern California Division of Research). All of these individuals were compensated for their roles in the project. Our original NCCAM project officer, Richard Nahin, PhD, MPH, provided helpful advice.