EHR indicates electronic health record; MRI, magnetic resonance imaging; CT, computed tomography.
aOf the 4 patients who died with cancer: 1 had early imaging likely prompted by a red flag since the history on the imaging study indicated known cancer; 1 had early imaging (both plain films and computed tomography) that showed no evidence of cancer; 2 had no imaging at any time.
bParticipants who had neither early radiographs nor early MRI/CT and were matched controls for comparisons.
eTable 1. Baseline diagnostic categories and ICD-9-CM diagnosis codes included in each
eText. Use and verification of natural language processing algorithm used to identify malignancy in imaging reports
eTable 2. Type of imaging by CPT code. Note: includes any spine imaging that occurred between day 0 and day 42 (inclusive). Some patients had >1 image
eTable 3. No-Early Imaging vs. Early Radiograph Health Care Costs and Falls
eTable 4 No-Early Imaging vs. Early MRI/CT Health Care Costs
eTable 5. Baseline Measures. Unmatched Early Imaging vs. No Early Imaging Patients
Jarvik JG, Gold LS, Comstock BA, Heagerty PJ, Rundell SD, Turner JA, Avins AL, Bauer Z, Bresnahan BW, Friedly JL, James K, Kessler L, Nedeljkovic SS, Nerenz DR, Shi X, Sullivan SD, Chan L, Schwalb JM, Deyo RA. Association of Early Imaging for Back Pain With Clinical Outcomes in Older Adults. JAMA. 2015;313(11):1143-1153. doi:10.1001/jama.2015.1871
Copyright 2015 American Medical Association. All Rights Reserved. Applicable FARS/DFARS Restrictions Apply to Government Use.
In contrast to the recommendations for younger adults, many guidelines allow for older adults with back pain to undergo imaging without waiting 4 to 6 weeks. However, early imaging may precipitate interventions that do not improve outcomes.
To compare function and pain at the 12-month follow-up visit among older adults who received early imaging with those who did not receive early imaging after a new primary care visit for back pain without radiculopathy.
Design, Setting, and Participants
Prospective cohort of 5239 patients 65 years or older with a new primary care visit for back pain (2011-2013) in 3 US health care systems. We matched controls 1:1 using propensity score matching of demographic and clinical characteristics, including diagnosis, pain severity, pain duration, functional status, and prior resource use.
Diagnostic imaging (plain films, computed tomography [CT], magnetic resonance imaging [MRI]) of the lumbar or thoracic spine within 6 weeks of the index visit.
Main Outcome and Measures
Primary outcome: back or leg pain–related disability measured by the modified Roland-Morris Disability Questionnaire (score range, 0-24; higher scores indicate greater disability) 12 months after enrollment.
Among the 5239 patients, 1174 had early radiographs and 349 had early MRI/CT. At 12 months, neither the early radiograph group nor the early MRI/CT group differed significantly from controls on the disability questionnaire. The mean score for patients who underwent early radiography was 8.54 vs 8.74 among the control group (difference, −0.10 [95% CI, −0.71 to 0.50]; mixed model, P = .36). The mean score for the early MRI/CT group was 9.81 vs 10.50 for the control group (difference,−0.51 [−1.62 to 0.60]; mixed model, P = .18).
Conclusions and Relevance
Among older adults with a new primary care visit for back pain, early imaging was not associated with better 1-year outcomes. The value of early diagnostic imaging in older adults for back pain without radiculopathy is uncertain.
When to image older adults with back pain remains controversial. Most guidelines regarding acute or chronic back pain focus on younger age groups. Many guidelines recommend that older adults undergo early imaging because of the higher prevalence of serious underlying conditions.1- 4 However, there is not strong evidence to support this recommendation.5 A Cochrane review of back pain in older adults concluded that there was “under-representation of the older population in the back pain literature.”6 Adverse consequences of early imaging are more substantial in an older population because the prevalence of incidental findings on spine imaging increases with age.7- 11 Given the high prevalence of incidental findings in this age group, imaging older adults soon after initial presentation may lead to a cascade of subsequent interventions that increase costs without benefits. This phenomenon has been observed in workers’ compensation populations.12,13
We used data from a prospective cohort of patients aged 65 years or older who presented to primary or urgent care for a new episode of care for low back pain, defined as no prior visits for low back pain within the previous 6 months, as part of the Back Pain Outcomes Using Longitudinal Data (BOLD) project.14 We hypothesized that older adults who had lumbar spine imaging within 6 weeks of their index visit (early imaging), compared with those who did not, would have worse outcomes and greater health care use 1 year later.
We used a prospective observational cohort to compare, using propensity score matching of demographic and clinical characteristics, outcomes of patients who received vs those who did not receive early imaging.
We previously described the BOLD cohort.14,15 In brief, we prospectively enrolled 5239 patients 65 years or older initiating a new episode of care for back pain (Figure). We recruited patients presenting to primary or urgent care at 3 integrated health care systems: Harvard Vanguard, Henry Ford Health System, and Kaiser Permanente Northern California. The visit for back pain that qualified the patient for entry into the cohort was the index visit. We enrolled patients from March 2011 through March 2013 and categorized them by International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes as axial back pain alone, back and leg pain or herniated disk, lumbar spinal stenosis, and other (eTable 1 in the Supplement).
At baseline, interviewers administered the patient-reported measures in person or by telephone within 3 weeks of a patient’s index visit. We collected information on demographics; duration of current episode of back or leg pain (<1 month, 1-3 months, 3-6 months, 6-12 months, 1-5 years, >5 years); and recovery expectations (confidence that their pain would be completely gone or much better in 3 months, on a scale from 0 “not at all confident” to 10 “extremely confident”).
Our primary outcome was the Roland-Morris Disability Questionnaire (RMDQ),16 a measure of physical limitations due to back pain (range, 0, no pain-related limitations, to 24, maximal pain-related limitations). We slightly modified this measure to indicate limitations due to back or leg pain (sciatica), which is a widely used modification.16- 18 The questionnaire contains 24 yes or no items. A minimal clinically important difference is 2 to 5 points.17,19 We administered the following 6 secondary outcome measures at baseline and at 3, 6, and 12 months (primary end point): (1) a 0 to 10 numerical rating scale of average back pain intensity in the past week (0, no pain; 10, pain as bad as can be imagined); (2) a 0 to 10 numerical rating scale of average leg pain intensity in the past week; (3) the Brief Pain Inventory (BPI) interference scale, which represents the mean of 7 ratings of back pain interference with general activity; mood; and ability to walk, perform normal work, engage in relations with other people, sleep, and enjoy life (0, no pain interference; 10, maximum pain interference)20,21; (4) the Patient Health Questionnaire (PHQ-4), a 4-item screen for depression and anxiety (score range, 0-10; 0, no pain interference; 10, maximum pain interference) 22; (5) the EuroQol health status measure (EuroQol 5D) consists of an index score that ranges from 0 (death) to 1 (perfect health) and reflects mobility, self-care, usual activities, pain and discomfort, and anxiety and depression and consists of a visual analog scale that ranges from 0 (the worst imaginable health state) to 100 (the best imaginable health state)23; and (6) a falls measure for which patients report the number of falls they experienced in the past 3 weeks and how many resulted in injury.24
We used electronic health record (EHR) data to calculate relative value units (RVUs),25- 27 assess resource use, and estimate Quan comorbidity scores,28 a weighted score derived from the Charlson comorbidity index29 based on 12 comorbidities. The Quan score and prior year RVUs were used in propensity matching. We obtained patient data 365 days before and after the index visit or until a patient either withdrew or died. The data included Current Procedural Terminology (CPT) codes for each procedure and filled prescription data. One site used ICD-9-CM procedure codes rather than CPT codes, so we converted ICD-9-CM procedure codes to the closest corresponding CPT codes. We captured CPT and ICD-9-CM codes for clinic visits, hospitalizations, and imaging tests. We did not account for medications. We linked CPT codes to year-specific RVUs, without including geographic modifiers. For applying costs to resource use, we used the Marketscan data warehouse to obtain 2012 payer and patient reimbursement amounts for CPT-based procedures and medications.30 Remaining outcomes were post hoc and exploratory.
Using data from the 12 months before the index visit, we calculated the Quan comorbidity score.28 To avoid altering the association between early imaging and RVUs, we summed all RVUs beginning with the day after the image (or the analogous day for the matched controls) until 365 days after the index visit for each individual. We performed similar summations for back pain–related RVUs (spine-related RVUs31; see eTable 1 in the Supplement for codes used), summarized as RVUs for physical therapy, injection therapy, imaging, and surgery.
If CPT codes were ambiguously spine-related, we only counted procedures as spine-related if they took place on the same date as unambiguous spine-related CPT codes. We classified all procedures on the index date as spine-related. Additionally, because 1 site used generic codes rather than CPT codes for all physical therapy encounters, we imputed RVUs at this site using year-appropriate RVUs for physical therapy from the other 2 sites. We included all procedures. For minor procedures, we imputed a year-appropriate 5-minute evaluation and management RVU (CPT code 99211).
We used natural language processing to help identify early imaging studies that reported cancer (see eText in the Supplement). We also examined the proportion of patients in each group diagnosed after the imaging date (or matched date for controls) with a serious condition that could be detected by spine imaging (cancer, spine infections, spine fractures, cauda equina compression). If patients in the early imaging group were more likely to be diagnosed with these conditions, it would raise the question of whether clinicians were missing these diagnoses in patients who did not receive early imaging.
Based on existing guidelines,32 we defined patients who underwent lumbar spine imaging within 6 weeks of their index visit as having received early imaging. Although guidelines differ in the exact length of time recommended to wait prior to imaging, 6 weeks is used by a variety of organizations33- 35 and was the consensus optimal approach in 1 study.36 We defined 2 separate early imaging cohorts: (1) patients undergoing early plain film imaging (radiographs) and (2) patients undergoing early advanced imaging (magnetic resonance imaging [MRI] or computed tomography [CT]) (see eTable 2 in the Supplement). We assigned patients who had multiple imaging procedures within the first 6 weeks to the radiograph or advanced imaging group based on their first study. Some patients assigned to the early radiograph group could also have received early MRI/CT, but only if the imaging occurred after their radiograph. Patients who had radiographs and an MRI/CT on the same day were assigned to the advanced imaging group.
We propensity-matched patients who underwent early imaging with a BOLD cohort patient who did not have any spine imaging within 6 weeks of the index visit. We constructed a propensity score as the logit function of the probability of receiving early imaging for a patient with specific characteristics or prognostic factors.37
Because the patient sample characteristics differed across study sites,15 we stratified the propensity score matching algorithm by site. All regressions included sex; self-assessed race/ethnicity; age; educational, smoking, and marital status; Quan comorbidity score; baseline diagnosis category (axial back pain alone, back and leg pain or herniated disk, lumbar spinal stenosis, and other); back pain duration; receipt of spine imaging in the year prior to index visit; days between index visit and interview; total RVUs in prior year; baseline back pain intensity; leg pain intensity; Roland-Morris Disability Questionnaire, EuroQol 5D, and PHQ-4 scores; and recovery expectations. We separately matched patients receiving early radiographs or early MRI/CT to the closest control using a greedy algorithm, which finds the closest match of nonimaged to imaged patients without replacement until no further matches can be identified.38 Nonimaged patients could serve as controls in both the early radiograph and the early MRI/CT analyses. After matching at each site, we combined data from all 3 sites for further analysis.
The study was approved by the institutional review boards (IRBs) of all the participating institutions. We obtained written or verbal consent from all patients with a waiver of documentation of consent having been granted by the IRBs.
We report summary descriptive statistics for the early radiograph and early MRI/CT groups. We used the McNemar tests for categorical variables and paired t tests for continuous variables to compare at baseline patients who received early imaging with matched patients who did not receive early imaging. We also calculated standardized differences between the matched groups for all variables. Finally, we used linear mixed-effects models to obtain adjusted differences between those who received early imaging and those who did not on total RVUs, spine-specific RVUs (further subdivided into those for physical therapy, injection therapy, imaging, and surgery), patient-reported outcome measures at 3, 6, and 12 months, and reimbursement estimates. Each model was adjusted for sex, age, baseline back or leg pain diagnosis category, baseline back pain duration, and RVUs in the 12 months before the index visit. We used Bonferroni-corrected significance thresholds (2-sided α of .05/3 = .017) for some exploratory comparisons (overall RVUs, spine-specific RVUs, and back pain intensity) for each 12-month analysis.
Given the sample sizes in the early radiograph (n = 1174) and early MRI/CT (n = 349) groups, the study had more than 90% power to detect small between-group differences in Roland-Morris Disability Questionnaire score (radiograph, 1-point difference; MRI/CT, 2-point difference), overall RVUs (radiograph, 18-RVU difference; MRI/CT, 11-RVU difference), and spine-specific RVUs (radiograph, 60-RVU difference; MRI/CT, 45-RVU difference). Power calculations were post hoc.
We performed all analyses using SAS statistical software version 9.3 (SAS Institute Inc).
Of the 5239 participants, we excluded 84 patients for whom EHR data were not available, 228 patients who withdrew before completing 1 year of follow-up, 34 who had a cancer diagnosis in the year prior to the index visit, 34 who died, 5 who had lumbar spine surgery in the year prior to the index visit, and 1 patient who had a bone scan and no other imaging within 6 weeks after the index visit (Figure). Of the 1264 patients (26%) who received early radiographs, 1174 were matched, and of the 366 patients (7.5%) who received early MRI/CT, 349 were matched. The baseline characteristics of the propensity-matched participants who underwent early diagnostics did not differ statistically or clinically from those who did not (Table 1 and Table 2).
Table 3 summarizes the 3-, 6-, and 12-month patient-reported outcome measures. Twelve-month cumulative RVUs in the early radiograph and their matched controls are available in eTable 3 in the Supplement. Table 4 and eTable 4 in the Supplement show analogous data for the early MRI/CT group and their matched controls. Follow-up rates of patient-reported outcome measures ranged from 88% to 91% across groups and time points. There was neither a statistically significant nor clinically meaningful difference in the primary outcome, the Roland-Morris Disability Questionnaire, between the early and not early imaging groups at any time point (eg, 3-month no early radiograph vs early radiograph, mixed-model difference −0.02 [95% CI, −0.46 to 0.42]). Patient-reported outcomes were not different between the groups except for leg pain numerical rating scale scores. Patients receiving early radiography had lower numerical scores at months 3 (mean difference, 0.31), 6 (mean difference, 0.30), and 12 (mean difference, 0.26) for leg pain than did those who did not receive early imaging. Although statistically significant, these differences were clinically unimportant. The 12-month differences between early radiograph patients and controls for other secondary outcomes were extremely small and not statistically significant: −0.071 (95% CI, −0.29 to 0.15) for BPI; 0.29 (95% CI, −1.42 to 2.01) for the EuroQol 5D visual analog scale; and −0.062 (95% CI, −0.29 to 0.17) for PHQ-4. Patients with early MRI/CT vs controls had statistically significant but not clinically meaningful differences on 2 measures: the early MRI/CT group had lower 6-month leg pain numerical rating scale scores (difference, −0.58 [95% CI, −1.07 to −0.089]) and higher 12-month EuroQol 5D-visual analog scale scores (difference, 4.04 [95% CI, 0.92 to 7.15]).
In contrast, there were marked differences in 1-year resource use and costs. Mean total RVUs were approximately 40% higher (P < .001) in the early radiograph and 50% higher (P = .01) in the early MRI/CT group than in the no early imaging or no imaging groups, and overall costs were 27% (P < .001) and 30% (P < .04) higher, respectively. Estimated monetary differences in 1-year total payments (payer and patient contributions) were $1380 higher (95% CI, $692 to $2060), for patients with early radiographs and $1430 higher (95% CI, $36.8 to $2820) for patients with early MRI/CTs (eTable 3 and eTable 4 in the Supplement). Early imaging cohorts incurred significantly greater mean RVUs, overall expenditures, and spine-related expenditures in most utilization categories. Spine-related, CPT-based expenditures as a percentage of overall expenditures were 17% in the early radiograph vs 7% for the no early or no radiograph group, and 29% for the early MRI/CT vs 11% for the no early or no MRI/CT group.
We did not observe differences in proportions of patients with cancer diagnoses in the next year among patients receiving early imaging vs controls (Table 5). Among patients who underwent early imaging, only 1 of 1630 (0.06%) had cancer (lymphoma) diagnosed on the early imaging study (lymphadenopathy seen on MRI). In contrast, patients who underwent imaging diagnostics early had more fractures detected (2% in the early radiograph group vs 0.6% in the no early or no radiograph group; 0.9% in the early MRI/CT group vs 0% in the no early or no MRI/CT group).
Our study demonstrates that older adults who had spine imaging within 6 weeks of a new primary care visit for back pain had pain and disability over the following year that was not different from matched patients who did not undergo early imaging. Patients receiving early imaging had small, clinically unimportant improvement in leg pain intensity and EuroQol 5D scores. We had hypothesized that patients undergoing early imaging would have worse outcomes, due to incidental findings leading to unnecessary and potentially harmful interventions. This was not the case. However, patients who had early imaging had substantially higher resource use and reimbursement expenditures than did matched controls, as reflected by greater RVUs, overall costs, and spine-specific costs. Overall, spine-specific, spine injection, and spine imaging RVUs and associated payer and patient expenditures were all greater in the early imaging groups than in the no early or no imaging groups.
Approximately 90% of older adults have incidental findings on spine imaging.11 These findings can lead to adverse labeling as well as unnecessary interventions with associated morbidity.39 Most guidelines exclude older patients from imaging restrictions. Prior studies suggested an association between early imaging and subsequent interventions13,40 and our results are concordant with a recent study of injured workers.41
Despite the lack of evidence supporting routine imaging for older adults with back pain, guidelines commonly recommend that older patients with back pain undergo imaging. Chou and colleagues1,42 recommended considering plain radiography for patients older than 50 years. The American College of Radiology’s guidelines state that early imaging with MRI is appropriate for patients older than 70 years and may be appropriate for patients older than 50 years with osteoporosis.32 The European guidelines for nonspecific low back pain classify patients older than 55 years as being in the red flag category for justifying imaging.43 Our study results support an alternative position that regardless of age, early imaging should not be performed routinely.
Because the rationale for undergoing early imaging is to avoid missing infrequent but serious diagnoses (cancer, infection, etc), we examined the proportion in each group that subsequently received these diagnoses over the ensuing year. We found that the proportion of cancer diagnoses was similar for both groups. Our data suggest that absence of early imaging is not associated with a higher incidence of missed cancer diagnoses. Only 1 case of cancer (lymphoma) was detected by early imaging, and this was not located in the spine but rather in adjacent adenopathy.
Our study has limitations. First, there is the potential for confounding by indication; ie, patients receiving early imaging had worse prognoses than patients not getting early imaging. We tried to minimize confounding through propensity matching. However, residual confounding of unmeasured attributes could exist. Confounding by health care site could also exist, since patient characteristics varied by site,15 as did patterns of care. Therefore, we adjusted for site in each analysis. Second, our data on pain duration is limited by the overlap of the pain duration categories. Third, our baseline measures were administered up to 3 weeks after the index visit and thus could reflect responses to therapy since the index visit. We assumed that all index-day procedures were related to the patient’s back pain, but patients’ index visits may have been for multiple problems, thus leading us to overattribute index-day procedures to back pain. Fourth, patients who are more likely to ask for early imaging might also be more likely to use resources subsequently. We attempted to control for this phenomenon by propensity matching for prior year RVUs and also controlled for prior year RVUs in our data analyses. Fifth, we assessed CPT-based and medication use but were not able to capture out-of-system use or indirect costs.
Among older adults with a new primary care visit for back pain, early imaging was not associated with better 1-year outcomes.
Corresponding Author: Jeffrey G. Jarvik, MD, MPH, Comparative Effectiveness, Cost and Outcomes Research Center (CECORC), Department of Radiology, University of Washington, PO Box 359728, 325 Ninth Ave, Seattle, WA 98104-2499 (firstname.lastname@example.org).
Author Contributions: Dr Jarvik and Mr Comstock had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Jarvik, Heagerty, Turner, Bresnahan, Friedly, Nerenz, Sullivan, Chan, Deyo.
Acquisition, analysis, or interpretation of data: Jarvik, Gold, Comstock, Heagerty, Rundell, Turner, Avins, Bauer, Bresnahan, Friedly, James, Kessler, Nedeljkovic, Nerenz, Shi, Chan, Schwalb, Deyo.
Drafting of the manuscript: Jarvik, Gold, Heagerty, Rundell, Bresnahan, Chan.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Jarvik, Gold, Comstock, Heagerty, Rundell, Avins, Kessler, Shi.
Obtained funding: Jarvik, Turner, Bresnahan, Friedly.
Administrative, technical, or material support: Jarvik, Gold, Comstock, Turner, Avins, Bauer, Bresnahan, Friedly, James, Nedeljkovic, Sullivan, Chan.
Study supervision: Jarvik, Heagerty, Avins, Nedeljkovic, Nerenz.
Conflict of Interest Disclosures: All authors have completed and submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest. Dr Jarvik reported that he served on the Comparative Effectiveness Advisory Board for GE Healthcare through October 2012; is a cofounder and stockholder of PhysioSonics, a high-intensity focused ultrasound company; receives royalties for intellectual property; and is also a consultant for HealthHelp, a radiology benefits management company. Dr Deyo reported that he has received honoraria as a member of the board of directors of the Informed Medical Decisions Foundation, a nonprofit organization; receives royalties from UpToDate for authoring topics on acute low back pain; and his university has received an endowment from Kaiser Permanente that supports part of his salary. He has current and pending grants from US federal agencies. Dr Bresnahan reported that he owns stock in and was an employee of Johnson & Johnson. No other disclosures are reported.
Funding/Support: This work was supported by grants 1R01HS01922201 and 1R01HS022972-01 from the Agency for Healthcare Research and Quality (AHRQ) and from the NIH Intramural Research Program (Dr Chan’s time).
Role of the Funder/Sponsor: The study sponsors had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication.
Additional Contributions: We thank the research staff for data collection and their overall dedication to this study: Brigham and Women’s Hospital and Harvard Vanguard Brian Orrick, MBA; Andrew Aherrera, MS; and Courtney Rust, BS; Kaiser Permanente Northern California Luisa Hamilton, MD; Jennifer Ireland, MA; Rebecca Rogot, BA; Karen Hansen, BA; Cynthia Huynh, MN; and Daniel Fernandez, BS; Henry Ford Health System Lisa Pietrantoni, BS; Ashkhen Movsisyan, BS; Paramjit Octain, MA; Lori Monia-Allen, BS; and Ryan Shelters, BS. We also thank Kathrine Tan, BS, PhD candidate, Department of Biostatistics, the University of Washington, Seattle, for help with the natural language processing (NLP). None mentioned herein received compensation for their contributions other than their usual salary support.
Correction: This article was corrected March 25, 2015, to align and add data in Tables 1 through 4.