Figure. Density plot for reduced model.
Mallen CD, Thomas E, Belcher J, Rathod T, Croft P, Peat G. Point-of-case prognosis for common musculoskeletal pain in older adults. JAMA Intern Med. Published online May 13, 2013. doi:10.1001/jamainternmed.2013.962.
eFigure 1. Current pain intensity, stratified by patient rating of change at 6 months.
eFigure 2. Current pain intensity profile plots extended to 3-year follow-up.
eFigure 3. Calibration plots for full model (median P value for Hosmer- Lemeshow test = .49).
eFigure 4. Calibration plots for reduced model (median P value for Hosmer-Lemeshow test = .61).
Mallen CD, Thomas E, Belcher J, Rathod T, Croft P, Peat G. Point-of-Care Prognosis for Common Musculoskeletal Pain in Older Adults. JAMA Intern Med. 2013;173(12):1119-1125. doi:10.1001/jamainternmed.2013.962
Author Affiliations: Arthritis Research UK Primary Care Centre, Keele University, Staffordshire, England.
Importance Many site-specific, multivariable risk models for predicting the outcome of musculoskeletal pain problems have been published. The overlapping content in these models suggests a common set of generic indicators suitable for use in primary care.
Objective To investigate whether a brief set of generic prognostic indicators can predict the outcome of musculoskeletal pain in older patients presenting to general practitioners.
Design, Setting, and Participants A prospective observational cohort study conducted from September 1, 2006, through March 31, 2007, of consecutive patients 50 years or older presenting with noninflammatory musculoskeletal pain to 1 of the 5 participating general practices in the United Kingdom.
Main Outcome Measures During consultation, the treating physician assessed and recorded 5 brief generic items (duration of present pain episode, current pain intensity, pain interference with daily activities, presence of multiple-site pain, and ultrashort depression screen) and recorded their overall prognostic judgment. The primary outcome was patient-rated improvement, which was measured 6 months after consultation and cross-validated with repeated measures up to 3 years.
Results A total of 194 (48.1%) of 403 participants were classified as having an unfavorable outcome at 6 months. Inclusion of 3 generic prognostic indicators (duration of present pain episode, pain interference with daily activities, and presence of multiple-site pain) in the prognostic model improved on reliance on physicians' prognostic judgment alone (C statistic = 0.72 vs 0.62; net reclassification index = 0.136; proportion correctly classified = 69%). The improvement in prognostic accuracy was attributable to correcting physicians' tendency toward overoptimistic expectations of outcome.
Conclusions and Relevance Three easy-to-obtain pieces of information followed by systematic recording of the general practitioners' prognostic judgment provide a simple generic assessment of prognosis at point of care in older persons presenting with musculoskeletal problems to primary care practices in the United Kingdom. Such an assessment offers a common foundation for investigating the usefulness of prognostic stratification for guiding management in the consultation across a range of common painful conditions.
Musculoskeletal conditions are the most common cause of severe long-term pain and long-term disability worldwide,1 accounting for a large and growing share of health care use.2 They are responsible for 20% to 30% of primary care consultations.3 In this setting, caseload is dominated by clinical syndromes of osteoarthritis, nonspecific spinal pain, and other noninflammatory regional pain.4 Most cases are managed within primary care practices and often, after exclusion of rare but serious causes, are treated in the absence of a specific diagnosis.
During the last 20 years, many studies of the prognosis of patients with these common conditions have been undertaken. These studies are designed to provide patients with more accurate information about the future course of their problem, to assist practitioners in predicting individual patient outcome, and to support more efficient targeting of treatment.5
To date these studies have been organized largely by anatomical location of pain, resulting in a proliferation of site-specific prediction models.6 This approach to prognosis research is likely to have limited clinical credibility7 for the general practitioner (GP) in a primary care setting: single-site musculoskeletal pain is the exception rather than the rule,8 a different prognostic model for each anatomical site is cumbersome to use in routine practice, and prognostic indicators included in research studies have rarely been collected at the point of care itself, often being too lengthy to fit within the time-limited primary care consultation.
The consistency of certain prognostic indicators for unfavorable outcome across site-specific prediction models9,10 suggests the potential value of simple, brief, generic prognostic indicators at the point of care. Among these common indicators are the occurrence of previous episodes of musculoskeletal pain, current episode duration, multiple-site pain, severity at presentation, and the presence of psychosocial obstacles to recovery. Physicians are aware of these common indicators,11 but the concept of generic prognosis has yet to be fully evaluated at the point of care. In this study, we investigate whether a set of brief generic potential indicators of prognosis, assessed by the GP at the point of care, followed by and combined with systematic recording of the GP's overall prognostic judgment predicted the outcome of musculoskeletal pain in older patients.
The Prognosis Research Strategy (PROG-RES) study is a prospective cohort of older adults with musculoskeletal pain presenting in primary care. Full details of this study are available in the open-access protocol12 and recruitment report.13 Ethical approval was granted by the Central Cheshire Local Research Ethics Committee (reference 06/Q1503/60). Consecutive patients 50 years and older presenting to 5 general practices in the United Kingdom (44 GPs) with noninflammatory musculoskeletal pain were eligible to participate. In all eligible consultations, a specially designed electronic template for data collection was activated when the GP entered an appropriate morbidity code, and a study tag was added to the patient record. Each practice recruited patients for a 3- to 4-month period. Patients were excluded if they had evidence of “red flags” (eg, significant traumatic injury and red, hot, swollen joints), had inflammatory arthropathy, or were deemed vulnerable by their GP (significant cognitive impairment or terminal illness). The GPs granted permission for eligible patients to be identified from weekly searches of the clinical databases and mailed a questionnaire from the practice within 1 week of consultation. Responses to this questionnaire were used to define cohort inclusion because this was the first opportunity to obtain consent from patients to access and analyze the information recorded by the GPs during the consultation and to contact them for follow-up. Follow-up questionnaires were mailed at 3, 6, 12, 24, and 36 months.
All prognostic information for this analysis was gathered by the treating GP at the point-of-care consultation. Outcome was evaluated by self-complete questionnaires mailed to the patients at 6 months. Data gathered from questionnaires sent within 1 week of consultation and at 3 and 6 months were used for obtaining patient consent, describing the sample, cross-validating the choice of outcome, and determining auxiliary variables in the imputation model. The self-complete questionnaires at all 3 postconsultation time points included the following: sociodemographic information; the nature of problem onset; previous consultations for this problem; standardized, validated measures of pain intensity and pain interference with daily activities14; a pain manikin15; a 2-item ultrabrief depression screen16; and the Hospital Anxiety and Depression Scale score.17
The questions at the point of care that were included in the brief generic prognostic assessment are given in Table 1. They were developed after a systematic review of the literature,6 a GP survey,11 and secondary analysis of a large data set.9
The GPs asked patients the questions during the consultation and recorded their responses within the electronic template. The GPs then recorded their own prediction of the likely outcome of this pain in 6 months' time in response to the prompt, “What do you think the outcome of this patient's pain will be in 6 months' time?” (response options: completely recovered, much better, better, same, worse, or much worse).
The primary outcome measure was patient global rating of change at 6 months—a recommended core outcome measure in chronic pain and osteoarthritis clinical trials21,22 (“Compared with when you first saw your doctor with this pain 6 months ago, how do you feel your pain is now?” response options: completely recovered, much better, better, same, worse, or much worse).23 Before data analysis, we defined unfavorable outcome as the same, worse, or much worse and favorable outcome as completely recovered, much better, or better. The validity of this definition was confirmed by producing profile plots for current pain intensity at each time point (point of care and postconsultation at 2 weeks, 3 months, and 6 months) by patient global rating of change and calculating the proportion of patients achieving moderate, substantial, and extensive pain relief at 6 months according to predefined criteria.24 The primary end point was specified at 6 months in the study protocol and permitted GP prediction to be matched exactly to the observed outcome. We used extended follow-up to 3 years after consultation to confirm that our primary outcome at 6 months was indicative of longer-term outcome.
A sequence of 3 prognostic models was fitted using multivariable logistic regression with the binary outcome of favorable or unfavorable outcome at 6 months. In the first model, only GP prediction was included. In the second model, we added the 5 generic items gathered by the GP in the consultation (full model). Because the form of the models and categorization of predictors were prespecified, no statistical selection procedure for variables was involved. In the third model, we aimed to reduce the second model using backward elimination procedure based on P < .10 (reduced model).
The percentage of missing values for predictors revealed an order effect, with missing data being least for the first item in the template (duration of present episode, 2.5% missing) and increasing with each subsequent item up to the final item (GP judgment, 14.4% missing). Outcome at 6 months was missing in 18.9%. Multiple imputation used all 6 predictors, plus outcome, plus the following auxiliary variables: practice, duration of present episode, current pain intensity, pain interference with daily activities, multiple-site pain, positive depression screen result, presence of mild or worse depression (Hospital Anxiety and Depression Scale score >7),17 and patient outcome expectation. With the exception of practice, all auxiliary variables were gathered in the postconsultation questionnaire.
Ninety imputed data sets were created because 49% of study participants had complete data and we required the ratio of fraction of missing information to the number of multiple imputations to be less than 1%.25 Imputations were performed using the -ice- package in STATA statistical software (release 11; StataCorp LP).26
A multivariable logistic regression model was fitted at each stage to each of the 90 imputed data sets. Model parameters were estimated by combining across imputed data sets using Rubin's rules.27 The significance of each prognostic factor was assessed using the Wald test. Backward elimination was performed on the pooled coefficients to select a final model of prognostic factors with statistical significance set at P < .10.
The heuristic shrinkage factor and C statistic were calculated for each of the models and averaged across the 90 data sets. Predictions from each imputation were combined using Rubin's rules. Calibration plots and the median value of the Hosmer-Lemeshow tests were used to assess model fit. We produced density plots28 to visually illustrate model discrimination by showing the distribution of predicted probability of unfavorable outcome among those who did and did not improve. We also calculated the net reclassification index29 with 95% CIs comparing GP overall prediction (favorable vs unfavorable outcome) to the combination of individual generic items and GP prediction (model 3) model (predicted risk of unfavorable outcome <50% vs ≥50%). C statistics, density plots, and net reclassification index were reported for the models without bias correction.
It was anticipated that the recruitment of consecutive consulters would selectively enroll patients who had previously consulted for their problem. The sequence of prognostic model fitting was, therefore, repeated in the subset of first-time consulters.
From September 1, 2006, through March 31, 2007, participating GPs completed prognostic assessments on 650 patients. A total of 502 completed postconsultation questionnaires were received (77.2%). The mean time between the GP consultation and return of the questionnaire was 16 days. A total of 403 participants gave consent to both medical record review and further contact: the 99 nonconsenting respondents to the baseline questionnaire were older (17.0% compared with 8.2% were ≥80 years), but no other substantial differences by sex, pain severity, general health, anxiety, or depression were found. Completed follow-up questionnaires were returned by 358 (88.8%) and 327 (81.1%) participants at 3 and 6 months, respectively.
Of the 403 cohort participants, 245 (60.8%) were female; the mean (SD) patient age was 64.8 (10.1) years (age range, 50-97 years) (Table 2). The presenting symptom, as recorded by the GP during the consultation, was typically coded under nonspecific symptom codes (255 in total, such as low back pain and knee arthralgia); 48 participants received a diagnosis of osteoarthritis, and 83 had a range of other diagnoses (plantar fasciitis [n = 14] and shoulder/subacromial impingement [n = 11] were the most common).
On average, participants presented with moderate pain at the time of consultation (current pain intensity as measured by a 0- to 10-point numerical rating scale; mean [SD] score, 6.1 [2.2]), which reduced during follow-up (mean [SD] scores: 5.5 [2.6] two weeks after consultation, 4.5 [2.8] at 3 months, and 4.2 [2.9] at 6 months). The proportions of patients who had experienced moderate (30% reduction in current pain intensity), substantial (50% reduction), and extensive (70% reduction) reductions in pain intensity at 6 months compared with the time of consultation were 49.9%, 39.9%, and 25.3%, respectively. On the separate global rating of change completed by participants at 6 months, 194 (48.1%) manifested an unfavorable outcome. Global rating of change was closely related to changes in pain intensity (eFigure 1) On average, participants with an unfavorable outcome at 6 months continued to experience pain well above patient acceptable symptom states, defined as pain below 4 on a 0- to 10-point numerical rating scale29 up to 3 years after initial consultation (eFigure 2).
The GPs' overall prediction of outcome at 6 months was correct in 251 patients (62.3%) (odds ratio [OR], 2.78; 95% CI, 1.69-4.57; C statistic = 0.62) (Table 3), although the prediction tended to be overoptimistic (predicted rate of unfavorable outcome, 0.37; 95% CI, 0.32-0.42; compared with an observed rate of 0.48).
The combined model, which included the brief list of individual items recorded during the consultation by the GP together with the GP's overall judgment, improved on the ability of GP judgment alone to discriminate between favorable and unfavorable outcome at 6 months (C statistic = 0.72 vs 0.62; Table 3).
The reduced model retained duration of present episode, pain interference with daily activities, presence of multiple-site pain, and GP overall judgment and showed calibration comparable with the full model (see eFigure 3 and eFigure 4 for calibration plots). Discrimination was also similar to the full model (C index = 0.72) and is illustrated in the Figure. Compared with GP judgment alone, the use of these 3 generic prognostic indicators together with the subsequent judgment of the GPs correctly reclassified an estimated net proportion of 0.165 (P < .001) of patients who had not improved at 6 months and 0.029 (P = .34) of those who had improved at 6 months, yielding a net reclassification improvement of 0.136 (95% CI, 0.086-0.186; P = .004) (Table 4). This model correctly classified 68.7% of patients.
Fitting the reduced model in the subset of patients presenting for the first time with their problem resulted in essentially the same performance as the whole sample (C statistic = 0.72; median P value for Hosmer-Lemeshow test for goodness of fit = .65). Duration of present episode (OR, 3.26; 95% CI, 1.15-9.29), pain interference (1.64; 0.62-4.33), multiple-site pain (2.75; 0.91-8.33), and overall GP judgment (adjusted OR, 2.80; 95% CI, 0.95-8.30) remained important prognostic indicators in this subset.
Our study addressed 3 challenges in this field: to reduce the bewildering proliferation of tools developed for each of the many different syndromes and sites of musculoskeletal pain, to focus prognostic research on the point of care, and to investigate how to incorporate GPs' own judgments into structured prognostic tools and models. We found that 3 simple generic questions, incorporated into routine consultations, combined with the GPs' own subsequent recorded prognoses successfully predicted 6-month outcome in 7 of every 10 older patients presenting to primary care practices with noninflammatory musculoskeletal pain.
Our study implicitly assumed that prognostic indicators ought to be evaluated in the context of physician judgment. There has been mixed evidence from previous studies on the accuracy of such judgment: findings of good accuracy in early studies30 have not been reproduced in more recent studies of GPs31,32 and emergency physicians.33 The model we have derived in our study supports the incorporation of clinician judgment in point-of-care generic prognosis. However, it was incorporated in a particular way—the judgment was made in response to a systematic question and followed after the GP had obtained answers to the individual prognostic items. The GPs may well have used the information supplied by those answers in coming to their judgment. We found in our analysis that the retention of 3 of the individual items improves on the accuracy of the GP's judgment taken on its own and corrects the generally overoptimistic predictions by the GP, particularly in patients whose pain has lasted longer than 3 months. Participants enrolled in this study did not have an unusually poor prognosis compared with previous studies, and a similar systematic bias in clinicians' predictions has been noted in other settings.34 This finding may reflect the perceived helpfulness of an upbeat attitude and the benefits of reinforcing positive prospects.35,36
A systematic review6 found only 4 previously published studies37- 40 and 1 pilot study41 of the prognosis of general (as opposed to anatomical site-specific) musculoskeletal pain or illness in primary care. A further series of studies derived a risk score for chronic back pain,42 which has been externally validated43 and found to perform well in patients with headache, orofacial pain, and knee pain.10,44 These have not, however, involved practitioners gathering prognostic indicators at the point of care—a unique feature of the current study but one that highlights continuing challenges in optimal brief assessment despite the convergence of findings around a core set of domains. A feature from previous studies has been the support for psychological prognostic indicators. Yet these are often based on multiple-item self-complete questionnaires or diagnostic interviews, which may be difficult to implement at the point-of-care in routine primary care settings. Our study did not support the prognostic usefulness of ultrashort depression screening questions.16 Misclassification of depressive symptoms is one explanation.45 Alternatives developed by item selection from parent scales for low back pain34,46 may prove to be of generic value. Our choice of items for pain persistence and diffuseness were strongly guided by feasibility. Although other items, notably pain days in the last 6 months and a count of comorbid pain symptoms from a predetermined checklist, have been more extensively studied in research settings,10,14,39,42 they were deemed unlikely to be administered by practitioners at the point of care. Our study demonstrates the feasibility of simple binary items on pain elsewhere and episode duration at the point of care, but greater prognostic discrimination may still be possible with better items.
Although the prespecification of all prognostic indicators included in our model and the lack of univariable selection are positive features in our study, a number of limitations should be noted. Our decision to categorize several indicators, despite being made before data analysis and using recognized cut points, resulted in a loss of information. We did not specifically examine the interaction between site of presenting pain and model performance, and recent secondary analysis from Dutch cohorts confirms that any such interaction is likely to be small and inconsistent.47
The level of discrimination that was possible at the point of care (C = 0.72) was in the expected range for prognostic models48 and, given the very simple and brief nature of the indicators gathered and the fact that this was done by practitioners during the consultation, compares well with previous site-specific musculoskeletal pain prognostic models that have tended to use more elaborate measures gathered from patients outside the consultation.49- 54 However, our model (and nearly all of the preceding efforts in primary care musculoskeletal prognosis) will almost certainly be overfitted and perform less well in a new sample of patients. The heuristic shrinkage factors we presented give some measure of this. Irrespective of internal validation procedures, prediction models ought to undergo temporal and external validation before evaluating their effect on patient outcomes,55,56 and the current one is no exception.
In an increasing number of clinical areas, including cardiovascular disease, GPs are able to calculate a prognostic risk score to help guide shared clinical decisions with their patients. There are no accepted clinically meaningful decision thresholds for common musculoskeletal pain that could guide targeted management and no information to date about the associated costs and benefits of doing so. However, the prognostic accuracy achieved by our approach compares favorably with the accuracy of other such risk scores and was done with the increased efficiency of a generic instrument that could be applied across a range of conditions. The performance of this generic approach to prognostic assessment of musculoskeletal pain in older people, as well as its brevity and practicality, provides the basis for investigating its effect and usefulness at point of care in the GP consultation. Improved identification of patients at risk for unfavorable outcome by GPs can facilitate more effective targeting of interventions and lead to improved clinical and health economic outcomes for patients with musculoskeletal pain.57 A plethora of site-specific prognostic models is an impediment to extending this work to the breadth of common musculoskeletal problems treated by primary care physicians. We hope that our findings signal a change in direction for prognosis research in this field. We hope research shifts away from a multitude of site-specific models to a single platform of generic indicators, akin to the notion of established risk indicators that exist in other fields.58
Correspondence: Christian D. Mallen, PhD, Keele University, Arthritis Research United Kingdom Primary Care Centre, Primary Care Sciences Research Centre, Keele, Staffordshire ST5 5BG, England (firstname.lastname@example.org).
Accepted for Publication: February 8, 2013.
Published Online: May 13, 2013. doi:10.1001/jamainternmed.2013.962
Author Contributions:Study concept and design: Mallen, Thomas, Croft, and Peat. Acquisition of data: Mallen and Thomas. Analysis and interpretation of data: All authors. Drafting of the manuscript: Mallen, Thomas, Belcher, Rathod, and Peat. Critical revision of the manuscript for important intellectual content: Mallen, Thomas, Belcher, Croft, and Peat. Statistical analysis: Mallen, Thomas, Belcher, Rathod, and Peat. Obtained funding: Mallen, Croft, and Peat. Study supervision: Mallen, Thomas, Belcher, Croft, and Peat.
Conflict of Interest Disclosures: None reported.
Funding/Support: This study was supported by Arthritis Research UK Primary Care Research Fellowship 16037 awarded to Dr Mallen. Dr Croft is a National Institute for Health Research senior investigator and member of the Medical Research Council Prognosis ReSearch Strategy (PROG-RES) Partnership.
Role of the Sponsor: Keele University had no role in the design and conduct of the study; in the collection, analysis, and interpretation of the data; or in the preparation, review, or approval of the manuscript.
Previous Presentation: This study was presented at the 40th Annual Meeting of the North American Primary Care Research Group; December 4, 2012; New Orleans, Louisiana.
Additional Contributions: Elaine Hay, MD, Jonathan Hill, PhD, and Kate Dunn, PhD, provided comments on the draft of the manuscript. We thank the participating patients and practices, and the Keele General Practice Research Partnership, and the administration and informatics team for assisting in the conduct of this study.