Flowchart of article selection.
Summary receiver operating characteristic curve of magnetic resonance imaging performance in 16 studies. Bubble size represents sample size.
Head-to-head performance of magnetic resonance imaging (MRI) and technetium Tc 99m bone scanning (A), MRI and plain radiography (B), and MRI and white blood cell (WBC) scanning (C). SROC indicates summary receiver operating characteristic curve.
Kapoor A, Page S, LaValley M, Gale DR, Felson DT. Magnetic Resonance Imaging for Diagnosing Foot OsteomyelitisA Meta-analysis. Arch Intern Med. 2007;167(2):125-132. doi:10.1001/archinte.167.2.125
Uncertainty exists regarding the optimal workup of patients with suspected osteomyelitis of the foot, many of whom have diabetes mellitus. We conducted a meta-analysis to determine the diagnostic test performance of magnetic resonance imaging (MRI) for osteomyelitis of the foot and compared this performance with that of technetium Tc 99m bone scanning, plain radiography, and white blood cell studies.
We searched MEDLINE (from 1966 to week 3 of June 2006) and EMBASE (from 1980 to week 3 of June 2006) for English-language studies in which adults suspected of having osteomyelitis of the foot or ankle were evaluated by MRI. We then extracted data using a standard form derived from the Cochrane Methods Group. To summarize the performance of diagnostic tests, we used the summary receiver operating characteristic curve analysis, which relies on the calculation of the diagnostic odds ratio (DOR). We also examined subsets of studies defined by the presence or absence of particular design flaws or populations.
Sixteen studies met inclusion criteria. In all studies combined, the DOR for MRI was 42.1 (95% confidence interval, 14.8-119.9), and the specificity at a 90% sensitivity cut point was 82.5%. The DOR did not vary greatly among subsets of studies. In studies in which a direct comparison could be made with other technologies, the DOR for MRI was consistently better than that for bone scanning (7 studies—149.9 vs 3.6), plain radiography (9 studies—81.5 vs 3.3), and white blood cell studies (3 studies—120.3 vs 3.4).
We found that MRI performs well in the diagnosis of osteomyelitis of the foot and ankle and can be used to rule in or rule out the diagnosis. Magnetic resonance imaging performance was markedly superior to that of technetium Tc 99m bone scanning, plain radiography, and white blood cell studies.
Osteomyelitis of the foot and ankle is the primary or secondary reason for 75 000 hospitalizations in the United States each year.1 By far the most common group at risk is persons with diabetes mellitus. In terms of diagnostic evaluations, history and routine laboratory tests, including the erythrocyte sedimentation rate, are not particularly informative.2,3 Although bone biopsy serves as a gold standard diagnostic test and is generally safe, the fear of introducing infection and the need for a surgical practitioner to perform the biopsy make development of diagnostic algorithms using noninvasive imaging strategies attractive.
Plain radiography is the traditional and often the initial modality used for evaluating bone infections in the foot. Radiographic changes are often not visible until 2 to 4 weeks after onset of infection, accounting in part for the low sensitivity of plain radiography.4- 6 The specificity of plain radiography tends to be higher than its sensitivity but can be compromised by posttraumatic reactions, nonspecific periosteal reactions as seen in chronic venous stasis, and most commonly Charcot osteoarthropathy. Charcot osteoarthropathy, or Charcot foot, is a disruption in foot architecture that results from microtrauma to an insensate foot. It is often indistinguishable from osteomyelitis.
Bone scanning with technetium Tc 99m [99mTc]–labeled diphosphonate can detect early changes of osteomyelitis but suffers from lack of specificity. White blood cell (WBC) scanning, usually with indium In 111, is more specific but lacks sensitivity.5 In addition, WBC scanning requires the inconvenient and time-consuming process of drawing and incubating patient blood before reinjecting and obtaining images.
Diagnostic findings of pedal osteomyelitis on MRI include a focal area of decreased marrow signal intensity on T1-weighted images and a focally increased signal intensity on fat-suppressed T2-weighted or short tau inversion recovery images.7 The MRI changes seen in osteomyelitis may be confused with changes seen in bony infarcts, fractures, and Charcot foot.
Accurate estimates of MRI test performance for osteomyelitis of the foot are difficult to establish. Most studies have reported on small cohorts, combined persons with suspected osteomyelitis of the foot and those with suspected osteomyelitis of other body sites, included persons both with and without diabetes, and left unstated the prevalence of Charcot foot. Studies have drawn different conclusions about the value of MRI alone (or compared with other technologies) and have reported vastly different estimates of diagnostic specificity (0%-100%). Previous systematic reviews8- 10 were limited by the number of publications they analyzed or the lack of foot-specific information.
We conducted a comprehensive meta-analysis of the test performance of MRI for the diagnosis of osteomyelitis of the foot and ankle. We then conducted subset analyses to explore the reason for variability among included studies. We also compared the accuracy of MRI with 99mTc bone scanning, plain radiography, and WBC scanning.
We searched MEDLINE (from 1966 to week 3 of June 2006) and EMBASE (from 1980 to week 3 of June 2006) for English-language articles. (The complete search strategy is available from the authors on request.) We also searched the bibliographies of included studies and asked specialists within the fields of surgery and radiology to recommend citations.
We included studies that evaluated the diagnostic test performance of MRI in adult patients suspected of having osteomyelitis of the foot or ankle or who had foot infection and were systematically examined for osteomyelitis. Specifically, studies were enrolled when information from the usual diagnostic performance 2 × 2 table (index diagnostic test result positive or negative vs true disease state present or absent) could be extracted about discrete foot and ankle cases, when 80% or more of the patients were 16 years or older, and when at least one site with the disease and one without were identified by the reference standard (eg, bone biopsy). Two authors (A.K. and S.P.) evaluated each study for inclusion, and a third author (D.T.F.) refereed ties.
Once studies were selected, we used a data extraction instrument derived from the Cochrane Methods Group checklist on Systematic Review of Screening and Diagnostic Tests.11 Two independent reviewers (A.K. and S.P.) extracted data that pertained to study population characteristics. The prevalence of diabetes in each study was noted. To understand better the quality of the sensitivity and specificity estimates reported in each study, we also extracted information about blinding and the type of reference standard used. Specifically, we calculated the frequency that a bone biopsy–based reference standard was used to determine or exclude osteomyelitis. For a positive disease determination, positive histologic analysis or culture results (however it was determined by the study authors) from a bone specimen were recorded. For the negative reference standard, we recorded the percentage of patients in whom osteomyelitis was excluded by negative histologic analysis results (however it was determined by the authors). If an individual study characteristic was not explicitly documented, no determination was made regarding the study status and a “not specified” label was assigned to the study for that characteristic. No attempt was made to contact study authors except to determine whether the same patients were enrolled more than once by authors with multiple publications.
We extracted data on the diagnostic performance of bone scan, plain radiography, and WBC studies if a 2 × 2 diagnostic performance table could be derived from a study that was already included for its MRI data.
For each study, we constructed a 2 × 2 contingency table that consisted of true-positive, false-positive, false-negative, and true-negative results according to the reference standard used in each case. We then calculated the sensitivity and specificity in the usual fashion and the diagnostic odds ratio (DOR), as determined by the formula (true positive × true negative)/(false positive × false negative). We then conducted a summary receiver operating characteristic curve analysis as our meta-analytic method. This method has been described before.12,13 We repeated this analysis in 13 subsets that represented different study populations (eg, low or unspecified prevalence of Charcot foot) and the presence or absence of design flaws (eg, no blinding).
We compared the head-to-head test performance of MRI with 3 other imaging tests. Because we had collected data on all MRI diagnostic studies, we evaluated these other technologies compared with MRI, which was our focus in this study. We included only studies in which 1 of the 3 diagnostic modalities was compared with MRI. To make comparisons, we used the same summary receiver operating characteristic method mentioned earlier. In certain studies not every patient underwent each test being compared, perhaps because a diagnosis was reached when the patient underwent the first diagnostic test, making the next one unnecessary. To account for this bias, we also measured the performance of the subset of studies in which all (or nearly all) patients underwent both diagnostic tests being compared. All statistical procedures were performed using SAS statistical software, version 9.1.3 (SAS Institute Inc, Cary, NC).
Our search strategy yielded 2070 titles with and without abstracts. One author (A.K.) reviewed them and requested 110 articles for full-text review. After eliminating those that did not meet the inclusion criteria (Figure 1), we were left with the 17 studies described in Table 1.14- 31 One study27 examined only patients with suspected osteomyelitis in or around a Charcot joint. Although this was not an a priori exclusion criterion, we believed the study was not consistent with the intent of our analysis. We chose to provide a summary estimate of the performance of MRI in a typical patient population at risk rather than one in which the population is artificially enriched with problem cases, and so the study was eliminated from analysis.
Eleven of the 16 studies involved almost exclusively diabetic patients. Nine of 16 recruited patients prospectively (ie, study authors enrolled patients before any imaging tests were recorded). Most studies did not specify or standardize the exact reason for diagnostic suspicion of osteomyelitis. In many cases, it was implied by the presence of a complicated or infected foot ulcer. Indeed, foot ulcer was required or uniformly present in 6 studies. In most instances, the number of cases with Charcot disease was not reported.
Most studies judged an MRI scan to be positive by the same criterion: a lesion in the bone that showed focally decreased marrow signal intensity in T1-weighted images and a focally increased signal intensity in fat-suppressed T2-weighted or short tau inversion recovery images. Eight studies14,16,18,20,22- 24,28 evaluated other diagnostic signs that were sometimes termed secondary signs, including cortical disruption, adjacent cutaneous ulcer, soft tissue mass, presence of a sinus tract, and, in some cases, adjacent soft tissue inflammation or edema. See Table 2 for further details.14- 31
The prevalence of criterion standard–defined osteomyelitis averaged approximately 50%, with a range of 32% to 89%. Most authors reported results according to the number of sites with potential osteomyelitis or number of at-risk bones imaged. We calculated a ratio of the number of patients to number of sites and compared MRI performance in studies with low and high ratios. Magnetic resonance imaging sensitivity was usually high and ranged from 77% to 100%; MRI specificity ranged from 40% to 100%.
Among all studies, the DOR for MRI was 42.1 (95% confidence interval [CI], 14.8-119.9). The specificity at a clinically relevant cut point of 90% sensitivity was 82.5%. We present the curve for diagnostic performance in Figure 2. We found no substantial or statistically significant differences in estimates of MRI diagnostic test performance among subsets of studies (available from the authors on request). The number of studies in certain subgroups was small (eg, subgroup with Charcot prevalence >10%), with small numbers of patients represented in each. Small numbers in subsets prohibit robust conclusions. We therefore only discuss the subsets for which 8 or more studies were available for analysis. Studies that did not use bone histologic analysis to exclude disease tended to have higher performance (DOR, 67.4; 95% CI, 18.3-248.0). Studies published in 1998 or afterward reported lower performance (DOR, 25.3; 95% CI, 5.5-116.8). Most of the later studies had a prospective design and documented assessment of MRI blinded to other results.
We compared the diagnostic performance of 4 technologies in studies that compared MRI with another imaging test (Table 3). We found 7 studies that directly compared MRI with 99mTc bone scanning, all using the triple-phase technique. Magnetic resonance imaging performance was markedly superior (DOR, 149.9; 95% CI, 54.6-411.3) vs bone scan (DOR, 3.6; 95% CI, 1.0-13.3) (Figure 3). At the 90% sensitivity cut point, the specificity for MRI was 98% compared with 28.5% for technetium. Similarly, in 9 studies that compared plain radiography with MRI, MRI outperformed plain radiography (DOR, 81.5; 95% CI, 14.2-466.1 compared with DOR, 3.3; 95% CI, 2.2-5.0). In 3 studies in which MRI was compared with WBC study, the DOR for MRI was 120.3 (95% CI, 61.8-234.3) compared with 3.4 (95% CI, 0.2-62.2) for WBC studies.
This meta-analysis demonstrated that MRI performs well in the diagnosis of osteomyelitis of the foot and ankle in adults. Good diagnostic performance was consistent across a subset of studies of different designs and different patients. Moreover, MRI outperformed technetium, plain radiography, and WBC studies.
Although the performance of MRI was strong, our review revealed many flaws in the published literature concerning imaging tests for osteomyelitis in the foot or ankle. Few studies prospectively followed up a cohort of patients in which assessment of MRI was blinded to other imaging tests and reference standard results, and few verified the diagnosis in all cases with a biopsy. Although the estimate of performance did not change substantially within study subsets, the relatively small number of studies did not permit exploring the combined effect of multiple design issues.
In addition, the frequency of Charcot foot was not typically documented in our studies. Performance estimates could vary significantly among studies of varying prevalence of Charcot foot. In the 13 studies in which prevalence was not documented, the prevalence of Charcot foot was probably low. However, in diabetic patients with coexistent diabetic foot infection, prevalence is uncertain. Although it may be uncommon in the general diabetic population, Charcot foot is likely much more prevalent among patients with peripheral neuropathy.32 Given that the management strategies for Charcot foot and osteomyelitis are vastly different (offloading and contact casting vs long-term antibiotics), making an accurate diagnosis is essential.
This meta-analysis has 2 major implications. First, this study confirms that MRI is a strong test to aid in both confirming and excluding osteomyelitis of the foot. Using the clinically relevant cut point of 90% sensitivity, the positive likelihood ratio is 5.1 and the negative likelihood ratio is 0.12. Assuming a pretest probability of 50% (not far from the 55% calculated from all studies), a patient with a positive MRI would have an 84% chance of having the diagnosis (Table 4). If any other features or examination findings favor the diagnosis of osteomyelitis, such as substantial depth of ulcer or positive probe to bone, the addition of a positive MRI virtually clinches the diagnosis. A negative MRI study in our baseline hypothetical patient results in a posttest probability of 11%. Combined with absence of substantial ulcer depth or a negative probe to bone, MRI effectively rules out osteomyelitis.
The second major implication is that there should be a diminished use of 99mTc bone scanning in the diagnosis of osteomyelitis of the foot. Although bone scanning has been proposed for ruling out the disease (given its purported high sensitivity), the lack of adequate specificity creates many false-positive results.4,6,33 Using a hypothetical prevalence of osteomyelitis of 25%, for every 100 patients subjected to bone scanning, 12 of 13 with a negative result would have the diagnosis correctly excluded (negative predictive value, 91%), but only 24 of 87 with a positive result would be correctly identified as having osteomyelitis (positive predictive value, 27%). A diagnostic algorithm that includes bone scanning would thereby result in the ordering of numerous second imaging tests or biopsies. At a lower prevalence of osteomyelitis (eg, 5%-15%), 99mTc scanning may successfully rule out disease (Table 4). However, clinicians often underestimate the prevalence of osteomyelitis, particularly in patients with a diabetic foot infection; this finding suggests that an assumption of low prevalence may be risky.34 In addition, MRI permits detection of deep collections of pus or necrotic tissue and visualization of foot anatomy, which helps the surgeon plan surgery when indicated. The Infectious Disease Society of America has already recognized that MRI is the preferred advanced imaging test for suspected osteomyelitis but recommends performing serial plain radiography before ordering an MRI.35 We are unaware of any study that has formally evaluated serial plain radiography vs early MRI. Such an investigation and/or cost-effectiveness analysis would likely clarify better the place for MRI in the diagnostic algorithm of osteomyelitis of the foot.35 Clinicians should, of course, consider history, physical examination findings, and imaging test results before deciding on therapeutic interventions.
In this meta-analysis, we chose to focus on osteomyelitis of the foot and ankle, because disease of the foot and ankle is a distinct entity that affects a particular patient population, that is, patients with diabetes and/or peripheral neuropathy. We did not analyze non–English-language articles. We are unaware of any evidence of bias in English language studies that assessed technology.
We did not exclude studies on the basis of date of publication or advent of innovation or variation in interpretation of MRI. Although MRI evaluation of osteomyelitis has evolved, with gadolinium now often used, subset analysis based on gadolinium use did not reveal any substantial variation in performance. Secondary diagnostic signs (such as cortical breaks) appeared to be incorporated into the diagnostic algorithm for osteomyelitis more frequently in recent publications. Ahmadi et al31 recently published a retrospective analysis of additional criteria for use in assessing osteomyelitis superimposed on Charcot foot, but this work is still largely untested.
As for the comparison of diagnostic tests, we focused on comparing MRI with plain radiography and radionuclide scanning, making 3 discrete head-to-head comparisons with MRI. A recent review by the Health Technology Assessment group supports this approach, suggesting that heterogeneity of diagnostic test comparisons will be less of a problem in head-to-head comparisons.36 We did not analyze other imaging modalities, such as combined bone scanning and WBC study, computed tomography, immunoglobulin tagged tracer scanning, or positron emission tomography, because 2 or fewer studies directly compared them to MRI. We did not compare the performance of biopsy with MRI. A biopsy affords information regarding the exact pathogen responsible for infection, something that imaging tests cannot do.
According to 2006 figures, Medicare reimburses $288 for a 3-phase bone scan and $416 for a lower-extremity MRI without contrast ($451 with contrast).37,38 We calculated these values on the basis of how our center bills Medicare, which is the sum of the reimbursement to our facility when providing the service to an outpatient and the reimbursement to the radiologist interpreting the film (the professional component alone). For an inpatient, Medicare reimburses the relevant diagnosis-related code; therefore, unique MRI payment information is not available. Given the small difference in cost (which is approximated by Medicare reimbursement) between MRI and 99mTc bone scan and the large difference in diagnostic performance between these technologies, MRI would be more cost-effective except when the probability of disease was low. Local availability and cost of each test must also be considered when selecting the appropriate test. Formal decision modeling is needed to fully characterize the place of MRI in the diagnostic algorithm of osteomyelitis of the foot and ankle.
In summary, MRI has a strong performance in the diagnosis of osteomyelitis of the foot and ankle in adults. It outperforms 3-phase 99mTc bone scanning and plain radiography. The role of bone scanning is probably eclipsed by that of MRI except in cases in which MRI is contraindicated or the probability of disease is low.
Correspondence: Alok Kapoor, MD, Division of General Internal Medicine, Boston University, 91 E Concord St, MAT 200, Second Floor, Boston, MA 02118 (firstname.lastname@example.org).
Accepted for Publication: September 15, 2006.
Author Contributions: Dr Kapoor had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Study concept and design: Kapoor and Felson. Acquisition of data: Kapoor. Analysis and interpretation of data: Kapoor, Page, LaValley, Gale, and Felson. Drafting of the manuscript: Kapoor, Page, and Felson. Critical revision of the manuscript for important intellectual content: Kapoor, Page, LaValley, and Gale. Statistical analysis: LaValley. Obtained funding: Felson. Administrative, technical, and material support: Kapoor, Page, Gale, and Felson. Study supervision: Felson.
Financial Disclosure: None reported.
Funding/Support: This study was supported by National Research Service Award T-32 HP 10028-08 and by grant AR47785 from the National Institutes of Health.
Acknowledgment: We thank Gary Gibbons, MD, Charles Foster, MD, and Jorge Medina, MD, for their consultation on this project. Special thanks also to Louise Falzon, MLIS, and Joseph Harzbecker, MLS, for the guidance in preparation of search strategies.