Figure 1. Distribution of percentage of standards satisfied in our survey compared with that of Shaneyfelt et al.15
Figure 2. Median number ± interquartile range of Institute of Medicine standards satisfied by guidelines according to the year they were published.
Kung J, Miller RR, Mackowiak PA. Failure of Clinical Practice Guidelines to Meet Institute of Medicine StandardsTwo More Decades of Little, If Any, Progress. Arch Intern Med. 2012;172(21):1628-1633. doi:10.1001/2013.jamainternmed.56
Author Affiliations: Department of Medicine, University of Maryland School of Medicine, Baltimore (Drs Kung and Mackowiak); Medical Care Clinical Center, VA Maryland Health Care System, Baltimore (Dr Mackowiak); and GlaxoSmithKline, Research Triangle Park, North Carolina (Dr Miller).
Background In March 2011, the Institute of Medicine (IOM) issued a new set of standards for clinical practice guidelines intended to enhance the quality of guidelines being produced. To our knowledge, no systematic review of adherence to such standards has been undertaken since one published over a decade ago.
Methods Two reviewers independently screened 130 guidelines selected at random from the National Guideline Clearinghouse (NGC) website for compliance with 18 of 25 IOM standards.
Results The overall median number (percentage) of IOM standards satisfied (out of 18) was 8 (44.4%), with an interquartile range of 6.5 (36.1%) to 9.5 (52.8%). Fewer than half of the guidelines surveyed met more than 50% of the IOM standards. Barely a third of the guidelines produced by subspecialty societies satisfied more than 50% of the IOM standards surveyed. Information on conflicts of interest (COIs) was given in fewer than half of the guidelines surveyed. Of those guidelines including such information, COIs were present in over two-thirds of committee chairpersons (71.4%) and 90.5% of co-chairpersons. Except for US government agency–produced guidelines, criteria used to select committee members and the selection process were rarely described. Committees developing guidelines rarely included an information scientist or a patient or patient representative. Non-English literature, unpublished data, and/or abstracts were rarely considered in developing guidelines; differences of opinion among committee members generally were not aired in guidelines; and benefits of recommendations were enumerated more often than potential harms. Guidelines published from 2006 through 2011 varied little with regard to average number of IOM standards satisfied.
Conclusion Analysis of a random sample of clinical practice guidelines archived on the NGC website as of June 2011 demonstrated poor compliance with IOM standards, with little if any improvement over the past 2 decades.
Over the past 2 decades, clinical practice guidelines have played an increasingly prominent role in dictating the practice of medicine in the United States and other developed countries. The number of organizations creating such guidelines has proliferated exponentially over time, as have the number of their guidelines. Some 2700 clinical practice guidelines are archived in the Agency for Healthcare Research and Quality's National Guideline Clearinghouse (NGC). Over 6800 reside in the Guidelines International Network.1
It has been hoped that, if properly developed and widely applied, clinical practice guidelines would enhance the practice of medicine by helping physicians and patients synthesize the dizzying array of clinical information, which, like piling Ossa on Pelion, expands year after year.1,2 Unfortunately, while some studies suggest that clinical practice guidelines help to reduce inappropriate practice variation, accelerate the translation of research into clinical practice, and improve the quality and safety of health care, many have come to question the validity and reliability of such guidelines.3- 6 Their concerns have focused on the quality of the evidence on which clinical practice guidelines are based,7 the tendency of guidelines to promote more care rather than more effective care,8,9 their narrow focus and use as marketing and opinion-based pieces rather than road maps to improved medical care,5 and the difficulties involved in customizing population-based recommendations to individual patients.10 Also of concern has been the lack of transparency in the process by which clinical practice guidelines are created and potential conflicts of interest (COIs) that might bias those preparing them.1,6,11- 13 In response to these concerns, the Institute of Medicine (IOM) issued a new set of standards for clinical practice guidelines in March 2011, intended to enhance the transparency and objectivity of guidelines being produced and to standardize the format by which they are developed.14
The purpose of our study was to systematically examine adherence to the IOM standards by guidelines archived on the NGC website. To our knowledge, this is the first such comprehensive analysis since one published in 1999 by Shaneyfelt et al,15 with which we compare our findings.
The NGC was the source of clinical practice guideline data analyzed in this study.16 Five clinical practice guidelines were selected at random in June 2011 from each of the 26 Medical Subject Headings (MeSH) under the general MeSH topic of “Diseases.” Sixteen of the 130 guidelines selected at random were encountered under 2 separate MeSHs. These were considered only once in the analysis, resulting in a total of 114 individual guidelines actually analyzed. Fifteen of these were selected at random for comparison with the original versions of the guidelines published on the sponsoring organization's website. Four were not accessible, leaving 11 for this comparison.
A set of 18 standards for the development and reporting of clinical practice guidelines were selected from the IOM recommendations in Clinical Practice Guidelines We Can Trust.14 Seven IOM standards (or recommendations) dealing with how recommendations should be articulated, external reviews, and regular monitoring of the literature following publication of guidelines were not included in the analysis because we felt they were too vague and subjective to be analyzed. Each clinical practice guideline summary was independently evaluated for compliance by 2 of us (J.K. and P.A.M.). Discrepancies in evaluations were resolved through open discussion. In evaluating each guideline summary, care was taken to be as liberal as possible in considering that a standard was met when the individual guideline summary provided any information pertaining to that particular standard. This was done in order to take into account differences in subject matter, data format, and presentation between guidelines. In cases in which the NGC guideline summary contained no information on adherence to a standard, that individual standard was considered to not have been met.
Clinical guidelines were divided into specific subgroups based on their developing agency (US developers, non-US developers, US government agencies, and medical specialty societies), and by disease category focus (infectious diseases, oncology, obstetrics-gynecology [OB/GYN], and all other categories). For the overall sample and for each subgroup, the median number (interquartile range) of IOM standards met was determined (possible values ranging from 0 to 18), as well as the number of guidelines within each group satisfying at least 9 (≥50%) of the 18 IOM standards were calculated.
Comparisons of the adherence to IOM standards between subgroups were performed as follows: (1) guidelines involving US developers vs guidelines involving non-US developers, (2) guidelines developed by US government agencies vs all other guidelines, (3) guidelines developed by medical specialty societies vs all other guidelines, and (4) a comparison between all 4 subspecialty categories (infectious diseases, oncology, OB/GYN, and all other categories). For comparisons of the median number of IOM standards met, Mann-Whitney and Kruskal-Wallis tests were performed. For the comparisons of the proportion of guidelines meeting at least 50% of standards, Fischer exact tests and χ2 tests were performed. The level of statistical significance was established at a 2-sided P < .05. All analyses were performed using SPSS statistical software (version 17.0.0; IBM Corp).
The overall median number of IOM standards satisfied (out of 18) was 8 (44.4%), with an interquartile range of 7 (38.9%) to 10 (55.6%) (Figure 1). Fewer than half of the guidelines surveyed met more than 50% of the IOM standards (Table 1). Subspecialty societies were the worst in this regard, with barely a third of their guidelines satisfying more than 50% of the IOM standards surveyed. Only a quarter of guidelines concerned with OB/GYN conditions met this target. However, in neither case were these poor performances significantly different from the performances of the other groups with which they were compared.
With regard to IOM standards for “Guideline Development and Format” (Table 2), adherence was again poor, with information on COIs given in fewer than 50% of the guidelines surveyed. In those guidelines including such information, COIs were present in over two-thirds of committee chairpersons (71.4%) and 90.5% of co-chairpersons. Guidelines produced by medical specialty societies and non-US developers were less likely to include information on COI than other organizations. Non-US developers performed worse with respect to COI standards than US developers. Except for US government agency–produced guidelines, criteria used to select committee members and the selection process were rarely described. Committees developing guidelines rarely included either an information scientist or a patient or patient representative.
Adherence to IOM standards for “Evidence Identification and Summary” (Table 3) was substantially better, except for the use of non-English literature, unpublished data, and/or abstracts, which was rare. As anticipated, non-US developers were more likely to include non-English literature in formulating their guidelines than US developers, but such literature tended to involve only 1 additional language (eg, French). Adherence of US government agencies (in particular, the Centers for Disease Control and Prevention [CDC]) to standards regarding “Data Collection Method Given” and “Quality of Evidence Rated” was poorer than that of the other developers.
Adherence to IOM standards for “Formulation and Recommendations” (Table 4) was also reasonably good, except that differences of opinion rarely were described. The benefits of recommendations were enumerated more often than potential harms. Free public access to guidelines was nearly universal.
Guidelines published from 2006 through 2011 varied little with regard to the average number of IOM standards satisfied (Figure 2). When the results of the present survey were compared with those of Shaneyfelt et al15 published in 1999, the distributions of the mean number of standards satisfied were similar. Fifteen of the 26 guidelines (58.0%) that could be evaluated had not been updated in 5.5 years or less. Sixteen (14.0%) of the guidelines examined had a US Food and Drug Administration (FDA) alert added after their publication. Comparison of versions of 11 randomly selected guidelines archived on the NGC website with those appearing on the developers' websites showed only minor differences in adherence to IOM standards, with the developers' versions meeting 0.36 ±1.57 fewer standards than the NGC versions.
Clinical practice guidelines are intended to help clinicians analyze and assimilate the ever-expanding and often contradictory body of medical information into clinical practice. Many believe they are the most appropriate vehicle for addressing concerns over quality of care, inappropriate variation in physician practice, and the upwardly spiraling cost of medical care.1,6 In the courts and in health policy debates, they are frequently the final arbiters of medical care.17,18 This is in no small part due to the cachet they derive from the organizations under whose imprimatur they are formulated, as well as the prestige of the journals in which they are published. Sadly, even the best are far from perfect, and these imperfections have caused some to wonder if clinical practice guidelines actually serve the purpose for which they are intended.
In their review of clinical practice guidelines published between 1985 and 1997, Shaneyfelt et al15 found generally poor compliance of guidelines with methodological standards promulgated by the American Medical Association, the IOM, and the Canadian Medical Association. These standards have since been refined by a special committee of the IOM, which on March 23, 2011, released 2 complementary standards for conducting systematic reviews and creating clinical practice guidelines.14 The present study provides a snapshot of the state of compliance with this new set of standards just after its publication. The overall mean number of standards satisfied in our analyses (out of 18) was 8.37, or 46.5%. This proportion is nearly identical to that of Shaneyfelt et al,15 who reported an overall mean adherence of 10.77 out of 25 standards, or 43.1%. However, whereas they found significant improvement in adherence to standards over time, we detected no such improvement.
Of the specific areas in which clinical practice guidelines need to be improved, none is more pressing than that having to do with the composition of committees developing the guidelines. As pointed out by Sniderman and Furberg,17(p430) “what is to be decided [by committees producing guidelines] is often already decided with the selection of the deciders.” Nevertheless, in our investigation, rarely did guidelines contain information on the criteria used to select committee members or the process by which selections were made. Fewer than half of the guidelines surveyed addressed COIs, and among those that did, such COIs were pervasive. Particularly troubling was the finding that over two-thirds of committee chairpersons for whom information was provided had COIs. In addition, less than a third of guidelines prepared under the aegis of subspecialty organizations—whose recommendations carry added weight because of their special expertise and whose members stand to profit directly from such recommendations—included information on COIs. Guidelines developed by specialty societies previously have been cited for poor quality and questionable validity.19 Although information scientists and patient representatives could enhance the quality of clinical practice guidelines by providing their unique perspectives and also possibly by mitigating the influence of COIs among clinician members, they were rarely included as committee members.
Adherence to IOM standards in identifying and summarizing evidence used to formulate guidelines was relatively good. However, rarely were non-English articles, unpublished data, or abstracts included in literature searches. Surprisingly, compliance of US government agencies, in particular the CDC, with these standards was substantially worse than that of other organizations. Although the actual compliance of US government agencies might have been better than indicated by our analysis, this was not evident in their guidelines archived on the NGC website.
Adherence to the IOM standards for Formulation of Recommendations was also reasonably good. However, guidelines were nearly always written in such a way as to suggest that recommendations were unanimously supported by committee members. Rarely were differences of opinion aired in the published documents. The role of opinion (as opposed to evidence) in formulating recommendations was typically covered under the heading “expert consensus” rather than explicitly addressed. Whereas with rare exception both the benefits and harms of recommendations were described, these description were frequently general rather than specific, and benefits almost invariably received greater attention than harms.
In Clinical Practice Guidelines We Can Trust,14(p100) the IOM states simply that guidelines should “Be reconsidered and revised as appropriate when important evidence warrants modifications.” In that high-quality systematic reviews directly relevant to clinical practice have a survival of 5.5 years,20 it has been suggested that clinical practice guidelines should be updated at least every 5 years.5 Fewer than half of the guidelines we reviewed had been updated in 5.5 years or less. The need for such regular updating was confirmed by our observations that FDA Alerts were added to 14% of the guidelines we reviewed following their publication.
There are several limitations to our study. First, we focused on versions of the guidelines archived on the NGC website, which might have differed from those published on the developers' websites. However, when we compared compliance with the IOM standards as reflected by guidelines archived on the NGC website with a random sample of those published on the developers' websites, we found a mean difference in scores of only 1.81 (10.0%).
Second, like Shaneyfelt et al,15 we used a yes or no format in determining adherence to standards and did not assess the relative quality of a guideline's compliance with given standards. For example, the criterion on “evidence supporting individual recommendations given” was met when some, but not necessarily all, recommendations given were accompanied by supporting evidence. This approach of holding guidelines to only the simplest criteria was used for all of the IOM standards examined, making the poor performance of the guidelines reviewed of even greater concern. Had they been held to more stringent criteria (eg, credit given for satisfying the that standard only if all recommendations were accompanied by supporting information), the results would have been substantially worse.
Third, because we relied on material reported in NGC versions of the guidelines, our findings would have been influenced not only by the quality of the guidelines themselves, but also by the quality of the reporting process. It is possible that in some cases organizations used different (perhaps more appropriate) techniques in developing their guidelines from those described in either the original versions or those archived on the NGC website. In some cases, for example, NGC guidelines did not distinguish between the roles of “guideline committees” and “groups that authored the guidelines,” which complicated our efforts to properly assign COIs. Be that as it may, as with all medical reports, documentation of methods used is fundamental in that the validity of conclusions can only be determined if the methods used to develop them are explicitly stated.
Finally, we monitored only 18 of the 25 IOM standards for clinical practice guidelines, and these differed slightly from those monitored by Shaneyfelt et al15 in their investigation. The standards we selected for our analysis were some of the most objective in the IOM report. Compliance might have been better (or worse) overall if all of the standards had been examined, although we believe that the 18 standards we examined were representative of the group as a whole. Likewise, although our standards differed slightly from those of Shaneyfelt et al,15 the similarity of our results to theirs suggests that little, if any, progress has been made over the past quarter century in improving the quality of clinical practice guidelines. Although the practice guidelines reviewed were developed prior to publication of the IOM standards analyzed herein, we believe our results provide a useful baseline for monitoring the impact of these standards on the future quality, transparency, and objectivity of clinical practice guidelines.
Of the possible approaches that might be taken in bringing about such improvement, vigorous enforcement of the IOM standards is the most logical. This would require a concerted effort on the part of medical societies and government agencies to ensure that the producers of guidelines familiarize themselves with the standards and incorporate them into the process by which they formulate clinical practice guidelines. Medical journals—particularly those belonging to subspecialty societies, which publish society guidelines without review—should insist that guidelines meet criteria related to IOM standards no less stringent than those applied to clinical trials prior to being considered for publication. These criteria should pay particular attention to the problem of COIs, the process by which committees are selected, and the need for patient representatives and information scientists on these committees.
Correspondence: Philip A. Mackowiak, MD, Medical Care Clinical Center, VA Maryland Health Care System, Medical Service-111, 10 N Greene St, Baltimore, MD 21201 (firstname.lastname@example.org).
Accepted for Publication: July 9, 2012.
Published Online: October 22, 2012. doi:10.1001/2013.jamainternmed.56
Author Contributions:Study concept and design: Kung and Mackowiak. Acquisition of data: Kung and Mackowiak. Analysis and interpretation of data: Kung, Miller, and Mackowiak. Drafting of the manuscript: Kung, Miller, and Mackowiak. Critical revision of the manuscript for important intellectual content: Kung, Miller, and Mackowiak. Statistical analysis: Kung and Miller. Administrative, technical, and material support: Mackowiak. Study supervision: Mackowiak.
Financial Disclosure: Dr Miller is employed by GlaxoSmithKline.