Objective Because of the pressure for timely, informed decisions in public health
and clinical practice and the explosion of information in the scientific literature,
research results must be synthesized. Meta-analyses are increasingly used
to address this problem, and they often evaluate observational studies. A
workshop was held in Atlanta, Ga, in April 1997, to examine the reporting
of meta-analyses of observational studies and to make recommendations to aid
authors, reviewers, editors, and readers.
Participants Twenty-seven participants were selected by a steering committee, based
on expertise in clinical practice, trials, statistics, epidemiology, social
sciences, and biomedical editing. Deliberations of the workshop were open
to other interested scientists. Funding for this activity was provided by
the Centers for Disease Control and Prevention.
Evidence We conducted a systematic review of the published literature on the
conduct and reporting of meta-analyses in observational studies using MEDLINE,
Educational Research Information Center (ERIC), PsycLIT, and the Current Index
to Statistics. We also examined reference lists of the 32 studies retrieved
and contacted experts in the field. Participants were assigned to small-group
discussions on the subjects of bias, searching and abstracting, heterogeneity,
study categorization, and statistical methods.
Consensus Process From the material presented at the workshop, the authors developed a
checklist summarizing recommendations for reporting meta-analyses of observational
studies. The checklist and supporting evidence were circulated to all conference
attendees and additional experts. All suggestions for revisions were addressed.
Conclusions The proposed checklist contains specifications for reporting of meta-analyses
of observational studies in epidemiology, including background, search strategy,
methods, results, discussion, and conclusion. Use of the checklist should
improve the usefulness of meta-analyses for authors, reviewers, editors, readers,
and decision makers. An evaluation plan is suggested and research areas are
explored.
Because of pressure for timely and informed decisions in public health
and medicine and the explosion of information in the scientific literature,
research results must be synthesized to answer urgent questions.1-4
Principles of evidence-based methods to assess the effectiveness of health
care interventions and set policy are cited increasingly.5
Meta-analysis, a systematic approach to identifying, appraising, synthesizing,
and (if appropriate) combining the results of relevant studies to arrive at
conclusions about a body of research, has been applied with increasing frequency
to randomized controlled trials (RCTs), which are considered to provide the
strongest evidence regarding an intervention.6,7
However, in many situations randomized controlled designs are not feasible,
and only data from observational studies are available.8
Here, we define an observational study as an etiologic or effectiveness study
using data from an existing database, a cross-sectional study, a case series,
a case-control design, a design with historical controls, or a cohort design.9 Observational designs may lack the experimental element
of a random allocation to an intervention and rely on studies of association
between changes or differences in 1 characteristic (eg, an exposure or intervention)
and changes or differences in an outcome of interest. These designs have long
been used in the evaluation of educational programs10
and exposures that might cause disease or injury.11
Studies of risk factors generally cannot be randomized because they relate
to inherent human characteristics or practices, and exposing subjects to harmful
risk factors is unethical.12 At times, clinical
data may be summarized in order to design a randomized comparison.13 Observational data may also be needed to assess the
effectiveness of an intervention in a community as opposed to the special
setting of a controlled trial.14 Thus, a clear
understanding of the advantages and limitations of statistical syntheses of
observational data is needed.15
Although meta-analysis restricted to RCTs is usually preferred to meta-analyses
of observational studies,16-18
the number of published meta-analyses concerning observational studies in
health has increased substantially during the past 4 decades (678 in 1955-1992,
525 in 1992-1995, and more than 400 in 1996 alone).19
While guidelines for meta-analyses have been proposed, many are written
from the meta-analyst's (author's) rather than from the reviewer's, editor's,
or reader's perspective20 and restrict attention
to reporting of meta-analyses of RCTs.21,22
Meta-analyses of observational studies present particular challenges because
of inherent biases and differences in study designs23;
yet, they may provide a tool for helping to understand and quantify sources
of variability in results across studies.24
We describe here the results of a workshop held in Atlanta, Ga, in April
1997, to examine concerns regarding the reporting of Meta-analysis Of Observational
Studies in Epidemiology (MOOSE). This article summarizes deliberations of
27 participants (the MOOSE group) of evidence leading to recommendations regarding
the reporting of meta-analyses. Meta-analysis of individual-level data from
different studies, sometimes called "pooled analysis" or "meta-analysis of
individual patient data,"25,26
has unique challenges that we will not address here. We propose a checklist
of items for reporting that builds on similar activities for RCTs22 and is intended for use by authors, reviewers, editors,
and readers of meta-analyses of observational studies.
We conducted a systematic review of the published literature on the
conduct and reporting of meta-analyses in observational studies. Databases
searched included MEDLINE, Educational Resources Information Center, PsycLIT
(http://www.wesleyan.edu/libr), and the Current Index to Statistics.
In addition, we examined reference lists and contacted experts in the field.
We used the 32 articles retrieved to generate the conference agenda and set
topics of bias, searching and abstracting, heterogeneity, study categorization,
and statistical methods. We invited experts in meta-analysis from the fields
of clinical practice, trials, statistics, epidemiology, social sciences, and
biomedical editing.
The workshop included an overview of the quality of reporting of meta-analyses
in education and the social sciences. Plenary talks were given on the topics
set by the conference agenda. For each of 2 sessions, workshop participants
were assigned to 1 of 5 small discussion groups, organized around the topic
areas. For each group, 1 of the authors served as facilitator, and a recorder
summarized points of discussion for issues to be presented to all participants.
Time was provided for the 2 recorders and 2 facilitators for each topic to
meet and prepare plenary presentations given to the entire group. We proposed
a checklist for meta-analyses of observational studies based on the deliberation
of the independent groups. Finally, we circulated the checklist for comment
to all conference attendees and representatives of several constituencies
who would use the checklist.
The checklist resulting from workgroup deliberations is organized around
recommendations for reporting background, search strategy, methods, results,
discussion, and conclusions (Table 1).
Reporting of the background should include the definition of the problem
under study, statement of hypothesis, description of the study outcome(s)
considered, type of exposure or intervention used, type of study design used,
and complete description of the study population. When combining observational
studies, heterogeneity of populations (eg, US vs international studies), design
(eg, case-control vs cohort studies), and outcome (eg, different studies yielding
different relative risks that cannot be accounted for by sampling variation)
is expected.8
Reporting of the search strategy should include qualifications of the
searchers, specification of databases used, search strategy and index terms,
use of any special features (eg, "explosion"), search software used, use of
hand searching and contact with authors, use of materials in languages other
than English, use of unpublished material, and exclusion criteria used. Published
research shows that use of electronic databases may find only half of all
relevant studies, and contacting authors may be useful,27
although this result may not be true for all topic areas.28
For example, a meta-analysis of depression in elderly medical inpatients29 used 2 databases for the search. In addition, bibliographies
of retrieved papers were searched. However, the authors did not report their
search strategy in enough detail to allow replication. An example of a thorough
"reject log" can be found in the report of a meta-analysis of electrical and
magnetic field exposure and leukemia.30 Examples
of a table characterizing studies included can be found in Franceschi et al31 and Saag et al.32
Complete specification of search strategy is not uniform; a review of 103
published meta-analyses in education showed that search procedures were described
inadequately in the majority of the articles.10
Items in this checklist section are concerned with the appropriateness
of any quantitative summary of the data; degree to which coding of data from
the articles was specified and objective; assessment of confounding, study
quality, and heterogeneity; use of statistical methods; and display of results.
Empirical evidence shows that reporting of procedures for classification and
coding and quality assessment is often incomplete: fewer than half of the
meta-analyses reported details of classifying and coding the primary study
data, and only 22% assessed quality of the primary studies.10
We recognize that the use of quality scoring in meta-analyses of observational
studies is controversial, as it is for RCTs,16,33
because scores constructed in an ad hoc fashion may lack demonstrated validity,
and results may not be associated with quality.34
Nevertheless, some particular aspects of study quality have been shown to
be associated with effect: eg, adequate concealment of allocation in randomized
trials.35 Thus, key components of design, rather
than aggregate scores themselves, may be important. For example, in a study
of blinding (masking) of readers participating in meta-analyses, masking essentially
made no difference in the summary odds ratios across the 5 meta-analyses.36 We recommend the reporting of quality scoring if
it has been done and also recommend subgroup or sensitivity analysis rather
than using quality scores as weights in the analysis.37,38
While some control over heterogeneity of design may be accomplished
through the use of exclusion rules, we recommend using broad inclusion criteria
for studies, and then performing analyses relating design features to outcome.8 In cases when heterogeneity of outcomes is particularly
problematic, a single summary measure may well be inappropriate.39
Analyses that stratify by study feature or regression analysis with design
features as predictors can be useful in assessing whether study outcomes indeed
vary systematically with these features.40
Investigating heterogeneity was a key feature of a meta-analysis of
observational studies of asbestos exposure and risk of gastrointestinal cancer.41 The authors of the meta-analysis hypothesized that
studies allowing for a latent period between the initiation of exposure and
any increases in risk should show, on average, appropriately higher standardized
mortality ratios than studies that ignored latency. In other words, the apparent
effect of exposure would be attenuated by including the latent period in the
calculation of time at risk (the "denominator"), since exposure-related deaths
(the "numerator") would, by definition, not occur during that latent period
(Figure 1).
In fact, the data suggested that studies allowing for latent periods
found on average somewhat higher standardized mortality ratios than studies
ignoring latency. This example shows that sources of bias and heterogeneity
can be hypothesized prior to analysis and subsequently confirmed by the analysis.
Recommendations for reporting of results include graphical summaries
of study estimates and any combined estimate, a table listing descriptive
information for each study, results of sensitivity testing and any subgroup
analysis, and an indication of statistical uncertainty of findings.
The discussion should include issues related to bias, including publication
bias, confounding, and quality. Bias can occur in the original studies (resulting
from flaws in the study design that tend to distort the magnitude or direction
of associations in the data) or from the way in which studies are selected
for inclusion.42 Publication bias, the selective
publication of studies based on the magnitude (usually larger) and direction
of their findings, represents a particular threat to the validity of meta-analysis
of observational studies.43-45
Thorough specifications of quality assessment can contribute to understanding
some of the variations in the observational studies themselves. Methods should
be used to aid in the detection of publication bias, eg, fail-safe procedures
or funnel plots.46
Schlesselman47 comments on such biases
in assessing the possible association between endometrial cancer and oral
contraceptives. This meta-analysis combined both cohort and case-control studies
and used a sensitivity analysis to illustrate the influence of specific studies,
such as those published in English.
Due to these biases in observational studies, the conclusion of the
report should contain consideration of alternative explanations for observed
results and appropriate generalizations of the conclusion. A carefully conducted
meta-analysis can reveal areas warranting further research. Finally, since
funding source has been shown to be an important source of heterogeneity,48 the sponsoring organization should be disclosed and
any effect on analysis should be examined.
Taking stock of what is known in any field involves reviewing the existing
literature, summarizing it in appropriate ways, and exploring the implications
of heterogeneity of population and study for heterogeneity of study results.
Meta-analysis provides a systematic way of performing this research synthesis,
while indicating when more research is necessary.
The application of formal meta-analytic methods to observational studies
has been controversial.42 One reason for this
has been that potential biases in the original studies, relative to the biases
in RCTs, make the calculation of a single summary estimate of effect of exposure
potentially misleading. Similarly, the extreme diversity of study designs
and populations in epidemiology makes the interpretation of simple summaries
problematic, at best. In addition, methodologic issues related specifically
to meta-analysis, such as publication bias, could have particular impact when
combining results of observational studies.44,47
Despite these challenges, meta-analyses of observational studies continue
to be one of the few methods for assessing efficacy and effectiveness and
are being published in increasing numbers. Our goal is to improve the reporting
of these meta-analyses so that readers can understand what was done in a given
analysis, who did it, and why it was done. If bias is a problem, we suggest
that an informative approach is to use broad inclusion criteria for studies
and then to perform analyses (when the data permit) relating suspected sources
of bias and variability to study findings.
Methodologic and interpretational concerns make the clear and thorough
reporting of meta-analyses of observational studies absolutely essential.
Our workshop was convened to address the problem of increasing diversity and
variability that exist in reporting meta-analyses of observational studies.
In constructing the checklist, we have attempted, where possible, to provide
references to literature justifying the inclusion of particular items.
Assessment of the usefulness of recommendations for reporting is dependent
on a well-designed and effectively conducted evaluation. The workshop participants
proposed a 3-pronged approach to determine usefulness and implementation of
these recommendations.
First, further comments should be incorporated into revisions of the
checklist, to ensure its usefulness to journal reviewers and editors. The
US Food and Drug Administration (FDA) receives and reviews petitions and applications
for approval of regulated products and/or their labeling. The FDA's Center
for Food Safety and Applied Nutrition is now receiving applications that use
results of meta-analyses in support of the requested action. The revised checklist
should be tested during the review of an application. One might randomly assign
FDA reviewers who encounter systematic reviews of observational studies to
use the checklist or not. Since the requirements for reporting for regulatory
purposes might not completely coincide with those in the checklist and since
sample size (the number of formal systematic reviews received by the FDA)
might be small, this evaluation should document any potential incompatibility
between requirements for regulatory reporting and the checklist.
Second, we will work with the Cochrane Collaboration to promote the
use of these recommendations by Cochrane collaborative review groups.49 Members of the Cochrane Collaboration are involved
routinely in performing systematic reviews. Some are now incorporating nonrandomized
studies out of necessity. A trial of use of the checklist could be compared
with the FDA experience.
Third, an evaluation of the checklist by authors, reviewers, readers,
and editors could compare objective measures of the quality of articles written
with and without the formal use of the guidelines. A challenge to the use
of quality measures would be arriving at a valid measure of quality. A more
important end point for trials in journals is process measures. Questions
of interest include whether the use of the checklist makes preparation and
evaluation of manuscripts easier or is otherwise helpful. Again, defining
the constructs of interest present crucial challenges to this research.
Less formal evaluations, based on comments from users in any of the
above groups, would certainly be helpful, as well. One would need to be concerned
about contamination of the control groups when evaluating the checklist, as
journals, for example, might adopt the checklist even in the absence of evidence
of its efficacy from randomized trials.
In conclusion, the conference participants noted that meta-analyses
are themselves observational studies, even when applied to RCTs.50
If a role for meta-analyses of observational studies in setting policy is
to be achieved,51 standards of reporting must
be maintained to allow proper evaluation of the quality and completeness of
meta-analyses.
1.Greenland S. Quantitative methods in the review of epidemiologic literature.
Epidemiol Rev.1987;9:1-30.Google Scholar 2.Chalmers TC, Lau J. Meta-analytic stimulus for changes in clinical trials.
Stat Methods Med Res.1993;2:161-172.Google Scholar 3.Badgett RG, O'Keefe M, Henderson MC. Using systematic reviews in clinical education.
Ann Intern Med.1997;126:886-891.Google Scholar 4.Ohlsson A. Systematic reviews.
Scand J Clin Lab Invest Suppl.1994;219:25-32.Google Scholar 5.Bero LA, Jadad AR. How consumers and policymakers can use systematic reviews for decision
making.
Ann Intern Med.1997;127:37-42.Google Scholar 7.Petitti D. Meta-Analysis, Decision Analysis, and Cost Effectiveness
Analysis. New York, NY: Oxford University Press; 1994.
9.Peipert JF, Phipps MG. Observational studies.
Clin Obstet Gynecol.1998;41:235-244.Google Scholar 10.Sipe TA, Curlette WL. A meta-synthesis of factors related to educational achievement.
Int J Educ Res.1997;25:583-598.Google Scholar 11.Ioannidis JP, Lau J. Pooling research results.
Jt Comm J Qual Improv.1999;25:462-469.Google Scholar 12.Lipsett M, Campleman S. Occupational exposure to diesel exhaust and lung cancer: a meta-analysis.
Am J Public Health.1999;89:1009-1017.Google Scholar 13.Vickers A, Cassileth B, Ernst E.
et al. How should we research unconventional therapies?
Int J Technol Assess Health Care.1997;13:111-121.Google Scholar 15.Blettner M, Sauerbrei W, Schlehofer B.
et al. Traditional reviews, meta-analyses and pooled analyses in epidemiology.
Int J Epidemiol.1999;28:1-9.Google Scholar 17.Lau J, Ioannidis JP, Schmid CH. Summing up evidence.
Lancet.1998;351:123-127.Google Scholar 18.Shapiro S. Meta-analysis/shmeta-analysis.
Am J Epidemiol.1994;140:771-778.Google Scholar 19.Stroup DF, Thacker SB, Olson CM, Glass RM. Characteristics of meta-analyses submitted to a medical journal. From: International Congress on Biomedical Peer Review and Global
Communications; September 17-21, 1997; Prague, Czech Republic.
20.Lang TA, Secic M. How to Report Statistics in Medicine. Philadelphia, Pa: American College of Physicians; 1997.
21.Cook DJ, Sackett DL, Spitzer WO. Methodologic guidelines for systematic reviews of randomized control
trials in health care from the Potsdam consultation on meta-analysis.
J Clin Epidemiol.1995;48:167-171.Google Scholar 22.Moher D, Cook DJ, Eastwood S.
et al. Improving the quality of reports of meta-analyses of randomised controlled
trials.
Lancet.1999;354:1896-1900.Google Scholar 24.Egger M, Scheider M, Davey-Smith G. Meta-analysis.
BMJ.1998;316:140-144.Google Scholar 25.Stewart LA, Parmar MK. Meta-analysis of the literature or of individual patient data?
Lancet.1993;341:418-422.Google Scholar 26.Steinberg K, Smith SF, Lee N.
et al. Comparison of effect estimates from a meta-analysis of summary data
from published studies and from a meta-analysis using individual patient data
for ovarian cancer studies.
Am J Epidemiol.1997;145:917-925.Google Scholar 27.McManus RJ, Wilson S, Delaney BC.
et al. Review of the usefulness of contacting other experts when conducting
a literature search for systematic reviews.
BMJ.1998;317:1562-1563.Google Scholar 28.Hetherington J, Dickersin K, Chalmers I, Meinert CL. Retrospective and prospective identification of unpublished controlled
trials.
Pediatrics.1989;84:374-380.Google Scholar 29.Cole MG, Bellavance F. Depression in elderly medical inpatients.
CMAJ.1997;157:1055-1060.Google Scholar 30.Kheifets LI, Afifi AA, Buffler PA.
et al. Occupational electric and magnetic field exposure and leukemia.
J Occup Environ Med.1997;39:1074-1091.Google Scholar 31.Franceschi S, La Vecchia C, Talamini R. Oral contraceptives and cervical neoplasia.
Tumori.1986;72:21-30.Google Scholar 32.Saag KG, Criswell LA, Sems KM.
et al. Low-dose corticosteroids in rheumatoid arthritis.
Arthritis Rheum.1996;39:1818-1825.Google Scholar 33.Emerson JD, Burdick E, Hoaglin DC.
et al. An empirical study of the possible relation of treatment differences
to quality scores in controlled randomized clinical trials.
Control Clin Trials.1990;11:339-352.Google Scholar 34.Jüni P, Witschi A, Bloch R, Egger M. The hazards of scoring the quality of clinical trials for meta-analysis.
JAMA.1999;282:1054-1060.Google Scholar 35.Schulz KF, Chalmers I, Hayes RJ, Altman DG. Empirical evidence of bias.
JAMA.1995;273:408-412.Google Scholar 36.Berlin JA.for the University of Pennsylvania Meta-analysis Blinding Study Group. Does blinding of readers affect the results of meta-analyses?
Lancet.1997;350:185-186.Google Scholar 37.Hasselblad V, Eddy DM, Kotchmar DJ. Synthesis of environmental evidence.
J Air Waste Manag Assoc.1992;42:662-671.Google Scholar 38.Friedenreich CM, Brant RF, Riboli E. Influence of methodologic factors in a pooled analysis of 13 case-control
studies of colorectal cancer and dietary fiber.
Epidemiology.1994;5:66-67.Google Scholar 39.Berlin JA, Rennie D. Measuring the quality of trials.
JAMA.1999;282:1083-1085.Google Scholar 40.Colditz GA, Burdick E, Mosteller F. Heterogeneity in meta-analysis of data from epidemiologic studies.
Am J Epidemiol.1995;142:371-382.Google Scholar 41.Frumkin H, Berlin J. Asbestos exposure and gastrointestinal malignancy review and meta-analysis.
Am J Ind Med.1988;14:79-95. [published correction appears in Am J Ind Med. 1988;14:493].Google Scholar 42.Blettner M, Sauerbrei W, Schlehofer B.
et al. Traditional reviews, meta-analyses and pooled analyses in epidemiology.
Int J Epidemiol.1999;28:1-9.Google Scholar 43.Rosenthal R. The file drawer problem and tolerance for null results.
Psychol Bull.1979;86:638-641.Google Scholar 44.Easterbrook PJ, Berlin JA, Gopalan R, Matthews DR. Publication bias in clinical research.
Lancet.1991;337:867-872.Google Scholar 45.Dickersin K, Min YI. NIH clinical trials and publication bias.
Online J Curr Clin Trials.[serial online] 1993 Apr 28: Doc No 50.Google Scholar 46.Hedges LV, Olkin I. Statistical Methods for Meta-analysis. Boston, Mass: Academic Press; 1985.
47.Schlesselman JJ. Risk of endometrial cancer in relation to use of combined oral contraceptives.
Hum Reprod.1997;12:1851-1863.Google Scholar 48.Jadad A, Sullivan C, Luo D.
et al. Patients' preferences for Turbuhaler or pressurized metered dose inhalers
(pMDIs) in the treatment. From Annual Meeting of the American Academy of Allergy, Asthma, and
Immunology; March 3-8, 2000; San Diego, Calif.
49.Huston P. Cochrane Collaboration helping unravel tangled web woven by international
research.
CMAJ.1996;154:1389-1392.Google Scholar 50.Moher D, Pham B, Jones A.
et al. Does the quality of reports of randomised trials affect estimates of
intervention efficacy reported in meta-analyses?
Lancet.1998;352:609-613.Google Scholar 51.Berlin JA, Colditz GA. The role of meta-analysis in the regulatory process for foods, drugs,
and devices.
JAMA.1999;281:830-834.Google Scholar