From the Institute of Psychiatry (Drs Joyce, Wessely, and Rabe-Hesketh), King's College School of Medicine, King's College, and Maudsley Hospitals (Dr Wessely), London, England.
Objective.— To test the hypothesis that the selection of literature in review articles
is unsystematic and is influenced by the authors' discipline and country of
Data Sources.— Reviews in English published between 1980 and March 1996 in MEDLINE,
EMBASE (BIDS), PSYCHLIT, and Current Contents were
Study Selection.— Reviews of chronic fatigue syndrome (CFS) were selected. Articles explicitly
concerned with a specialty aspect of CFS and unattributed, unreferenced, or
insufficiently referenced articles were discarded.
Data Extraction.— Record of data sources in each review was noted as was the departmental
specialty of the first author and his or her country of residence. The references
cited in each index paper were tabulated by assigning them to 6 specialty
categories, by article title, and by assigning them to 8 categories, by country
of journal publication.
Data Synthesis.— Of 89 reviews, 3 (3.4%) reported on literature search and described
search method. Authors from laboratory-based disciplines preferentially cited
laboratory references, while psychiatry-based disciplines preferentially cited
psychiatric literature (P=.01). A total of 71.6%
of references cited by US authors were from US journals, while 54.9% of references
cited by United Kingdom authors were published in United Kingdom journals
Conclusion.— Citation of the literature is influenced by review authors' discipline
MANY narrative reviews and reviews that describe themselves as systematic
have been shown to be nonreproducible and to be of low mean scientific quality.1- 3 A lack of clearly specified
methods of identifying, selecting, and validating included information has
been among the problems noted.4 Experts could
not agree, even among themselves, about whether other experts who wrote review
articles had conducted a competent search or generated a bias-free list of
Few things are certain about chronic fatigue syndrome (CFS) other than
that it is controversial. Both public and professional opinions are often
debated passionately. In such circumstances both physicians and interested
members of the public may turn for guidance and information to review articles.
Such articles fulfill an important function for professionals, journalists,
and patients unable to find, locate, or evaluate primary sources of information.
This is particularly important in CFS, since potentially relevant research
spans many disciplines, with important contributions coming from specialties
as diverse as immunology, virology, internal medicine, psychiatry, psychology,
and neurology. Thus, our aim was to examine the quality of current reviews
of CFS. Our hypothesis was that use of the literature would show the following
biases: the identification and selection of literature for review is unsystematic,
it fails to reflect the broad range of literature, and it is influenced by
the author's discipline and country of residence.
All reviews of CFS between 1980 and 1996 from English-language journals
were eligible. We defined a review as an article that made a claim, either
implicit or explicit, referring to the range of knowledge known at the time
of publication and that represented itself as being able to reach general
conclusions about CFS. Articles explicitly labeled as dealing with a specialty
aspect of CFS, such as "psychiatric aspects of CFS" or "immunological findings
in CFS," were excluded. Reviews with fewer than 15 references (we considered
a citation list of at least 15 as necessary evidence of a serious attempt
to review the subject) or without any details about authors were excluded.
Seventy-three foreign language reviews in 14 different languages found in
the same search were excluded due to lack of linguistic expertise and small
numbers per country of journal publication.
We searched MEDLINE, EMBASE (BIDS), PSYCHLIT and Current Contents. We conducted a free text search using the terms chronic fatigue syndrome, neurasthenia, myalgic encephalomyelitis, and tiredness and the truncated terms chronic fatigue and postviral. More than 4000 references
were checked in title and abstract by one of the authors (J.J.). All possible
reviews were then confirmed by one of the authors (S.W.).
We used 4 phases in extracting and categorizing data. First, using the
first 3 of 10 criteria for the assessment of scientific quality of research
overviews and how they relate to selecting literature to reviews as recommended
by Oxman and Guyatt,7 we noted comments the
authors made on their search methods, each article's comprehensiveness, and
the review article's inclusion criteria. Second, we divided the tasks of tabulating
(J.J.) and checking (S.W.) between the departmental specialty of the first
author and his or her country of residence and found no discrepancies. Third,
the references cited in each index paper were tabulated and assigned to 6
specialty or subject categories by article title alone as shown in Table 1. Fourth, the references cited were
also tabulated and assigned to 8 categories representing the countries in
which they were published, including the United States, United Kingdom, Europe
(excluding the United Kingdom), Australia, Canada, New Zealand, other (which
included South Africa, Israel, India, Japan, and China), and not listed. The
place of publication of each journal title was ascertained by consulting Libertas'
list of serials.
The data consist of a set of percentages for each review, representing
the number of references that fall into each subject or country of publication.
We wished to test how the author's discipline, country of residence, and the
country in which the article was published had affected the use of references.
Since the subject or country of publication categories may be regarded as
repeated measurements within each review and the data are approximately normally
distributed, this could be done by repeated-measures multivariate analysis
of variance (MANOVA) using the matrix of percentages. In order to display
the multivariate data in 2 dimensions, we made a series of biplots,8 ie, plotting the first 2 principal components of the
matrix of percentages together with a set of axes for the reference subject
or country of publication category. The repeated MANOVA was carried out using
SPSS statistical software (SPSS Inc, Chicago, Ill) and the biplots were made
using S-Plus (MathSoft, Seattle, Wash).
One hundred eighty-six reviews were found on our preliminary search.
Eighty-nine of these were counted as eligible for analysis. All were checked
by one of the authors (S.W.) and there were no disagreements between the 2
reviewers. Fifty-five of the articles concerned a specialty aspect of CFS,
despite a general title, or were primary research, an audit, a case report,
or a first-person account (26 of these were reviews specifically dealing with
treatment only). Thirteen were not attributed to an author, 7 were without
any references, and 18 had fewer than 15 references. Four were unobtainable
despite a thorough interlibrary search.
Only 3 (3.4%) of the 89 reviews reported on the database source or sources
used to conduct its literature search, none of which were written by any of
the current authors. In 2 articles, the authors merely described the search,
without further elaboration, as "relevant published research literature,"
and in 1 article, the author reported the databases used for the search. One
of the 3 specifically reported its inclusion criteria.
After data inspection, 1 review (by a neurologist) was excluded as an
outlier, as its large distance from all other data would exert a disproportionate
Because of the insufficient numbers of nurse-therapists, pharmacologists,
and those in the other group for testing, we avoided the problems of inference,
which would be caused by having too many small groups, by including for analysis
only those groups with a number of authors who were mostly physicians, either
general practitioners or specialized in infectious diseases or psychiatry.
There was a highly significant interaction between reference disciplines and
specialty of author (P=.01, F12,185=2.84). Figure 1 shows the pattern of reference use
by infectious diseases specialists and psychiatrists. Those working in infectious
diseases quote the laboratory category most often followed by physicians,
nurses, and general practitioners. Infectious diseases specialists and general
practitioners quote psychiatric articles least. Psychiatrists and pharmacologists
quote the laboratory category and psychiatric categories about equally.
The biplot in the left panel of Figure
2, shows the spread of reviews by specialty of author, in this case
psychiatrist or infectious diseases specialist according to reference disciplines.
Those reviews authored by infectious diseases experts cluster on the left
around the laboratory category, and those authored by psychiatrists cluster
on the right around general psychiatry. Principal components accounted for
73.5% and 14.4% of the variance.
The distribution of countries of reference publication was compared
by countries where the review author resided using repeated measures-MANOVA.
Only references from the United States and United Kingdom were included due
to small numbers of articles published in other countries. The interaction
between where the author lived and where the reference was published was significant
The 2 principal components accounted for 92.1% and 5% of the variance,
respectively (97.1% in total). The right panel of Figure 2 shows the spread of reviews according to where the author
resides and in what country the references were published. Those reviews with
US authors cluster to the right and those with United Kingdom authors cluster
to the left. We have not included the 6 reviews by Australian authors in this
biplot in order to preserve the clarity of presentations. However, Table 2 shows that the Australian reviews
were situated midway between the United States and the United Kingdom in terms
of citation distribution by country.
Despite the recent emphasis on the necessity for quality in medical
reviews, our results show that in the area of CFS the vast majority of reviews
are not based on systematic literature searches and do not use objective criteria
for inclusion or exclusion.
One might reasonably expect that reviews of a multidisciplinary subject,
such as CFS, are able to integrate findings from many sources–instead
it is possible that they perpetuate preexisting disciplinary biases. We have
shown that the choice of articles to cite is influenced by the author's discipline
and the country in which he or she resides. We emphasize that we have only
looked at review articles that claimed to be comprehensive and have not included
any review articles that made such a bias explicit. A reader consulting any
of the review articles we have studied expects them to be an objective synthesis
of a complex subject. Instead, most display biases toward particular disciplines,
usually the one in which the author practices. Reference bias has been previously
reported in drug trials9 but not, to our knowledge,
in the area of reviews. Such biases are not unexpected but are important.
Similarly, it is a staple of academic gossip that Americans only cite US literature,
Europeans European literature, and so on. We have confirmed that both US and
United Kingdom authors are more likely to cite literature published in their
own countries. Also of note is the underuse of Continental European literature
by US or United Kingdom authors. We are aware of only 1 previous confirmation
of this intuition.10
Exclusion of references because of language has been shown to introduce
bias in randomized controlled trials.11- 13
We acknowledge our findings can only be generalized to the English-language
literature. For fairly obvious reasons, language bias alone cannot explain
our findings. American English and United Kingdom English may not sound the
same, but they do read the same.
John Joyce, Sophia Rabe-Hesketh, Simon Wessely. Reviewing the ReviewsThe Example of Chronic Fatigue Syndrome. JAMA. 1998;280(3):264–266. doi:10.1001/jama.280.3.264