[Skip to Navigation]
Sign In
Peer Review Congress
July 15, 1998

Does Masking Author Identity Improve Peer Review Quality?A Randomized Controlled Trial

Author Affiliations

From the Division of General Internal Medicine and Program for Health Care Research, Department of Veterans Affairs Medical Center and University Hospitals of Cleveland and Case Western Reserve University, Cleveland, Ohio (Dr Justice); the Center for Bioethics (Dr Cho) and the Division of Biostatistics, Department of Biostatistics and Epidemiology and Center for Clinical Epidemiology and Biostatistics, School of Medicine, University of Pennsylvania, Philadelphia (Dr Berlin); and the Institute for Health Policy Studies, University of California, San Francisco (Dr Rennie). Dr Winker is Senior Editor and Dr Rennie is Deputy Editor (West), JAMA .

JAMA. 1998;280(3):240-242. doi:10.1001/jama.280.3.240
Abstract

Context.— All authors may not be equal in the eyes of reviewers. Specifically, well-known authors may receive less objective (poorer quality) reviews. One study at a single journal found a small improvement in review quality when reviewers were masked to author identity.

Objectives.— To determine whether masking reviewers to author identity is generally associated with higher quality of review at biomedical journals, and to determine the success of routine masking techniques.

Design and Setting.— A randomized controlled trial performed on external reviews of manuscripts submitted to Annals of Emergency Medicine, Annals of Internal Medicine, JAMA , Obstetrics & Gynecology , and Ophthalmology .

Interventions.— Two peers reviewed each manuscript. In one study arm, both peer reviewers received the manuscript according to usual masking practice. In the other arm, one reviewer was randomized to receive a manuscript with author identity masked, and the other reviewer received an unmasked manuscript.

Main Outcome Measure.— Review quality on a 5-point Likert scale as judged by manuscript author and editor. A difference of 0.5 or greater was considered important.

Results.— A total of 118 manuscripts were randomized, 26 to usual practice and 92 to intervention. In the intervention arm, editor quality assessment was complete for 77 (84%) of 92 manuscripts. Author quality assessment was complete on 40 (54%) of 74 manuscripts. Authors and editors perceived no significant difference in quality between masked (mean difference, 0.1; 95% confidence interval [CI], −0.2 to 0.4) and unmasked (mean difference, −0.1; 95% CI, −0.5 to 0.4) reviews. We also found no difference in the degree to which the review influenced the editorial decision (mean difference, −0.1; 95% CI,−0.3 to 0.3). Masking was often unsuccessful (overall, 68% successfully masked; 95% CI, 58%-77%), although 1 journal had significantly better masking success than others (90% successfully masked; 95% CI, 73%-98%). Manuscripts by generally known authors were less likely to be successfully masked (odds ratio, 0.3; 95% CI, 0.1-0.8). When analysis was restricted to manuscripts that were successfully masked, review quality as assessed by editors and authors still did not differ.

Conclusions.— Masking reviewers to author identity as commonly practiced does not improve quality of reviews. Since manuscripts of well-known authors are more difficult to mask, and those manuscripts may be more likely to benefit from masking, the inability to mask reviewers to the identity of well-known authors may have contributed to the lack of effect.

IT HAS BEEN suggested that masking reviewers to author identity would improve the fairness and the quality of peer review1 because well-known authors' work may be reviewed less critically. Yet only a small fraction of journals routinely mask reviewers.2,3 When editors are asked why they do not mask, they cite an "overwhelming burden" associated with masking.2,3 Some question whether it is possible to mask successfully.2,4

One study, conducted at a single journal,5 demonstrated that the quality of masked reviews was statistically higher than that of unmasked reviews, although that difference was small. We tested the hypothesis that masking peer reviewers to author identity improves the quality of peer review at 5 biomedical journals. To increase the generalizability of our study, we used a masking procedure that is commonly practiced.

Methods

Journals

Five journals participated in the study: Annals of Emergency Medicine, Annals of Internal Medicine , JAMA , Obstetrics & Gynecology , and Ophthalmology. Only 1 of these journals, Annals of Emergency Medicine, routinely masks reviewers to author identity.

Manuscript Enrollment. Eligible manuscripts were submitted between November 1995 and March 1996 and met the following inclusion criteria: (1) the manuscript reported original research, including meta-analyses but excluding case reports or letters, (2) the manuscript was sent for external peer review, and (3) the authors did not object to having their manuscripts enrolled. Authors were notified that their manuscripts would be included in a study of the peer review process unless they declined, and that declining to have their manuscripts enrolled would not affect any editorial decisions regarding their manuscript. No authors objected.

Masking Procedure. Each journal followed a standardized masking procedure that involved removing author and institutional identity from the title page, running headers or footers, and acknowledgments of the manuscripts. Self-references in the text were not removed. In addition, the managing editor at Annals of Internal Medicine removed names and journal identification (but not titles or other reference information) from self-references in the text and reference section. Annals of Emergency Medicine also stated in their "Information for Authors" that authors should not include author names in the running heads.

Study Design. At each journal, the editor followed the journal's usual procedure in selecting manuscripts for review and identified 2 reviewers for each manuscript. Once reviewers had agreed to review the manuscript, 2 randomizations were performed. The first was weighted to assign 25% of manuscripts to usual practice. These manuscripts were reviewed according to the journal's usual practice (ie, all but those at Annals of Emergency Medicine were unmasked reviews). The manuscripts randomized to the intervention arm had 1 of 2 reviewers randomly selected to receive a manuscript from which the author identity and institution had been removed. The other reviewer received the manuscript with author and institution identified. Randomization was performed using random number tables. In all cases, both reviewers were sent a questionnaire along with a statement that the manuscript was part of a study of peer review (described herein). The reviewers returned the manuscripts, reviews, and questionnaire to the journal. Before sending the reviews to editors and authors, the managing editors at each journal removed from the reviews any information that would reveal whether the manuscript had been masked. The manuscript editor and the corresponding author rated the quality of each review, unaware of the group to which the review had been assigned.

Questionnaires. Questionnaires were completed by editors, authors, and reviewers. Because of a miscommunication with the managing editor, no authors at the Annals of Emergency Medicine were sent surveys. Editor and author surveys each included 4 specific ratings of review quality: how well the review addressed the clinical or research importance of the study, how well it identified the study's strengths and weaknesses, whether reviewers were courteous, and whether reviewers supplied evidence to support their statements. Authors and editors responded to these questions using a 5-point Likert scale for which a score of 5 represented the best quality. Authors and editors also were asked to provide an overall rating of review quality on a 5-point Likert scale. Editors were asked how much the review had influenced their decision. These questions were chosen to parallel those used in the prior study of masking.5

To determine masking success, reviewers in the masked group were asked whether they thought they could identify any of the authors or their institutions, and if so, to list the authors and institutions. Reviewers in both groups were asked whether they were familiar with the authors, their previous work, or the reviewed work.

Analysis. The primary outcome was the difference in quality as assessed by the editor and author between the masked and unmasked review for each manuscript. Because this analysis uses a comparison between the quality of a masked and an unmasked review for each manuscript, a paired t test was used. The sample size for this analysis was the total number of manuscripts randomized to the intervention for which masking status and editor's quality score were complete. A positive difference means that masked reviews were of better quality than unmasked reviews; a negative difference means that masked reviews were of lesser quality than unmasked reviews. The Wilcoxon matched-pairs signed rank test was used to verify the result of the paired t test. A difference of greater than 0.5 on the 5-point Likert scale was considered editorially important.

The secondary outcome was masking success. We defined a review as successfully masked if the reviewer did not guess the author's identity or if the reviewer guessed incorrectly. Exact 95% confidence intervals (CIs) were calculated using the binomial distribution. This analysis did not require a pair of reviews per manuscript. Thus, the sample size for this analysis was all reviews that were randomized to be masked, either because of the usual practice at that journal or because that reviewer was randomized to receive the masking intervention. We analyzed masking success overall and by journal and tested for differences across the 5 journals using the χ24 test. We subsequently identified 1 journal with a significantly higher success rate comparing each journal with all other journals using a χ21 test. Author renown was determined by whether the randomly unmasked reviewer was familiar with the author, their prior work, or the current work. Logistic regression, with variance estimates adjusted for multiple observations per manuscript, was used to explore whether the higher success of masking at the Annals of Emergency Medicine could be explained by differences in author renown and to test whether author renown was associated with reduced masking success.

Results

A total of 118 manuscripts were randomized, 26 to usual practice and 92 to the intervention. Of those randomized to the intervention, 77 (84%) of 92 had sufficient data to compare review quality based on the editor's judgment. Only 40 (54%) of 74 had sufficient data to compare review quality based on the author's judgment (the denominator for author judgment is smaller because no authors received surveys from Annals of Emergency Medicine). Of those reviewers randomized to receive masked manuscripts, 99 (93%) of 106 had sufficiently complete data to evaluate the success of masking reviewers to author identity.

Review Quality. Editors perceived no significant difference in quality between masked and unmasked reviews (mean difference, 0.1; 95% CI,−0.2 to 0.4). Results were similar when editors rated the degree to which the review influenced their decision (mean difference, –0.1; 95% CI,−0.3 to 0.3). Authors also perceived no overall difference in quality between masked and unmasked reviews (mean difference,−0.1; 95% CI,−0.5 to 0.4). Differences in quality did not vary substantially by journal (Table 1). When the analysis of review quality was restricted to pairs for which masking was successful, no difference in quality was found. There were also no significant quality differences between masked and unmasked reviews for the 4 specific components of quality (data not shown). All results were confirmed when tested using the nonparametric signed rank test.

Table 1.—Mean Difference in Quality Between Masked and Unmasked Reviews*
Table 1.—Mean Difference in Quality Between Masked and Unmasked Reviews*
Image description not available.

Masking Success. Success in masking reviewers to author identity was generally low (68%; 95% CI, 58%-77%) and fairly consistent across all but 1 participating journal, Annals of Emergency Medicine, which achieved a masking success rate of 90% (95% CI, 73%-98%; Table 2). This corresponds to odds of success 6.5 times that of the other journals (95% CI, 1.8-23.7). When this journal was excluded from the calculation of overall masking success, the rate dropped to 58% (95% CI, 45%-70%) and a χ2 test for significant differences among the remaining journals was not significant.

Table 2.—Success of Masking Reviewers to Author Identity*
Table 2.—Success of Masking Reviewers to Author Identity*
Image description not available.

Author Renown. Manuscripts by authors with whom the unmasked reviewer was familiar (n=43) were less likely to be successfully masked (53%) (that is, the masked reviewer was more likely to correctly guess author identity) than those of authors who were not known to the unmasked reviewer (79%; P =.008; odds ratio [OR], 0.3; 95% CI, 0.1-0.8).

We also considered whether author renown explained the apparent discrepancy in masking success rates between the Annals of Emergency Medicine and the other participating journals. When an indicator variable for Annals of Emergency Medicine was used to predict masking success in a logistic model, the unadjusted OR was 5.9 (95% CI, 1.6-21.1). When a multivariate logistic model was used to adjust for author renown, the OR was reduced but remained significant (OR, 4.8; 95% CI, 1.2-19.3).

Comment

McNutt and colleagues5 found that reviews of masked manuscripts were of marginally higher quality than reviews of unmasked manuscripts (3.5 vs 3.1 on a 5-point scale). However, they recognized that this difference was small. We believed it was too small to justify the added time and cost of masking. We sought to determine whether the difference in quality due to masking, studied among several journals, might be larger. In this first multijournal study of peer review at biomedical journals, our 95% CI excluded an overall improvement in peer review quality of 0.5 or greater on a 5-point Likert scale, the difference we considered editorially significant.

Poor overall masking success, in combination with the observation that an author's renown is strongly associated with masking failure, is a possible explanation for this finding. Notably, the average masking success in our study was similar to that achieved by McNutt et al5 and Yankauer6 and was obtained using commonly used procedures that are generalizable to standard journal practices.

The participation of multiple journals, sufficient overall sample size, and a feasible method of masking author identity make it likely that these findings are valid for most biomedical journals. However, our study was limited to biomedical journals and our sample size was too small to eliminate the possibility of an editorially important difference in quality for individual journals. Although we achieved a good response rate for editor's quality ratings (84%), the rate for authors was low (54%). Thus, our conclusions for author evaluations may not generalize well. Additionally, if the major improvement in review quality provided by masking would be expected to occur for manuscripts of renowned authors, then we cannot exclude the possibility that increasing the rate of successfully masking renowned authors could improve review quality. Finally, reviewers were aware that they were participating in a study. The effect of such knowledge on the quality of review is unknown.

Our study did not directly address the question of whether masking improves fairness. However, if masking is frequently unsuccessful, especially among well-known authors, it is not likely to improve fairness, no matter how fairness might be defined. The only potential benefit to a policy of masking that is largely unsuccessful is the appearance of fairness.

We conclude that masking reviewers to author identity as commonly practiced does not improve review quality. Further, masking as commonly practiced fails to hide the identity of renowned authors and therefore may also fail to improve the fairness of review. Techniques to improve masking success are needed to determine whether masking the identity of renowned authors improves review quality or fairness.

References
1.
Fletcher RH, Fletcher SW. Evidence for the effectiveness of peer review.  Sci Eng Ethics.1997;3:35-43.Google Scholar
2.
Cleary JD, Alexander B. Blind versus nonblind review:survey of selected medical journals.  Drug Intell Clin Pharm.1988;22:601-602.Google Scholar
3.
Pitkin RM. Blinded manuscript review: an idea whose time has come?  Obstet Gynecol.1995;85:781-782.Google Scholar
4.
Moossy J, Moossy YR. Anonymous authors, anonymous referees: an editorial exploration.  J Neuropathol Exp Neurol.1985;44:225-228.Google Scholar
5.
McNutt RA, Evans AT, Fletcher RH, Fletcher SW. The effects of blinding on the quality of peer review.  JAMA.1990;263:1371-1376.Google Scholar
6.
Yankauer A. How blind is blind review?  Am J Public Health.1991;81:843-845.Google Scholar
×