Flow diagram of included studies.
Langan S, Schmitt J, Coenraads P, Svensson Å, von Elm E, Williams H, for the European Dermato-Epidemiology Network (EDEN). The Reporting of Observational Research Studies in Dermatology JournalsA Literature-Based Study. Arch Dermatol. 2010;146(5):534-541. doi:10.1001/archdermatol.2010.87
MICHAELBIGBYMDOLIVIERCHOSIDOWMD, PhDROBERT P.DELLAVALLEMD, PhD, MSPHDAIHUNGDOMDURBÀGONZÁLEZMD, PhDCATALIN M.POPESCUMD, PhDHYWELWILLIAMSMSc, PhD, FRCP
To assess the quality of reporting in observational studies in dermatology.
Five dermatology journals—the Archives of Dermatology, the British Journal of Dermatology, the Journal of the American Academy of Dermatology, the Journal of Investigative Dermatology, and Acta Dermato-Venereologica.
Cohort, case-control, and cross-sectional studies published as original articles during the period January 2005 through December 2007. Studies were identified with a literature search of PubMed combining the journal title and the term epidemiological studies (free text) and by hand searching all of the issues of each journal to identify relevant articles.
All articles were extracted by 2 reviewers independently using standardized checklists based on the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) recommendations.
The number and proportion of reported STROBE items were analyzed for each article. The proportion of studies with good reporting for each item was also assessed.
A total of 138 articles were included and analyzed. Reporting quality was very mixed. Key areas that were infrequently reported included sample size calculations (n = 10 [7%]), missing data (n = 8 [6%]), losses to follow-up (n = 17 [12%]), and statistical methods (n = 19 [14%]). Only 13 studies (9%) explained the role of funders in the research. The quality of reporting was similar across study designs for “critical” questions with the exception of reporting of participant details, which was better reported in cohort studies (96%) compared with cross-sectional (80%) and case-control (70%) studies.
It is difficult to judge the quality of dermatological research unless it is reported well. This study has identified a clear need to improve the quality of reporting of observational studies in dermatology. We recommend that dermatology journals adopt the STROBE criteria.
Interpretation of data from observational studies is often limited by poor quality of reporting. Poor reporting limits the assessment of a study's strengths and weaknesses and generalizability.1 It also limits the use of observational data for secondary analyses. The clinical and scientific utility of research data may be lost in poorly reported studies.
The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) criteria have recently been developed to try and improve the quality of reporting of observational studies in medical research.2 The STROBE checklist aims to assess and improve the reporting of epidemiological studies, but it is not a measurement for the quality of the research itself. Poor reporting may also constitute an ethical problem if the necessary information is collected but is not described in research articles, thus reducing its usefulness. Similar initiatives for improving the quality of reporting of particular study types have already shown benefits in terms of the reporting of randomized controlled trials using the Consolidated Standards for Reporting Trials (CONSORT) statement.3
From our collective experience in performing systematic reviews within the field of dermatology, we hypothesized that the quality of reporting of observational studies in dermatology was “hit and miss,” with many published articles failing to mention key aspects that would allow the reader to judge the validity of the study findings and conclusions. We set out to systematically assess the quality of reporting of observational studies in dermatology using the STROBE checklist as a guideline and to highlight specific areas that could be improved.
A study protocol was devised prior to the study, and all methods were defined and piloted a priori.
We selected the 4 journals with the highest impact factor (IF) in the category “Dermatology” in the Journal Citation Report 20074 that have a section for the reporting of epidemiological studies, namely, the Archives of Dermatology (IF, 2.84), the British Journal of Dermatology (IF, 3.50), the Journal of the American Academy of Dermatology (IF, 2.90), and the Journal of Investigative Dermatology (IF, 4.83). Acta Dermato-Venereologica (IF, 1.93) was also included because this journal publishes a high proportion of epidemiological studies. The journal articles were reviewed over a 3-year period, from January 2005 through December 2007.
We included epidemiological studies that corresponded to a relevant STROBE checklist, namely, cohort, case-control, and cross-sectional studies published during the study period. Studies that were labeled as a randomized controlled trial (RCT) or laboratory research that were in fact observational epidemiological studies were included in this study under the appropriate category.
Epidemiological studies other than cohort, case-control, or cross-sectional studies such as ecological studies were excluded. Studies in abstract format only were also excluded. Genetic epidemiological studies were not included because this study subtype has its own reporting guideline (Strengthening the Reporting of Genetic Associations [STREGA]).5
Studies were identified with a literature search of PubMed combining the journal title and the term epidemiological studies (free text) and by hand searching all of the issues of each journal to identify relevant articles. This process was carried out independently by 2 reviewers (S.L. and J.S.), and disagreements were resolved by discussion and arbitration by a third person (H.W.) if needed. Articles were then numbered consecutively and randomly allocated to each reviewer using a random numbers table.
To determine whether further clarification was needed to define the questions when scoring items from the STROBE checklist, 3 articles representing each main study design underwent a pilot data abstraction by all reviewers (S.L., J.S., P.-J.C., and Å.S.) and by the European Dermato-Epidemiology Network (EDEN) steering group (10 members including S.L., J.S., P.-J.C., and Å.S.). Disagreements were resolved by discussion with all reviewers. Following the pilot phase, all articles were extracted by 2 reviewers independently. Disagreements were resolved between pairs (S.L. and J.S.; Å.S. and P.-J.C.), and an independent arbitrator (H.W.) was involved for persisting differences.
Relevant data from included articles was summarized in an Excel spreadsheet (Microsoft Corp, Redmond, Washington) containing information on the number and proportion of the items in the checklist that were reported against the STROBE checklist.
The STROBE checklists of 22 items was operationalized into a series of questions for each study design, which can be answered “yes,” “partly,” “no,” “unclear,” or “not applicable.” One author (E.v.E.) was also involved in the development of the STROBE checklist. This guaranteed the consistency of this operationalization and of our research in general with the aims of the STROBE initiative. The checklists were piloted by all investigators on the 3 selected study types as previously described. Each investigator independently submitted their analysis to their paired author. A panel discussion by e-mail followed to resolve the differences in analyses and to obtain a consensus on how the responses are defined by individual investigators.
A glossary of rules was developed throughout the study, which will help inform the STROBE initiative and other researchers. Briefly, a number of key decisions were made to translate the STROBE reporting items into data that could be assessed reliably.
The number and proportion of reported items (“yes” responses) and not reported items (all responses except “yes” or “not applicable”) were analyzed for each study. We then examined the median and range of reported items by study design. The proportion of studies with good reporting for each item was also assessed. Items that were not applicable were excluded from the analysis. A sensitivity analysis was carried out, in which “partly” responses were analyzed as “yes” responses to assess the effect on study conclusions.
To inform future studies, we believed it would be useful to identify a priori which questions within the STROBE checklist should be critical to complete when reporting observational studies. To achieve this, each author independently identified what they perceived to be 10 essential questions, and responses were then pooled to determine a critical list of items (Table 1). An item was considered essential if at least 4 of 5 reviewers (S.L., J.S., P.-J.C., Å.S., and H.W.) agreed on the item for inclusion, and these are given in Table 2. Of the 22 items in the STROBE checklist, 4 were considered essential by the panel of authors prior to completion of the study. The items were as follows: question 1 (title and abstract), question 6 (detail of participants), question 7 (study variables), and question 12 (statistical methods). Items that none of the reviewers considered essential were question 11 (handling of quantitative variables) and question 17 (other analyses).
The search strategy identified 291 studies, of which 138 were relevant for this study (a list is provided in an online repository available at: http://eden.dermis.net/content/e02eden/e02projects/e88/index_ger.html). The Figure shows the flow diagram of included studies: 53 articles (38%) were cohort studies, 44 (32%) were case-control studies, and 41 (30%) were cross-sectional studies.
The median (range) number of reported items per article was comparable across the different study designs. For cohort studies, the median number of reported items per article was 13 (7-21), and similarly, the median was 12 (7-17) for cross-sectional designs and 12 (4-18) for case-control studies.
Of the 138 studies, 137 (99%) reported key results in relation to study objectives (Table 2). Similarly, 134 studies (97%) reported the scientific background and rationale for carrying out the study. Most studies (n = 129 [93%]) provided a reasonable summary of the research in the abstract. The majority of studies reported the study objectives (n = 123 [89%]), design (n = 120 [87%]), and outcomes (n = 122 [88%]).
Sample size calculations were reported by only 10 (7%) of included studies. Other areas that were very poorly described relate to the issues of management of missing data (n = 8 [6%]) and losses to follow-up (n = 17 [12%]). In addition, infrequently addressed areas included the number of individuals at each stage of the study (n = 54 [39%]) and the statistical methods (n = 19 [14%]). Thirteen studies (9%) explained the role of funders in the research.
Of those questions considered essential (Table 2), questions 1 (98%) and 6 (83%) were frequently reported, whereas questions 7 (53%) and 12 (14%) were infrequently described. Examining the critical items by study design, item 1 was adequately completed by at least 95% of studies in all categories. Item 6 was reported in 96% of cohort studies compared with 80% of cross-sectional studies and 70% of case-control studies. Completion rates were similar for items 7 (52%-54%) and 12 (10%-18%) across all study designs.
Recoding partly responses as yes responses altered the proportion of completely reported items as given in Table 2. Specifically, in relation to question 7 (study variables), this increased completion rates from 53% (n = 73) to 83% (n = 114). Similarly for question 14 (details of study participants), completion rate increased from 43% (n = 60) to 67% (n = 92). Significant increases were also observed for the questions 16 (reporting of crude and confounder adjusted estimates), 19 (study limitations), 20 (cautious interpretation), and 21 (generalizability of the findings).
This study has highlighted areas where reporting of observational, epidemiological studies is good in the dermatology literature, along with other areas where improvements are required. In particular, there is a need for studies to improve their reporting of sample size calculations, statistical methods, and details of numbers and characteristics of participants. Sample size might be determined by any limited resources in time, resources, and eligible patients. We do not believe that researchers should be forced into post hoc sample size justifications, but they should say how they obtained the number of participants. Sample size estimates are useful to see what magnitude of effect the authors were looking for and also to see if there were problems recruiting the target sample owing to participant drop out.6 Missing data and losses to follow-up are important potential sources of information and attrition bias. Two other critical areas of weakness included the description of study variables and statistical methods, both of which were identified a priori by the authors as being critical items in the reporting of observational studies.
The quality of reporting of observational studies has been previously assessed in other fields of medical research7,8 but not, to our knowledge, in dermatology. We only identified studies assessing the quality of reporting of observational studies using the STROBE criteria in ophthalmology and sexual health. Both specialties have identified similar problems with incomplete reporting, with particularly poor reporting of management of missing data and confounding.9,10
Similar findings have also been shown in the dermatology literature in relation to the reporting of RCTs and the CONSORT guidelines.11- 13 In 1985, Bigby et al13 reported (n = 61 trials) similar findings in relation to reporting of RCTs, with only 3% of studies reporting power and 76% addressing losses to follow-up. The situation for dermatology RCTs appears to have improved following the introduction of CONSORT guidelines.11,12,14- 16
This is a novel study that has systematically highlighted important areas requiring improvement in the reporting of observational studies in dermatology. The results have direct relevance for readers and authors of epidemiological studies. The use of pilot methods to devise the study checklists, independent data extraction, and arbitration has helped improve the methodological rigor of our study. One of the key limitations is that it was not possible to formally assess interrater agreement because discussion between authors and the creation of rules were required to resolve these issues as the study identified new problems as it progressed. A degree of subjectivity is reflected by the findings of the sensitivity analysis.
This study has highlighted important deficiencies in the reporting of observational studies in dermatology journals. On the basis of these findings, we believe it would be useful for authors if dermatology journals adopted the use of the STROBE criteria to guide authors. Authors of epidemiological studies might also start using the guidance regardless of which dermatology journals they submit their article to. This study has defined a number of key rules that will be useful for researchers. These include the conclusion that settings and locations should always be defined even in the setting of referring readers to previous publications. Another important conclusion was that for studies based on disease registries or databases, a number of the checklist items are not applicable, for example, the dates of recruitment, numbers eligible at each stage of the study, reasons for nonparticipation, or flow diagrams.
Full explanatory notes on the use of STROBE are on the STROBE Web site (http://www.strobe-statement.org [accessed August 26, 2009]), and a fuller list of other reporting guidelines are on the EQUATOR (Enhancing the Quality and Transparency of Health Research) Web site (http://www.equator-network.org/home/ [accessed August 26, 2009]). Use of STROBE is likely to lead to better quality reporting, as was found with the adoption of the CONSORT statement.12 Better reporting is essential to maintain an undistorted scientific record, which can be used for synthesis of existing evidence, clinical decision making, and health policy. Further research could then review the quality of reporting to assess improvement after this important change.Article
Correspondence: Sinéad Langan, PhD, Centre of Evidence-Based Dermatology, C Floor, South Block, Queen's Medical Centre, Nottingham NG 2UH, England (firstname.lastname@example.org).
Accepted for Publication: December 10, 2009.
Author Contributions: Dr Langan had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Study concept and design: Langan, Schmitt, Svensson, von Elm, and Williams. Acquisition of data: Langan, Schmitt, and Coenraads. Analysis and interpretation of data: Langan, Schmitt, Coenraads, Svensson, and von Elm. Drafting of the manuscript: Langan, Schmitt, and Svensson. Critical revision of the manuscript for important intellectual content: Coenraads, Svensson, von Elm, and Williams. Statistical analysis: Langan and von Elm. Study supervision: Svensson and Williams.
Financial Disclosure: None reported.
EDEN Group Members: Jan-Nico Bouwes Bavinck, MD, PhD; Pieter-Jan Coenraads, MD, PhD, MPH; Thomas Diepgen, MD, PhD; Peter Elsner, MD, PhD; Ignacio Garcíia-Doval, MD, MSc, PhD; Jean Jacques Grob, MD, PhD; Sinéad Langan, MD, MSc, PhD; Luigi Naldi, MD; Tamar Nijsten, MD, PhD; Jochen Schmitt, MD, MPH; Åke Svensson, MD, PhD; and Hywel Williams, MSc, PhD, FRCP.
Additional Contributions: The EDEN steering group provided useful discussions and reviewed the study checklists.
Section Editor's Note: The STROBE instrument was reviewed in a Commentary published in the Evidence-Based Dermatology section in the September 2008 issue of the Archives of Dermatology (Nijsten T, Spuls P, Stern RS. STROBE: a beacon for observational studies. Arch Dermatol. 2008;144:1200-1204).