Progression of a mock trial defining data age, data collection period, and publication time. The black lines indicate an individual patient’s own timeline throughout the study.
The ends of the boxes indicate the upper and lower quartiles, so the box spans the interquartile range. The middle line indicates the median, the whiskers are the 2 lines outside the box that extend to the highest and lowest observations, and the circles indicate the extreme values of the observations.
eFigure 1. PRISMA Flowchart
eFigure 2. Distribution of Data Age for Randomized Trials Published in 6 High-Impact Journals in 2015
eFigure 3. Distribution of Enrollment Time for Randomized Trials Published in 6 High-Impact Journals in 2015
eFigure 4. Distribution of Publication Time for Randomized Trials Published in 6 High-Impact Journals in 2015
Customize your JAMA Network experience by selecting one or more topics from the list below.
Identify all potential conflicts of interest that might be relevant to your comment.
Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.
Err on the side of full disclosure.
If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.
Not all submitted comments are published. Please see our commenting policy for details.
Welsh J, Lu Y, Dhruva SS, et al. Age of Data at the Time of Publication of Contemporary Clinical Trials. JAMA Netw Open. 2018;1(4):e181065. doi:10.1001/jamanetworkopen.2018.1065
What is the age of clinical trial data at the time of publication?
This cross-sectional analysis of clinical trials published in 2015 in 6 journals with a high impact factor found that by the time of publication, the median data age was nearly 3 years. The median publication time was more than 1.2 years, with 18.5% of trials taking 2 or more years to be published.
Collectively, these findings suggest opportunities to adjust various processes related to clinical trials to allow dissemination of the final results in a more timely manner.
As medical knowledge and clinical practice rapidly evolve over time, there is an imperative to publish results of clinical trials in a timely way and reduce unnecessary delays.
To characterize the age of clinical trial data at the time of publication in journals with a high impact factor and highlight the time from final data collection to publication.
Design and Setting
A cross-sectional analysis was conducted of all randomized clinical trials published from January 1 through December 31, 2015, in the Annals of Internal Medicine, BMJ, JAMA, JAMA Internal Medicine, Lancet, and New England Journal of Medicine. Multivariable linear regression analyses were conducted to assess whether data age (adjusted for follow-up duration) and publication time were associated with trial characteristics.
Main Outcomes and Measures
The outcome measures were the midpoint of data collection until publication (data age), the time from first participant enrollment to last participant enrollment (enrollment time), and the time from final data collection to publication (publication time).
There were 341 clinical trials published in 2015 by the 6 journals. For assessment of the primary end point, 37 trials (10.9%) had a follow-up period of less than 1 month, 172 trials (50.4%) had a follow-up period of 1 month to 1 year, and 132 trials (38.7%) had a follow-up period of more than 1 year. For all trials, the median data age at publication was 33.9 months (interquartile range, 23.5-46.3 months). Among trials with a follow-up period of 1 month or less, the median data age was 30.6 months (interquartile range, 18.6-39.0 months). A total of 68 trials (19.9%) required more than 4 years to complete enrollment. The median time from the completion of data collection to publication was 14.8 months (interquartile range, 7.4-22.2 months); publication time was 2 or more years in 63 trials (18.5%). In multivariable regression analyses adjusted for follow-up time, inconclusive or unfavorable trial results were significantly associated with older data age (>235 days). Compared with trials funded only by private industry, trials funded by government were associated with a significantly longer time to publication (>180 days).
Conclusions and Relevance
Clinical trials in journals with a high impact factor were published with a median data age of nearly 3 years. For a substantial proportion of studies, time for enrollment and time from completion of data collection to publication were quite long, indicating marked opportunities for improvement in clinical trials to reduce data age.
Clinical trials require time to generate and disseminate new knowledge. The time lag is subject to fixed constraints, such as the follow-up period for the primary end point, and modifiable factors, such as participant enrollment time and time to publication after completion of data collection. Furthermore, time to publication could be affected by the time needed for data entry, adjudication, cleaning, analysis and interpretation, manuscript preparation, peer reviews, and actual publication by the journal after acceptance. Some of these tasks could be conducted, in large part, in parallel with the conduct of the trial. As medical knowledge and clinical practice rapidly evolve,1 the faster the variable aspects of a trial are accomplished, the more relevant the results are to current practice.2,3 There is an imperative to publish clinical trial results in a timely way and reduce unnecessary delays.
Previous studies of published clinical trials showed that it took, on average, 2 years for these trials to be published after completion (ie, time to publication).4,5 Little is known about the overall age of data at publication, the contribution of the time to publication to the data age, or the time spent enrolling participants. Such information might identify opportunities to accelerate the timeliness of the clinical trial process and reporting. Accordingly, we sought to characterize data age, enrollment time, and publication time of all clinical trials published in 2015 by the medical journals with the highest impact factors.
We screened all original research articles published (either online or in print) from January 1 through December 31, 2015, in 6 general and internal medicine journals with high impact factors6: the New England Journal of Medicine (NEJM), Lancet, JAMA, BMJ, Annals of Internal Medicine, and JAMA Internal Medicine. We identified only trials with a randomized comparison of an intervention with a control and excluded those that did not represent a primary analysis or that did not report the time of starting enrollment or ending data collection in a specific month (eFigure 1 in the Supplement). Approval of this study was waived by the Yale University institutional review board as it did not meet the definition of human participants research. This study followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guidelines.
All data elements were independently abstracted by 1 of us (J.W., L.B., C.O.Z., or L.M.) and were then checked for accuracy by a different investigator from our group. For 335 of the 341 data elements (98.2%), the results of both abstractions were in agreement. All discrepancies were resolved through discussion with a third investigator (J.W., J.S.R., or H.M.K.).
The outcome measures of interest were the midpoint of data collection until publication (data age), the time from first participant enrollment to last participant enrollment (enrollment time), and the time from final data collection to publication (publication time). We defined the data collection period as the start of enrollment to the end of follow-up. The definition of data age was determined as such to convey the mean age of the entire sample. The first patient enrolled in the trial could be viewed as the age of the data, as could the last patient follow-up collected. However, using these markers would not capture the sample as well as the midpoint of these 2 time points (Figure 1). We extracted the start and end dates of enrollment, as well as the final data collection date from the main text of each article. When those dates were not available, we checked the trial protocol or appendix and, if necessary, its registration on ClinicalTrials.gov. For trials that provided only a month as a start or end date, we assumed that the start date was the first day of that month and that the end date was the last day of that month. Only 27 of 341 trials (7.9%) were missing the start or end date, and the maximum that a trial duration could have been overestimated would be by 2 months. We extracted the publication date based on the e-publication ahead of print date when it was listed; otherwise, it was based on the print publication date.
Our independent variables included important trial features to characterize our sample and features that may be associated with data age at publication. Specifically, we collected data on the type of intervention (drug, device, or other), early study termination for any reason, number of patients enrolled, trial location (United States only, United States and outside of United States, or outside of United States only), number of trial centers, number of manuscript authors, author affiliation (government or private industry), funding source (government, nonprofit, private industry, or combination), whether the trial was registered on ClinicalTrials.gov or registered on another website, and whether results were posted on ClinicalTrials.gov. We also determined the favorability of the findings by designating trials with primary end point results that yielded statistically significant better outcomes for the treatment population compared with the control population as “favorable,” statistically significant worse outcome for the treatment population compared with the control population as “unfavorable,” and all others as “inconclusive.”
We calculated the median values and interquartile ranges (IQRs) for continuous variables because we presumed the distributions of these variables were not normal; we calculated the total counts and percentages for categorical variables. We reported the data age, enrollment time, and publication time, overall and by follow-up duration (<1 month, 1 month to 1 year, and >1 year). We conducted bivariate analysis to test whether data age, enrollment time, and publication time were associated with each of the following variables: type of intervention, early study termination for any reason, number of patients enrolled, trial location, number of trial centers, number of manuscript authors, author affiliation, funding source, favorability of the findings, trial registration on ClinicalTrials.gov, and results posted on ClinicalTrials.gov. To identify trial characteristics associated with data age, enrollment time, and publication time, we further developed multivariable linear regression models. We adjusted for variables that reached statistical significance at 2-sided P < .05 in bivariate analyses. Because longer follow-up duration was associated with older data age of the trials, we also adjusted for follow-up time in the model to account for the different follow-up times in these studies. In a post hoc analysis, we characterized the 10 studies with the longest data age and the 10 studies with the shortest data age. All data were analyzed with R, version 3.3.2 (The R Foundation for Statistical Computing).
Our search identified 979 original research articles, of which 566 were excluded because they were not randomized clinical trials, 23 because they did not represent a primary analysis, 46 because they had missing data on outcomes of interest, and 3 for other reasons; thus, our final analysis included 341 trials (eFigure 1 in the Supplement). The 341 trials assessed drugs (206 [60.4%]), devices (21 [6.2%]), and other interventions (114 [33.4%]) (Table 1). Among these trials, 37 (10.9%) had a follow-up period of less than 1 month, 172 (50.4%) had a follow-up period between 1 month and 1 year, and 132 (38.7%) had a follow-up period of more than 1 year. The median number of enrollees was 467 (IQR, 212-1260), the median number of trial centers was 23 (IQR, 6-62), and the median number of authors was 16 (IQR, 11-22).
The median data age (midpoint of data collection to publication) was 33.9 months (IQR, 23.5-46.3 months; range [minimum to maximum], 2.2-131.8 months). A total of 88 trials (25.8%) reported data age of less than 2 years; 31 trials (9.1%) reported data age of 5 years or more (Figure 2 and eFigure 2 in the Supplement). The median data age was 30.6 months (IQR, 18.6-39.0 months) for trials with a follow-up period of 1 month, 31.8 months (IQR, 21.0-41.7 months) for trials with a follow-up period of 1 month to 1 year, and 40.1 months (IQR, 30.3-51.9 months) for trials with a follow-up period of more than 1 year.
The median enrollment time was 26.2 months (IQR, 14.2-42.3 months; range, 0.3-141.1 months), and the median mean enrollment time per person across trials was 1.4 days (IQR, 0.5-3.8 days; range 0.0002-69.0 days); 64 of 313 trials (20.4%) completed enrollment within 1 year; 251 of 313 trials (80.2%) completed enrollment within 4 years; and 68 trials (19.9%) required more than 4 years to complete enrollment. A total of 257 of 313 trials (82.1%) required fewer than 5 days per enrolled participant; 32 of 313 trials (10.2%) required 9 days or more (Figure 2 and eFigure 3 in the Supplement).
The median time to publication was 14.8 months (IQR, 7.4-22.2 months; range, 0.5-90.3 months) (Figure 2). A total of 138 trials (40.5%) were published within 1 year after completing the final data collection, and 63 trials (18.5%) were published more than 2 years after completing the final data collection (eFigure 4 in the Supplement). Overall, 43.2% (IQR, 26.8%-61.6%) of the data age was the time to publication.
In multivariable analyses, some factors were associated with older data age, adjusted for follow-up time (Table 2). Compared with favorable trials, inconclusive or unfavorable trials had a median data age that was 235 days longer (95% CI, 108-362 days). Each additional day of follow-up duration was also associated with an additional 0.6 days (95% CI, 0.5-0.8) of data age.
We also found several characteristics associated with significantly longer enrollment time (Table 3). Specifically, trials that had no authors affiliated with private industry were associated with a longer enrollment time (by 566 days; 95% CI, 306-827 days) than those with at least 1 author affiliated with industry. Compared with trials that had only government funding, trials that had funding from both private industry and government or from both private industry and nonprofit agencies (by 460 days; 95% CI, 152-768 days) or all 3 sources together (by 676 days; 95% CI, 322-1031 days) were also associated with a longer enrollment time. Compared with trials that were funded only by private industry, trials that were only government funded were associated with an additional 180 days (95% CI, 18-343 days) to publish (Table 4).
In a post hoc analysis, we characterized 10 studies with the longest data age. They were mostly drug trials (7), were mostly non-US based (6), and had an end of follow-up to publication time range of 7.8 to 91.5 months. In contrast, the 10 studies with the shortest data age shared the following characteristics: 9 were drug trials, all had no follow-up or relatively short follow-up, the end of follow-up to publication time range was 0.5 to 6 months, they involved higher numbers of trial centers, and the area of study was predominately of the hepatitis C virus or Ebola.
In our review of clinical trials published in 2015 in 6 journals with high impact factors, we found that by the time of publication, the median data age was nearly 3 years and the median publication time was more than 1.2 years, with 63 trials (18.5%) taking 2 years or more to complete. For some trials, enrollment periods required as many as 9 days per participant. In multivariable analyses, inconclusive or unfavorable trial results (vs favorable results) were significantly associated with older data age after adjusting for follow-up time. Government-funded trials took 6 months longer in time to publication. Collectively, these findings suggest opportunities to adjust various processes related to clinical trials to improve the timeliness for dissemination of the final results.
Our study extends the current literature in 2 important ways. First, previous studies have shown delays in publication, which we confirm with a comprehensive assessment, in addition to elucidating data age and enrollment time as important time markers of clinical trials.4,5 Our descriptive analysis of the data age, enrollment time, and time to publication of randomized trials in medical journals with the highest impact factors provides benchmarks and indicates leverage points to improve the timeliness for research dissemination. As medical knowledge rapidly evolves, an old data age and a long delay in publication time can result in the knowledge generated from trials being less relevant to contemporary clinical practice.7-9 In addition, there could be implications for research—researchers trying to apply the trial findings and advance the science will be delayed in adapting the new knowledge.
Our study also adds to the literature on which trial characteristics are associated with the time needed for each phase of a trial from the start of enrollment to the dissemination of results to other investigators, clinicians, policy makers, and patients. We found that trials with more centers and more authors were significantly associated with shorter times to publication, but the effect sizes were small. Of greater importance, government funding was associated with substantially longer times to publication. Researchers funded by private companies may have shorter publication times because of greater incentives to produce and distribute findings compared with researchers funded by government grants or nonprofit foundations. Private funders may impose greater accountability on the clinical trial process to match the performance of industry; they may also provide more resources, better staffing, larger infrastructure, and share knowledge of patent drugs, devices, or other interventions to improve timeliness. There are many other factors that may be responsible for the differences between trials funded by private industry and those funded by the government, including resources, the use of contract research organizations, and motivation. The ultimate goal is to identify best practices and spread them. For some trials, particularly in the area of prevention, which depend on the accumulation of the hard end points over time, a longer follow-up time is required, and it is inevitable that these trials will have older data age at the time of publication. However, our study shows that, aside from the follow-up time, multiple areas contribute to older data age, including enrollment and publication times. Our findings reveal many opportunities in these areas where the clinical trial process can be accelerated and the time from data collection to publication (ie, the data age) can be shortened.
First, because there is variability in enrollment rates, the trials with relatively slow enrollment (>9 days per participant) may need to consider innovative strategies. Enrollment time might be shortened by integrating the randomization process into clinical practice, such as by the use of already existing clinical registries.10 Examples include the SAFE-PCI for Women (Study of Access Site for Enhancement of PCI for Women) trial11 using the National Cardiovascular Research Infrastructure as the platform for randomization and data collection, and the ADAPTABLE (Aspirin Dosing: A Patient-Centric Trial Assessing Benefits and Long-term Effectiveness) trial12 using the National Patient-Centered Clinical Research Network to support rapid and efficient randomization of patients. In this way, participants could be enrolled (and data generated) more quickly. Another approach might be preregistration of participants (ie, creating a pool of people who are amenable to enrollment in trials), so that such individuals are easier to identify and invite to participate. It may also be useful to make enrollment less reliant on clinicians and pursue more direct-to-participant strategies. The idea of participant-partnered research is growing and could provide opportunities to disrupt the current approaches.13,14
Another opportunity for improvement is publication time, which might be shortened by accelerating the aggregation and analysis of data, the time to write manuscripts, and the submission and revision processes before publication. Of the 6 journals that published the work examined in this study, only BMJ publicly provides this information, and it shows that the peer review process contributes to a substantial proportion of the publication time. It is possible that some manuscripts may be reviewed and rejected at other journals before they are accepted by a different journal, which could lead to some delays. An implication of this study is that there may be more a priori preparation for completion of a manuscript, even before the final data are analyzed. Authors can be preparing the Introduction and Methods of the manuscript even before final results are known, and discussions with journal editors may proceed before the calculation of results. In the end, there needs to be an imperative to report the results of a completed trial quickly and comprehensively. Many recent trials, including the recently published CANTOS (Canakinumab Anti-inflammatory Thrombosis Outcomes Study) trial, demonstrated that a complex trial could have a very short time from completion to publication (58 days).15
Our study has several limitations. First, this study, by design, focuses on trials from 6 general medicine journals with the highest impact factors in 1 year, not a complete sample of trials published across all general medicine journals. We selected these journals because of their prominence and because they publish the trials that are likely to inform clinical practice, and thus they should be the best examples for how quickly trials are conducted and published. Because the general medical journals with high impact factors analyzed in this study are more likely to publish more quickly compared with other general medical journals, we may expect longer delays if we include all general medicine journals. Second, we used the midpoint of data collection until publication as the definition of data age, but data might not be collected evenly over time. Therefore, our results may not precisely reflect the true data age, but we expect this difference to be small. Third, we do not have information about duration of the peer review process except for 1 journal; therefore, we are unable to determine whether longer times to publication were caused by submission delays or by the time required for peer review, acceptance, and publication. Fourth, this study cannot determine the effect of older data age in some trials. However, practice is changing rapidly, and it could be that the clinical care patterns at the end of the trial were different than at the beginning. Such interactions of effect are rarely tested. Also, all things being equal, more recent data and a more quickly completed study are preferable. Therefore, data age does seem to be a relevant metric worthy of more attention and study. This study also did not evaluate delays in the translation of the trial findings into practice. Often, even well-done trials experience delays after publication. This issue—which was beyond the scope of our present article—also deserves attention, along with more timely knowledge generation in the course of trials.
Clinical trials in 6 journals with high impact factors were published with a median data age of nearly 3 years. For a substantial proportion of these trials, there were extended times for enrollment and publication that led to markedly older data at the time of publication. There are seemingly many opportunities for improvement in the clinical trial process and in the work of trialists with journal editors.
Accepted for Publication: May 11, 2018.
Published: August 10, 2018. doi:10.1001/jamanetworkopen.2018.1065
Open Access: This is an open access article distributed under the terms of the CC-BY License. © 2018 Welsh J et al. JAMA Network Open.
Corresponding Author: Harlan M. Krumholz, MD, SM, Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, One Church Street, Ste 200, New Haven, CT 06510 (email@example.com).
Author Contributions: Dr Krumholz had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: Welsh, Bikdeli, Benchetrit, Mu, Krumholz.
Acquisition, analysis, or interpretation of data: Welsh, Lu, Dhruva, Bikdeli, Desai, Benchetrit, Zimmerman, Mu, Ross.
Drafting of the manuscript: Welsh, Benchetrit.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Welsh, Lu.
Administrative, technical, or material support: Welsh, Krumholz.
Supervision: Welsh, Desai, Krumholz.
Conflict of Interest Disclosures: Drs Desai, Ross, and Krumholz reported being recipients of a research agreement from Johnson & Johnson and Medtronic, through Yale University, to develop methods of clinical trial data sharing. Drs Ross and Krumholz reported receiving research support through a grant from the US Food and Drug Administration and Medtronic to develop methods for postmarket surveillance of medical devices. Dr Ross reported receiving research grant support from the Blue Cross Blue Shield Association. Dr Bikdeli reported receiving grants from the National Heart, Lung, and Blood Institute during the conduct of the study and serving as an expert (on behalf of the plaintiff) for litigation related to inferior vena caval filters; the content of this article is not directly related to that litigation. Dr Ross reported receiving grants from the US Food and Drug Administration, the Center for Medicare & Medicaid Services, Medtronic Inc, Johnson & Johnson, and Blue Cross Blue Shield Association outside the submitted work. Dr Krumholz reported receiving grants from Medtronic, Johnson & Johnson, and the US Food and Drug Administration; receiving contracts, through Yale, from the Centers for Medicare & Medicaid Services to develop performance measures that are publicly reported; receiving personal fees from UnitedHealthcare; receiving personal fees from IBM Watson Health; receiving personal fees from Element Science; receiving personal fees from Aetna; being the founder and owner of Hugo, a personal health information platform; serving as chair for a cardiac scientific advisory board for UnitedHealth; serving as a member of the Advisory Board for Element Science and the Physician Advisory Board for Aetna; and serving as a participant/participant representative of the IBM Watson Health Life Sciences Advisory Board. No other disclosures were reported.
Funding/Support: Dr Bikdeli is supported by grant T32 HL007854 from the National Heart, Lung, and Blood Institute, National Institutes of Health.
Role of the Funder/Sponsor: The funding source had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Disclaimer: The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Meeting Presentation: This article was presented at the Eighth International Congress on Peer Review and Scientific Publication; September 11, 2017; Chicago, Illinois.