Assessment of Trends in the Design, Accrual, and Completion of Trials Registered in ClinicalTrials.gov by Sponsor Type, 2000-2019

Key Points Question What are the characteristics and trends of clinical trials registered in ClinicalTrials.gov over time, and how do they differ by sponsor type? Findings In this cross-sectional study of ClinicalTrials.gov registration data on 245 999 interventional studies started between 2000 and 2019 that were sponsored by the National Institutes of Health or other US government agencies, industry, or other sources (foundations, universities, hospitals, clinics, and others), most trials were small, single-site studies that did not have US Food and Drug Administration–defined phases and were sponsored by other sources. Median sample sizes and years to trial completion decreased over time. Meaning The findings suggest that the composition and design of trials changed between 2000 and 2019 and differed substantially by sponsor type; increased funding toward larger randomized clinical trials may be warranted to inform clinical decision-making and guide future research.


Introduction
Since ClinicalTrials.gov was launched in 2000, more than 345 000 interventional and observational studies have been registered. [1][2][3] ClinicalTrials.gov is managed by the National Library of Medicine and is an online resource for health care professionals, researchers, patients, and the general public. It is an important resource that can be used to view and access clinical trials registration data. Analyzing clinical trials metadata can illuminate important trends over time, such as the composition, size, design, and types of trials being funded.
There have been updates to the clinical trials registration and reporting requirements since implementation of the US Food and Drug Administration (FDA) Modernization Act of 1997, which mandated clinical trials registration and led to the establishment of ClinicalTrials.gov. 4,5 In 2005, the International Committee of Medical Journal Editors (ICMJE) required registration of clinical trials as a prerequisite for publication. 6 Subsequently, the FDA Act (FDAAA 801) of 2007 expanded requirements to the types of trials being registered, key data elements being entered, and basic results being reported. 7 The Final Rule became effective in January 2017, further clarifying and expanding on the registration and requirements of FDAAA 801. 8 Some changes include the types of trials subject to the requirements, the information that must be submitted and data elements that are required to be entered on registration, and additional results information reporting requirements for trials. 8 Simultaneously, a policy was issued by the National Institutes of Health (NIH) to require registration and results reporting for all trials funded by the NIH regardless of whether the trials are covered by the FDAAA 801 requirements of the Final Rule. 8 Availability of the Clinical Trials Transformation Initiative Aggregate Analysis of ClinicalTrials.gov (CTTI AACT) database has facilitated and improved the ability to analyze ClinicalTrials.gov registration data. 9 In 2017, the CTTI AACT database was upgraded to a cloud-based platform that allows for open access to the complete set of trials registered in ClinicalTrials.gov for download and analysis. Its restructured and relational format facilitates analysis and provides access to additional fields that are not readily available in direct exports from ClinicalTrials.gov.
Previous reports of ClinicalTrials.gov registration data have focused analyses on specific funders, such as the NIH; on a single condition; or on a particular registration element within ClinicalTrials.gov. [10][11][12] To our knowledge, no studies have characterized trials by sponsor type during this 20-year time span. Thus, our objective was to assess the characteristics and trends of clinical trials started from January 1, 2000, through December 31, 2019, and to compare trends by sponsor type.

Variables of Interest
ClinicalTrials.gov registration fields, as coded in the CTTI AACT database, included the following: trial start and completion dates; study type (interventional or observational); overall status (completed, withdrawn, terminated, suspended, open to enrollment, recruiting, not yet recruiting, or status unknown); enrollment number; study phase (early phase 1, phase 1, phase 1 to 2, phase 2, phase 3, phase 4, and trials that do not have an FDA-defined phase [phase not applicable (NA)]); treatment assignment (randomized or not randomized); masking (open label or masked); facilities (single center or multicenter); posted results; and lead sponsor (NIH or other US government agency, industry, and all other sponsors). Lead sponsor is defined in ClinicalTrials.gov as the "organization or person who initiates the study and who has authority and control over the study." 1 This variable is not the same as funder type, which is derived from multiple data elements in ClinicalTrials.gov and is not available as a discrete field in the database download. Additional calculated variables included time to completion (calculated as the difference between actual completion date and start date for completed trials) and times to posted results. Anticipated and actual enrollment counts were also assessed by comparing target sample size provided at trial registration with the sample size provided on trial completion. A description of each variable as defined in ClinicalTrials.gov and used for the purpose of this article is available in eTable 1 in the Supplement.

Statistical Analysis
Results were grouped by lead sponsor and start date in 5-year periods: 2000 to 2004, 2005 to 2009, 2010 to 2014, and 2015 to 2019. These year groupings align with changes in the registration and reporting regulations, including the launch of ClinicalTrials.gov, the ICMJE edict, and implementation of FDAAA 801. Trial start dates were used to classify periods because registration dates can be entered retrospectively and thus are more likely to be inaccurate or lead to time misclassification.
Multivariable regression models were fitted to evaluate the association between sample size and sponsor type and were adjusted for start year and other trial design characteristics. An interaction term between start year and lead sponsor was included in the model in which a significant result would indicate an interactive effect. Anticipated and actual sample sizes were compared across sponsor types for trials started and completed between 2010 and 2019 using the available CTTI AACT archived databases for each year. Median times to completion were calculated from start date to actual completion date for completed trials. A 2-sided P < .05 was considered to be statistically significant. All tabulations and analyses were duplicated (A.G.G. and J.L.M.) using postgreSQL and SAS. The postgreSQL codes used to generate tables are available in the eAppendix in the Supplement. Additional analyses were performed using Stata, version 15 (StataCorp LLC).

Results
There were 325 860 registrations on ClinicalTrials.gov as of January 1, 2020, of which 245 999 were clinical trials (interventional studies) started between 2000 and 2019; 135 144 trials (54.9%) were completed (Figure). Overall, there were 8023 NIH-or US government-sponsored trials (3.3%), 70 329 industry-sponsored trials (28.5%), and 167 647 trials sponsored by other funding sources (68.1%). Among the NIH-and US government-sponsored trials, 63.7% were completed, 11.4% were incomplete, 20.2% were active, and 4.6% had unknown status ( Table 1). Industry-sponsored trials had the highest percentage of completed trials (69.2%) and the lowest percentage of active trials (16.8%), whereas trials sponsored by other sources had the lowest completion rates (48.5%) and the highest percentage of active trials (29.8%), including trials that were not yet recruiting, were recruiting, were enrolling by invitation, or were active and not recruiting. The number of NIH-and US government-sponsored trials started each year decreased over time, in contrast to the number of trials started per year that were sponsored by industry and other funding sources, which increased over time (eTable 2 in the Supplement).
Design characteristics of completed trials ordered by lead sponsors and start year are given in  Median sample sizes for completed trials by sponsor and phase over time are given in Table 3.  Table 4 shows median times to completion and IQRs for completed trials by sponsor type and start year, which decreased over time (eFigure 2 in the Supplement). For example, median years to completion for phase 3 to 4 trials sponsored by the NIH and other US government agencies

Discussion
ClinicalTrials.gov is an important resource that can be used to characterize the state and nature of trials. We describe trends and characteristics of 245 999 trials that were registered in ClinicalTrials.gov and started between 2000 and 2019. We found that trials had smaller sample sizes and were being completed in less time and that most trials were sponsored by other sources (foundations, universities, hospitals, clinics, and others) from 2000 to 2019.

Observed Trends in Clinical Trial Characteristics Over Time
The   centers involved over time. There was a decrease in the percentages of phase 3 to 4 trials and drug trials being conducted over time compared with an increase in the percentages of nondrug trials and trials without an FDA-defined phase. The rate in difference between early-phase trials (phase [1][2] and phase 3 trials decreased by almost half by the end of 2019. This shift may be explained by the increased uptake in registration for these trials, expansion of the clinical trial definition, or increasing interest in other intervention types (eg, behavioral interventions, imaging, biologic, and devices) in recent years. [15][16][17] The decreasing percentages of trials that involved drugs may also be associated with increasing costs and complexity of conducting phase 3 to 4 drug trials.

JAMA Network Open | Statistics and Research Methods
Overall, trial sample sizes decreased over time and took less time to complete. Median times to trial completion varied by sponsor type and phase. Industry completed trials at faster rates compared with the NIH and US government and other funders, possibly in association with more efficient trial startup processes and higher recruitment rates. Reasons for this trend may include changes in the types of outcomes being used (eg, surrogate outcomes and biomarkers as well as patient-reported outcomes), increasing trial-associated costs, and greater budget constraints. With an overall median sample size of 60 persons per trial, the ability to generate meaningful, reproducible differences with such a sample size remains questionable. 10,18 Reports from almost 10 years ago had similar conclusions, without any evidence of change or improvement. 10,12,19 The original planned sample sizes were not met and were often smaller compared with the actual sample size when the trial was completed. 20 Reasons for not achieving the planned sample size, other than meeting the scientific goals of the trial, include recruitment and retention difficulties, business decisions, and unavailability or discontinuation of funding. [21][22][23] At time of analysis, there were 21 455 trials started in 2019 and registered in ClinicalTrials.gov. If we assume registrations in ClinicalTrials.gov account for 70% of all trials registered and that the median cost (direct and indirect) per trial from start to finish is $1 000 000 at a minimum, the total cost for trials in 2019 would be approximately $31 billion. The median cost for comparative efficacy trials (phase 3-4) is closer to $19 million per trial, with larger trials ranging up to $53 million per trial. 24,25 Thus, increased funding for larger randomized clinical trials may be warranted to inform clinical decision-making and answer important clinical and health policy questions.

Opportunities for Improvement
The findings suggest that registration and reporting systems could be further improved. There appears to be a need for ClinicalTrials.gov to modify its registration system to accommodate the broader range of trials being conducted and the collaborative arrangements involved. For instance, there is currently no explicit data element for funding source in ClinicalTrials.gov. Thus, the lead sponsor variable was used to estimate trends by the different agencies and organizations as classified in ClinicalTrials.gov. The term lead sponsor refers to the primary organization that oversees study implementation and is responsible for conducting data analysis and is further used to determine the primary funding source for the study. 1 The impetus for ClincalTrials.gov was the FDA. That history is reflected in the emphasis on drug trials, US funding, and intermixing sponsor as funder and holder of Investigational New Drug applications. Because trials are largely collaborative and can involve several funders, there is potential for misclassification, underreporting, or overreporting of estimates of funding sources. Although methods have been described to estimate the probable funding source from ClinicalTrials.gov, this information cannot be easily exported or analyzed using the publicly available data set or derived without extensive data manipulation and assumptions. 10 Thus, an    Additional updates that may be beneficial to the analysis of ClinicalTrials.gov registration data include further subdivisions for trial phases beyond the FDA-defined phases to reduce the number of trials lacking an FDA-defined phase, expansion of the trial typography as listed in ClinicalTrials.gov to better capture the different types of trial designs, discrete elements for the outcome types (eg, time to event, surrogate, composite, and patient reported) and for specific time points for the primary outcomes for analysis purposes, and reduction of free-text renders to prevent formatting issues when analyzing the database. This approach would allow for future comparisons of specific outcome types and adjust analyses by outcome duration. Improvements to the system for tracking publications related to the trial are needed, including a separate field for the publication associated with the primary outcome, to determine the fraction of trials registered that are published.

JAMA Network Open | Statistics and Research Methods
Publication is the sine qua non of trials, but only a fraction of completed trials are published. Thus, continued efforts to enforce the timely and complete reporting of results are important to reduce the reporting biases associated with delayed publication or failure to publish. [26][27][28][29][30]

Limitations
This study has limitations. The analysis was limited to the available data as registered in standards, it is difficult to obtain a single comprehensive evaluation of all registered trials. It also remains unclear how many duplicate registrations may exist, especially for non-US studies that may be registered in both ClinicalTrials.gov and a second or third registration registry. 31 Thus, whether trials registered in ClinicalTrials.gov are representative of trials registered elsewhere may remain unknown until there is a system for merging registries into 1 or for establishing a universal and standardized data system for harvesting trial information from all registries through the World Health Organization International Clinical Trials Registry Platform.

Conclusions
Even with its limitations, ClinicalTrials.gov registration provides valuable insights into the massive clinical trials research enterprise. The findings suggest that the composition and design of trials changed over time and differed substantially by sponsor type. Increased funding toward larger randomized clinical trials may be warranted to inform clinical decision-making and guide future research.