[Skip to Navigation]
Sign In

Featured Clinical Reviews

July 31, 2020

Weighing the Benefits and Risks of Proliferating Observational Treatment Assessments: Observational Cacophony, Randomized Harmony

Author Affiliations
  • 1Verily Life Sciences (Alphabet), South San Francisco, California
  • 2Duke Clinical Research Institute, Durham, North Carolina
  • 3Division of Cardiology, Department of Medicine, Duke University School of Medicine, Durham, North Carolina
  • 4Nuffield Department of Population Health, University of Oxford, Headington, Oxford, United Kingdom
JAMA. 2020;324(7):625-626. doi:10.1001/jama.2020.13319

Amid the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic, substantial effort is being directed toward mining databases and publishing case series and reports that may provide insights into the epidemiology and clinical management of coronavirus disease 2019 (COVID-19). However, there is growing concern about whether attempts to infer causation about the benefits and risks of potential therapeutics from nonrandomized studies are providing insights that improve clinical knowledge and accelerate the search for needed answers, or whether these reports just add noise, confusion, and false confidence. Most of these studies include a caveat indicating that “randomized clinical trials are needed.” But disclaimers aside, does this approach help make the case for well-designed randomized clinical trials (RCTs) and accelerate their delivery?1 Or do observational studies reduce the likelihood of a properly designed trial being performed, thereby delaying the discovery of reliable truth?

The growth of structured registries and organization of claims and electronic health record data have greatly expedited sophisticated comparisons of therapies provided in clinical practice settings (ie, observational “real-world” evidence). Large troves of administrative and clinical data can be accessed and tabulated, using programs to construct propensity scores and inverse-weighted probability estimates.

The benefits of this approach, if well done, are obvious: by sifting potential treatments and measuring outcomes and safety signals, qualified investigators and funding agencies can choose the most promising therapies for testing in rigorous RCTs. Sample sizes and expected event rates can be calculated, and communities and health care systems with relevant patient populations identified. The risks, however, are also clear: aggregating information about diagnosis, comorbidities, treatment, and outcomes can lend a patina of technical excellence that obscures the influence of systematic bias (patients who receive a given treatment are not the same as those who do not), leading to erroneous estimates of treatment effects. These risks are often unclear to the public when observational findings are widely disseminated by the lay media.

Anxious, frightened patients, as well as clinicians and health systems with a strong desire to prevent morbidity and mortality, are all susceptible to cognitive biases.2 Furthermore, profit motives in the medical products industry, academic hubris, interests related to increasing the valuation of data platforms, and revenue generated by billing for these products in care delivery can all tempt investigators to make claims their methods cannot fully support, and these claims often are taken up by traditional media and further amplified on social media. Politicians have been directly involved in discourse about treatments they assert are effective. The natural desire of all elements of society to find effective therapies can obscure the difference between a proven fact and an exaggerated guess. Nefarious motives are not necessary for these problems to occur.

The role of regulators in this context is crucial. In the United States, the 21st Century Cures Act and user fee agreements require industry, academia, and regulators to advance the use of data and evidence from clinical settings.3 This legislation directed the US Food and Drug Administration (FDA) and the National Institutes of Health (NIH) to work with the clinical research ecosystem to develop robust methods for generating such evidence and clear guidance for applying it. Historically, the FDA has insisted on high-quality evidence as a condition for granting marketing approval for drugs and devices, and for specific marketing claims.

Considerable progress has been made in defining appropriate methods for improving the quality of observational treatment comparisons. Both NIH- and FDA-funded work fosters transparency by publishing study protocols, reporting results, and ensuring methodological rigor in this treacherous field. Methods for ensuring data quality are also evolving rapidly. In the context of COVID-19, the FDA has worked through the Evidence Accelerator to advance observational research methods and characterize quality and bias of newly available data sets.

This approach addresses valid concerns about veracity and data quality in observational research. However, this approach also should accelerate and prioritize the development and delivery of RCTs, not be viewed as a substitute for them. In fact, the most important data and evidence will accrue from applying randomized designs (individual, cluster, adaptive) within the context of data from clinical practice settings.4 The exigencies of the pandemic have created an understandable temptation to rush toward therapeutic options without the usual rigor, but the conclusions of reports must include appropriate caveats about the degree of uncertainty. Care must be taken to eschew “pandemic exceptionalism”5 to produce reliable evidence to guide intervention.

Academic leaders and clinicians also have critical responsibilities. The pressure to issue newsworthy pronouncements often fuels communications efforts by universities and companies that can promote unwarranted expectations in an era of social media “virality.” Clinicians must find the balance between supporting optimism in their patients and being truthful about the quality and uncertainties of therapeutic evidence. When given the option of using an unproven treatment or enrolling patients in appropriate, well-designed trials, the choice of advancing reliable knowledge should be far preferable.

Several recent experiences in the public arena exemplify concerns about a cacophony of scientific claims regarding candidate therapeutics. In the case of hydroxychloroquine, initial reports of benefit were followed by the initiation of multiple clinical trials using randomization across the spectrum of relevant populations. While these trials were accruing, multiple observational studies were published, claiming to show either no benefit or harm, and one very large published study received sharp criticism from experts and immediate calls for retraction due to methodological flaws and concerns about data provenance.6

However, despite the refrain that RCTs are needed, the lay and scientific press amplified various estimates of treatment effect, while at the same time hydroxychloroquine was promoted in the global political arena. The fact that a high-profile study incorporating observational data was later retracted6 is in some ways less relevant: during the brief interval when the study data were thought to be valid, many (including some international regulators) were duped by the method, turning the conclusion of “evidence from RCTs is needed” into a movement of “RCTs should cease.” However, several pragmatic RCTs were conducted, and definitive findings of no benefit for hydroxychloroquine in hospitalized patients with COVID-19 have been announced.7,8

Meanwhile, a venerable candidate for treating acute lung injury, the corticosteroid dexamethasone, was also being examined. In a preliminary report, low-dose dexamethasone resulted in a mortality reduction in patients with COVID-19 requiring oxygen or ventilator support,9 showing that this inexpensive, generic, lifesaving treatment is beneficial for relevant patients.

Another recent study of 20 000 patients treated with plasma infusions from recovering COVID-19 patients10 claimed evidence of safety and expressed optimism for benefit based on low reported event rates, although there was no control group to anchor the observed event rates. If a fraction of these patients had been enrolled in RCTs, the answer for whether this intervention was effective would now be known. Ongoing US RCTs are slowly accruing patients in the face of massive public plasma donation and uncertainty regarding benefits or risk in the treatment of COVID-19 patients.

Ideally, robust ongoing evaluation would be applied to the use of treatments and clinical outcomes. Continuing quality improvement in electronic health record and claims data; development of multiple registries to evaluate technologies, medical procedures, and quality of care; and ongoing methodological refinements all contribute to making a system of continuous learning feasible. In some situations, observational findings about treatment effects associated with specific interventions merit adoption in practice, but in most cases this learning system should identify promising treatments and approaches for designing proper large-scale trials or should supplement RCT findings by modeling effects seen in RCTs in broader populations. Rather than promoting inconclusive observational findings in medical journals and the press, a repository could be created to register results in a manner less apt to inappropriately influence practice. In addition, it seems prudent to place a moratorium on reporting observational studies that could mislead the public.

Once promising treatments are identified, the system should be aligned to optimize enrollment in well-designed RCTs with sufficient power to provide definitive answers. This will require reimagining the entire system to remove unjustified barriers, such as onerous bureaucratic steps, excessive costly monitoring, and data collection that is cumbersome and far exceeds the needs of the trial.4 Mechanisms must be in place to make it as easy or preferable for potential participants to enroll in a trial of a potentially worthwhile treatment as it is to prescribe the same unproven treatment. The former approach ensures the rapid advance of reliable clinical knowledge and benefits future patients; the latter means clinicians and researchers will remain ignorant. But if leaders, commentators, academics, and clinicians cannot restrain the rush to judgment in the absence of reliable evidence, the proliferation of observational treatment comparisons will hinder the goal of finding effective treatments for COVID-19—and a great many other diseases.

Back to top
Article Information

Corresponding Author: Robert M. Califf, MD, Verily Life Sciences, 269 E Grand Ave, South San Francisco, CA 94080 (robertcaliff@verily.com).

Published Online: July 31, 2020. doi:10.1001/jama.2020.13319

Conflict of Interest Disclosures: Dr Califf reported being head of clinical policy and strategy at Verily Life Sciences and Google Health, an adjunct professor of medicine at Duke University and Stanford University, a board member for Cytokinetics, and former commissioner for the FDA. Dr Hernandez reported receipt of grants and personal fees from AstraZeneca, Amgen, Boehringer Ingelheim, Novartis, and Merck, personal fees from Bayer, and grants from Janssen and Verily, as well as being the principal investigator for the Healthcare Worker Exposure & Outcomes Research (HEROES) Program funded by the Patient-Centered Outcomes Research Institute. Dr Landray reported receipt of grants from Boehringer Ingelheim, Novartis, The Medicines Company, Merck, Sharp & Dohme, and UK Biobank and being co–chief investigator for the RECOVERY trial of potential treatments for hospitalized patients with COVID-19, funded by UK Research & Innovation and the National Institute for Health Research (NIHR).

Funding/Support: Dr Landray is supported by Health Data Research UK, the NIHR Oxford Biomedical Research Centre, and the Medical Research Council Population Health Research Unit.

Role of the Funder/Sponsor: Supporters had no role in the preparation, review, or approval of the manuscript or decision to submit the manuscript for publication.

Additional Contributions: We thank Jonathan McCall, MS (Duke Forge, Duke University), for editorial assistance. No compensation other than usual salary was received.

Kalil  AC.  Treating COVID-19—off-label drug use, compassionate use, and randomized clinical trials during pandemics.   JAMA. 2020;323(19):1897-1898. doi:10.1001/jama.2020.4742PubMedGoogle ScholarCrossref
Halpern  SD, Truog  RD, Miller  FG.  Cognitive bias and public health policy during the COVID-19 pandemic.   JAMA. Published online June 29, 2020. doi:10.1001/jama.2020.11623PubMedGoogle Scholar
Corrigan-Curay  J, Sacks  L, Woodcock  J.  Real-world evidence and real-world data for evaluating drug safety and effectiveness.   JAMA. 2018;320(9):867-868. doi:10.1001/jama.2018.10136PubMedGoogle ScholarCrossref
Collins  R, Bowman  L, Landray  M, Peto  R.  The magic of randomization versus the myth of real-world evidence.   N Engl J Med. 2020;382(7):674-678. doi:10.1056/NEJMsb1901642PubMedGoogle ScholarCrossref
London  AJ, Kimmelman  J.  Against pandemic research exceptionalism.   Science. 2020;368(6490):476-477. doi:10.1126/science.abc1731PubMedGoogle ScholarCrossref
Mehra  MR, Ruschitzka  F, Patel  AN.  Retraction—hydroxychloroquine or chloroquine with or without a macrolide for treatment of COVID-19: a multinational registry analysis.   Lancet. 2020;395(10240):1820. doi:10.1016/S0140-6736(20)31324-6PubMedGoogle ScholarCrossref
Horby  P, Mafham  M, Linsell  L,  et al. Effect of hydroxychloroquine in hospitalized patients with COVID-19: preliminary results from a multi-centre, randomized, controlled trial. medRxiv. Preprint posted July 15, 2020. doi:10.1101/2020.07.15.20151852
Q&A: Hydroxychloroquine and COVID-19. World Health Organization website. Published June 19, 2020. Accessed July 5, 2020. https://www.who.int/news-room/q-a-detail/q-a-hydroxychloroquine-and-covid-19
Horby  P, Lim  WS, Emberson  J,  et al. Effect of dexamethasone in hospitalized patients with COVID-19: preliminary report. medRxiv. Preprint posted June 22, 2020. doi:10.1101/2020.06.22.20137273
Joyner  MJ, Bruno  KA, Klassen  SA,  et al.  Safety update: COVID-19 convalescent plasma in 20,000 hospitalized patients.   Mayo Clin Proc. Published online July 19, 2020. doi:10.1016/j.mayocp.2020.06.028Google Scholar
4 Comments for this article
Observational Studies and Clinical Trials
Sylvia Smoller, PhD | Albert Einstein College of Medicine
I agree with Dr. Califf that clinical trials are the gold standard, but they are not always possible. Let's not forget that the tobacco-lung cancer and the tobacco-CVD evidence of harm came from observational studies and had enormous impact on public health. There are good observational studies and bad observational studies, just as there are well-powered or under-powered clinical trials. Good observational studies can be extremely useful.
Randomized Clinical Trials. Necessary but with Limitations
Jeanne Dobrzynski, BA; John B. Kostis, MD, DPhil | Cardiovascular Institute, Rutgers Robert Wood Johnson Medical School
We recently itemized “Limitations of Randomized Clinical Trials” in the American Journal of Cardiology (1).  Clinical trials are necessary to establish efficacy of interventions, but by themselves do not describe the effectiveness when the intervention comes to general practice. In particular as it pertains to COVID-19, there may be bias in choosing the hypothesis to be examined, especially when randomized clinical trials are sponsored by the industry. Examining 370 randomized drug trials, Als-Nielsen (2) reported that trials funded by for-profit organizations were more likely to report positive results although equipoise dictates that in 50% of the cases an intervention would be beneficial and in 50% not beneficial. In addition, favorable presentation of clinical trials by emphasizing relative risk reduction rather than absolute risk reduction or number needed to treat may be misleading. In our opinion, clinical decisions on individual patients should be made by considering clinical trials as well as from observational studies.


1. Kostis JB, Dobrzynski JM. Limitations of randomized clinical trials. Am J Cardiol. 2020 May 16:S0002-9149(20)30486-0. doi: 10.1016/j.amjcard.2020.05.011. Online
2. Als-Nielsen B, Chen W, Gluud C, Kjaergard LL. Association of funding and conclusions in randomized drug trials. A reflection of treatment effects or adverse events? JAMA. 2003;290:921-928.

John B. Kostis, MD, DPhil
Jeanne M. Dobrzynski, BA
Cardiovascular Institute
Rutgers Robert Wood Johnson Medical School
Pragmatic Realities
Steven Scott, MD | CORE Physician Resources
COVID-19 is still racing far ahead of our ability to perform enough research to develop meaningful treatment guidelines. Those guidelines in place rely more on expert opinion than they do on evidence. Scientific studies, regardless of their quality only provide guidance at the population level. Which is why evidence-based medicine produces guidelines rather than cookbooks.

We all know that observational assessment of treatments has its pitfalls. But until the day that we have a large enough body of randomized trials to offer a viable alternative, wasting our collective energy stating the obvious smacks of parochial protectionism. />
Our reputation as a profession would be much better served if that energy was spent in coordinating study design between various competing entities. Randomized trials using different designs and criteria produce as much chaos as observational ones do. They lead to constant conclusions of "more study needed." Meanwhile, patients are dying, those in the trenches are flying by the seat of their pants, and politicians and journalists are having a hayday manipulating public opinion.
Well-Designed Studies Come in all Flavors
Daniel Waxman, MD, PhD | UCLA David Geffen School of Medicine
There are well-designed and poorly-designed RCTs and the same holds true for observational studies. With regard to COVID-19, a recent RCT of remdesivir is uninterpretable because it is open-label and underpowered for clinically-important outcomes (1). At the same time, a recent observational study of convalescent plasma made wonderful use of a natural experiment--varying antibody titers measured post-hoc--to provide compelling evidence for a small but significant treatment effect (2). As others have mentioned, RCTs are not always feasible or appropriate. Placebo-controlled trials for patients at high risk of death, where there is strong theoretical basis for treatment efficacy (e.g. convalescent plasma), are particularly problematic. Innovative trial design and critical review are needed all around, but RCTs must not have a monopoly on evidence.


1. Li, Ling, Wei Zhang, Yu Hu, Xunliang Tong, Shangen Zheng, Juntao Yang, Yujie Kong et al. "Effect of Convalescent Plasma Therapy on Time to Clinical Improvement in Patients With Severe and Life-threatening COVID-19: A Randomized Clinical Trial." JAMA (2020).

2. Joyner, Michael J., Jonathon W. Senefeld, Stephen A. Klassen, John R. Mills, Patrick W. Johnson, Elitza S. Theel, Chad C. Wiggins et al. "Effect of Convalescent Plasma on Mortality among Hospitalized Patients with COVID-19: Initial Three-Month Experience." medRxiv (2020)