The observed number of deaths is indicated by the solid line, and the expected number of deaths, adjusting for seasonality, influenza epidemics, and reporting delays, is indicated by the dashed line. The area between these 2 lines represents the total number of excess deaths: blue-gray (bottom), deaths recorded as due to COVID-19; orange (narrow middle section), additional pneumonia and influenza excess deaths not coded as due to COVID-19; and beige (top), deaths that were not attributed to COVID-19, pneumonia, or influenza.
Observed number of deaths are indicated by solid lines, and the expected number of deaths, adjusting for seasonality, influenza epidemics, and reporting delays, are indicated by the dashed lines. The area between these 2 lines represents the total number of excess deaths: blue-gray (bottom), deaths recorded as due to COVID-19; orange (narrow middle section), additional pneumonia and influenza excess deaths not coded as due to COVID-19; and beige (top), deaths that were not attributed to COVID-19, pneumonia, or influenza.
Excess deaths due to all causes are indicated by solid black lines; shaded areas are 95% prediction intervals. Reported deaths due to COVID-19 are indicated by dotted black lines. Dashed blue lines indicate the volume of tests performed per 1000 population in that week.
eFigure 1. Estimated Reporting Delays by State
eFigure 2. Excess Deaths for Additional States From March 1, 2020-May 30, 2020
eFigure 3. Map of Excess All-Cause Death Deaths by State, and COVID-19 Deaths Reported by NCHS
eFigure 4. Trends in Excess Mortality Due to All Causes or Reported Deaths Due to COVID-19 for March 1, 2020 to May 30, 2020
eFigure 5. Relative Increase (Observed/Expected) for Influenza-like Illness and All-Cause Deaths in Select States With Large Epidemics
eFigure 6. Comparison of Excess Deaths Per Week Estimated From the Main Regression Model or Using an Empirical Baseline
eFigure 7. Observed Deaths Per Week, Compared to the Model Fitted Deaths +/- 95% Prediction Intervals, Through end of February 2020
eFigure 8. Evaluation of How Reporting Delay Adjustments Influence the Magnitude of Deaths at the National Scale
eTable. Comparison of Baselines That Are or Are Not Adjusted for Influenza
eAppendix. Supplemental Methods
Customize your JAMA Network experience by selecting one or more topics from the list below.
Weinberger DM, Chen J, Cohen T, et al. Estimation of Excess Deaths Associated With the COVID-19 Pandemic in the United States, March to May 2020. JAMA Intern Med. 2020;180(10):1336–1344. doi:10.1001/jamainternmed.2020.3391
Did more all-cause deaths occur during the first months of the coronavirus disease 2019 (COVID-19) pandemic in the United States compared with the same months during previous years?
In this cohort study, the number of deaths due to any cause increased by approximately 122 000 from March 1 to May 30, 2020, which is 28% higher than the reported number of COVID-19 deaths.
Official tallies of deaths due to COVID-19 underestimate the full increase in deaths associated with the pandemic in many states.
Efforts to track the severity and public health impact of coronavirus disease 2019 (COVID-19) in the United States have been hampered by state-level differences in diagnostic test availability, differing strategies for prioritization of individuals for testing, and delays between testing and reporting. Evaluating unexplained increases in deaths due to all causes or attributed to nonspecific outcomes, such as pneumonia and influenza, can provide a more complete picture of the burden of COVID-19.
To estimate the burden of all deaths related to COVID-19 in the United States from March to May 2020.
Design, Setting, and Population
This observational study evaluated the numbers of US deaths from any cause and deaths from pneumonia, influenza, and/or COVID-19 from March 1 through May 30, 2020, using public data of the entire US population from the National Center for Health Statistics (NCHS). These numbers were compared with those from the same period of previous years. All data analyzed were accessed on June 12, 2020.
Main Outcomes and Measures
Increases in weekly deaths due to any cause or deaths due to pneumonia/influenza/COVID-19 above a baseline, which was adjusted for time of year, influenza activity, and reporting delays. These estimates were compared with reported deaths attributed to COVID-19 and with testing data.
There were approximately 781 000 total deaths in the United States from March 1 to May 30, 2020, representing 122 300 (95% prediction interval, 116 800-127 000) more deaths than would typically be expected at that time of year. There were 95 235 reported deaths officially attributed to COVID-19 from March 1 to May 30, 2020. The number of excess all-cause deaths was 28% higher than the official tally of COVID-19–reported deaths during that period. In several states, these deaths occurred before increases in the availability of COVID-19 diagnostic tests and were not counted in official COVID-19 death records. There was substantial variability between states in the difference between official COVID-19 deaths and the estimated burden of excess deaths.
Conclusions and Relevance
Excess deaths provide an estimate of the full COVID-19 burden and indicate that official tallies likely undercount deaths due to the virus. The mortality burden and the completeness of the tallies vary markedly between states.
The novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) first emerged in December 2019 in Wuhan, China, and rapidly grew into a global pandemic.1 Without adequate capacity to test for SARS-CoV-2, the virus that causes coronavirus disease 2019 (COVID-19), during the early part of the pandemic, laboratory-confirmed cases captured only an estimated 10% to 15% of all infections.2 As a result, estimating the number of deaths caused by COVID-19 is a challenge.
Questions have been raised about the reported tallies of deaths related to COVID-19 in the United States. Some officials have raised concerns that deaths not caused by the virus were improperly attributed to COVID-19, inflating the reported tolls. However, given the limited availability of viral testing and the imperfect sensitivity of the tests,3,4 there have likely been a number of deaths caused by the virus that were not counted. Furthermore, if patients with chronic conditions turn away from the health care system because of concerns about potential COVID-19 infection, there could be increases in certain categories of deaths unrelated to COVID-19. In the midst of a large outbreak, there is also an unavoidable delay in the compilation of death certificates and ascertainment of causes of death. Overall, the degree of testing, criteria for attributing deaths to COVID-19, and the length of reporting delays are expected to vary between states, further complicating efforts to obtain an accurate count of deaths related to the pandemic.
To estimate the mortality burden of a new infectious agent when there is a lack of comprehensive testing, it is common to assess increases in rates of death beyond what would be expected if the pathogen had not circulated.5-7 The “excess death” approach can be applied to specific causes of death directly related to the pathogen (eg, pneumonia or other respiratory conditions), or this approach can be applied to other categories of deaths that may be directly or indirectly influenced by viral circulation or pandemic interventions (eg, cardiac conditions, traffic injuries, or all causes). The excess deaths methodology has been used to quantify official undercounting of deaths for many pathogens, including pandemic influenza viruses and HIV.7-9
In this study, we estimate the excess deaths due to any cause in each week of the COVID-19 pandemic across the United States. We compare these estimates of excess deaths with the reported numbers of deaths due to COVID-19 in different states and evaluate the timing of these increases in relation to testing and pandemic intensity. These analyses provide insights into the burden of COVID-19 in the early months of the outbreak in the United States and serve as a surveillance platform that can be updated as new data accrue.
Data on deaths due to pneumonia, influenza, and COVID-19 (International Statistical Classification of Diseases and Related Health Problems, Tenth Revision codes U07.1 or J09-J18) and on deaths due to all causes were obtained from the National Center for Health Statistics (NCHS) mortality surveillance system.10 Data were stratified by state and week.
Data on all-cause deaths in previous years were obtained from https://data.cdc.gov/resource/pp7x-dyj2 and https://data.cdc.gov/resource/muzy-jte6. Data on all-cause deaths and pneumonia/influenza/COVID-19 deaths since January 26, 2020, were obtained from https://data.cdc.gov/resource/r8kw-7aab. The NCHS data are based on the state where the death occurred rather than the state of residence.
The NCHS reports deaths as they are received from the states and processed; counts of deaths from recent weeks are highly incomplete, reflecting delays in reporting. These “provisional” counts are updated regularly for past weeks, and the counts are not finalized until more than a year after the deaths occur.
Historical data on the proportion of deaths due to pneumonia and influenza in previous years were obtained from Centers for Disease Control and Prevention (CDC) weekly influenza death reports (https://gis.cdc.gov/grasp/fluview/mortality.html) via the cdcfluview package in R (R Foundation), and these were used to determine the number of pneumonia and influenza deaths in the baseline period. All data were accessed June 12, 2020.
Connecticut and North Carolina were missing mortality data for recent months and were therefore excluded from the analyses and from the baseline numbers.
We also compiled data on COVID-19–related morbidity to gauge the timing and intensity of the pandemic in different locations. We used CDC data on influenza-like illness,11 a long-standing indicator of morbidity due to acute respiratory infections, which has been used to monitor COVID-19. We also obtained information on influenza virus circulation to adjust baseline estimates.12 See the eAppendix in the Supplement for details.
To compare our excess mortality estimates with official COVID-19 tallies, we compiled weekly numbers of reported deaths due to COVID-19 in each state from the NCHS,13 and these data were supplemented with data from the COVID Tracking Project.14 State-specific testing information was obtained from the COVID Tracking Project14
These analyses use publicly available aggregate data and were deemed exempt from human subjects review by the Yale institutional review board (protocol 1411014890).
To calculate the number of excess deaths, we first needed to estimate the baseline number of deaths in the absence of COVID-19. We then subtracted the expected number of deaths in each week from the observed number of deaths for the period March 1, 2020, to May 30, 2020.
Each of the 48 states (excluding North Carolina and Connecticut) and the District of Columbia were analyzed individually. We fit Poisson regression models to the weekly state-level death counts from January 5, 2015, to January 25, 2020 (see the eAppendix in the Supplement for details). The baseline was then projected forward until May 30, 2020, to generate baseline deaths; excess mortality was defined as the observed mortality minus the baseline for the pandemic period March 1, 2020, to May 30, 2020. The baseline model was adjusted for seasonality, year-to-year baseline variation, influenza epidemics, and reporting delays. The model for pneumonia/influenza/COVID-19 mortality used all-cause deaths as a denominator and did not have a separate adjustment for reporting delays. Poisson 95% prediction intervals were estimated by sampling from the uncertainty distributions for the estimated model parameters.15 Pennsylvania was not highlighted in the data despite having a large number of excess deaths because the data were incomplete during March 2020. Deaths for New York City are reported separately by the NCHS, and we report estimates for New York City and the rest of New York State separately. To obtain national-level estimates, the observed count and predicted counts (median estimate from the model) for each state were summed for each week and compared. Estimates for excess all-cause deaths were rounded to the nearest 100 and for excess pneumonia/influenza/COVID-19 deaths to the nearest 10. Medians and 95% prediction intervals are presented.
Reporting delays make it challenging to estimate excess deaths for recent weeks. To adjust for incomplete data in recent weeks, we adjusted the baseline based on an estimate for data completeness in that week. The estimate of completeness is based on the number of weeks that passed between the week in which the data set was obtained and the week in which the death occurred. We used a modified version of the NobBS package in R to estimate the proportion of deaths that were reported for each date and incorporated that as an adjustment in the main analysis16 (eAppendix in the Supplement). For instance, if we estimated that the data were 75% complete for a particular week, we multiplied the baseline by 0.75. These reporting delays were estimated using provisional data for deaths that occurred since March 29, 2020, and thus reflect changes in reporting that might have occurred during the pandemic. The completeness of the data varied markedly between states (eFigure 1 in the Supplement).
A study by Woolf et al17 of excess deaths in the US used the same database and a related harmonic regression method. The main differences in methodology are that Woolf et al did not adjust for reporting delays, the study period ended on April 25, 2020, and that study controlled for time trends using an adjustment for calendar year rather than epidemiological year.
The analyses were run using R version 3.6.1. All analysis scripts and archives of the data are available at https://doi.org/10.5281/zenodo.3893882 and the current version of the repository is available at https://github.com/weinbergerlab/excess_pi_covid. More details about the data and methods are in the eAppendix in the Supplement.
Across the United States, there were 95 235 reported deaths officially attributed to COVID-19 from March 1 to May 30, 2020. In comparison, there were an estimated 122 300 (95% prediction interval, 116 800-127 000) excess deaths during the same period (Table). The deaths officially attributed to COVID-19 accounted for 78% of the excess all-cause deaths, leaving 22% unattributed to COVID-19. The proportion of excess deaths that were attributed to COVID-19 varied between states and increased over time (Table and Figure 1).
The changes in mortality that occurred during the pandemic varied by state and region. In New York City, all-cause mortality rose 7-fold above baseline at the peak of the pandemic, for a total of 25 100 (95%prediction interval, 24 800-25 400) excess deaths, of which 26% were unattributed to COVID-19 (Table and Figure 2). In contrast, in the rest of New York State, the increase was more moderate, rising 2-fold above baseline and resulting in 12 300 (95% prediction interval, 11 900-12 700) excess deaths. There were notable per capita increases in rates of death due to any cause in many other states, including New Jersey, Massachusetts, Louisiana, Illinois, and Michigan, where the number of deaths greatly exceeded the expected levels (Table, Figure 2, and Figure 3; eFigure 2 in the Supplement for additional states). Other states, particularly smaller states in the central United States and northern New England, had some COVID-19 deaths reported in official tallies but small or no detectable increases in all-cause deaths above expected levels (Table).
The gap between the reported COVID-19 deaths and the estimated all-cause excess deaths varied among states (Table; eFigure 3 in the Supplement). For instance, California had 4046 reported deaths due to COVID-19 and 6800 (95% prediction interval, 6100-7500) excess all-cause deaths, leaving 41% of the excess deaths unattributed to COVID-19 (Table). Texas and Arizona had even wider gaps, with approximately 55% and 53% of the excess deaths unattributed to COVID-19, respectively. In contrast, there was better agreement between the reported COVID-19 deaths and the excess all-cause deaths in Minnesota, with 12% unattributed to COVID-19 (Table).
Some of the discrepancy between reported COVID-19 deaths and excess deaths could be related to the intensity and timing of increases in testing. In some states (eg, Texas, California), excess all-cause mortality preceded the widespread adoption of testing for SARS-CoV-2 by several weeks (Figure 4; eFigure 4 in the Supplement for additional states). In other states (eg, Massachusetts, Minnesota), testing intensity increased prior to or with the increase in excess deaths, and the gap between COVID-19 deaths and excess deaths was smaller (Figure 4).
The increase in excess deaths in many states trailed an increase in outpatient visits due to influenza-like illness by several weeks (eFigure 5 in the Supplement).
We performed several sensitivity analyses. We refit the seasonal baseline without adjusting for influenza activity (eTable in the Supplement). Excluding influenza pulled the baseline upward and led to smaller excess estimates in some states. Furthermore, we created an empirical baseline by averaging the number of deaths in corresponding weeks of the previous years. This yielded weekly estimates of excess death that aligned closely with estimates from our model in April 2020. The estimates of excess deaths based on the empirical baseline were slightly higher than those calculated with the modeled baseline in March 2020 and much lower estimates for May (eFigure 6 in the Supplement). The difference in the estimates for May is driven by reporting delays, which are adjusted for in the modeling approach but not in the empirical baseline. This suggests that our modeling approach provides robust estimates of excess mortality while allowing for formal quantification of uncertainty and more timely estimates than other empirical approaches. Finally, we explored the accuracy of our adjustment for reporting lags (eFigure 8 in the Supplement). The reporting delay correction underestimates deaths by 5% to 8% 2 weeks after the deaths at the national level but then stabilizes after 3 weeks or more. Therefore, our excess mortality estimates for the most recent week are modestly conservative.
Mortality data are released regularly, and updated analyses, along with additional figures, are available at https://weinbergerlab.github.io/excess_pi_covid/.
Monitoring excess deaths has been used as a method for tracking influenza mortality for more than a century. Herein, we used a similar strategy to capture COVID-19 deaths that had not been attributed specifically to the pandemic coronavirus. Monitoring trends in broad mortality outcomes, like changes in all-cause and pneumonia/influenza/COVID-19 mortality, provides a window into the magnitude of the mortality burden missed in official tallies of COVID-19 deaths. Given the variability in testing intensity between states and over time, this type of monitoring provides key information on the severity of the pandemic and the degree to which viral testing might be missing deaths caused by COVID-19. These findings demonstrate that estimates of the death toll of COVID-19 based on excess all-cause mortality may be more reliable than those relying only on reported deaths, particularly in places that lack widespread testing.
Syndromic end points, such as deaths due to pneumonia/influenza/COVID-19, outpatient visits for influenza-like illness, and emergency department visits for fever, can provide a crude but informative measure of the progression of the outbreak.18 These measures themselves can be biased by changes in health-seeking behavior and how conditions are recorded. However, in the absence of widespread and systematic testing for COVID-19, they provide a useful measure of pandemic progression and the impact of interventions.
The gap between reported COVID-19 deaths and excess deaths can be influenced by several factors, including the intensity of testing; guidelines on the recording of deaths that are suspected to be related to COVID-19 but do not have a laboratory confirmation; and the location of death (eg, hospital, nursing home, or unattended death at home). For instance, deaths that occur in nursing homes might be more likely to be recognized as part of an epidemic and correctly recorded as due to COVID-19. As the pandemic has progressed, official statistics have become better aligned with excess mortality estimates, perhaps due to enhanced testing and increased recognition of the clinical features of COVID-19. In New York City, official COVID-19 death counts were revised after careful inspection of death certificates, adding an extra 5048 probable deaths to the 13 831 laboratory-confirmed deaths.19 As a result, the all-cause excess mortality burden from March 11 to May 2, 2020, is only 27% higher than official COVID-19 statistics.19 This aligns well with our estimate of 26% for a similar period in New York City, using a slightly different modeling approach.
Many European countries have experienced sharp increases in all-cause deaths associated with the pandemic. Real-time all-cause mortality data from the EuroMomo project (https://www.euromomo.eu/) demonstrate gaps between the official COVID-19 death toll and excess deaths that echo findings in our study. These gaps are more pronounced in countries that were affected more and earlier by the pandemic and had weak testing. Very limited excess mortality information is available from Asia, Africa, the Middle East, and South America thus far; these data will be important to fully capture the heterogeneity of death rates related to the COVID-19 pandemic across the world. Prior work on the 1918 and 2009 pandemics has shown substantial heterogeneity in mortality burden between countries, in part related to health care.8,20
These analyses are all based on provisional data, which are incomplete for recent weeks in some states because of reporting delays. We have attempted to correct for these reporting delays in the analysis. Sensitivity analyses suggest that these corrections might result in estimates that are conservative (smaller estimates of excess) in the most recent week (eFigure 8 in the Supplement) at the national level, but the correction might overestimate excess deaths in the most recent week in some states. Since several months of data have accrued, and pandemic activity is currently low nationally, any inaccuracies in correcting for reporting delays in recent weeks would likely have a minor impact on the overall estimates of excess deaths.
An alternative approach to the one presented here would be to simply apply the observed number of deaths to the average number of deaths in the corresponding weeks from previous years (eFigure 6 in the Supplement). While this would yield similar answers during certain periods (particularly in April 2020), using an empirical baseline would ignore secular trends in death rates, the potential impact of influenza epidemics in the early part of the COVID-19 pandemic, and reporting delays in more recent weeks. While it would be ideal to wait until the pandemic is over and analyze complete data, there is a need for timely data and analysis during public health emergencies, so the trade-off between data completeness is warranted.
The number of excess deaths reported herein could reflect increases in rates of death directly caused by the virus, increases indirectly related to the pandemic response (eg, due to avoidance of health care), as well as declines in certain causes (eg, deaths due to motor vehicle collisions or triggered by air pollution). Further work is needed to determine the relative importance of these different forces on the overall estimates of excess deaths.
The national estimates do not include data from Connecticut and North Carolina. Together, these account for only 4.5% of the US population and are unlikely to have a large influence on the national-level estimates.
We used a Poisson regression model for analysis. While there was modest overdispersion in some of the larger states, the 95% prediction intervals provide adequate coverage during the prepandemic period (eFigure 7 in the Supplement).
We present a comparison of excess deaths with influenza-like illness. Influenza activity declined to historically low levels starting in March 2020. At the same time, health care–seeking behavior changed drastically. Therefore, analyses of influenza and influenza-like illness need to be interpreted with caution. Regardless, this analysis demonstrates the expected time lag between outpatient visits for influenza-like illness and excess deaths (eFigure 8 in the Supplement).
Monitoring syndromic causes of death can provide crucial additional information on the severity and progression of the COVID-19 pandemic. Estimates of excess deaths will be less biased by variations in viral testing, but reporting lags need to be properly accounted for. Even in situations of ample testing, deaths due to viral pathogens, including SARS-CoV-2, can occur indirectly via secondary bacterial infections or exacerbation of comorbidities. There can also be secondary effects on mortality due to changes in population behavior brought about by strict lockdown measures and an aversion of the health care system. Together with information on official tallies of COVID-19 deaths, monitoring excess mortality provides a key tool in evaluating the effects of an ongoing pandemic.
Corresponding Author: Daniel M. Weinberger, PhD, PO Box 208034, New Haven, CT 06520 (firstname.lastname@example.org).
Accepted for Publication: June 15, 2020.
Published Online: July 1, 2020. doi:10.1001/jamainternmed.2020.3391
Author Contributions: Dr Weinberger had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: Weinberger, Chen, Cohen, Pitzer, Reich, Russi, Simonsen, Viboud.
Acquisition, analysis, or interpretation of data: Weinberger, Crawford, Mostashari, Olson, Reich, Russi, Watkins, Viboud.
Drafting of the manuscript: Weinberger, Russi, Watkins, Viboud.
Critical revision of the manuscript for important intellectual content: Weinberger, Chen, Cohen, Crawford, Mostashari, Olson, Pitzer, Reich, Simonsen, Watkins, Viboud.
Statistical analysis: Weinberger, Crawford, Reich, Russi, Viboud.
Obtained funding: Weinberger.
Administrative, technical, or material support: Chen, Olson, Russi, Viboud.
Conflict of Interest Disclosures: Dr Weinberger reported receipt of consulting fees from Pfizer, Merck, GlaxoSmithKline, and Affinivax for topics unrelated to this work and being principal investigator on a research grant from Pfizer on an unrelated topic. Dr Pitzer reported having received reimbursement from Merck and Pfizer for travel expenses to scientific input engagements unrelated to the topic of this work and being a member of the World Health Organization Immunization and Vaccine-related Implementation Research Advisory Committee (IVIR-AC). No other disclosures were reported.
Funding/Support: This study was supported by grants R01AI123208 (Dr Weinberger), R01AI137093 (Drs Weinberger and Pitzer), R01AI112970 (Dr Pitzer), and R01AI146555 (Dr Cohen) from the National Institute of Allergy and Infectious Diseases/National Institutes of Health; by grant 1DP2HD091799-01 (Dr Crawford) from the Eunice Kennedy Shriver National Institute of Child Health and Human Development; by grant R35GM119582 (Dr Reich) from the National Institute of General Medical Sciences/National Institutes of Health; by grant 1U01IP001122 (Dr Reich) from the CDC; and by grant CF20-0046 (Dr Simonsen) from the Carlsberg Foundation.
Role of the Funder/Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication.
Disclaimer: This study does not necessarily represent the views of the National Institutes of Health or the US government. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health, the New York City Department of Health and Mental Hygiene, or the CDC.
Additional Contributions: We thank Andrew Ba Tran, BA, The Washington Post, for feedback on the analysis code. No compensation was received.