The COVID-19 pandemic’s early death toll in the United States was higher than what officials recorded, according to a new study that used machine learning to estimate COVID-19 deaths that may have been missed when people died outside hospitals. Researchers published the findings in Science Advances and focused on gaps between deaths that appeared in official counts and deaths that might have gone unrecognized in state records.
The analysis centered on the period when the virus swept through the U.S. in 2020 and 2021. The study authors said about 840,000 COVID-19 deaths were reported on death certificates in those years, but they estimated that as many as 155,000 additional deaths may have occurred outside hospitals without being recognized as COVID-19, which they calculated would mean roughly 16% of COVID-19 deaths went uncounted in that span.
The paper also aimed to identify which deaths were most likely missing from official tallies. It found that the undiagnosed dead were more likely to be Hispanic people and other people of color, and it tied that pattern to deaths that occurred in the first few months of the pandemic. The authors further said the deaths were more likely to be concentrated in certain Southern and Southwestern states, including Alabama, Oklahoma and South Carolina.
The study builds on earlier efforts that estimated pandemic death tolls, while trying to go beyond broad totals by evaluating where and how COVID-19 deaths might not have been captured. The authors used machine learning to examine death certificates of infected patients who died in hospitals, then applied patterns from those records to assess death certificates for people who died outside hospitals and whose causes were listed as conditions such as pneumonia or diabetes.
Steven Woolf, a researcher at Virginia Commonwealth University who was not involved in the study, said in an email that barriers remain for the populations most likely to be missed in the data. He pointed to persistent disparities in access to care and said, “People on the margins continue to die at disproportionate rates because they can’t access care,” according to the study report.
Elizabeth Wrigley-Field of the University of Minnesota, one of the study’s authors, said the problem was not limited to hospital settings. She said many people who grew sick and died outside hospitals were not tested for COVID-19, including because at-home testing was not readily available early in the pandemic, as described in the Associated Press report.
The AP report also noted that death investigation practices can vary by place. In some areas, death investigations are handled by elected coroners, who may not have specialized training like medical examiners. The Associated Press story said some research has suggested that partisan attitudes could influence whether families sought COVID-19 testing and whether coroners pursued postmortem coronavirus testing, and it also reported that some coroners said families pressed them not to list COVID-19 as a cause of death.
Andrew Stokes, the senior author and a Boston University researcher, said the official counts fell short for a practical reason related to the death investigation system. “Our antiquated death investigation system is one key reason why we fell short of accurate counts, particularly outside of big metropolitan areas,” he said, according to the report.
The study comes amid continued debate over how many Americans died from COVID-19. The Associated Press story said Centers for Disease Control and Prevention data show more than 1.2 million COVID-19 deaths since the pandemic began, with more than two-thirds of those deaths occurring in 2020 and 2021. The report also said the counting debate has been fueled by political disputes, including false claims circulating on social media and a post by President Donald Trump in August 2020 that was later removed from Twitter after being retweeted.
The AP report added that pandemic death tolls can include other kinds of harm beyond infection, such as people who died from other medical conditions because they could not get care at hospitals overwhelmed with COVID-19 patients, and people who died of drug overdoses amid social isolation and lost access to treatment. The researchers behind the Science Advances study, however, focused specifically on deaths of people infected by the coronavirus.
Scientists said machine learning–based methods have strengths and weaknesses, but Woolf described the approach as “intriguing.” In the Associated Press report, he was one of several voices framing the new results as a step toward understanding why some COVID-19 deaths may have been missed early on, especially among groups that faced greater barriers to testing and care.