What Does and Does Not Correlate with COVID-19 Death Rates
Christopher R. Knittel and Bora Ozaltun
We correlate county-level COVID-19 death rates with key variables using both linear regression and negative binomial mixed models, although we focus on linear regression models. We include four sets of variables: socio-economic variables, county-level health variables, modes of commuting, and climate and pollution patterns. Our analysis studies daily death rates from April 4, 2020 to May 27, 2020. We estimate correlation patterns both across states, as well as within states. For both models, we find higher shares of African American residents in the county are correlated with higher death rates. However, when we restrict ourselves to correlation patterns within a given state, the statistical significance of the correlation of death rates with the share of African Americans, while remaining positive, wanes. We find similar results for the share of elderly in the county. We find that higher amounts of commuting via public transportation, relative to telecommuting, is correlated with higher death rates. The correlation between driving into work, relative to telecommuting, and death rates is also positive across both models, but statistically significant only when we look across states and counties. We also find that a higher share of people not working, and thus not commuting either because they are elderly, children or unemployed, is correlated with higher death rates. Counties with higher home values, higher summer temperatures, and lower winter temperatures have higher death rates. Contrary to past work, we do not find a correlation between pollution and death rates. Also importantly, we do not find that death rates are correlated with obesity rates, ICU beds per capita, or poverty rates. Finally, our model that looks within states yields estimates of how a given state’s death rate compares to other states after controlling for the variables included in our model; this may be interpreted as a measure of how states are doing relative to others. We find that death rates in the Northeast are substantially higher compared to other states, even when we control for the four sets of variables above. Death rates are also statistically significantly higher in Michigan, Louisiana, Iowa, Indiana, and Colorado. California’s death rate is the lowest across all states.
It is important to understand that this research, and other observational analyses like it, only identify correlations: these relationships are not necessarily causal. However, these correlations may help policy makers identify variables that may potentially be causally related to COVID-19 death rates and adopt appropriate policies after understanding the causal relationship.
Keywords: Coronavirus, COVID-19