Covid mortality in America

Mean covid moratality per 100,000 people of red and blue states across the USA

Matthew John Bentham (S3923076), John Fergus Murrowood (S3923075)

Last updated: 14 October, 2021

Introduction.

Each state or territory government in the USA has considerable power in the policies put in place on how to best deal with the Covid Pandemic crisis and the health measures put in place in each state. Throughout the covid-19 pandemic, the two political parties have often approached managing the pandemic differently and implemented varying degrees of severity of public health orders, ie length and severity of lockdowns. Therefore, it is to be investigated whether it can be said if one parties policies had more of an effect of lowering their states mean covid mortality rate over the other party. This is done by finding the number of covid deaths per 1000 people for each state then calculating the mean deaths across all the states in the country. This death rate will be then grouped by whether it is a republican or democrat state and then a general mean death rate per 1000 population will be compared for each of the two major parties.
*map of political affiliation of state executives*

map of political affiliation of state executives

Problem Statement

This report will investigate whether the government elected in each state and subsequently the general differences in the public health responses between the two political parties had any effect on the death rate of covid per 1000 people.

*Covid-19 deaths per state*

Covid-19 deaths per state

Data

  1. Data is imported: (population + covid-19 death count)
covid_deaths_usafacts <- read_csv("covid_deaths_usafacts.csv")[c(3,632)]
censusdata <- read_excel("censusdata.xlsx",skip = 3)[6:56,c(1,13)]
  1. Death count per state was calculated
covid_death <- covid_deaths_usafacts%>% group_by(State) %>% summarise(deaths = sum(`2021-10-10`))
  1. State variable was converted to corresponding abbreviations & datasets are joined by state
states <- probes <- append(state.abb, "DC", after=8)
censusdata$...1 <- censusdata$...1 %>% factor(labels  = states)
data <- censusdata %>% left_join(covid_death,by =c( '...1'="State"))
colnames(data) <- c('State',"population (2019)","No. of covid deaths\n(22/01/2020-10/10/2021)")

Data Cont.

  1. Covid-19 deaths per 100,000 people for each state was calculated:
data <- data %>% mutate('Deaths per 100,000 people' = (data$`No. of covid deaths
(22/01/2020-10/10/2021)`/data$`population (2019)`)*100000)
  1. Political affiliation of state executive variable was generated:
dem <- c('CA','CO','CT','DE','DC','HI','IL','KS','LA','ME','MI','MN','NV','NJ','NM','NY','NC','OR','PA','RI','VA','WA',"WI","KY")
i <- 1
for (row in data$State){
  if (row %in% dem){
  data$`Political affiliation of state executive`[i] <- 'Democratic Party'
  } else{
    ifelse(row != 'MT', data$`Political affiliation of state executive`[i] <- 'Republican Party',i<i-1)

  }
  i <- i+1
}

NOTE: Montana was removed from the dataset as it had a democrat and republican state executive during the data interval

Data Summary

Varibales:

Boxplot visualisation

pal <- c('#457b9d','#e63946')
plot <- ggplot(data = data,aes(x=`Deaths per 100,000 people`,y=`Political affiliation of state executive`,fill=`Political affiliation of state executive`))+
  geom_boxplot(show.legend  = FALSE)+coord_flip()+labs(title = "Covid deaths in democrat and rebpublican states")+theme(
    panel.background = element_rect(fill ='#f1faee'),plot.background =element_rect(fill ='#f1faee') )+
  scale_fill_manual(values = pal)
plot

Initial inspection shows there is a slight difference in median deaths per 1000 people in red vs blue states

Outlier removal.

data_rep <- data %>% subset(`Political affiliation of state executive` == "Republican Party")
is_outlier <- function(x) {
  return(x < quantile(x, 0.25) - 1.5 * IQR(x) | x > quantile(x, 0.75) + 1.5 * IQR(x))
}
loc <- which(is_outlier(data_rep$`Deaths per 100,000 people`))
out_state <- data_rep[loc,"State"] %>% as.character()
data <- data %>% filter(data$State !=out_state)

Descriptive Statistics

  1. check for missing and special values:
na <- which(is.na(data))
nan <- sapply(data,is.nan)
inf <- sapply(data,is.infinite)
which(inf | nan | na )
## integer(0)

No missing/special values found

  1. Descriptive statistics:
data %>% group_by(`Political affiliation of state executive`) %>% summarise(Min = min(data$`Deaths per 100,000 people`,na.rm = TRUE),Q1 = quantile(`Deaths per 100,000 people`,probs = .25,na.rm = TRUE),
                                           Median = median(`Deaths per 100,000 people`, na.rm = TRUE),
                                           Q3 = quantile(`Deaths per 100,000 people`,probs = .75,na.rm = TRUE),
                                           Max = max(`Deaths per 100,000 people`,na.rm = TRUE),
                                           Mean = mean(`Deaths per 100,000 people`, na.rm = TRUE),
                                           SD = sd(`Deaths per 100,000 people`, na.rm = TRUE),
                                           n = n(),
                                           Missing = sum(is.na(`Deaths per 100,000 people`))) -> table1
kbl(table1) %>% kable_classic( html_font = "Timesnewroman") %>% kable_styling(font_size = 16)
Political affiliation of state executive Min Q1 Median Q3 Max Mean SD n Missing
Democratic Party 58.33861 151.6586 202.8300 234.3376 304.8748 191.6956 68.15866 24 0
Republican Party 58.33861 182.5347 221.0928 254.5051 329.6542 213.8196 60.80173 26 0

Hypothesis Testing

\(\mu_2\) : mean of total covid-19 deaths in democrat states

Assumptions:

Normaility testing

data_rep <- data %>% subset(`Political affiliation of state executive` == "Republican Party")
plot1 <- ggplot(data = data_rep ,mapping =  aes(sample=`Deaths per 100,000 people`))+stat_qq_line()+ stat_qq_point()+stat_qq_band(alpha=0.5,distribution = "norm",fill ='#e63946') +
           theme(panel.background = element_rect(fill ='#f1faee')  
                 ,plot.background =element_rect(fill ='#f1faee'))+ 
           theme_light()+ labs(title = "Q-Q plot for republican data")+xlab( "Deaths per 100,000 people")
plot1

Non-normality is generally characterised by a defined s-shape , as there is only 2 points (with no continuing trend) that fall outside the 95% CI for the normal quantiles , it can be said the data only has a very minor departure from normality towards the lower tail of the distribution. A two sample t-test can still be performed as they are generally robust against minor departures from normality and will tend to maintain the desired significance level (e.g. 0.05) even if normality is not strictly met.

Normaility testing cont.

data_dem <- data %>% subset(`Political affiliation of state executive` == "Democratic Party")
plot2 <- ggplot(data = data_dem ,mapping =  aes(sample=`Deaths per 100,000 people`))+stat_qq_line()+ stat_qq_point()+stat_qq_band(alpha=0.5,distribution = "norm",fill ='#457b9d') +
           theme(panel.background = element_rect(fill ='#f1faee')  
                 ,plot.background =element_rect(fill ='#f1faee'))+ 
           theme_light()+ labs(title = "Q-Q plot for democrat data")+xlab( "Deaths per 100,000 people")
plot2

As seen in the Q-Q plot above all the points of the democrat data is within the 95% confidence interval of the normal distribution , meaning normality can be assumed and a two-sample t-test can be performed on the datasets.

Homogeneity of Variance

tab2 <- leveneTest(data$`Deaths per 100,000 people`~data$`Political affiliation of state executive`)
kbl(tab2) %>% kable_classic( html_font = "Timesnewroman") %>% kable_styling(font_size = 16)
Df F value Pr(>F)
group 1 0.6740034 0.4157182
48 NA NA

The p-value for the Levene’s test of equal variance for no. covid deaths between red and blue states was 0.416. This values is greater than 0.05 , therefore, we fail to reject \(H_0\) and assume equal variance

Two-sample t-test - Assuming Equal Variance

t.test(data = data,`Deaths per 100,000 people`~`Political affiliation of state executive`,var.equal = TRUE,
       alternative = "two.sided")
## 
##  Two Sample t-test
## 
## data:  Deaths per 100,000 people by Political affiliation of state executive
## t = -1.213, df = 48, p-value = 0.2311
## alternative hypothesis: true difference in means between group Democratic Party and group Republican Party is not equal to 0
## 95 percent confidence interval:
##  -58.79523  14.54738
## sample estimates:
## mean in group Democratic Party mean in group Republican Party 
##                       191.6956                       213.8196

p > 0.05 , therefore we fail to reject \(H_0\).

There is no statistically significant difference between red and blue states covid-19 mortality means

Discussion - RESULTS

Overall the investigation suggests that political affiliation of a state’s representative has no effect on Ovid mortality in said state.

Discussion - Strengths and Limitaions

Strengths:

Limitations:

Discussion - future investigations

Conclusion

Despite there being a higher mean death count in republican states during initial inspection, there was a statistically insignificant difference between the mean deaths of covid per 1000 people in republican run states compared with states that are run by Democrat governors and therefore no insight into the effectiveness of any political parties ability to reduce the mortality rate of covid was gained.

References