Rough Draft

Due 5/7 at 11 pm: Published Rmarkdown document with link submitted to moodle

Each group should submit ONE assignment.

One May 9th, we will spend time peer-reviewing each other’s visualizations during class. Choose ONE visualization that you feel would be best supported by peer review and be ready to present it to the class.

In a Published RMarkdown document, your rough draft must include:

Through this project we are going to analyze and discuss how COVID-19 cases deaths varied by state based on the states vote in the 2020 political election We want to analyze how the rate of deaths varied between states that voted for the republican canidate and states that voted democratic canidate and see if there is a correlation between vote outcome and the rate of COVID-19 deaths. We also hope to emphasize and explore the reasons why these trends are evident. This analysis is very important as it enables us to potentionlly identify states that are the most vulnerable in extreme cases (such as pandemics) so we are able to pinpoint and potentially alleviate some of the reasons that expose such demographics. This is a topic that has been thoroughly researched but we are hoping to introduce different perspectives to this topic in order for this research to have a holistic view on this important subject.

The data is from the CDC Wonder public database and it was collected by the National Center for Health Statistics which is a division of the CDC. This data is derived from death certificates submitted by the 57 vital statistics jurisdictions in the United States. This data collection began in January 2020 and has not been stopped ever since. The key variables are: Underlying cause of death, contributing cause of death, time of death, age group, sex, race, place of death, population and COVID 19 deaths.

  1. At least two figures, ONE of which is ready for peer-review during class time
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats   1.0.0     ✔ readr     2.1.5
## ✔ ggplot2   3.5.1     ✔ stringr   1.5.1
## ✔ lubridate 1.9.4     ✔ tibble    3.2.1
## ✔ purrr     1.0.4     ✔ tidyr     1.3.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(readr)
library(lubridate)
library(viridis)
## Loading required package: viridisLite
library(ggplot2)
covid_data <- read_csv("us-states.csv")
## Rows: 61942 Columns: 5
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (2): state, fips
## dbl  (2): cases, deaths
## date (1): date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
voting_data <- read_csv("voting.csv")
## Rows: 51 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): state, state_abr
## dbl (6): trump_pct, biden_pct, trump_vote, biden_vote, trump_win, biden_win
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
population_data <-read_csv("PEPPOP2021.NST_EST2021_POP-2025-05-07T224127.csv")
## Rows: 57 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): Geographic Area Name (NAME)
## num (3): Estimates Base Population, April 1, 2020 (POP_BASE2020), Population...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
ggplot(merged_data_2021, aes(x = trump_pct, y = daily_case_rate, color = state)) +
  geom_point(size = 3, show.legend = FALSE) +
  labs(
    title = "Trump Vote Share vs. Daily COVID-19 Case Rate by State (2021)",
    x = "Trump Vote Percentage (2020)",
    y = "Average Daily COVID-19 Case Rate ",
    caption = "Sources: https://www.eac.gov and https://www.cdc.gov/covid/index.html"
  ) +
  scale_color_viridis_d(option = "cividis") +  
  theme_minimal(base_size = 14) +
  theme(
    plot.title = element_text(face = "bold", size = 16, hjust = 0.5),
    axis.title = element_text(face = "bold"),
    axis.text = element_text(face = "bold"),
    legend.position = "none" 
  )
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_point()`).
Figure 2. As a state has a higher percentage of votes for Trump the average daily rate of COVID-19 case increases. Data is based off of daily covid cases for the entire year of 2021 which was collected by the CDC. The state voting data was collected via the U.S Election Assistance commission. We cleaned the data by combinding the two seperate datasets and making them a little eaiser to work with. Overall we used the majority of the data raw. This graph is a scatter plot showing the relationship between Trump’s 2020 vote percentage and the average daily COVID-19 case rate in each U.S. state for 2021. Each point represents a state, colored uniquely. The x-axis shows Trump's vote share (in percent) ranging from 0-80%, and the y-axis shows the average daily COVID-19 case rate ranging from 0.03 ti 0.15. The trend on the graph shows that as Trumps vote percentage in a state increases the average daily Covid-19 case rate also increases.

Figure 2. As a state has a higher percentage of votes for Trump the average daily rate of COVID-19 case increases. Data is based off of daily covid cases for the entire year of 2021 which was collected by the CDC. The state voting data was collected via the U.S Election Assistance commission. We cleaned the data by combinding the two seperate datasets and making them a little eaiser to work with. Overall we used the majority of the data raw. This graph is a scatter plot showing the relationship between Trump’s 2020 vote percentage and the average daily COVID-19 case rate in each U.S. state for 2021. Each point represents a state, colored uniquely. The x-axis shows Trump’s vote share (in percent) ranging from 0-80%, and the y-axis shows the average daily COVID-19 case rate ranging from 0.03 ti 0.15. The trend on the graph shows that as Trumps vote percentage in a state increases the average daily Covid-19 case rate also increases.

Graph 1 Alt Text:

This graph is a scatter plot showing the relationship between Trump’s 2020 vote percentage and the average daily COVID-19 case rate in each U.S. state for 2021. Each point represents a state, colored uniquely. The x-axis shows Trump’s vote share (in percent) ranging from 0-80%, and the y-axis shows the average daily COVID-19 case rate ranging from 0.03 ti 0.15. The trend on the graph shows that as Trumps vote percentage in a state increases the average daily Covid-19 case rate also increases.

ggplot(merged_data_2021, aes(x = biden_pct, y = daily_case_rate, color = state)) +
  geom_point(size = 3, show.legend = FALSE) +
  labs(
    title = "Biden Vote Share vs. Daily COVID-19 Case Rate by State (2021)",
    x = "Biden Vote Percentage (2020)",
    y = "Average Daily COVID-19 Case Rate",
     caption = "Sources: https://www.eac.gov and https://www.cdc.gov/covid/index.html"
  ) +
  scale_color_viridis_d(option = "cividis") +  
  theme_minimal(base_size = 14) +
  theme(
    plot.title = element_text(face = "bold", size = 16, hjust = 0.5),
    axis.title = element_text(face = "bold"),
    axis.text = element_text(face = "bold"),
    legend.position = "none"  
  )
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_point()`).
Figure 2. As a state has a higher percentage of votes for Biden the average daily rate of COVID-19 case decreases. Data is based off of daily covid cases for the entire year of 2021 which was collected by the CDC. The state voting data was collected via the U.S Election Assistance commission. We cleaned the data by combinding the two seperate datasets and making them a little eaiser to work with. Overall we used the majority of the data raw. This graph is a scatter plot showing the relationship between Biden's 2020 vote percentage and the average daily COVID-19 case rate in each U.S. state for 2021. Each point on the graph represents one of the 50 states.The x-axis shows Biden's vote share (in percent) ranging from 0-80%, and the y-axis shows the average daily COVID-19 case rate ranging from 0.03 ti 0.15. The trend on the graph shows that as Bidens vote percentage in a state increases the average daily Covid-19 case rate decreases.

Figure 2. As a state has a higher percentage of votes for Biden the average daily rate of COVID-19 case decreases. Data is based off of daily covid cases for the entire year of 2021 which was collected by the CDC. The state voting data was collected via the U.S Election Assistance commission. We cleaned the data by combinding the two seperate datasets and making them a little eaiser to work with. Overall we used the majority of the data raw. This graph is a scatter plot showing the relationship between Biden’s 2020 vote percentage and the average daily COVID-19 case rate in each U.S. state for 2021. Each point on the graph represents one of the 50 states.The x-axis shows Biden’s vote share (in percent) ranging from 0-80%, and the y-axis shows the average daily COVID-19 case rate ranging from 0.03 ti 0.15. The trend on the graph shows that as Bidens vote percentage in a state increases the average daily Covid-19 case rate decreases.

Graph 2 Alt Text:

This graph is a scatter plot showing the relationship between Biden’s 2020 vote percentage and the average daily COVID-19 case rate in each U.S. state for 2021. Each point on the graph represents one of the 50 states.The x-axis shows Biden’s vote share (in percent) ranging from 0-80%, and the y-axis shows the average daily COVID-19 case rate ranging from 0.03 ti 0.15. The trend on the graph shows that as Bidens vote percentage in a state increases the average daily Covid-19 case rate decreases.