The Democratic candidate, Joe Biden, flipped 5 states (Michigan, Wisconsin, Arizona, Pennsylvania, and Georgia) won by his incumbent counterpart, Donald Trump, in 2016, and won the election. So, it could be interesting to view how changes in county-level result contributed to the former vice president’s triumph in the just-passed election.
Data from 2020 election was obtained through web scrapping. https://github.com/charlottetse33/portfolio/blob/main/NBC%20US%20election/web%20scrapping.R
Biden flipped 5 states that Trump won in 2016. They are Michigan, Wisconsin, Arizona, Pennsylvania, and Georgia. The following code creates a character vector for the names of the 5 states:
flipped_states <- c('arizona', 'georgia', 'michigan', 'pennsylvania', 'wisconsin')
We’re going to zoom in on counties of these 5 states to inspect changes in vote between the 2016 and 2020 elections.
(county <- as_tibble(map_data("county")))
## # A tibble: 87,949 x 6
## long lat group order region subregion
## <dbl> <dbl> <dbl> <int> <chr> <chr>
## 1 -86.5 32.3 1 1 alabama autauga
## 2 -86.5 32.4 1 2 alabama autauga
## 3 -86.5 32.4 1 3 alabama autauga
## 4 -86.6 32.4 1 4 alabama autauga
## 5 -86.6 32.4 1 5 alabama autauga
## 6 -86.6 32.4 1 6 alabama autauga
## 7 -86.6 32.4 1 7 alabama autauga
## 8 -86.6 32.4 1 8 alabama autauga
## 9 -86.6 32.4 1 9 alabama autauga
## 10 -86.6 32.4 1 10 alabama autauga
## # ... with 87,939 more rows
(county_fips <- as_tibble(county.fips))
## # A tibble: 3,085 x 2
## fips polyname
## <int> <chr>
## 1 1001 alabama,autauga
## 2 1003 alabama,baldwin
## 3 1005 alabama,barbour
## 4 1007 alabama,bibb
## 5 1009 alabama,blount
## 6 1011 alabama,bullock
## 7 1013 alabama,butler
## 8 1015 alabama,calhoun
## 9 1017 alabama,chambers
## 10 1019 alabama,cherokee
## # ... with 3,075 more rows
county_fips <- county_fips %>% mutate(state_county = polyname, .keep = "unused")
election_result_temp <- read_csv("https://raw.githubusercontent.com/charlottetse33/portfolio/main/NBC%20US%20election/USPresidential08-16.csv") %>% select(fips = fips_code, total_2016,dem_2016,gop_2016) %>% mutate(fips = as.integer(fips))
## Rows: 3112 Columns: 14
## -- Column specification --------------------------------------------------------
## Delimiter: ","
## chr (2): fips_code, county
## dbl (12): total_2008, dem_2008, gop_2008, oth_2008, total_2012, dem_2012, go...
##
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.
election_res_2016_inner.join <- election_result_temp %>% inner_join(county_fips, by = "fips")
election_res_2016 <- election_res_2016_inner.join[str_extract(election_res_2016_inner.join$state_county,"[a-z]+") %in% flipped_states,] %>% .[,2:5]
election_res_2016
## # A tibble: 396 x 4
## total_2016 dem_2016 gop_2016 state_county
## <dbl> <dbl> <dbl> <chr>
## 1 18467 6431 11112 michigan,delta
## 2 6743 904 5676 pennsylvania,fulton
## 3 284832 169169 106559 pennsylvania,delaware
## 4 3572 2695 841 georgia,hancock
## 5 3577 1186 2343 georgia,seminole
## 6 33848 16050 15871 wisconsin,sauk
## 7 96945 34436 58941 pennsylvania,washington
## 8 14757 6774 7239 michigan,leelanau
## 9 6285 619 5561 georgia,brantley
## 10 3486 1156 2158 michigan,baraga
## # ... with 386 more rows
In the 5 flipped states, there are 4 counties whose names are inconsistent between election_res_2020 and election_res_2016:
In election_res_2020, their names are "georgia,dekalb", "michigan,st. clair", "michigan,st. joseph", and "wisconsin,st. croix" (they come from NBC News webpages).
while, in election_res_2016, they are "georgia,de kalb", "michigan,st clair", "michigan,st joseph", and "wisconsin,st croix" (they come from maps::county.fips).
Data frame that combines county-level results for the 5 states:
# Provide your code to create election_res_1620
# You are not allowed to use any for/while/repeat loop in this chunk
election_res_2020 <- read_csv("https://raw.githubusercontent.com/charlottetse33/portfolio/main/NBC%20US%20election/election_res_2020.csv")
## New names:
## * `` -> ...1
## Rows: 4588 Columns: 5
## -- Column specification --------------------------------------------------------
## Delimiter: ","
## chr (1): state_county
## dbl (4): ...1, trump, biden, others
##
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.
temp_2020 <- election_res_2020[str_extract(election_res_2020$state_county,"[a-z]+") %in% flipped_states,] %>% mutate(state_county = state_county, total_2020 = trump + biden +others, dem_2020 = trump, gop_2020 = biden, .keep = "none")
temp_2020[temp_2020[,1] == "georgia,dekalb", 1] <- "georgia,de kalb"
temp_2020[temp_2020[,1] == "michigan,st. clair", 1] <- "michigan,st clair"
temp_2020[temp_2020[,1] == "michigan,st. joseph", 1] <- "michigan,st joseph"
temp_2020[temp_2020[,1] == "michigan,st. croix", 1] <- "michigan,st croix"
temp_2020[temp_2020[,1] == "wisconsin,st. croix", 1] <- "wisconsin,st croix"
election_res_1620 <- inner_join(election_res_2016, temp_2020, by = "state_county") %>% select(state_county, total_2016, dem_2016, gop_2016, total_2020, dem_2020, gop_2020) %>% arrange(state_county)
election_res_1620
## # A tibble: 396 x 7
## state_county total_2016 dem_2016 gop_2016 total_2020 dem_2020 gop_2020
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 arizona,apache 18659 12196 5315 35183 23293 11442
## 2 arizona,cochise 43147 15291 25036 60473 23732 35557
## 3 arizona,coconino 44929 25308 16573 73346 44698 27052
## 4 arizona,gila 21398 6746 13672 27678 8943 18377
## 5 arizona,graham 11939 3301 8025 14996 4034 10749
## 6 arizona,greenlee 3243 1092 1892 3688 1182 2433
## 7 arizona,la paz 4931 1318 3381 7460 2236 5129
## 8 arizona,maricopa 1201934 549040 590465 2069475 1040774 995665
## 9 arizona,mohave 74189 16485 54656 104705 24831 78535
## 10 arizona,navajo 35409 15362 18165 51783 23383 27657
## # ... with 386 more rows
Based on election_res_1620, create a data frame that summarizes the total numbers of votes received by both parties at the state level.
# Provide your code to create election_res_1620_state
# You are not allowed to use any for/while/repeat loop in this chunk
election_res_1620_state <- election_res_1620 %>% mutate(state = str_to_title(str_extract(election_res_1620$state_county,"[a-z]+"))) %>% group_by(state) %>% summarise(dem_2016 = sum(dem_2016), dem_2020 = sum(dem_2020), gop_2016 = sum(gop_2016), gop_2020 = sum(gop_2020)) %>% pivot_longer(., cols = dem_2016:gop_2020, names_to = "party_year", values_to = "vote") %>% mutate(party = str_extract(party_year,"[a-z]+"), year = str_extract(party_year, "[0-9]+"), .keep = "unused") %>% .[,c(1,3,4,2)]
election_res_1620_state
## # A tibble: 20 x 4
## state party year vote
## <chr> <chr> <chr> <dbl>
## 1 Arizona dem 2016 936250
## 2 Arizona dem 2020 1672143
## 3 Arizona gop 2016 1021154
## 4 Arizona gop 2020 1661686
## 5 Georgia dem 2016 1837300
## 6 Georgia dem 2020 2473633
## 7 Georgia gop 2016 2068623
## 8 Georgia gop 2020 2461854
## 9 Michigan dem 2016 2267373
## 10 Michigan dem 2020 2804040
## 11 Michigan gop 2016 2279210
## 12 Michigan gop 2020 2649852
## 13 Pennsylvania dem 2016 2844705
## 14 Pennsylvania dem 2020 3459923
## 15 Pennsylvania gop 2016 2912941
## 16 Pennsylvania gop 2020 3378263
## 17 Wisconsin dem 2016 1382210
## 18 Wisconsin dem 2020 1630673
## 19 Wisconsin gop 2016 1409467
## 20 Wisconsin gop 2020 1610065
With all necessary data ready, create 2 Choropleth maps for the 2016 and 2020 election results
# Provide your code to wrangle the data and create the plot
# You are not allowed to use any for/while/repeat loop in this chunk
# Use inner_join if you want to join two data frames
state <- as_tibble(map_data("state"))
election_res_1620 <- election_res_1620 %>% mutate(result_2016 = (dem_2016 -gop_2016)/total_2016, result_2020 = (dem_2020 -gop_2020)/total_2020)
county_temp <- county %>% filter(region %in% flipped_states) %>% unite(state_county, region, subregion, sep = ",")
election_result_agg <- inner_join(election_res_1620, county_temp, by = "state_county") %>% select(state_county,result_2016, result_2020, long, lat, group, order) %>% pivot_longer(., cols = result_2016:result_2020, names_to = "year",names_prefix = "result_", values_to = "result")
p <- ggplot(election_result_agg, aes(long, lat, group = group)) + geom_polygon(aes(fill = result)) + facet_grid(~ year) + geom_polygon(data = state, aes(long, lat, group = group), colour = "black", fill = "NA", size = 0.6)
p + ggtitle("Flipped States: 2016 VS. 2020 Presidential Election \n") + theme_bw() +
scale_fill_gradient2(name=NULL, limits = c(-1, 1),
low = "#e41a1c", high = "#377eb8",
breaks = c(-1, 1), labels = c("Republican Won ", "Democrat Won")) +
labs(x = NULL, y = NULL) +
theme(legend.position = "bottom",
strip.background = element_rect(fill="lightgray", size= 0.8),
plot.title = element_text(size = 20, face = "bold"),
strip.text.x = element_text(size = 16, face = "bold.italic"),
legend.text = element_text(size = 16, face = "bold"),
legend.spacing.x = unit(0.5, "line"),
legend.key.size = unit(0.9, "cm")) +
guides(fill = guide_legend(title.position = "top", title.hjust = 0.5))
As you can see, only the 5 flipped states are colored. And colors represent differences in percentage of votes received by the 2 parties (dem vs. gop).
h <- ggplot(election_res_1620_state, aes(x = state, y = vote, fill = party)) + geom_histogram(stat = "identity", position = "dodge", width = 0.5) + facet_grid(~ year)
## Warning: Ignoring unknown parameters: binwidth, bins, pad
h + ggtitle("Flipped States: 2016 VS. 2020 Presidential Election \n") + theme_bw() +
scale_y_continuous(breaks = c(0, 1000000, 2000000, 3000000),
labels = c("0", "1,000", "2,000", "3,000")) +
scale_fill_manual(name=NULL, values = c("dem" = "#377eb8", "gop" = "#e41a1c"),
labels = c("gop"= "Republican Party", "dem"="Democrat Party " )) +
labs(x = NULL, y = "No. of votes\n(in thousands)") +
theme(legend.position = "bottom",
strip.background = element_rect(fill="lightgray", size= 0.8),
plot.title = element_text(size = 20, face = "bold"),
strip.text.x = element_text(size = 16, face = "bold.italic"),
legend.text = element_text(size = 16, face = "bold"),
legend.spacing.x = unit(0.5, "line"),
legend.key.size = unit(0.9, "cm"),
axis.title.y = element_text(face="bold.italic", size=18),
axis.text.x = element_text(size = 14, face="italic"),
axis.text.y = element_text(size = 14, face="italic")
) + guides(fill = guide_legend(title.position = "top", title.hjust = 0.5))