── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(spdep)
Loading required package: sp
Loading required package: spData
To access larger datasets in this package, install the spDataLarge
package with: `install.packages('spDataLarge',
repos='https://nowosad.github.io/drat/', type='source')`
library(tigris)
To enable caching of data, set `options(tigris_use_cache = TRUE)`
in your R script or .Rprofile.
library(readr)#Read in election datavotesdata <-read_csv("county level presidential data 2020.csv")
Rows: 257 Columns: 14
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (13): Office, State, RaceDate, CensusPop, Area, RedistrictedDate, TotalV...
dbl (1): FIPPS
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Rename FIPPS column to FIPSnames(votesdata)[names(votesdata) =="FIPPS"] <-"FIPS"names(votesdata)
Office State RaceDate CensusPop
Length:257 Length:257 Length:257 Length:257
Class :character Class :character Class :character Class :character
Mode :character Mode :character Mode :character Mode :character
FIPS Area RedistrictedDate TotalVotes
Min. :48001 Length:257 Length:257 Length:257
1st Qu.:48128 Class :character Class :character Class :character
Median :48254 Mode :character Mode :character Mode :character
Mean :48254
3rd Qu.:48381
Max. :48507
NA's :3
RepVotes RepCandidate RepStatus DemVotes
Length:257 Length:257 Length:257 Length:257
Class :character Class :character Class :character Class :character
Mode :character Mode :character Mode :character Mode :character
DemCandidate DemStatus
Length:257 Length:257
Class :character Class :character
Mode :character Mode :character
# na.omit(votesdata)library(dplyr)# Remove the last three rows from the dataframevotesdata <- votesdata %>%slice(1:(n() -3))# Calculate total number of votes in each countyvotesdata$county_total_votes <-as.numeric(gsub(",", "", votesdata$TotalVotes))# Calculate proportion of Republican votes in each countyvotesdata$prop_rep <-as.numeric(gsub(",", "", votesdata$RepVotes)) / votesdata$county_total_votes# Calculate proportion of Democratic votes in each countyvotesdata$prop_dem <-as.numeric(gsub(",", "", votesdata$DemVotes)) / votesdata$county_total_votes# Calculate overall proportion of Republican votes across all countiestotal_rep_votes <-sum(as.numeric(gsub(",", "", votesdata$RepVotes)))total_dem_votes <-sum(as.numeric(gsub(",", "", votesdata$DemVotes)))prop_total_rep <- total_rep_votes / (total_rep_votes + total_dem_votes)# Calculate dissimilarity index for each countyvotesdata$dissimilarity_index <-abs(votesdata$prop_rep - prop_total_rep)# Calculate overall dissimilarity indexoverall_dissimilarity_index <-sum(votesdata$county_total_votes * votesdata$dissimilarity_index) / (2* total_rep_votes * total_dem_votes)
Mapping the Dissimilarity Index: SEE HERE
Attempt #1
# library(sf)# library(ggplot2)# # # First I loaded the shapefile for Texas# tx_shp <- st_read("new.shp")# # # Then, I tried to merge shapefile with the election data, but I failed :( # m_tx_shape <- merge(sf = tx_shape, y = votesdata, by = "Area")# # # Generate choropleth map --- but I can't # ggplot() +# geom_sf(data = tx_map, aes(fill = variable_to_plot)) +# scale_fill_gradient(low = "white", high = "red") +# theme_void()
Attempt #2
Another attempt, but I tried downloading the shapefile using the Tigris/SF packages.