Using the data found on Social Explorer, I wanted to show the amount of people who reported their health as poor or fair on a map of the United States. The 2017 data includes only those that are above 18 years old. The topic of health in the united states has always been a popular issue up for discussion. Being more health cautious and practicing healthy habits are being engraved in the youth. With this data I want to see the distribution of quantity throughout the United States on fair or poor health.
Loading in the required data, and importing the data set I am going to use from Social Explorer.
library(tidyverse)
library(sf)
library(tmap)
library(tigris)
library(spdep)
library(tmaptools)
library(dplyr)
library(tidyr)
library(readr)
options(tigris_use_cache = TRUE)
options(tigris_progress_bar = FALSE)
options(tidycensus_progress_ba = FALSE)
ct_map <- st_read('C:/Users/Jessica/Desktop/712/tl_2016_us_county.shp', stringsAsFactors = FALSE)
## Reading layer `tl_2016_us_county' from data source `C:\Users\Jessica\Desktop\712\tl_2016_us_county.shp' using driver `ESRI Shapefile'
## Simple feature collection with 3233 features and 17 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -179.2311 ymin: -14.60181 xmax: 179.8597 ymax: 71.44106
## epsg (SRID): 4269
## proj4string: +proj=longlat +datum=NAD83 +no_defs
names(ct_map)
## [1] "STATEFP" "COUNTYFP" "COUNTYNS" "GEOID" "NAME" "NAMELSAD"
## [7] "LSAD" "CLASSFP" "MTFCC" "CSAFP" "CBSAFP" "METDIVFP"
## [13] "FUNCSTAT" "ALAND" "AWATER" "INTPTLAT" "INTPTLON" "geometry"
am_data <- read.csv("C:/Users/Jessica/Desktop/712/health.csv", stringsAsFactors = FALSE)
names(am_data)
## [1] "Geo_FIPS"
## [2] "Geo_NAME"
## [3] "Geo_QNAME"
## [4] "Geo_STATE"
## [5] "Geo_COUNTY"
## [6] "adults_that_report_fair_or_poor_health"
am_data <- am_data %>%
mutate(fips = Geo_FIPS)
am_data$fips <- as.numeric(as.character(am_data$fips))
ct_map <- ct_map %>%
mutate(fips = parse_integer(GEOID))
comb_data <- ct_map %>%
left_join(am_data, by = "fips")
options(tigris_class = "sf")
t_county <- counties(cb = TRUE)
Here shows an extremely small picture because it includes Hawaii and Alaska.
tm_shape(comb_data) + tm_polygons("adults_that_report_fair_or_poor_health")
When we take out the Hawaii and Alaska data, the map enlarges.
comb_data_sub <- comb_data %>%
filter(STATEFP != "02") %>%
filter(STATEFP != "15") %>%
filter(STATEFP != "60") %>%
filter(STATEFP != "66") %>%
filter(STATEFP != "69") %>%
filter(STATEFP != "72") %>%
filter(STATEFP != "78")
tm_shape(comb_data_sub, projection = 2163) + tm_polygons("adults_that_report_fair_or_poor_health")
tm_shape(comb_data_sub, projection = 2163) + tm_polygons("adults_that_report_fair_or_poor_health", palette = "-RdBu", midpoint = 50)
The map here shows that there are two clusters of dark blue in the midwestern region of the United States that only had 5-10 people to report fair or poor health. I would imagine having to maintain rural land would have a higher physical demand. There are some scattered white regions in the south as well as the south east region. There are some noticible white regions in the most southern parts of Texas as well.
us_states <- comb_data_sub %>%
aggregate_map(by = "STATEFP")
tm_shape(comb_data_sub, projection = 2163) +
tm_polygons("adults_that_report_fair_or_poor_health", palette = "-RdBu", midpoint = 50) +
tm_shape(us_states) + tm_borders(lwd = .36, col = "black", alpha = 1)
tm_shape(comb_data_sub, projection = 2163) + tm_polygons("adults_that_report_fair_or_poor_health", palette = "-RdBu", midpoint = 50, border.col = "grey", border.alpha = .4) +
tm_shape(us_states) + tm_borders(lwd = .36, col = "black", alpha = 1)
Focusing on the state lines help separate the states better than it did before. It is clearer so that the contrast is different enough to differentiate the county and the state.
ggplot(data=comb_data_sub, aes(adults_that_report_fair_or_poor_health)) + geom_histogram() + labs(title = "Adults that report fair or poor health") + ylab("Count") + xlab("Number of Adults that report fair or poor health")
The strength and weakness of spatial and non-spatial approach really depends on the uses for each one. They are different in the sense that spatial approaches give a more detailed visual on location. This way if we were to specify a county or state the spatial approach would be more beneficial. The non-spatial approach shows quantity and is easily comparible when using more general terms to describe the area of interest.
The command, t_county <- counties(cb = TRUE), is used to show clearer and more details in the map. If cb=false were to be set, as shown below, it is not as clear and detailed as the first maps that were shown above.
t_county <- counties(cb = FALSE)
options(tigris_class = "sf")
t_county2 <- counties(cb = FALSE)
am_data <- am_data %>%
mutate(fips = Geo_FIPS)
am_data$fips <- as.numeric(as.character(am_data$fips))
ct_map <- ct_map %>%
mutate(fips = parse_integer(GEOID))
comb_data <- ct_map %>%
left_join(am_data, by = "fips")
tm_shape(comb_data) + tm_polygons("adults_that_report_fair_or_poor_health")
comb_data_sub <- comb_data %>%
filter(STATEFP != "02") %>%
filter(STATEFP != "15") %>%
filter(STATEFP != "60") %>%
filter(STATEFP != "66") %>%
filter(STATEFP != "69") %>%
filter(STATEFP != "72") %>%
filter(STATEFP != "78")
tm_shape(comb_data_sub, projection = 2163) + tm_polygons("adults_that_report_fair_or_poor_health")
tm_shape(comb_data_sub, projection = 2163) + tm_polygons("adults_that_report_fair_or_poor_health", palette = "-RdBu", midpoint = 50)
us_states <- comb_data_sub %>%
aggregate_map(by = "STATEFP")
tm_shape(comb_data_sub, projection = 2163) +
tm_polygons("adults_that_report_fair_or_poor_health", palette = "-RdBu", midpoint = 50) +
tm_shape(us_states) + tm_borders(lwd = .36, col = "black", alpha = 1)
tm_shape(comb_data_sub, projection = 2163) + tm_polygons("adults_that_report_fair_or_poor_health", palette = "-RdBu", midpoint = 50, border.col = "grey", border.alpha = .4) +
tm_shape(us_states) + tm_borders(lwd = .36, col = "black", alpha = 1)