In recent years, political figures have had debates on what type of healthcare insurance American citizens should have. Often, the debate is whether or not health insurance should remain private and be paid out of pocket or by paycheck, or should it be public through a government marketplace. The maps I will be looking at show the percentage change in uninsured people between the ages of 18 through 64 from the end of President Obama’s first term (2012) and the end of his second term (2016). I want to see how the percentage of those uninsured changed after the enacting of the Affordable Care Act, which was signed into law by President Obama on March 23, 2010. The data on percentage of uninsured people by county is from Social Explorer.
library(tidyverse)
library(dplyr)
library(sf)
library(tmap)
library(tigris)
library(spdep)
library(tidycensus)
options(tigris_class = "sf")
counties<-st_read("C:/R-3.5.2/tl_2016_us_county.shp", stringsAsFactors=FALSE)
counties=counties(cb = TRUE)
Looking at the maps for the 2 years, I see that in 2016, the map gets lighter (on the legend, 0-10=lightest) showing that there were fewer uninsured people in 2016 than in 2012.
tm_shape(combined_data2,projection=2163)+tm_polygons("health2012",palette="Reds",midpoint=50, border.col="beige")+tm_shape(states)+tm_borders(lwd=.36,col="black",alpha=1)+tm_layout(panel.show=TRUE)+tm_layout(
legend.position = c("left", "bottom"), frame = FALSE,inner.margins = c(0.1, 0.1, 0.05, 0.05)) + tm_layout( panel.labels=c("2012"))
tm_shape(combined_data2,projection=2163)+tm_polygons("health2016",palette="Reds",midpoint=50, border.col="beige")+tm_shape(states)+tm_borders(lwd=.36,col="black",alpha=1)+tm_layout(panel.show=TRUE)+tm_layout(
legend.position = c("left", "bottom"), frame = FALSE,inner.margins = c(0.1, 0.1, 0.05, 0.05)) + tm_layout( panel.labels=c("2016"))
I am taking the mean of 2012’s uninsured percentages and 2016’s uninsured percentages to visualize and compare non-spatially.
According to the bar graph below, 2012 had ~1% more people on average uninsured than in 2016. This isn’t a lot and the bar graphs don’t show specifically where the changes occured throughout the country.
library(ggplot2)
combined_data2a<-combined_data2%>%
dplyr::mutate(health2016mean=mean(combined_data2$health2016,na.rm=TRUE),health2012mean=mean(combined_data2$health2012,na.rm=TRUE))
temp=data.frame(name=c("Health 2012","Health 2016"),value=c(21.82687,20.97545))
ggplot(temp,aes(name,value))+geom_col(fill="pink")+xlab("Year")+ylab("Mean % of people uninsured")
Here, I am subtracting 2012 from 2016 to see how large the differences are by county and state.
library(dplyr)
combined_data3=combined_data2%>%
group_by(GEOID)%>%
dplyr::mutate(Uninsured_Diff=(health2016-health2012))
head(combined_data3)
## Simple feature collection with 6 features and 33 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -102.042 ymin: 37.38839 xmax: -84.79633 ymax: 43.49961
## epsg (SRID): 4269
## proj4string: +proj=longlat +datum=NAD83 +no_defs
## # A tibble: 6 x 34
## # Groups: GEOID [6]
## STATEFP COUNTYFP COUNTYNS AFFGEOID GEOID NAME LSAD ALAND AWATER fips
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <int>
## 1 19 107 00465242 0500000~ 19107 Keok~ 06 1.50e9 1.93e6 19107
## 2 19 189 00465283 0500000~ 19189 Winn~ 06 1.04e9 3.18e6 19189
## 3 20 093 00485011 0500000~ 20093 Kear~ 06 2.25e9 1.13e6 20093
## 4 20 123 00485026 0500000~ 20123 Mitc~ 06 1.82e9 4.50e7 20123
## 5 20 187 00485055 0500000~ 20187 Stan~ 06 1.76e9 1.79e5 20187
## 6 21 005 00516849 0500000~ 21005 Ande~ 06 5.23e8 6.31e6 21005
## # ... with 24 more variables: Geo_FIPS.x <int>, Geo_NAME.x <chr>,
## # Geo_QNAME.x <chr>, Geo_STATE.x <chr>, Geo_COUNTY.x <dbl>,
## # SE_T006_001.x <dbl>, SE_T006_002.x <dbl>, SE_T006_003 <dbl>,
## # SE_NV005_001 <dbl>, SE_NV005_002 <dbl>, SE_NV005_003 <dbl>,
## # health2016 <dbl>, Geo_FIPS.y <int>, Geo_NAME.y <chr>,
## # Geo_QNAME.y <chr>, Geo_STATE.y <chr>, Geo_COUNTY.y <dbl>,
## # SE_T006_001.y <dbl>, SE_T006_002.y <dbl>, SE_NV007_001 <dbl>,
## # SE_NV007_002 <dbl>, health2012 <dbl>, geometry <MULTIPOLYGON [°]>,
## # Uninsured_Diff <dbl>
The histogram below shows the difference in uninsured people between 2012 and 2016. I see the non-spatial distribution of percentage differences in uninsured people between 2012 and 2016. Some percentage rate differences are greater and some are less. In some areas around the country, the difference can be a 10% increase in people uninsured, and in others, a 15% decrease in uninsured people. However, most areas show little to no change.
library(ggplot2)
ggplot(combined_data3, aes(Uninsured_Diff, fill="Uninsured_Diff")) + geom_histogram() + xlab("Difference in % of People Without Health Insurance in USA Between 2012 and 2016")
However, the non-spatial visuals are limited since they only show these differences numerically. I want to see where these differences appear by county and state, and perhaps find some trends on the maps.
I want to see how each county changed between 2012 and 2016. Looking at the maps, I see that most counties had small changes. The largest extremes appear to be in the west (darkest greens and darkest reds).
tm_shape(combined_data3,projection = 2163)+tm_polygons( 'Uninsured_Diff',midpoint=-3,border.col = "grey", border.alpha = .5,title="Percentage Difference by County")+tm_shape(counties)+ tm_borders(lwd = .36, col = "grey", alpha = .6)+tm_layout(inner.margins = c(0.1, 0.1, 0.05, 0.05))
Most of the country shows small changes by county (between -5 and 5%). However, some states west of the Mississippi show a different trends than in the east. It appears that Texas and Idaho show the greatest extremes.
tm_shape(combined_data3,projection=2163)+tm_polygons("Uninsured_Diff",midpoint=0, border.col="grey",border.alpha = .5,title="Percentage Difference by County")+tm_shape(states)+tm_borders(lwd=.36,col="black",alpha=1)+tm_layout(inner.margins = c(0.1, 0.1, 0.05, 0.05))
Several counties in Texas show large differences. Looking at some of the darkest red areas, I see the largest differences by county are approximately Kenedy County, Kames County, and Concho County showing a decrease in people uninsured by 15 to 20%.
tx<-combined_data3%>%
subset(STATEFP==48)
tx<-tm_shape(tx, projection = 2163) + tm_polygons("Uninsured_Diff",id="Geo_QNAME.x", midpoint = 30, border.col = "grey", border.alpha = 1,title="Uninsured Difference %") +tm_text("NAME",size = "AREA")+ tm_borders(lwd = 1, col = "black", alpha = .5)+ tm_layout(legend.text.size =.5)
tmap_leaflet(tx)
Idaho is the only state that has a dark green county which is Clark County, showing a 10 to 15 % increase in people uninsured.
idaho<-combined_data3%>%
subset(STATEFP==16)
idaho<-tm_shape(idaho)+tm_polygons( "Uninsured_Diff",id="Geo_QNAME.x",midpoint=30,border.col = "grey", border.alpha = 1,title="Uninsured Difference %") +tm_text("NAME",size = "AREA")+ tm_borders(lwd = .28, col = "black", alpha = 1)+ tm_layout(legend.text.size =2)
tmap_leaflet(idaho)
Considering the United States as a whole, the change is relatively small. For example, when examining the maps on a county by county basis, I can see that there are some changes that cannot be seen on a non-spatial plot. The spatial maps allow me to localize these changes on a map and evaluate the significance and see where the changes occur.
The non-spatial plots give me a total picture of these changes but are limiting in that they only display numerical evlauations and do not show the entire picture and leave out part of the story where the maps fill in the gaps.
What happens when I change CB to false?
I see that the map below is much less detailed and there are counties added in bodies of water such as the great lakes, and the Long Island Sound is left out because Suffolk county is there instead. Therefore, I can conclude that changing cb to false makes the map less detailed than cb=true.
library(tidyverse)
library(dplyr)
library(sf)
library(tmap)
library(tigris)
library(spdep)
library(tidycensus)
options(tigris_class = "sf")
counties<-st_read("C:/R-3.5.2/tl_2016_us_county.shp", stringsAsFactors=FALSE)
counties <- counties(cb = FALSE)
library(readr)
healthinsurance2016 <- read_csv("C:/Users/abbys/Downloads/R12140862_SL050.csv",
col_types = cols(Geo_COUNTY = col_number(),
Geo_FIPS = col_integer(), STATEFP = col_number()))%>%
dplyr::mutate(fips = Geo_FIPS, health2016=SE_T006_002)
healthinsurance2012 <- read_csv("C:/Users/abbys/Downloads/R12140759_SL050.csv",
col_types = cols(Geo_COUNTY = col_number(),
Geo_FIPS = col_integer(), STATEFP = col_number()))%>%
dplyr::mutate(fips = Geo_FIPS,health2012=SE_T006_001)
combo_health<-left_join(healthinsurance2016,healthinsurance2012, by="fips")
counties <- counties %>%
dplyr::mutate(fips = parse_integer(GEOID))
combined_data <- counties %>%
left_join(combo_health,counties, by = "fips")
combined_data2 =combined_data %>%
filter(STATEFP != "02") %>%
filter(STATEFP != "15") %>%
filter(STATEFP != "60") %>%
filter(STATEFP != "66") %>%
filter(STATEFP != "69") %>%
filter(STATEFP != "72") %>%
filter(STATEFP != "78")
library(tmaptools)
states<-combined_data2%>%
aggregate_map(by="STATEFP")
combined_data3=combined_data2%>%
group_by(GEOID)%>%
dplyr::mutate(Uninsured_Diff=health2016-health2012)
tm_shape(combined_data3,projection=2163)+tm_polygons("Uninsured_Diff",midpoint=0, border.col="grey")+tm_shape(states)+tm_borders(lwd=.36,col="black",alpha=.4)