Hypothesis and Background

The goal of this analysis is to investigate the politicalization of adherence to stay at home orders in Pennslyvania during the COVID-19 pandemic. It is not absurd to think that, in today’s increasingly polarized climate, political beliefs have begun to influence decision that may not be overtly political.

There is data that show a difference in willingness to comply with stay-at-home orders based on political party affliation. One thing that might contribute to this spilt is the current president’s response to the pandemic and specifically the stay at home orders. He has used public platforms to endorse protests against the stay at home orders and called for states to be “liberated” from the shutdowns ordered by governors.

This report focuses specifically on Pennsylvania because of the political nature of the state. It is recently a more spilt state, but currently has a Democratic governor. This distinction could provide interesting results because the stay at home orders are being issued by the governors. I think that testing this hypothesis in a state with the political climate of PA might provide more rich results than looking at a less dynamic state.

This report is testing the hypothesis that counties in Pennsylvania that have a higher Republican population, specifically those with high population of people who voted for Donald Trump in the 2016 election, are less likely to adhere to stay-at-home orders.

Programs and Data

1.R: tidytext, tidyverse packages

2.Pennsylvania Voter Registation Data. (by county)

3.Pennsylvania COVID-19 Mobility Reports. (by county)

Methodology

After deciding to choose to analyze Pennsylvania for reasons stated above, the next step was to look at the voter data. My goal is to compare areas with a high Republican population and high support for the president to areas with lower Republican population and see if there is any difference on how these two areas seem to respond to the stay at home orders. When looking into the situation, it became clear that it was not always the case that Republican’s were supporters of Trump. He is a very polarizing figure and that is why I chose to first look at counties with the highest amount of registered Republicans, but then to also identify the counties who had the most votes cast for Trump in the 2016 election. The point of this was to see if there were differences there before moving on with the analysis.

Counties with the Highest Percent of Registered Republicans

The graph shows the top 15 counties in PA based off of registered Republicans and the chart next to that lists the top 10 counties. In the bulk of my report I will analyze the 5 counties with the highest number of Republicans compared to the 5 counties with the lowest number of Republicans. I chose to create a new column in the dataset called “RepPercent” and to order using precent of registered Republicans instead of just raw numbers because some of the counties are much larger, have more people in total, and therefore have more registered Republicans. By using percentages for this metric, I am allowing for a more equal, representative estimate.

library(tidytext)
library(tidyverse)
library(textdata)

PAvoterreg <- readr:: read_csv("/Users/Genna/Desktop/PAvoterreg.csv")

PAvoterreg <- transform (PAvoterreg, RepPercent =Republican/Total)
PAvoterreg %>% 
  arrange (desc(RepPercent)) -> arrangebyRep

arrangebyRep %>% 
  head(15) %>% 
  ggplot(aes(reorder(County, RepPercent), RepPercent)) +
  labs (y="WPercent of Republicans", x="County")+
  geom_col() +
  coord_flip()

County Percent of Registered Republicans
Potter .6724
Bedford .6713
Fulton .6693
Juniata .6481
Perry .6430
Snyder .6342
Tioga .6314
Mifflin .6286
Huntingdon .6144
Bradford .6124
Counties with the Highest Percent of Trump Voters in 2016

The graph shows the top 15 counties in PA based off of voters in the 2016 election and the chart next to that lists the top 10 counties. I also used percentages for this calculation for the same reasons as above. The goal of this was to identify counties that have the highest percentage of Trump supporters and not just the highest percentage of Republicans.

library(tidytext)
library(tidyverse)
library(textdata)

debatesgeneral2016 <- readr::read_csv("/Users/Genna/Desktop/PA2016General.csv")
debatesgeneral2016 <- transform (debatesgeneral2016, total= CLINTON + TRUMP)
debatesgeneral2016 <-transform (debatesgeneral2016, TrumpPercent = TRUMP/ total)

debatesgeneral2016 %>% 
  arrange(desc(TrumpPercent)) -> arrangebyTRUMP

arrangebyTRUMP %>% 
  head(15) %>% 
  ggplot (aes(reorder(CountyName,TrumpPercent), TrumpPercent)) +
  labs (y="Percentage of Trump Voters", x="County")+
  geom_col()+
  coord_flip()

County Percent of Trump Voters
Fulton .8619
Bedford .8429
Potter .8276
Juniata .8196
Jefferson .8063
Somerset .7878
Mifflin .7843
Tioga .7773
Perry .7712
Armstrong .7659

This analysis did show some differences. For the most part there was just movement in the rank of different counties, but there were 3 counties in the top 10 for percentage of Trump voters that were not in the top 10 for percentage of registered Republicans. Jefferson, Somerset, and Armstrong county all had high percentage of Trump voters, but not as high a percentage of registered Republicans. This shows that there is a slight difference between the two metrics and justifies why I chose to look specifically at Trump voters. The rhetoric and information about COVID-19 coming directly from the president has been highly negative and since the examples are from Trump specifically, and not the whole Republican party, I believe it is most representative to look at his supporters specifically and not the whole party. Now that I have proven why it makes more sense to analyze the Trump voter data, I next determined the counties with the lowest percentage of Trump voters to have something to compare to.

Counties with the Lowest Percent of Trump Voters in 2016

The graph shows the 15 PA counties with lowest percentage of Trump voters in the 2016 election. I then listed the top 10 in the chart along with their percentages. The top 5 of these counties will be used to compare the counties with the most Trump support when it comes to adherence to stay at home orders.

library(tidytext)
library(tidyverse)
library(textdata)

debatesgeneral2016 <- readr::read_csv("/Users/Genna/Desktop/PA2016General.csv")
debatesgeneral2016 <- transform (debatesgeneral2016, total= CLINTON + TRUMP)
debatesgeneral2016 <-transform (debatesgeneral2016, TrumpPercent = TRUMP/ total)

debatesgeneral2016 %>% 
  arrange(desc(-TrumpPercent)) -> arrangebyTRUMP

arrangebyTRUMP %>% 
  head(15) %>% 
  ggplot (aes(reorder(CountyName,TrumpPercent), TrumpPercent)) +
  labs (y="Percentage of Trump Voters", x="County")+
  geom_col()+
  coord_flip()

County Percent of Trump Voters
Philadelphia .1570
Delaware .3842
Montgomery .3886
Allegheny .4138
Chester .4504
Lehigh .4753
Lackawanna .4821
Dauphin .4847
Centre .4875
Bucks .4959

This inital analysis was conducted to provide information about which counties should be compared. I chose to pick the 5 counties with the highest support for Trump and compare their mobility reports to the 5 counties with the lowest percentage of Trump voters.

Five Counties with Highest Percentage

County Percent of Trump Voters
Fulton .8619
Bedford .8429
Potter .8276
Juniata .8196
Jefferson .8063

Five Counties with Lowest Percentage

County Percent of Trump Voters
Philadelphia .1570
Delaware .3842
Montgomery .3886
Allegheny .4138
Chester .4504

These are the ten counties that I will be looking at for the rest of this report.

Mobility Reports

Google release mobility reports that can be broken down by county. The different cateogories that are included in the reports are retail and recreation, grocery and pharmacy, parks, transit stations, workplaces and residental. These reports were calculated as a percent change from a previously observed baseline. The data includes observations from Feburary 15, 2020 to May 6,2020. I chose to look at the percentages from the day that the president openly supported protesting stay at home orders and released the tweets shown above, April 19, 2020. I chose to specifically look at the reports of grocery and pharmacy, transit stations and workplaces. My reasoning behind isolating the mobility specifically related to only these three areas is because these are all areas that have enough information from the majority of the ten counties above to actually have calculations. One issue with the mobility dataset is there are not observations for every county in every area. I looked at the tens counties I was interested in observing and I chose the three areas that would allow me to have observations for the majority of the counties.

library(tidytext)
library(tidyverse)
library(textdata)
library(ggplot2)

mobility_total <- readr::read_csv("/Users/Genna/Desktop/Global_Mobility_Report.csv", col_types = cols(sub_region_2=col_character()))

mobility_total %>% 
  filter (country_region_code=="US") -> US_mobility

US_mobility %>% 
  filter(sub_region_1=="Pennsylvania") ->PA_mobility

PA_mobility %>% 
  filter (date =="4/19/20") ->PA_mobility_19april
Workplaces Mobility

The workplaces mobility change is described as the “mobility trends for places of work”.

When looking at these numbers, it is important to realize that the lower the negative number translates to a better adherence to stay at home policies.

County Percent of Trump Voters Mobility Change
Philadelphia .1570 -41
Delaware .3842 -39
Montgomery .3886 -41
Allegheny .4138 -41
Chester .4504 -38
Fulton .8619 N/A
Bedford .8429 -32
Potter .8276 -25
Juniata .8196 -37
Jefferson .8063 -21

In this category, there does seem to be a connection bewteen a low percentage of Trump voters in a county and a larger mobilty change. Counties with a lower percentage of Trump voters for the most part have a more negative mobility change.

PA_mobility_19april %>%
  arrange (desc(workplaces_percent_change_from_baseline)) ->arrangebyWork

arrangebyWork %>% 
  filter(sub_region_2 %in% c("Fulton_County", "Bedford_County", "Potter_County", 
  "Juniata_County", "Jefferson_County","Philadelphia_County", "Delaware_County", 
  "Montgomery_County", "Allegheny_County", "Chester_County" )) ->TrumpSupportByWork


TrumpSupportByWork%>% 
  ggplot (aes(reorder(sub_region_2,workplaces_percent_change_from_baseline), 
              workplaces_percent_change_from_baseline)) +
  labs (y="Workplaces Percent Change from Baseline", x="County")+
  coord_flip()+
  geom_col()

Transit Stations Mobility

The transit stations mobility change is described as the “mobility trends for places like public transport hubs such as subway, bus, and train stations.”

County Percent of Trump Voters Mobility Change
Philadelphia .1570 -54
Delaware .3842 -36
Montgomery .3886 -52
Allegheny .4138 -58
Chester .4504 -62
Fulton .8619 -66
Bedford .8429 -48
Potter .8276 N/A
Juniata .8196 N/A
Jefferson .8063 -22

In this category, there does not seem to be a connection between a percentage of Trump voters in a county and its mobility change score.

PA_mobility_19april %>%
  arrange (desc(transit_stations_percent_change_from_baseline)) ->arrangebyTransit

arrangebyTransit %>% 
  filter(sub_region_2 %in% c("Fulton_County", "Bedford_County", "Potter_County", "Juniata_County", "Jefferson_County","Philadelphia_County", "Delaware_County", "Montgomery_County", "Allegheny_County", "Chester_County" )) ->TrumpSupportByTransit


TrumpSupportByTransit%>% 
  ggplot (aes(reorder(sub_region_2,transit_stations_percent_change_from_baseline), transit_stations_percent_change_from_baseline)) +
  labs (y="Transit Stations Percent Change from Baseline", x="County")+
  geom_col()+
  coord_flip()

Grocery and Pharmacy Mobility

The grocery and pharmacy mobility change is described as the “mobility trends for places like grocery markets, food warehouses, farmers markets, specialty food shops, drug stores, and pharmacies.”

County Percent of Trump Voters Mobility Change
Philadelphia .1570 -23
Delaware .3842 -21
Montgomery .3886 -27
Allegheny .4138 -25
Chester .4504 -28
Fulton .8619 N/A
Bedford .8429 -8
Potter .8276 N/A
Juniata .8196 N/A
Jefferson .8063 -5

In this category, there does seem to be a connection between a percentage of Trump voters in a county and its mobility change score, however it is not extremly clear.

PA_mobility_19april %>%
  arrange (desc(grocery_and_pharmacy_percent_change_from_baseline)) ->arrangebyGrocery

arrangebyGrocery %>% 
  filter(sub_region_2 %in% c("Fulton_County", "Bedford_County", "Potter_County", "Juniata_County", "Jefferson_County","Philadelphia_County", "Delaware_County", "Montgomery_County", "Allegheny_County", "Chester_County" )) ->TrumpSupportByGrocery


TrumpSupportByGrocery%>% 
  ggplot (aes(reorder(sub_region_2, grocery_and_pharmacy_percent_change_from_baseline) ,grocery_and_pharmacy_percent_change_from_baseline)) +
  labs (y="Grocery and Pharmacy Percent Change from Baseline", x="County")+
  geom_col()+
  coord_flip()

Looking for Correlation

The next step I chose to take in this analysis was to create equations that show the relationship between the two variables (percent changes in mobility in the different areas and percentage of Trump voters) as well as interpret their signifiance.

In order to do this, I created a whole new data set in excel, that included the information shown above in the three result charts. Then I created simple linear regression models for each of the mobility categories that I investigated with Trump voters as the predictor variable

library(tidytext)
library(tidyverse)
library(textdata)
library(ggplot2)
library(Hmisc)
library(psych)
library(ggplot2)
library(jtools)
library(huxtable)

TrumpandMobility <- readr::read_csv("/Users/Genna/Desktop/trumpandmobility.csv", col_types=cols(transit=col_double(), 
                                                                                                grocery=col_double(),
                                                                                                workplace=col_double()))
TransitMod <- lm(formula= transit~TrumpPercent, data= TrumpandMobility) 
export_summs(TransitMod)
Model 1
(Intercept) -54.92 **
(13.10)  
TrumpPercent 9.61   
(22.13)  
N 8      
R2 0.03   
*** p < 0.001; ** p < 0.01; * p < 0.05.

Transit Mobility

This is a model predicting the transit mobility percentage change based on the Trump percentage variable. Based on this model, for each one unit increase in the Trump percentage variable, there is a 9.61 increase in the transit mobility percentage change from baseline.

Again, it is important to remember that the more negative a mobility percentage is translates to a better adherence to stay at home policies. So a positive increase in mobility percentage means there is less adherence. However, this slope parameter (9.61) is not significant therefore this model is not the most trustworthy to predict this interaction.

Grocery and Pharmacy Mobility

This model predicts the grocery and pharmacy mobility percentage change based on the Trump percentage variable. Based on this model, for every one unit increase in the Trump percentage variable, there is a 31.53 unit increase in the grocery and pharmacy mobility percentage change from baseline.

This slope parameter (31.53) is slightly significant (at a 0.05 level), therefore this model is somewhat trustworthy.

GroceryMod <-lm(formula= grocery~TrumpPercent, data=TrumpandMobility)
export_summs(GroceryMod)
Model 1
(Intercept) -35.08 ***
(4.98)   
TrumpPercent 31.53 *  
(9.18)   
N 7       
R2 0.70    
*** p < 0.001; ** p < 0.01; * p < 0.05.
WorkplaceMod <-lm(formula= workplace~TrumpPercent, data=TrumpandMobility)
export_summs(WorkplaceMod)
Model 1
(Intercept) -47.47 ***
(4.29)   
TrumpPercent 22.05 *  
(6.96)   
N 9       
R2 0.59    
*** p < 0.001; ** p < 0.01; * p < 0.05.

Workplace Mobility

This model predicts the workplace mobility percentage change based on the Trump percentage variable.

Based on this model, for every one unit increase in the Trump percentage variable, there is a 22.05 unit increase in the grocery and pharmacy mobility percentage change from baseline.

This slope parameter (22.05) is slightly significant (at a 0.05 level), therefore this model is somewhat trustworthy.

Conclusion

Like in a lot of analyses, there is no blanket conclusion that was reached during this investigation. There are no absolutes espically with data this new but this analysis showed some interesting results.

Overall, when looking at the 5 counties in PA with the highest percent of Trump voters in the 2016 elections compared to the 5 counties with the lowest percent of Trump voters, there seems to be some connection to adherence of stay at home orders. Workplace mobility showed the greatest connection, grocery and pharmacy mobility had some connection but it was not clear across the board, and finally transit mobility showed very little connection.

When looking at linear models, all of the models connecting Trump voters to mobility had a postive slope which supports my intial hypothesis. However, only two of these slopes are statistically significant and at a slight level. This means that although all the slopes are postive, the models cannot be considered significantly accurate in predicting the relationship.

Some limitations of this study are that I only looked at Pennsylvania and I only considered one date in the mobility data. For a larger scale analysis I would look at multiple states and pull data from multiple dates on the mobility reports. I purposefully chose Pennsylvania and April 16th for reasons stated above however for a more robust study that could possibly produce clearer results, it would be interesting to look at a wider variety of these variables.

My final conclusion would be that there is some connection between Trump support in an area and the adherence to stay at home orders but the scale and significance of that connection is very unclear.