This analysis seeks to answer two questions.
Question 1: Is a person’s residential area related to their feelings towards police?
Question 2: The residential area where a person lives, is it related to their belief that the killings of African American men by police are isolated incidence or part of a greater pattern of how police treat African Americans?
The Democracy Fund Voter Study Group conducts research designed to help policy makers and thought leaders listen more closely, and respond more powerfully, to the views of American voters. Their goal is to have more productive conversations where voters feel like they are truly heard. Their hope is that the study group’s research and analysis helps us understand each other and make our democracy more functional.
The VOTER Survey (Views of the Electorate Research Survey) is the study group’s first original research.
This report will analyze responses to the VOTER Survey in order to answer the research questions outlined above.
In order to answer these questions, we will look at respondents answer to 3 survey questions:
ResidenceArea: How would you describe the place where you live?
PoliceThreat: Do you think the recent killings of African American men by police in recent years are isolated incidents, or are they part of a broader pattern of how police treat African Americans?
FTPolice: We’d like to get your feelings towards [Police]. Ratings between 50 degrees and 100 degrees mean that you feel favorable and warm toward the group. Ratings between 0 degrees and 50 degrees mean that you don’t feel favorable toward the group and that you don’t care too much for that group. You would rate the group at the 50 degree mark if you don’t feel particularly warm or cold toward the group.
By looking at how the average rating of police varies based on a person’s area of residence, we will be able to answer question #1.
Looking at the differences in average FTPolice among residential areas such as city, suburb, town, and rural areas will give us an idea of how their ratings vary.
Looking at the distribution will give us some more detailed insight.
Plotting a sampling distrubution will allow us to forcast whether or not a t-test is likely to yeild statistical significance.
By looking at a person’s belief on the killings of African American men by police, we will be able to answer question #2.
A crosstab will allow us to investigate how people are distributed across categories.
A chi-squared test will allow us to determine whether or not the two variables (ResidenceArea & PoliceThreat) are independent from one another, or are infleuncing one another.
knitr::opts_chunk$set(warning = FALSE)
library(dplyr)
library(ggplot2)
library(readr)
library(knitr)
library(plotly)
Voter<-read_csv("VoterData2017(1).csv")## Warning: 13 parsing failures.
## row col expected actual file
## 1418 religpew_muslim_baseline 1/0/T/F/TRUE/FALSE 90 'VoterData2017(1).csv'
## 1531 child_age7_1_baseline 1/0/T/F/TRUE/FALSE 6 'VoterData2017(1).csv'
## 1531 child_age8_1_baseline 1/0/T/F/TRUE/FALSE 4 'VoterData2017(1).csv'
## 1531 child_age9_1_baseline 1/0/T/F/TRUE/FALSE 2 'VoterData2017(1).csv'
## 2947 religpew_muslim_baseline 1/0/T/F/TRUE/FALSE 2 'VoterData2017(1).csv'
## .... ........................ .................. ...... ......................
## See problems(...) for more details.
VoterData <-Voter%>%
mutate(urbancity_baseline=ifelse(urbancity_baseline==1, "City",
ifelse(urbancity_baseline==2,"Suburb",
ifelse(urbancity_baseline==3, "Town",
ifelse(urbancity_baseline==4, "Rural", NA)))),
police_threat_2016=ifelse(police_threat_2016==1, "Isolated incidents",
ifelse(police_threat_2016==2, "Part of a broader pattern",
ifelse(police_threat_2016==8,"Don't know",NA ))),
ft_police_2017 =ifelse(ft_police_2017==997, NA, ft_police_2017))%>%
select(ft_police_2017, police_threat_2016,urbancity_baseline)%>%
rename("ResidenceArea"=urbancity_baseline, "PoliceThreat"=police_threat_2016, "FTPolice"= ft_police_2017)When respondents were asked to rate their feeling towards police on a scale of 0 to 100, with 0 being negative, 50 being neutral, and 100 being positive, how do respondents differ in their average rating according residential area?
VoterData1 <- VoterData%>%
filter(!is.na(ResidenceArea))%>%
group_by(ResidenceArea)%>%
summarize(AvgFTPolice = mean(FTPolice, na.rm =TRUE))
kable(VoterData1)| ResidenceArea | AvgFTPolice |
|---|---|
| City | 71.16739 |
| Rural | 79.57765 |
| Suburb | 76.07696 |
| Town | 78.79226 |
The chart below presents the same information as the above table, in a more visual format.
VoterData1%>%
#summarise(FTPolice = mean(FTPolice, na.rm = TRUE))%>%
ggplot()+
geom_col(aes(x=ResidenceArea, y=AvgFTPolice, fill=ResidenceArea))+
geom_label(aes(x=ResidenceArea, y=AvgFTPolice,
label = round(AvgFTPolice)))+
theme(legend.position="none") Looking at the distribution of ratings may give us more insight into how respondents of each residential area rated their feelings towards police.
We can see that the most common rating given by respondents living in town, city, suburb, and rural areas shown to be 100.
Among city and suburb area respondents, the second most common rating that was given is 76.
VoterData%>%
ggplot()+
geom_histogram(aes(x=FTPolice, fill=ResidenceArea))+
facet_wrap(~ResidenceArea)+
theme(legend.position="none") #City Sampling Distribution
city_data<-VoterData%>%
filter(ResidenceArea=="City")
City_Samp_Dist<-replicate(10000,
mean(sample(city_data$FTPolice,40),na.rm=TRUE))%>%
data.frame()%>%
rename("mean"=1)
#Rural Area Sampling Distribution
rural_data<-VoterData%>%
filter(ResidenceArea=="Rural")
Rural_Samp_Dist<-replicate(10000,
mean(sample(rural_data$FTPolice,40),na.rm=TRUE))%>%
data.frame()%>%
rename("mean"=1)
#Suburb Sampling Distribution
suburb_data<-VoterData%>%
filter(ResidenceArea=="Suburb")
Suburb_Samp_Dist<-replicate(10000,
mean(sample(suburb_data$FTPolice,40),na.rm=TRUE))%>%
data.frame()%>%
rename("mean"=1)
#Town Sampling Distribution
town_data<-VoterData%>%
filter(ResidenceArea=="Town")
Town_Samp_Dist<-replicate(10000,
mean(sample(town_data$FTPolice,40),na.rm=TRUE))%>%
data.frame()%>%
rename("mean"=1) When plotting these distributions on the same graph, we can see clearly that the sampling distribution of rural, suburb and town residents overlap around the mean average of 80. On the contrary, city residents peak average for FTPolice stands alone.
ggplotly(ggplot()+
geom_histogram(data=City_Samp_Dist, aes(x=mean),fill="Red")+
geom_histogram(data=Rural_Samp_Dist, aes(x=mean),fill="Green")+
geom_histogram(data=Suburb_Samp_Dist, aes(x=mean),fill="Blue")+
geom_histogram(data=Town_Samp_Dist, aes(x=mean),fill="Yellow"))A p-value of 1.694e-15 indicates that there is a statistically significant difference in the mean feeling towards police between rural and city residents.
VoterData2 <- VoterData%>%
filter(ResidenceArea%in%c("City", "Rural"))
t.test(FTPolice~ResidenceArea, data=VoterData2) ##
## Welch Two Sample t-test
##
## data: FTPolice by ResidenceArea
## t = -8.0243, df = 2065.7, p-value = 1.694e-15
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -10.465698 -6.354833
## sample estimates:
## mean in group City mean in group Rural
## 71.16739 79.57765
Do residence of the different residence area think the recent killings of African American men by police in recent years are isolated incidents, or are they part of a broader pattern of how police treat African Americans?
50% of city residents view the killings of African American men by police part of a broader pattern.
61% of rural area residents view the killings of African American men by police as isolated incidents.
| City | Rural | Suburb | Town | |
|---|---|---|---|---|
| Don’t know | 0.09 | 0.10 | 0.09 | 0.10 |
| Isolated incidents | 0.41 | 0.61 | 0.50 | 0.56 |
| Part of a broader pattern | 0.50 | 0.30 | 0.40 | 0.34 |
The bar chart below presents the same information as the above table, in a more visual format.
VoterData%>%
group_by(ResidenceArea, PoliceThreat)%>%
summarize(n=n())%>%
mutate(percent=n/sum(n))%>%
ggplot()+
geom_col(aes(x=ResidenceArea, y=percent, fill=PoliceThreat), stat="Identity")This is how many people should be in each category of response, if the variables are completely unrelated to each other.
|
How many people actually in each category
|
A p-value <2.2e-16 indicates that there is a statistically significant relationship between these two variables.
##
## Pearson's Chi-squared test
##
## data: VoterData$PoliceThreat and VoterData$ResidenceArea
## X-squared = 184.2, df = 6, p-value < 2.2e-16
Question 1.
Is a person’s residential area related to their feelings towards police.
Rural area residents, on average, rate their feelings towards police as 80.
City residents, on average, rate their feelings towards police as 71.
A t-test confirms that there is a statistically significant difference in the average feeling towards police between city and rural area residents
Question 2.
The residential area where a person lives, is it related to their belief that the killings of African American men by police are isolated incidence or part of a greater pattern of how police treat African Americans?
50% of city residents view the killings of African American men by police part as a broader pattern of how police treat African Americans.
61% of rural area residents view the killings of African American men by police as isolated incidents.
A chi-squared test for independence confirms that residence area and police threat are not independent from one another.