In my report, I will be researching to find any statistical significant relationship between Party Identification and Immigrant Contributions,Naturalization and feelings towards immigrants.
dataset<-read.csv("/Users/apple/Downloads/Abbreviated Voter Dataset Labeled.csv")
dataset<- dataset[,c("PartyIdentification", "ImmigrantContributions", "ImmigrantNaturalization", "ft_immig_2017")]
dataset<-na.omit(dataset)
head(dataset, 10)
## PartyIdentification ImmigrantContributions ImmigrantNaturalization
## 1 Democrat Mostly Contribute Favor
## 2 Republican Mostly a Drain Not Sure
## 3 Republican Mostly Contribute Favor
## 5 Republican Mostly a Drain Not Sure
## 6 Democrat Mostly Contribute Favor
## 7 Democrat Mostly Contribute Favor
## 8 Independent Mostly a Drain Oppose
## 9 Democrat Mostly Contribute Favor
## 10 Democrat Mostly Contribute Favor
## 12 Independent Mostly Contribute Favor
## ft_immig_2017
## 1 95
## 2 96
## 3 77
## 5 91
## 6 100
## 7 100
## 8 1
## 9 90
## 10 80
## 12 75
suppressMessages(suppressWarnings(library(gmodels)))
table(dataset$PartyIdentification, dataset$ImmigrantContributions)
##
## Mostly a Drain Mostly Contribute Neither Not Sure
## Democrat 559 770 243 207
## Independent 799 396 200 115
## Not Sure 38 9 3 5
## Other 31 9 8 5
## Republican 1049 107 133 59
round(prop.table(table(dataset$PartyIdentification, dataset$ImmigrantContributions))*100,2)
##
## Mostly a Drain Mostly Contribute Neither Not Sure
## Democrat 11.78 16.23 5.12 4.36
## Independent 16.84 8.35 4.21 2.42
## Not Sure 0.80 0.19 0.06 0.11
## Other 0.65 0.19 0.17 0.11
## Republican 22.11 2.26 2.80 1.24
chisq.test(dataset$PartyIdentification, dataset$ImmigrantContributions)
## Warning in chisq.test(dataset$PartyIdentification,
## dataset$ImmigrantContributions): Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: dataset$PartyIdentification and dataset$ImmigrantContributions
## X-squared = 740.94, df = 12, p-value < 2.2e-16
barplot(prop.table(table(dataset$ImmigrantContributions, dataset$PartyIdentification))*100, ylab='Percentage', xlab='Party', legend.text=TRUE,
col=c("lightgreen", "pink", "lightblue", "yellow", "lightblue"), args.legend = list(x = "topright",
inset = c(0.23, 0)))
According to the chi squared results, there is a significant relationship between Immigrant Contributions and Party Identification as the p-value is less than.5.Democrats tend to think that immigrants mostly contribute, Republicans believe they are mostly a drain, as do Independents.
table(dataset$PartyIdentification, dataset$ImmigrantNaturalization)
##
## Favor Not Sure Oppose
## Democrat 1079 326 374
## Independent 626 283 601
## Not Sure 20 12 23
## Other 20 15 18
## Republican 285 276 787
round(prop.table(table(dataset$PartyIdentification, dataset$ImmigrantNaturalization))*100,2)
##
## Favor Not Sure Oppose
## Democrat 22.74 6.87 7.88
## Independent 13.19 5.96 12.67
## Not Sure 0.42 0.25 0.48
## Other 0.42 0.32 0.38
## Republican 6.01 5.82 16.59
chisq.test(dataset$PartyIdentification, dataset$ImmigrantNaturalization)
##
## Pearson's Chi-squared test
##
## data: dataset$PartyIdentification and dataset$ImmigrantNaturalization
## X-squared = 570.35, df = 8, p-value < 2.2e-16
barplot(prop.table(table(dataset$ImmigrantNaturalization, dataset$PartyIdentification))*100, ylab='Percentage', xlab='Party', legend.text=TRUE,
col=c("lightgreen", "pink", "lightblue", "yellow", "lightblue"), args.legend = list(x = "topright",
inset = c(0.23, 0)))
According to the chi squared results, there is a significant relationship between Immigrant Contributions and Immigrant Naturalization as the p-value is less than .5. According to the results as in the table, Democrats and Independents are mostly in favor of naturalization, while Republicans for the most part aren’t.
suppressMessages(suppressWarnings(library(psych)))
suppressMessages(suppressWarnings(library(ggpubr)))
describeBy(dataset$ft_immig_2017, dataset$PartyIdentification)
##
## Descriptive statistics by group
## group: Democrat
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 1779 70.27 24.79 76 73.21 25.2 0 100 100 -0.89 0.25 0.59
## ------------------------------------------------------------
## group: Independent
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 1510 61.28 26.87 61 63.32 28.17 0 100 100 -0.51 -0.45 0.69
## ------------------------------------------------------------
## group: Not Sure
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 55 48.13 29.8 50 47.71 35.58 0 100 100 -0.04 -1.03 4.02
## ------------------------------------------------------------
## group: Other
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 53 59.62 27 63 60.93 26.69 0 100 100 -0.33 -0.75 3.71
## ------------------------------------------------------------
## group: Republican
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 1348 52.42 26.93 51 53.09 31.13 0 100 100 -0.22 -0.74 0.73
summary(aov(ft_immig_2017 ~ PartyIdentification, data = dataset))
## Df Sum Sq Mean Sq F value Pr(>F)
## PartyIdentification 4 256973 64243 93.83 <2e-16 ***
## Residuals 4740 3245209 685
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
ggplot(dataset, aes(x=PartyIdentification, y=ft_immig_2017, fill=PartyIdentification)) +
geom_boxplot(alpha=0.3) +
theme(legend.position="none") +
scale_fill_brewer(palette="Dark2")
According to the one way ANOVA results as in the table, p value of the test is less than .5. Therefore, the results are statistically significant.
-There is a statistically significant relationship between Immigrant Contributions, Naturalization, and Party Identification- There is significant difference in score of feeling towards immigrants (ft_immig_2017) among the Party Identification (Highest average was shown for people in the Democratic party, lowest average was shown for people in the Not Sure group).