The following table displays the correlation value of each variable with GG_Level and only fist 10 are shown to save space.
In the correlation plot, we marked positive correlation as green, which shows people who regard Muslim Americans as having a dual identity. While those negative correlations are marked in red, representing people view Muslim Americans have only one identity.
corre<-read.csv("~/Downloads/GG_identification_Nov_2018__Muslim_Americans_Correlational - 20190716.csv") %>%
dplyr::select(GG_Level:Gender)
x<-corre %>%
correlate() %>%
focus(GG_Level) %>%
rename(Correlation_with_GG_Level=GG_Level)
##
## Correlation method: 'pearson'
## Missing treated using: 'pairwise.complete.obs'
x
## # A tibble: 31 x 2
## rowname Correlation_with_GG_Level
## <chr> <dbl>
## 1 Simon_Dual_identity_SCL 0.619
## 2 Angst_SCL -0.470
## 3 Stereotype_SCL -0.540
## 4 Symb_Threat_SCL -0.659
## 5 Real_Threat_SCL -0.489
## 6 ITT -0.621
## 7 GG_Dehumanization -0.627
## 8 Peception_of_GC_mediator 0.200
## 9 Peception_of_GC_bridge 0.291
## 10 Peception_of_GC_fifth_column -0.387
## # … with 21 more rows
x$color<-ifelse(x$Correlation_with_GG_Level >0, "positive", "negative")
x %>%
#mutate(rowname = factor(rowname, levels = rowname[order(GG_Level)]))%>% # Order by Correlation Strength
ggplot(aes(x= rowname, y=Correlation_with_GG_Level, fill=color))+
scale_fill_manual(values=c(positive="green",negative="red"))+
geom_bar(stat="identity")+
ylab("Correlation with GG_Level")+
xlab("Variable")+
geom_hline(yintercept = 0, size=1)+
coord_flip()
Then, the variables are sorted(highest on top) by the correlation value with GG_Level. This is an interactive plot, if we hover over a bar, it displays the specific variable name, correlation value between this variable and GG_Level and whether the correlation is positive or negative. Simon_Dual_identity_SCL ranked highest and Symb_Threat_SCL ranked lowest.
y<-x %>%
mutate(rowname =
factor(rowname, levels =rowname[order(Correlation_with_GG_Level)]))%>% # Order by Correlation Strength
ggplot(aes(x= rowname, y=Correlation_with_GG_Level, fill=color))+
scale_fill_manual(values=c(positive="green",negative="red"))+
geom_bar(stat="identity")+
ylab("Correlation with GG_Level")+
xlab("Variable")+
geom_hline(yintercept = 0, size=1)+
coord_flip()#+geom_text(aes(label=GG_Level))
ggplotly(y)
Finally, I am interested if there is any inter-correlation between partcipants’ age, gender and GG_Level. Before plotting the mosaic plot, we need to process the raw data.
According to data legend file, I renamed gender 1 to male and 2 to female.
Variable Age are classified to three groups, 19-35 as young adults, 35-55 as middle_aged adults and over 55 as elder adults.
Variable GG_Levels are classified into three level, -25-0 for participants who view Muslim Americans as having a single identity 0-25 for those having a neutal attitude, and 25-50 for those who view Muslim Americans as having a dual identity.
From the mosaic plot, we have some interesting findings:
There are in general more males in three age groups(young, middle-aged and elder) than females. For both genders, there is a higher proportion of people who are young than are elder for three age groups.
Elder people are more likely to view Muslim Americans as having a single identity while young adults usually view them as having a dual identity.
corree<-corre %>%mutate(Gender=recode(Gender, '1'="Male", '2'="Female"))%>%
mutate(Age=cut(Age, breaks = c(-Inf, 18, 35, 55, Inf),
labels = c("0", "Young(1-35)", "Middle_Aged(36-55)", "Elder(55+)"))) %>%
mutate(Viewas_dual=cut(GG_Level, breaks = c(-25,0, 25, 50),
labels = c("single", "neutral", "dual")))
gg<-corree %>%
na.omit() %>%
ggplot() +
geom_mosaic(aes(x=product(Viewas_dual,Age, Gender), fill = Gender))+
theme(axis.text.x = element_text(angle = 10, hjust = 1))
ggplotly(gg)