rm(list=ls())## Install "pacman" package if not installed# (remove the # symbol from the line below):# install.packages("pacman")## Load R packages:pacman::p_load(data.table, tidyverse, haven, labelled, vtable, psych, scales, weights, clipr, forcats, stargazer, ggthemes, ggcharts, geomtextpath, corrplot, tm)## Import dataset:ds <-read_sav("~/Desktop/oxford/data/pakistan/Pakistan Data.sav")options(digits =2)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
0 0 2 2 3 10 34
Display code
ds %>%drop_na(children)%>%ggplot(aes(x = children))+geom_bar(color ="black",fill ="gray",width =0.75)+geom_textvline(label ="Mean = 2.00", xintercept =2.00, vjust =1.3, lwd =1.05, linetype =2)+labs(x ="Number of children", y ="Frequency", title ="Number of children: Pakistan")+theme_bw()
Variable: Ethnicity
Note: I did my best to clean up the variable, but the categories still need more work. Some of the categories (like “Christian” or “Hindu” or “Urdu speaking”) do not seem like ethnic groups.
Section 2. Factor Analysis: Group fusion/identification
Display code
## Four ingroup fusion items:# I have a deep emotional bond with the [ingroup].ds$IGF01 <-as.numeric(ds$Q11_1)# I am strong because of the [ingroup].ds$IGF02 <-as.numeric(ds$Q11_2)# I make the [ingroup] strong. ds$IGF03 <-as.numeric(ds$Q11_3)# I am one with the [ingroup]## TWO IDENTICAL ITEMS!!! ds$IGF04a <-as.numeric(ds$Q11_4)ds$IGF04b <-as.numeric(ds$Q11_5)table(ds$IGF04a)
1 2 3 4 5 6 7
4 6 8 31 80 172 203
Display code
table(ds$IGF04b)
1 2 3 4 5 6 7
35 60 46 101 115 101 46
Display code
corr.test(ds$IGF04a, ds$IGF04b)
Call:corr.test(x = ds$IGF04a, y = ds$IGF04b)
Correlation matrix
[1] 0.08
Sample Size
[1] 504
These are the unadjusted probability values.
The probability values adjusted for multiple tests are in the p.adj object.
[1] 0.07
To see confidence intervals of the correlations, print with the short=FALSE option
Display code
## Four outgroup fusion items:# I have a deep emotional bond with the [outgroup].# DOES NOT EXIST!!!## Assuming Q11_5 is first outgroup fusion item ## (instead of duplicate of 4th ingroup fusion item):ds$IGF04 <-as.numeric(ds$Q11_4)ds$OGF01 <-as.numeric(ds$Q11_5)# I am strong because of the [outgroup].ds$OGF02 <-as.numeric(ds$Q11_6)# I make the [outgroup] strong. ds$OGF03 <-as.numeric(ds$Q11_7)# I am one with the [outgroup].ds$OGF04 <-as.numeric(ds$Q11_8)## Four ingroup identification items:# I identify with the [ingroup].ds$IGI01 <-as.numeric(ds$Q11_9)# I have a lot in common with the [ingroup].ds$IGI02 <-as.numeric(ds$Q11_10)# I connect with the values of the [ingroup].ds$IGI03 <-as.numeric(ds$Q11_11)# I feel a sense of belonging with the [ingroup].ds$IGI04 <-as.numeric(ds$Q11_12)## Four outgroup identification items:# I identify with the [outgroup]. ds$OGI01 <-as.numeric(ds$Q11_13)# I have a lot in common with the [outgroup]. ds$OGI02 <-as.numeric(ds$Q11_14)# I connect with the values of the [outgroup]. ds$OGI03 <-as.numeric(ds$Q11_15)# I feel a sense of belonging with the [outgroup]. ds$OGI04 <-as.numeric(ds$Q11_16)
Shorthand for item names: IG = Ingroup, OG = Outgroup F = Fusion, I = Identification So for example: IGF01 = “ingroup fusion: item 1” OGI04 = “outgroup identification: item 4”, and so on
The correlation plot above shows the correlation coefficients between the sixteen different items (eight for ingroup fusion/identification and eight for outgroup fusion/identification).
The general criteria is that KMO needs to be greater than 0.60. Based on the results (overall MSA = 0.9), factor analysis is appropriate.
Bartlett’s test of sphericity
Display code
cortest.bartlett(bonds)
$chisq
[1] 4939
$p.value
[1] 0
$df
[1] 120
The test is significant, again suggesting that factor analysis is appropriate.
Parallel test
Display code
parallel <-fa.parallel(bonds)
Parallel analysis suggests that the number of factors = 2 and the number of components = 2
Based on the scree plot, factor analysis with two factors is the most appropriate. We will proceed with promax rotation, which assumes that the items are inter-correlated (that is, not independent from each other).
Factor analysis: Two factor model with promax rotation
Call:
factanal(x = bonds, factors = 2, rotation = "promax")
Uniquenesses:
IGF01 IGF02 IGF03 IGF04 IGI01 IGI02 IGI03 IGI04 OGF01 OGF02 OGF03 OGF04 OGI01
0.46 0.39 0.42 0.40 0.61 0.37 0.31 0.61 0.67 0.49 0.36 0.38 0.63
OGI02 OGI03 OGI04
0.40 0.28 0.23
Loadings:
Factor1 Factor2
IGF01 0.727
IGF02 0.779
IGF03 0.756
IGF04 0.774
IGI01 0.618
IGI02 0.794
IGI03 0.834
IGI04 0.629
OGF01 0.579
OGF02 0.711
OGF03 0.801
OGF04 0.782
OGI01 0.592 -0.197
OGI02 0.774
OGI03 0.846
OGI04 0.873
Factor1 Factor2
SS loadings 4.53 4.46
Proportion Var 0.28 0.28
Cumulative Var 0.28 0.56
Factor Correlations:
Factor1 Factor2
Factor1 1.000 -0.083
Factor2 -0.083 1.000
Test of the hypothesis that 2 factors are sufficient.
The chi square statistic is 602 on 89 degrees of freedom.
The p-value is 0.000000000000000000000000000000000000000000000000000000000000000000000000000043
The promax rotated FA output suggests that there are two factors:
Factor 1 consisting of the eight outgroup fusion/identification items
Factor 2 consisting of the eight ingroup fusion/identification items
The two factors are negatively correlated (coefficient = -0.082)
The two factors cumulatively explain 56% of the variance in the data. Factor 1 explains 28% of the variance, and factor 2 explains 28% of the variance.
Three factor solution looks reasonable. We will split the factor solution into: two factors for ingroup fusion and identification, and one factor for outgroup fusion/identification (called outgroup bonds).
## BCL and BBL items:# BCL_01:# Seek out opportunities to bridge social divisions with their opponents, enemies, opposition groups, or other outgroups. # Variables: Q6.1, Not found# BCL_02:# Demonstrate willingness to compromise with their opponents, enemies, opposition groups, or other outgroups. # Variables: Q6.2, Q7.1# BCL_03:# Try to understand and empathize with their opponents, enemies, opposition groups, or other outgroups. # Variables: Q6.3, Q7.2# BBL_01:# Represent the interests of the communities and groups that they belong to even at the cost of other groups.# Variables: Q6.4, Q7.3# BBL_02:# Focus on building stronger connections/relationships within the communities and groups they belong to rather than building stronger relationships with other groups across boundaries.# Variables: Q6.5, (Q7.4 and 7.5)# BBL_03:# Try to gain benefits for the communities and groups they belong to even at the expense of other groups.# Variables: Q6.6, Q7.6### PROBLEM ##### there is no item for EXP_BCL_01 (experience of BCL, item 01): # (Seek out opportunities to bridge social...)## BUt, EXP_BBL_02 (endorsement of BBL, item 2) has two nearly identically worded items: Q7_4 and Q7_5:# (Focus on building stronger relationships /connections.... )## Going over this with research partners in Pakistan confirmed this. ## So this scale for Pakistan will have a missing item (EXP_BCL_01)## while one item (EXP_BBL_02) was asked twice, and only one will be retainedds$ENDBCL01 <-as.numeric(ds$Q6_1)ds$ENDBCL02 <-as.numeric(ds$Q6_2)ds$ENDBCL03 <-as.numeric(ds$Q6_3)ds$ENDBBL01 <-as.numeric(ds$Q6_4)ds$ENDBBL02 <-as.numeric(ds$Q6_5)ds$ENDBBL03 <-as.numeric(ds$Q6_6)ds$EXPBCL01 <-NAds$EXPBCL02 <-as.numeric(ds$Q7_1)ds$EXPBCL03 <-as.numeric(ds$Q7_2)ds$EXPBBL01 <-as.numeric(ds$Q7_3)## For EXP_BBL_02, # Q7.4: Focus on building stronger relationships...# Q7.5: Focus on building stronger connections...## Q7.4 was dropped and Q7.5 was used:ds$EXPBBL02 <-as.numeric(ds$Q7_5)ds$EXPBBL03 <-as.numeric(ds$Q7_6)leadership <-cbind.data.frame(ds$ENDBCL01, ds$ENDBCL02, ds$ENDBCL03, ds$ENDBBL01, ds$ENDBBL02, ds$ENDBBL03, ds$EXPBCL02, ds$EXPBCL03, ds$EXPBBL01, ds$EXPBBL02, ds$EXPBBL03)names(leadership) <-sub('ds\\$', '', names(leadership))leadership <-na.omit(leadership)mtx1 <-cor(leadership[, c(1:11)])
Shorthand for item names: END = Endorse, EXP = Experience BCL = BCL, BBL = BBL So for example: ENDBCL01 = “Endorsement of BCL, item 1” EXPBBL02 = “Experience of BBL, item 2”, and so on
## Four regression models predicting endorsement and experience of BCL / BBL:lm01 <-lm(Endorse_BCL~IG_Fusion+IG_Identification+OG_Bonds+Age+Female+Married+`SES-`,data = ds)lm02 <-lm(Experience_BCL~IG_Fusion+IG_Identification+OG_Bonds+Age+Female+Married+`SES-`, data = ds)lm03 <-lm(Endorse_BBL~IG_Fusion+IG_Identification+OG_Bonds+Age+Female+Married+`SES-`, data = ds)lm04 <-lm(Experience_BBL~IG_Fusion+IG_Identification+OG_Bonds+Age+Female+Married+`SES-`, data = ds)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.0 3.0 4.0 4.4 6.0 7.0
Display code
ds %>%drop_na(history_discrimination)%>%ggplot(aes(x = history_discrimination))+geom_histogram(color ="black",fill ="gray",bins =20)+geom_textvline(label ="Mean = 4.40", xintercept =4.40, vjust =1.1, lwd =1.05, linetype =2)+labs(x ="Perceived history of discrimination score", y ="Frequency", title ="Perceived history of discrimination")+theme_bw()
This set of regression models have additional predictor variables compared to the last set. It adds empathic concern, perspective taking, and perceived history of discrimination as additional predictors.
Display code
## Four regression models predicting endorsement and experience of BCL / BBL:lm01 <-lm(Endorse_BCL~IG_Fusion+IG_Identification+OG_Bonds+empathic_concern+perspective_taking+history_discrimination+Age+Female+Married+`SES-`, data = ds)lm02 <-lm(Experience_BCL~IG_Fusion+IG_Identification+OG_Bonds+empathic_concern+perspective_taking+history_discrimination+Age+Female+Married+`SES-`, data = ds)lm03 <-lm(Endorse_BBL~IG_Fusion+IG_Identification+OG_Bonds+empathic_concern+perspective_taking+history_discrimination+Age+Female+Married+`SES-`, data = ds)lm04 <-lm(Experience_BBL~IG_Fusion+IG_Identification+OG_Bonds+empathic_concern+perspective_taking+history_discrimination+Age+Female+Married+`SES-`, data = ds)
Section 9. Alternative regression models with different outcomes
This set of regression models have different outcome variables which are: outgroup cooperation, outgroup hostility, and willingness to fight outgroup.
Display code
lm01 <-lm(og_cooperation~IG_Fusion+IG_Identification+OG_Bonds+empathic_concern+perspective_taking+history_discrimination+Age+Female+Married+`SES-`, data = ds)lm02 <-lm(og_hostility~IG_Fusion+IG_Identification+OG_Bonds+empathic_concern+perspective_taking+history_discrimination+Age+Female+Married+`SES-`, data = ds)lm03 <-lm(fight_outgroup~IG_Fusion+IG_Identification+OG_Bonds+empathic_concern+perspective_taking+history_discrimination+Age+Female+Married+`SES-`, data = ds)
Religion Exp_religious_freedom
1 Christian (Catholic) 5.3
2 Christian (Protestant) 5.5
3 Muslim (Sunni) 5.9
4 Muslim (Shia) 4.8
5 Hindu 5.2
6 Sikh 5.6
Display code
ds %>%drop_na(religion, exp_religious_freedom)%>%ggplot(aes(y = exp_religious_freedom, x = religion))+geom_boxplot()+labs(x ="", y ="Experience of religious freedom score", title ="Experience of religious freedom: Pakistan")+coord_flip()+theme_bw()
Section 11. Positive/Negative contact with outgroup
Never Very rarely Rarely Sometimes Often Very Often
115 116 108 90 33 33
Always
9
Display code
# never, very rarely, rarely# sometimes, often, very often, alwaysds$nc02 <-as.numeric(ds$nc01)summary(ds$nc02)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.0 2.0 3.0 2.9 4.0 7.0
Display code
ds %>%drop_na(pc02)%>%ggplot(aes(x = pc02))+geom_histogram(color ="black",fill ="gray",bins =30)+geom_textvline(label ="Mean = 2.90", xintercept =2.90, vjust =1.1, lwd =1.05, linetype =2)+labs(x ="Negative contact score", y ="Frequency", title ="Negative contact with outgroup: Pakistan")+theme_bw()
Section 12. Intergroup marriage
No data for Pakistan.
Section 13. Outgroup affect
Outgroup affect: Scale construction
Display code
## Feel outgroup: negative to positive:ds$ogaf1 <-as.character(haven::as_factor(ds$Q13_1))## convert first character only to upper case:ds$ogaf1 <-gsub("(\\D)(\\D+)", "\\U\\1\\L\\2", ds$ogaf1, perl =TRUE)ds$og_aff_01 <-factor(ds$ogaf1, levels=c("Very negative", "Moderately negative","A little negative", "Neutral", "A little positive", "Moderately positive", "Very positive"))table(ds$og_aff_01)
Very negative Moderately negative A little negative Neutral
9 22 58 121
A little positive Moderately positive Very positive
74 125 95
Display code
## Feel Outgroup: Hostile to friendly:ds$ogaf2 <-as.character(haven::as_factor(ds$Q13_2))## convert first character only to upper case:ds$ogaf2 <-gsub("(\\D)(\\D+)", "\\U\\1\\L\\2", ds$ogaf2, perl =TRUE)ds$og_aff_02 <-factor(ds$ogaf2, levels =c("Very hostile", "Moderately hostile", "A little hostile", "Neutral", "A little friendly", "Moderately friendly","Very friendly"))table(ds$og_aff_02)
Very hostile Moderately hostile A little hostile Neutral
6 10 43 94
A little friendly Moderately friendly Very friendly
99 152 100
Display code
## Feel Outgroup: Suspicious to trusting:ds$ogaf3 <-as.character(haven::as_factor(ds$Q13_3))## convert first character only to upper case:ds$ogaf3 <-gsub("(\\D)(\\D+)", "\\U\\1\\L\\2", ds$ogaf3, perl =TRUE)ds$og_aff_03 <-factor(ds$ogaf3, levels =c("Very suspicious", "Moderately suspicious", "A little suspicious", "Neutral","A little trusting", "Moderately trusting","Very trusting"))table(ds$og_aff_03)
Very suspicious Moderately suspicious A little suspicious
15 25 67
Neutral A little trusting Moderately trusting
102 96 134
Very trusting
65
Display code
## Feel Outgroup: Contempt to respect:ds$ogaf4 <-as.character(haven::as_factor(ds$Q13_4))## convert first character only to upper case:ds$ogaf4 <-gsub("(\\D)(\\D+)", "\\U\\1\\L\\2", ds$ogaf4, perl =TRUE)ds$og_aff_04 <-factor(ds$ogaf4, levels =c("A lot of contempt", "Moderate contempt","A little contempt", "Neutral", "A little respect", "Moderate respect","A lot of respect"))table(ds$og_aff_04)
A lot of contempt Moderate contempt A little contempt Neutral
8 14 33 89
A little respect Moderate respect A lot of respect
74 163 123
Display code
## Feel Outgroup: Concerned to Unconcerned:ds$ogaf5 <-as.character(haven::as_factor(ds$Q13_5))## convert first character only to upper case:ds$ogaf5 <-gsub("(\\D)(\\D+)", "\\U\\1\\L\\2", ds$ogaf5, perl =TRUE)ds$og_aff_05 <-factor(ds$ogaf5, levels =c("Very concerned", "Moderately concerned", "A little concerned", "Neutral","A little unconcerned", "Moderately unconcerned","Very unconcerned"))table(ds$og_aff_05)
Very concerned Moderately concerned A little concerned
33 106 96
Neutral A little unconcerned Moderately unconcerned
121 62 62
Very unconcerned
24
Display code
## Feel Outgroup: Threatened to Relaxed:ds$ogaf6 <-as.character(haven::as_factor(ds$Q13_6))## convert first character only to upper case:ds$ogaf6 <-gsub("(\\D)(\\D+)", "\\U\\1\\L\\2", ds$ogaf6, perl =TRUE)ds$og_aff_06 <-factor(ds$ogaf6, levels =c("Very threatened", "Moderately threatened", "A little threatened", "Neutral", "A little relaxed", "Moderately relaxed", "Very relaxed"))table(ds$og_aff_06)
Very threatened Moderately threatened A little threatened
14 24 83
Neutral A little relaxed Moderately relaxed
124 60 113
Very relaxed
85
Call:
factanal(x = og_affect, factors = 1, rotation = "promax")
Uniquenesses:
NegativeToPositive HostileToFriendly SuspiciousToTrusting
0.32 0.22 0.35
ContemptToRespect ConcernedToUnconcerned ThreatenedToRelaxed
0.53 0.96 0.47
Loadings:
Factor1
NegativeToPositive 0.83
HostileToFriendly 0.88
SuspiciousToTrusting 0.81
ContemptToRespect 0.69
ConcernedToUnconcerned -0.20
ThreatenedToRelaxed 0.73
Factor1
SS loadings 3.16
Proportion Var 0.53
Test of the hypothesis that 1 factor is sufficient.
The chi square statistic is 47 on 9 degrees of freedom.
The p-value is 0.00000047
Display code
## Reliability analysispsych::alpha(og_affect)
Some items ( ConcernedToUnconcerned ) were negatively correlated with the total scale and
probably should be reversed.
To do this, run the function again with the 'check.keys=TRUE' option
Based on the output of factor analysis as well as reliability, it seems like one item (“concerned to unconcerned”) is problematic. Here is the scale without that item:
Based on the above output for factor analysis and reliability analysis, the “outgroup affect” scale is highly reliable if we drop the problematic item (“concerned to unconcerned”). Here is the visualization for the scale after dropping the problematic item:
This set of regression models have additional predictors: frequency of positive and negative contact, and interaction between perspective taking and perceived history of discrimination
Display code
ds$freq_positive_contact <- ds$pc02ds$freq_negative_contact <- ds$nc02ds$perspectiveXdiscrimination <- ds$perspective_taking*ds$history_discrimination## Four regression models predicting endorsement and experience of BCL / BBL:lm01 <-lm(Endorse_BCL~IG_Fusion+IG_Identification+OG_Bonds+freq_positive_contact+freq_negative_contact+empathic_concern+perspective_taking+history_discrimination+perspectiveXdiscrimination+Age+Female+Married+`SES-`, data = ds)lm02 <-lm(Experience_BCL~IG_Fusion+IG_Identification+OG_Bonds+freq_positive_contact+freq_negative_contact+empathic_concern+perspective_taking+history_discrimination+perspectiveXdiscrimination+Age+Female+Married+`SES-`, data = ds)lm03 <-lm(Endorse_BBL~IG_Fusion+IG_Identification+OG_Bonds+freq_positive_contact+freq_negative_contact+empathic_concern+perspective_taking+history_discrimination+perspectiveXdiscrimination+Age+Female+Married+`SES-`, data = ds)lm04 <-lm(Experience_BBL~IG_Fusion+IG_Identification+OG_Bonds+freq_positive_contact+freq_negative_contact+empathic_concern+perspective_taking+history_discrimination+perspectiveXdiscrimination+Age+Female+Married+`SES-`, data = ds)
Same predictors as section 14 above, but with different outcomes. This set of regression models has an additional outcome variable: affect towards outgroup.
Display code
## Four regression models:lm01 <-lm(og_cooperation~IG_Fusion+IG_Identification+OG_Bonds+freq_positive_contact+freq_negative_contact+empathic_concern+perspective_taking+history_discrimination+perspectiveXdiscrimination+Age+Female+Married+`SES-`, data = ds)lm02 <-lm(og_hostility~IG_Fusion+IG_Identification+OG_Bonds+freq_positive_contact+freq_negative_contact+empathic_concern+perspective_taking+history_discrimination+perspectiveXdiscrimination+Age+Female+Married+`SES-`, data = ds)lm03 <-lm(fight_outgroup~IG_Fusion+IG_Identification+OG_Bonds+freq_positive_contact+freq_negative_contact+empathic_concern+perspective_taking+history_discrimination+perspectiveXdiscrimination+Age+Female+Married+`SES-`, data = ds)lm04 <-lm(og_affect~IG_Fusion+IG_Identification+OG_Bonds+freq_positive_contact+freq_negative_contact+empathic_concern+perspective_taking+history_discrimination+perspectiveXdiscrimination+Age+Female+Married+`SES-`, data = ds)
Call:
factanal(x = imagistic, factors = 2, rotation = "promax")
Uniquenesses:
negative_affect positive_affect episodic_recall_01
0.21 0.15 0.71
episodic_recall_02 shared_perception_01 shared_perception_02
0.68 0.43 0.48
event_reflection_01 event_reflection_02 transformative_indiv_01
0.61 0.61 0.70
transformative_indiv_02 transformative_group_01 transformative_group_02
0.56 0.45 0.58
Loadings:
Factor1 Factor2
negative_affect 0.106 -0.876
positive_affect 0.925
episodic_recall_01 0.540
episodic_recall_02 0.565
shared_perception_01 0.733 -0.143
shared_perception_02 0.715
event_reflection_01 0.626
event_reflection_02 0.623
transformative_indiv_01 0.540
transformative_indiv_02 0.653 0.161
transformative_group_01 0.741
transformative_group_02 0.650
Factor1 Factor2
SS loadings 4.14 1.70
Proportion Var 0.34 0.14
Cumulative Var 0.34 0.49
Factor Correlations:
Factor1 Factor2
Factor1 1.0000 0.0585
Factor2 0.0585 1.0000
Test of the hypothesis that 2 factors are sufficient.
The chi square statistic is 569 on 43 degrees of freedom.
The p-value is 0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000064
The factor structure above makes sense but is a little incoherent. There’s a seperate factor for positive affect, and the remaining items all go together into a single factor. Here is what happens if we “force” six factors:
The factor breakdown above is much more aligned with theory. There are six factors that represent the six different measures that we want. However, the item transformative_indiv_02 is quite weird.
Imagistic experience: Reliability
Display code
## Reliability:alph01 <- psych::alpha(imagistic)
Some items ( positive_affect ) were negatively correlated with the total scale and
probably should be reversed.
To do this, run the function again with the 'check.keys=TRUE' option
The “imagistic experience” scale (which comprises of the six different subscales) is quite reliable. The Cronbach’s alpha value of 0.79 is close to 0.80.
If we conceptualize “imagistic experience” as a single measure that is the sum of the seven scales, here is the visualization of the measure:
Below are the individual Cronbach’s alpha for each of the seven subscales, followed by their individual visualizations, followed by the correlation between the six sub-scales
Social perception of religious freedom
Display code
Display code