Untitled

Data wrangling & exploration

The data are loaded from the file ofek_massaged.xlsx.

Most metadata are omitted, but duration (in seconds) is retained for analysis.

Reverse items

  • Item Q6_4 was reversed and recoded into efficacy_small_letgo.

  • Item Q12_3 was reversed and recoded into post_eff_easy_offense.

Recoding

To produce more readable tables, I recoded some of the variables to their original meanings.

Variable renaming

For my convenience I renamed the variables used in this study, here is the table with column names:

old_name new_name
duration_in_seconds duration
gender demo_gender
age demo_age
education demo_education
religion demo_religion
religiousness demo_religiosness
children demo_children
residency demo_residency
status_23 demo_status
periphery demo_periphery
born_il demo_born_il
work demo_work
income demo_income
q1 ever_victim
q2 complaint_filed
q4_1 previous_fair_result
q4_2 previous_cop_repsectful
q4_3 previous_equal_trt
q4_4 previous_cop_listen
old_name new_name
q4_5 previous_really_try
q4_6 previous_deserve_trust
q4_7 previous_provide_service
q5_1 police_all_equal
q5_2 police_keep_promise
q5_3 police_account_needs
q5_4 police_really_try
q5_5 police_explain_actions
q5_6 police_respectg_citizen
q5_6 police_listen_citizen
q6_1 efficacy_police_eff
q6_2 efficacy_fast_respond
q6_3 efficacy_prepared_ass
q6_4 efficacy_small_letgo
q7_1 trust_police
q7_2 trust_cops
q7_3 trust_boss
q8_1 halukati
q9_1 tend_vig_unrelated
old_name new_name
q9_1 tend_vig_unrelated
q9_2 tend_vig_inappropriate
q9_3 tend_vig_impotent
q10_1 tviolent
q11_1 post_pj_all_equal
q11_2 post_pj_promise_keep
q11_3 post_pj_my_needs
q11_4 post_pj_really_try
q11_5 post_pj_explain_actions
q11_6 post_pj_trt_respect
q11_7 post_pj_listen
q12_1 post_eff_fast_response
q12_2 post_eff_prepared_help
q12_3 post_eff_easy_offense
q13_1 post_trust_cop_story
q14_1 post_believe_fair_result
q15_1 post_self_investigate
q15_2 post_self_locate
q15_3 post_self_can_do
q15_4 post_self_intend

Missing value analysis

  • We disregard variables that may be missing by design.

  • The most frequent missingness was found in post_self_ (that is Q15_) items.

  • We omit observations with less than three complete Q15 items. Fuck those idiots.

Scale construction

  • score_previous_encounter - average of Q4.

  • score_police_pj - average of Q5.

  • score_police_effi - average of Q6_1 to Q6_3.

  • score_q6_4 - the raw value of Q6_4.

  • score_police_trust - average of Q7.

  • score_vigilant_tendencies - average of Q9.

  • score_violent - the raw value og Q10

  • bbr_pj - average of Q11.

  • bbr_effi - average of Q12_1 and Q12_2.

  • bbr_q12_3 - raw value of Q12_3.

  • bbr_self - average of Q15.

Final dataset

Reliability analysis

Here you see Cronbach’s \(\alpha\) values, with corresponding bootstrap confidence intervals (CI). Nivce values, very reliable!

Previous encounter - Q4

Cronbach’s alpha for the ‘Q4’ data-set

Items: 7 Sample units: 79 alpha: 0.911

Bootstrap 95% CI based on 1000 samples 2.5% 97.5% 0.868 0.939

Procedural justice - Q5

Cronbach’s alpha for the ‘Q5’ data-set

Items: 7 Sample units: 474 alpha: 0.946

Bootstrap 95% CI based on 1000 samples 2.5% 97.5% 0.938 0.954

Efficacy of Police - Q6

Cronbach’s alpha for the ‘Q6’ data-set

Items: 4 Sample units: 475 alpha: 0.633

Bootstrap 95% CI based on 1000 samples 2.5% 97.5% 0.568 0.685

Reliability analysis
Call: psych::alpha(x = Q6)

raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r 0.63 0.67 0.7 0.33 2 0.029 2.4 0.78 0.35

95% confidence boundaries 
     lower alpha upper

Feldt 0.58 0.63 0.68 Duhachek 0.58 0.63 0.69

Reliability if an item is dropped: raw_alpha std.alpha G6(smc) average_r S/N alpha se efficacy_police_eff 0.42 0.46 0.52 0.22 0.85 0.049 efficacy_fast_respond 0.42 0.46 0.51 0.22 0.85 0.049 efficacy_prepared_ass 0.45 0.49 0.51 0.24 0.95 0.046 efficacy_small_letgo 0.85 0.85 0.79 0.65 5.66 0.012 var.r med.r efficacy_police_eff 0.15343 0.025 efficacy_fast_respond 0.13771 0.053 efficacy_prepared_ass 0.12182 0.053 efficacy_small_letgo 0.00023 0.647

Item statistics n raw.r std.r r.cor r.drop mean sd efficacy_police_eff 475 0.81 0.83 0.781 0.621 2.3 1.0 efficacy_fast_respond 475 0.81 0.83 0.790 0.614 2.3 1.1 efficacy_prepared_ass 475 0.78 0.81 0.769 0.574 2.5 1.1 efficacy_small_letgo 475 0.43 0.37 0.023 0.016 2.7 1.3

Non missing response frequency for each item 1 2 3 4 5 miss efficacy_police_eff 0.27 0.29 0.31 0.12 0.02 0 efficacy_fast_respond 0.26 0.32 0.25 0.14 0.02 0 efficacy_prepared_ass 0.22 0.30 0.31 0.14 0.03 0 efficacy_small_letgo 0.23 0.28 0.21 0.18 0.10 0

Trust the oinks - Q7

Cronbach’s alpha for the ‘Q7’ data-set

Items: 4 Sample units: 478 alpha: 0.856

Bootstrap 95% CI based on 1000 samples 2.5% 97.5% 0.826 0.879

Distributive justice - Q8

Only one question, so no Cronbach \(\alpha\). We look at the distribution instead.

We conduct a G-test for independence between the distribution of Q8 and being a victim of an offense in the past. The association is statistically significant (see the \(\chi^2\) statistic below, “). Note how victims scored 1 considerably more, and 3 considerably less than non-victims.

statistic p.value df method
12.56 0.01361 4 Log likelihood ratio (G-test) test of independence with Williams’ correction

Tendency to vigilantism - Q9

Cronbach’s alpha for the ‘Q9’ data-set

Items: 3 Sample units: 468 alpha: 0.769

Bootstrap 95% CI based on 1000 samples 2.5% 97.5% 0.718 0.809

Violent vigilantism - Q10

Procedural justice - Q11

Cronbach’s alpha for the ‘Q11’ data-set

Items: 7 Sample units: 474 alpha: 0.932

Bootstrap 95% CI based on 1000 samples 2.5% 97.5% 0.921 0.941

Efficiency - Q12

Cronbach’s \(\alpha\) is somewhat low for the efficiency (Q12) items.

Cronbach’s alpha for the ‘Q12’ data-set

Items: 3 Sample units: 480 alpha: 0.563

Bootstrap 95% CI based on 1000 samples 2.5% 97.5% 0.471 0.637

The problematic item is post_eff_easy_offense (Q12_3). Removing it yields a better value:

Cronbach’s alpha for the ‘Q12a’ data-set

Items: 2 Sample units: 480 alpha: 0.84

Bootstrap 95% CI based on 1000 samples 2.5% 97.5% 0.801 0.874

It means that Q12_3 is not in the same domain as Q12_1 and Q12_2.

What should we do?

Trust - Q13

Fair result - Q14

Vigilantism

Cronbach’s alpha for the ‘Q15’ data-set

Items: 4 Sample units: 464 alpha: 0.85

Bootstrap 95% CI based on 1000 samples 2.5% 97.5% 0.820 0.873

Descriptive statistics

Demographics


    Pearson's Chi-squared test with Yates' continuity correction

data:  table(dat$demo_children, dat$ever_victim)
X-squared = 10.705, df = 1, p-value = 0.001068

We present the demographical data in the table below, stratified by victim-status. Note that victims were generally younger (Fisher’s exact test, p < .0001) had children more than non-victims (\(\chi^2(1) = 10.705, p = 0.001\)). There were more Jews among the victims (85%) than the non-victims (77%).

Characteristic Overall, N = 4891 victim, N = 1411 not victim, N = 3481 p-value2
age <0.001
    18-24 78 (16%) 31 (22%) 47 (14%)
    25-34 95 (19%) 38 (27%) 57 (16%)
    35-44 88 (18%) 30 (21%) 58 (17%)
    45-54 83 (17%) 27 (19%) 56 (16%)
    55-64 62 (13%) 7 (5.0%) 55 (16%)
    65 + 83 (17%) 8 (5.7%) 75 (22%)
children <0.001
    has children 318 (66%) 76 (55%) 242 (71%)
    no children 163 (34%) 63 (45%) 100 (29%)
    Unknown 8 2 6
status_23 0.014
    single 159 (33%) 59 (42%) 100 (29%)
    married 272 (56%) 63 (45%) 209 (60%)
    divorced 47 (9.7%) 16 (11%) 31 (9.0%)
    widowed 8 (1.6%) 2 (1.4%) 6 (1.7%)
    Unknown 3 1 2
religion 0.043
    jewish 384 (80%) 120 (85%) 264 (77%)
    muslim 78 (16%) 15 (11%) 63 (18%)
    christian 10 (2.1%) 2 (1.4%) 8 (2.3%)
    druze 8 (1.7%) 2 (1.4%) 6 (1.8%)
    other 2 (0.4%) 2 (1.4%) 0 (0%)
    Unknown 7 0 7
residency 0.4
    North & valley 142 (29%) 35 (25%) 107 (31%)
    TLV & center 206 (42%) 66 (47%) 140 (40%)
    Jerusalem 61 (12%) 21 (15%) 40 (11%)
    Beer-Sheva & south 49 (10%) 11 (7.8%) 38 (11%)
    Other 31 (6.3%) 8 (5.7%) 23 (6.6%)
religiousness 0.4
    secular 255 (52%) 78 (56%) 177 (51%)
    traditional 169 (35%) 46 (33%) 123 (35%)
    religious 52 (11%) 11 (7.9%) 41 (12%)
    very religious 10 (2.0%) 4 (2.9%) 6 (1.7%)
    orthodox 2 (0.4%) 1 (0.7%) 1 (0.3%)
    Unknown 1 1 0
born_il 0.4
    Born in IL 426 (87%) 125 (89%) 301 (87%)
    Born elsewhere 61 (13%) 15 (11%) 46 (13%)
    Unknown 2 1 1
gender 0.8
    male 232 (49%) 70 (50%) 162 (49%)
    female 238 (51%) 69 (50%) 169 (51%)
    Unknown 19 2 17
education >0.9
    high school 282 (58%) 78 (57%) 204 (59%)
    non-academic 50 (10%) 14 (10%) 36 (10%)
    academic 152 (31%) 45 (33%) 107 (31%)
    Unknown 5 4 1
duration 324 (229, 449) 329 (233, 433) 321 (228, 452) >0.9
periphery >0.9
    periphery 184 (38%) 53 (38%) 131 (38%)
    non-periphery 305 (62%) 88 (62%) 217 (62%)
1 n (%); Median (IQR)
2 Pearson’s Chi-squared test; Fisher’s exact test; Wilcoxon rank sum test

Scores

In this section we examine the correlation between the various scales described above and study differences in the distribution of these scores between various subgroups of our sample.

Correlations

Correlation matrix of the scores:

This plot is to show you where the score distributions differ - notably in \(police_effi\), police_trust pj and effi, where non-victims scored higher more often than victims.

By condition

Characteristic Overall, N = 4891 Control, N = 981 MNN, N = 971 MNP, N = 931 MPN, N = 1011 MPP, N = 1001 p-value2
post_self_investigate 2.58 (1.24) 2.55 (1.20) 2.62 (1.16) 2.65 (1.30) 2.57 (1.36) 2.51 (1.19) >0.9
    Unknown 14 4 2 1 3 4
post_self_locate 2.87 (1.34) 2.98 (1.29) 2.80 (1.32) 3.05 (1.39) 2.78 (1.35) 2.76 (1.35) 0.4
    Unknown 7 2 1 0 2 2
post_self_can_do 2.27 (1.20) 2.37 (1.13) 2.33 (1.19) 2.40 (1.33) 2.20 (1.24) 2.08 (1.09) 0.3
    Unknown 3 0 0 1 2 0
post_self_intend 2.38 (1.30) 2.51 (1.28) 2.51 (1.26) 2.45 (1.28) 2.23 (1.37) 2.24 (1.30) 0.3
    Unknown 4 1 0 2 1 0
1 Mean (SD)
2 One-way ANOVA

Regression model

We select the subjects from the four experimental condition groups (n = 391) and fit a linear regression model to predict bbr_self using the two experimental conditions, victim status and the scores for police efficiency, the subject’s vigilant_tendencies and q6_4. Another regression model, containing all the above terms alongside interaction terms of ever_victim with the scores.

service
Control     Low    High 
     98     190     201 

The coefficient tables for both models:


Call:
lm(formula = bbr_self ~ ever_victim + score_police_effi + score_vigilant_tendencies + 
    score_q6_4, data = dat)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.3657 -0.7560 -0.1223  0.7360  2.9501 

Coefficients:
                          Estimate Std. Error t value Pr(>|t|)    
(Intercept)                2.00218    0.18087  11.070   <2e-16 ***
ever_victimnot victim     -0.13601    0.09653  -1.409   0.1595    
score_police_effi         -0.08362    0.04645  -1.800   0.0725 .  
score_vigilant_tendencies  0.49436    0.04581  10.792   <2e-16 ***
score_q6_4                -0.05974    0.03370  -1.773   0.0769 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.9454 on 470 degrees of freedom
  (14 observations deleted due to missingness)
Multiple R-squared:  0.2146,    Adjusted R-squared:  0.2079 
F-statistic: 32.11 on 4 and 470 DF,  p-value: < 2.2e-16
Model 1
Characteristic Beta 95% CI1 p-value
ever_victim
    victim
    not victim -0.14 -0.33, 0.05 0.2
score_police_effi -0.08 -0.17, 0.01 0.072
score_vigilant_tendencies 0.49 0.40, 0.58 <0.001
score_q6_4 -0.06 -0.13, 0.01 0.077
1 CI = Confidence Interval
Model 2
Characteristic Beta 95% CI1 p-value
procedural
    Control
    Low -0.14 -0.40, 0.11 0.3
    High -0.18 -0.44, 0.08 0.2
service
    Control
    Low 0.12 -0.09, 0.33 0.3
    High
ever_victim
    victim
    not victim -0.05 -0.64, 0.54 0.9
bbr_effi -0.02 -0.18, 0.13 0.8
score_vigilant_tendencies 0.49 0.31, 0.66 <0.001
score_q6_4 -0.07 -0.14, 0.00 0.046
ever_victim * score_vigilant_tendencies
    not victim * score_vigilant_tendencies 0.00 -0.20, 0.21 >0.9
ever_victim * bbr_effi
    not victim * bbr_effi -0.05 -0.22, 0.13 0.6
1 CI = Confidence Interval

The second model (\(R^2=23.5%\)) did not improve the explained variance of the first model (\(R^2 = 22.72%\)) significantly (F(3) = 1.647, p = .178).

See the automated text generated below:

We fitted a linear model (estimated using OLS) to predict bbr_self with ever_victim, score_police_effi, score_vigilant_tendencies and score_q6_4 (formula: bbr_self ~ ever_victim + score_police_effi + score_vigilant_tendencies + score_q6_4). The model explains a statistically significant and moderate proportion of variance (R2 = 0.21, F(4, 470) = 32.11, p < .001, adj. R2 = 0.21). The model’s intercept, corresponding to ever_victim = victim, score_police_effi = 0, score_vigilant_tendencies = 0 and score_q6_4 = 0, is at 2.00 (95% CI [1.65, 2.36], t(470) = 11.07, p < .001). Within this model:

  • The effect of ever victim [not victim] is statistically non-significant and negative (beta = -0.14, 95% CI [-0.33, 0.05], t(470) = -1.41, p = 0.160; Std. beta = -0.13, 95% CI [-0.31, 0.05])
  • The effect of score police effi is statistically non-significant and negative (beta = -0.08, 95% CI [-0.17, 7.66e-03], t(470) = -1.80, p = 0.072; Std. beta = -0.07, 95% CI [-0.16, 6.79e-03])
  • The effect of score vigilant tendencies is statistically significant and positive (beta = 0.49, 95% CI [0.40, 0.58], t(470) = 10.79, p < .001; Std. beta = 0.44, 95% CI [0.36, 0.52])
  • The effect of score q6 4 is statistically non-significant and negative (beta = -0.06, 95% CI [-0.13, 6.48e-03], t(470) = -1.77, p = 0.077; Std. beta = -0.07, 95% CI [-0.15, 7.89e-03])

Standardized parameters were obtained by fitting the model on a standardized version of the dataset. 95% Confidence Intervals (CIs) and p-values were computed using a Wald t-distribution approximation. We fitted a linear model (estimated using OLS) to predict bbr_self with procedural, service, ever_victim, bbr_effi, score_vigilant_tendencies and score_q6_4 (formula: bbr_self ~ procedural + service + ever_victim + bbr_effi + score_vigilant_tendencies + score_q6_4 + ever_victim score_vigilant_tendencies + ever_victim bbr_effi + ever_victim * bbr_effi). The model explains a statistically significant and moderate proportion of variance (R2 = 0.22, F(9, 461) = 14.50, p < .001, adj. R2 = 0.21). The model’s intercept, corresponding to procedural = Control, service = Control, ever_victim = victim, bbr_effi = 0, score_vigilant_tendencies = 0 and score_q6_4 = 0, is at 2.00 (95% CI [1.43, 2.57], t(461) = 6.87, p < .001). Within this model:

  • The effect of procedural [Low] is statistically non-significant and negative (beta = -0.14, 95% CI [-0.40, 0.11], t(461) = -1.09, p = 0.278; Std. beta = -0.13, 95% CI [-0.37, 0.11])
  • The effect of procedural [High] is statistically non-significant and negative (beta = -0.18, 95% CI [-0.44, 0.08], t(461) = -1.36, p = 0.176; Std. beta = -0.17, 95% CI [-0.41, 0.08])
  • The effect of service [Low] is statistically non-significant and positive (beta = 0.12, 95% CI [-0.09, 0.33], t(461) = 1.13, p = 0.259; Std. beta = 0.11, 95% CI [-0.08, 0.31])
  • The effect of service [High] is statistically non-significant and negative (beta = -0.05, 95% CI [-0.64, 0.54], t(461) = -0.17, p = 0.868; Std. beta = -0.14, 95% CI [-0.32, 0.04])
  • The effect of ever victim [not victim] is statistically non-significant and negative (beta = -0.02, 95% CI [-0.18, 0.13], t(461) = -0.29, p = 0.769; Std. beta = -0.02, 95% CI [-0.18, 0.13])
  • The effect of bbr effi is statistically significant and positive (beta = 0.49, 95% CI [0.31, 0.66], t(461) = 5.57, p < .001; Std. beta = 0.44, 95% CI [0.28, 0.59])
  • The effect of score vigilant tendencies is statistically significant and negative (beta = -0.07, 95% CI [-0.14, -1.07e-03], t(461) = -2.00, p = 0.046; Std. beta = -0.08, 95% CI [-0.16, -1.30e-03])
  • The effect of score q6 4 is statistically non-significant and positive (beta = 4.83e-03, 95% CI [-0.20, 0.21], t(461) = 0.05, p = 0.962; Std. beta = 4.34e-03, 95% CI [-0.18, 0.19])
  • The effect of ever victim [not victim] × score vigilant tendencies is statistically non-significant and negative (beta = -0.05, 95% CI [-0.22, 0.13], t(461) = -0.53, p = 0.597; Std. beta = -0.05, 95% CI [-0.23, 0.13])

Standardized parameters were obtained by fitting the model on a standardized version of the dataset. 95% Confidence Intervals (CIs) and p-values were computed using a Wald t-distribution approximation.

Key findings from this analysis

  • The procedural justice condition had no significant effect on the outcome.

  • The service availability had a marginally significant (p = .067) effect, indicating an average reduction of 0.18 points in the high condition.

  • The coefficient The significance of the police efficiency score (model 1) is a result of a modulation effect. In the second model its main effect is non significant but the interaction with ever_victim is significant. This means that high police efficiency scores reduce the outcome self-justice only in those who were victims.