Introduction

Every person who leaves for work in the morning, or simply goes for a walk outside after dinner, should expect to return home at night safely and in good health. A lot of factors affect how safe people feel walking alone outside during night time. Walking alone in the dark could be dangerous in locations where there is higher chances of petty crimes or less police protection.

This report contains an assessment of how safe people feel walking at night in the town of Somerville, New Jersey. The factors studied are subject’s sex (SecRec: 0 = male, 1 = female) and how much they trust the local police (Trust_in_police: ranging from 0 to 10). Both of these factors are individual level data. To find out if there is any variation across wards, the subjects are grouped at ward level. In the dataset, there are data on people from 7 different wards in Somerville. The sample dataset has been derived from data.gov website. (source: https://catalog.data.gov/dataset/somerville-happiness-survey-responses-2011-2013-2015?fbclid=IwAR3imtJKQvwRlW4z6-rB3OIh0HLj2NslhxG0Rs1aoDI7N_9PtnC5XC7TE-s)

Importing packages and dataset

library(nlme)
library(dplyr)
library(magrittr)
library(tidyr)
library(haven)
library(lmerTest)
library(ggplot2)
library(texreg)
library(readr)
Somerville <- read_csv("C:/Users/Nusrat/Desktop/MA - 3rd semester, Spring 19/SOC 712 - Advanced Analytics (R)/Assignment 8 - Randon Effect Model/Somer.csv")
Somer <- na.omit(Somerville)
length(unique(Somer$Ward))
## [1] 7
Somer %>% 
  group_by(Ward) %>% 
  summarise(n_ward = n())
## # A tibble: 7 x 2
##    Ward n_ward
##   <dbl>  <int>
## 1     1    442
## 2     2    576
## 3     3    645
## 4     4    362
## 5     5    650
## 6     6    512
## 7     7    411

The above table displays how many people in the dataset belong to each neighborhood.

Complete Pooling Model

cpooling <- lm(Safe_at_night ~ SexRec, data = Somer)
summary(cpooling)
## 
## Call:
## lm(formula = Safe_at_night ~ SexRec, data = Somer)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.7359 -0.7629  0.2641  1.2641  2.2641 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  7.73586    0.03595  215.19   <2e-16 ***
## SexRec      -5.97294    0.06437  -92.78   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.789 on 3596 degrees of freedom
## Multiple R-squared:  0.7054, Adjusted R-squared:  0.7053 
## F-statistic:  8609 on 1 and 3596 DF,  p-value: < 2.2e-16

Result: On average, people in Somerville feel pretty safe (with a rating of 7.7) walking alone in the streets at night. Sexual identity has a significant impact on how safe they feel walking alone during that time of the day.

No-pooling Model

The Intercept

dcoef <- Somer %>% 
    group_by(Ward) %>% 
    do(mod = lm(Safe_at_night ~ SexRec, data = .))
coef <- dcoef %>% do(data.frame(intc = coef(.$mod)[1]))
ggplot(coef, aes(x = intc)) + geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Result: Above is the distribution of how safe males feel walking alone in the street during night time in their respective wards. The rating ranges from 7.5 to 7.85 (not a great deal of between ward variation). In other words, male feel pretty same about their safety walking outside at night across all wards.

The Slope

dcoef <- Somer %>% 
    group_by(Ward) %>% 
    do(mod = lm(Safe_at_night ~ SexRec, data = .))
coef <- dcoef %>% do(data.frame(sexc = coef(.$mod)[2]))
ggplot(coef, aes(x = sexc)) + geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Result: Above is the distribution of gender difference for how safe males feel walking alone in the street at night in across wards in Somerville. The rating ranges from -6.2 to -5.8 (a little bit of between ward variation). Interestingly, female rated their safety approximately 6 units less than male on average. Therefore, it is evident that female feel a lot insecure about their safety walking outside at night compared to males across all wards.

Partial-pooling

Model 1

m1 <- lmer(Safe_at_night ~ SexRec + (1|Ward), data = Somer)
summary(m1)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: Safe_at_night ~ SexRec + (1 | Ward)
##    Data: Somer
## 
## REML criterion at convergence: 14402.2
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.7718 -0.4266  0.1444  0.7066  1.2735 
## 
## Random effects:
##  Groups   Name        Variance  Std.Dev.
##  Ward     (Intercept) 0.0006479 0.02545 
##  Residual             3.1991694 1.78862 
## Number of obs: 3598, groups:  Ward, 7
## 
## Fixed effects:
##               Estimate Std. Error         df t value Pr(>|t|)    
## (Intercept)    7.73545    0.03726   10.95414  207.63   <2e-16 ***
## SexRec        -5.97308    0.06437 3595.99756  -92.79   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##        (Intr)
## SexRec -0.539

Model 2

m2 <- lmer(Safe_at_night ~ SexRec + (SexRec|Ward), data = Somer)
## singular fit
summary(m2)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: Safe_at_night ~ SexRec + (SexRec | Ward)
##    Data: Somer
## 
## REML criterion at convergence: 14401.9
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.7818 -0.4270  0.1345  0.7022  1.2951 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  Ward     (Intercept) 0.003135 0.05599       
##           SexRec      0.002566 0.05065  -1.00
##  Residual             3.197883 1.78826       
## Number of obs: 3598, groups:  Ward, 7
## 
## Fixed effects:
##             Estimate Std. Error       df t value Pr(>|t|)    
## (Intercept)  7.73534    0.04187  5.68731  184.74 5.77e-12 ***
## SexRec      -5.97254    0.06723 21.63918  -88.84  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##        (Intr)
## SexRec -0.607
## convergence code: 0
## singular fit

Comparison of Models

htmlreg(list(m1, m2), doctype = FALSE)
Statistical models
Model 1 Model 2
(Intercept) 7.74*** 7.74***
(0.04) (0.04)
SexRec -5.97*** -5.97***
(0.06) (0.07)
AIC 14410.19 14413.93
BIC 14434.94 14451.06
Log Likelihood -7201.09 -7200.96
Num. obs. 3598 3598
Num. groups: Ward 7 7
Var: Ward (Intercept) 0.00 0.00
Var: Residual 3.20 3.20
Var: Ward SexRec 0.00
Cov: Ward (Intercept) SexRec -0.00
p < 0.001, p < 0.01, p < 0.05

Result: The AIC and BIC comparison suggests that model 1 fits the sample dataset better than model 2.

Model 3 and 4 (adding independent variable: trust in local police)

m3 <- lmer(Safe_at_night ~ SexRec + (SexRec|Ward) + Trust_in_police, data = Somer)
m4 <- lmer(Safe_at_night ~ SexRec*Trust_in_police + (SexRec|Ward), data = Somer)
htmlreg(list(m1, m2, m3, m4), doctype = FALSE)
Statistical models
Model 1 Model 2 Model 3 Model 4
(Intercept) 7.74*** 7.74*** 7.71*** 7.69***
(0.04) (0.04) (0.08) (0.08)
SexRec -5.97*** -5.97*** -5.96*** -5.78***
(0.06) (0.07) (0.08) (0.16)
Trust_in_police 0.00 0.01
(0.01) (0.01)
SexRec:Trust_in_police -0.09
(0.07)
AIC 14410.19 14413.93 14422.54 14426.36
BIC 14434.94 14451.06 14465.86 14475.87
Log Likelihood -7201.09 -7200.96 -7204.27 -7205.18
Num. obs. 3598 3598 3598 3598
Num. groups: Ward 7 7 7 7
Var: Ward (Intercept) 0.00 0.00 0.00 0.00
Var: Residual 3.20 3.20 3.20 3.20
Var: Ward SexRec 0.00 0.00 0.00
Cov: Ward (Intercept) SexRec -0.00 -0.00 -0.00
p < 0.001, p < 0.01, p < 0.05

Result: AIC and BIC comparison between the four models suggest that model 1 best explains the variation in the given dataset the best.

Centering Covariates

Somer %<>% mutate(ctrust = Trust_in_police - mean(Trust_in_police))

Refitting the Models

m5 <- lmer(Safe_at_night ~ SexRec + Trust_in_police + (SexRec|Ward), data = Somer)
m6 <- lmer(Safe_at_night ~ SexRec*Trust_in_police + (SexRec|Ward), data = Somer)
htmlreg(list(m1, m2, m4, m5), doctype = FALSE)
Statistical models
Model 1 Model 2 Model 3 Model 4
(Intercept) 7.74*** 7.74*** 7.69*** 7.71***
(0.04) (0.04) (0.08) (0.08)
SexRec -5.97*** -5.97*** -5.78*** -5.96***
(0.06) (0.07) (0.16) (0.08)
Trust_in_police 0.01 0.00
(0.01) (0.01)
SexRec:Trust_in_police -0.09
(0.07)
AIC 14410.19 14413.93 14426.36 14422.54
BIC 14434.94 14451.06 14475.87 14465.86
Log Likelihood -7201.09 -7200.96 -7205.18 -7204.27
Num. obs. 3598 3598 3598 3598
Num. groups: Ward 7 7 7 7
Var: Ward (Intercept) 0.00 0.00 0.00 0.00
Var: Residual 3.20 3.20 3.20 3.20
Var: Ward SexRec 0.00 0.00 0.00
Cov: Ward (Intercept) SexRec -0.00 -0.00 -0.00
p < 0.001, p < 0.01, p < 0.05

Result: Although both the AIC and BIC lowered for Model 4 after centering the covariates, model 1 still turns out to be the best fit for the sample dataset.

Intra-class Correlation

m7 <- lmer(Safe_at_night ~ 1 + (1|Ward), data = Somer)
## singular fit
summary(m7)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: Safe_at_night ~ 1 + (1 | Ward)
##    Data: Somer
## 
## REML criterion at convergence: 18794.2
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.4790 -0.8720  0.3420  0.9489  1.2524 
## 
## Random effects:
##  Groups   Name        Variance  Std.Dev. 
##  Ward     (Intercept) 1.235e-13 3.514e-07
##  Residual             1.086e+01 3.295e+00
## Number of obs: 3598, groups:  Ward, 7
## 
## Fixed effects:
##              Estimate Std. Error        df t value Pr(>|t|)    
## (Intercept) 5.873e+00  5.493e-02 3.597e+03   106.9   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## convergence code: 0
## singular fit

Result Intraclass correlation = 3.514e-07/(3.514e-07+3.295e+00) = 1.06646423e-7

GGPlot

ggplot(Somer, aes(fill=Safe_at_night, y=Safe_at_night, x=Sex)) + 
    geom_bar(position="dodge", stat="identity")

Result: The bar chart displays that there is a huge difference in rating for feeling safe walking outside at night across male and female subjects. On one hand, female subjects feel a lot less safe, with most ratings are below 5). On the other hand, male subjects feel very safe, with most ratings above 7.5)

Conclusion

In general, male feel a lot more safer than female while walking alone at night in Somerville, NJ. However, their trust level on local police or the ward that they live in do not tend have any impact on their feeling of safety. Further research on the type of crimes that usually happen in Somerville might help explain why female feel so unsafe walking alone in the streets of Somerville at night.