REGRESSION MODEL

LBB REGRESSION MODEL

1. Introduction

This RMarkdown is intended for analyzing the linear regression models that can be applied to the data that we have : crime reports in several states in the US in 1960. The results of linear regression model is expected to show the effect of socio-economic condition variables on the target variable crime rate or crime_rate so that it can offer solutions that can reduce the crime rate in the US.

Library

We are going to use these libraries.

library(tidyverse)

## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──

## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.6     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.4     ✓ stringr 1.4.0
## ✓ readr   2.1.1     ✓ forcats 0.5.1

## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

library(GGally)

## Registered S3 method overwritten by 'GGally':
##   method from   
##   +.gg   ggplot2

library(lmtest)

## Loading required package: zoo

## 
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

library(car)

## Loading required package: carData

## 
## Attaching package: 'car'

## The following object is masked from 'package:dplyr':
## 
##     recode

## The following object is masked from 'package:purrr':
## 
##     some

library(MLmetrics)

## 
## Attaching package: 'MLmetrics'

## The following object is masked from 'package:base':
## 
##     Recall

2. Data Explanatory

2.1 Data Input & Structure

crime <- read.csv("crime.csv")
head(crime)

##   X   M So  Ed Po1 Po2  LF  M.F Pop  NW  U1 U2 GDP Ineq     Prob    Time    y
## 1 1 151  1  91  58  56 510  950  33 301 108 41 394  261 0.084602 26.2011  791
## 2 2 143  0 113 103  95 583 1012  13 102  96 36 557  194 0.029599 25.2999 1635
## 3 3 142  1  89  45  44 533  969  18 219  94 33 318  250 0.083401 24.3006  578
## 4 4 136  0 121 149 141 577  994 157  80 102 39 673  167 0.015801 29.9012 1969
## 5 5 141  0 121 109 101 591  985  18  30  91 20 578  174 0.041399 21.2998 1234
## 6 6 121  0 110 118 115 547  964  25  44  84 29 689  126 0.034201 20.9995  682

Futhermore we will inspect our data

tail(crime)

##     X   M So  Ed Po1 Po2  LF  M.F Pop  NW  U1 U2 GDP Ineq     Prob    Time    y
## 42 42 141  0 109  56  54 523  968   4   2 107 37 489  170 0.088904 12.1996  542
## 43 43 162  1  99  75  70 522  996  40 208  73 27 496  224 0.054902 31.9989  823
## 44 44 136  0 121  95  96 574 1012  29  36 111 37 622  162 0.028100 30.0001 1030
## 45 45 139  1  88  46  41 480  968  19  49 135 53 457  249 0.056202 32.5996  455
## 46 46 126  0 104 106  97 599  989  40  24  78 25 593  171 0.046598 16.6999  508
## 47 47 130  0 121  90  91 623 1049   3  22 113 40 588  160 0.052802 16.0997  849

dim(crime)

## [1] 47 17

names(crime)

##  [1] "X"    "M"    "So"   "Ed"   "Po1"  "Po2"  "LF"   "M.F"  "Pop"  "NW"  
## [11] "U1"   "U2"   "GDP"  "Ineq" "Prob" "Time" "y"

Now we want to exclude X from out Dataset because we will not use it.

crime <- crime %>% 
  select(-X) %>% 
  setNames(c("percent_m", "is_south", "mean_education", "police_exp60", 
             "police_exp59", "labour_participation", "m_per1000f", "state_pop", 
             "nonwhites_per1000", "unemploy_m24", "unemploy_m39", "gdp", "inequality", 
             "prob_prison", "time_prison", "crime_rate"))

3. Exploratory Data Analysis

We would like to know the type of our data.

str(crime)

## 'data.frame':    47 obs. of  16 variables:
##  $ percent_m           : int  151 143 142 136 141 121 127 131 157 140 ...
##  $ is_south            : int  1 0 1 0 0 0 1 1 1 0 ...
##  $ mean_education      : int  91 113 89 121 121 110 111 109 90 118 ...
##  $ police_exp60        : int  58 103 45 149 109 118 82 115 65 71 ...
##  $ police_exp59        : int  56 95 44 141 101 115 79 109 62 68 ...
##  $ labour_participation: int  510 583 533 577 591 547 519 542 553 632 ...
##  $ m_per1000f          : int  950 1012 969 994 985 964 982 969 955 1029 ...
##  $ state_pop           : int  33 13 18 157 18 25 4 50 39 7 ...
##  $ nonwhites_per1000   : int  301 102 219 80 30 44 139 179 286 15 ...
##  $ unemploy_m24        : int  108 96 94 102 91 84 97 79 81 100 ...
##  $ unemploy_m39        : int  41 36 33 39 20 29 38 35 28 24 ...
##  $ gdp                 : int  394 557 318 673 578 689 620 472 421 526 ...
##  $ inequality          : int  261 194 250 167 174 126 168 206 239 174 ...
##  $ prob_prison         : num  0.0846 0.0296 0.0834 0.0158 0.0414 ...
##  $ time_prison         : num  26.2 25.3 24.3 29.9 21.3 ...
##  $ crime_rate          : int  791 1635 578 1969 1234 682 963 1555 856 705 ...

we would want to change “is_south” into factor

crime <- crime %>% 
  mutate_at("is_south", function(x) ifelse(x==0, "No", "Yes")) %>%
  mutate (is_south = factor(is_south))

now we have to check if there is NA in our data

colSums(is.na(crime))

##            percent_m             is_south       mean_education 
##                    0                    0                    0 
##         police_exp60         police_exp59 labour_participation 
##                    0                    0                    0 
##           m_per1000f            state_pop    nonwhites_per1000 
##                    0                    0                    0 
##         unemploy_m24         unemploy_m39                  gdp 
##                    0                    0                    0 
##           inequality          prob_prison          time_prison 
##                    0                    0                    0 
##           crime_rate 
##                    0

GREAT! No missing values! Then by using ggcorr, we will check the correlation of each variable to see which predictor variables are possible to determine the target variable crime_rate.

ggcorr(crime, label=T, label_size = 2)

## Warning in ggcorr(crime, label = T, label_size = 2): data in column(s)
## 'is_south' are not numeric and were ignored

Based on the the result above, it can be seen that the nonwhites_per1000 and crime_rate variables show a correlation value of 0 so that it can be said that the nonwhites_per1000 variable is not suitable to be used as a predictor variable.

4. Regression Linear Model

Before creating the model, we go deeper into our dataset with dividing data_train and data_test first so that we can train the model first via data train and then evaluate the model via data test.

RNGkind(sample.kind = "Rounding")

## Warning in RNGkind(sample.kind = "Rounding"): non-uniform 'Rounding' sampler
## used

set.seed(123)
row_data <- nrow(crime)
index <- sample(row_data, row_data*0.8)

data_train <- crime[ index, ]
data_test <- crime[ -index, ]

based on the correlation between variables in the previous chunk, we create a simple linear model using one of the variables that has a high correlation with the crime_rate variable, namely the police_exp60 variable.

model_crime <- lm(crime_rate ~ police_exp60, data_train)

summary(model_crime)

## 
## Call:
## lm(formula = crime_rate ~ police_exp60, data = data_train)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -587.13 -138.38   25.91  150.55  566.20 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   164.930    150.015   1.099    0.279    
## police_exp60    8.775      1.616   5.431 4.35e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 306.9 on 35 degrees of freedom
## Multiple R-squared:  0.4573, Adjusted R-squared:  0.4418 
## F-statistic:  29.5 on 1 and 35 DF,  p-value: 4.346e-06

From the model that we have created, it has an R-Squared value of 0.4573, meaning that the model can explain 45.73% of the data, while the remaining 54.27% is explained by variables or other factors that we do not include in the model.

R-Squared itself has a range between 0-1 (0-100%) and a good R-Squared is close to 1. However, finding the highest R-square does not necessarily guarantee to give the best model because R-square only considers how well the model follows existing data patterns and do not reflect the model’s ability to predict new data. A more appropriate measure if you want to find a trade-off between the model’s ability to follow data patterns and its predictive ability is to use the Akaike Information Criterion (AIC), and a good model has a low AIC.

Let’s try to do the selection of predictor variables using the stepwise method with direction = both and starting from the model with the complete predictor variable, to get the lowest possible AIC.

model_full <- lm(crime_rate ~., data_train)
model_step <- step(model_full, direction="both")

## Start:  AIC=414.49
## crime_rate ~ percent_m + is_south + mean_education + police_exp60 + 
##     police_exp59 + labour_participation + m_per1000f + state_pop + 
##     nonwhites_per1000 + unemploy_m24 + unemploy_m39 + gdp + inequality + 
##     prob_prison + time_prison
## 
##                        Df Sum of Sq     RSS    AIC
## - labour_participation  1       653 1143026 412.52
## - time_prison           1      1921 1144294 412.56
## - police_exp59          1      2875 1145248 412.59
## - state_pop             1      3010 1145383 412.59
## - is_south              1      8757 1151130 412.78
## - nonwhites_per1000     1     13116 1155489 412.92
## - gdp                   1     13142 1155515 412.92
## - police_exp60          1     28035 1170408 413.39
## - m_per1000f            1     45287 1187660 413.93
## <none>                              1142373 414.49
## - prob_prison           1     70404 1212777 414.71
## - percent_m             1     97608 1239981 415.53
## - unemploy_m24          1    103796 1246169 415.71
## - mean_education        1    200844 1343218 418.49
## - unemploy_m39          1    228152 1370525 419.23
## - inequality            1    244314 1386687 419.67
## 
## Step:  AIC=412.52
## crime_rate ~ percent_m + is_south + mean_education + police_exp60 + 
##     police_exp59 + m_per1000f + state_pop + nonwhites_per1000 + 
##     unemploy_m24 + unemploy_m39 + gdp + inequality + prob_prison + 
##     time_prison
## 
##                        Df Sum of Sq     RSS    AIC
## - police_exp59          1      2351 1145376 410.59
## - time_prison           1      2678 1145704 410.60
## - state_pop             1      4131 1147157 410.65
## - is_south              1      8388 1151414 410.79
## - gdp                   1     12638 1155664 410.92
## - nonwhites_per1000     1     13220 1156246 410.94
## - police_exp60          1     27439 1170465 411.39
## - m_per1000f            1     52705 1195731 412.18
## <none>                              1143026 412.52
## - prob_prison           1     71104 1214130 412.75
## - percent_m             1    103075 1246101 413.71
## - unemploy_m24          1    121084 1264110 414.24
## + labour_participation  1       653 1142373 414.49
## - mean_education        1    200303 1343329 416.49
## - unemploy_m39          1    227521 1370547 417.23
## - inequality            1    246070 1389096 417.73
## 
## Step:  AIC=410.59
## crime_rate ~ percent_m + is_south + mean_education + police_exp60 + 
##     m_per1000f + state_pop + nonwhites_per1000 + unemploy_m24 + 
##     unemploy_m39 + gdp + inequality + prob_prison + time_prison
## 
##                        Df Sum of Sq     RSS    AIC
## - state_pop             1      3598 1148975 408.71
## - time_prison           1      6058 1151435 408.79
## - is_south              1      8688 1154064 408.87
## - nonwhites_per1000     1     11027 1156404 408.95
## - gdp                   1     11041 1156418 408.95
## <none>                              1145376 410.59
## - prob_prison           1     68965 1214341 410.76
## - m_per1000f            1     88286 1233663 411.34
## - percent_m             1    102660 1248036 411.77
## - unemploy_m24          1    122681 1268058 412.36
## + police_exp59          1      2351 1143026 412.52
## + labour_participation  1       129 1145248 412.59
## - mean_education        1    203773 1349149 414.65
## - unemploy_m39          1    233268 1378645 415.45
## - inequality            1    245177 1390554 415.77
## - police_exp60          1    358986 1504363 418.68
## 
## Step:  AIC=408.71
## crime_rate ~ percent_m + is_south + mean_education + police_exp60 + 
##     m_per1000f + nonwhites_per1000 + unemploy_m24 + unemploy_m39 + 
##     gdp + inequality + prob_prison + time_prison
## 
##                        Df Sum of Sq     RSS    AIC
## - time_prison           1      3983 1152958 406.84
## - gdp                   1      9092 1158067 407.00
## - is_south              1      9896 1158871 407.03
## - nonwhites_per1000     1     12118 1161093 407.10
## <none>                              1148975 408.71
## - prob_prison           1     68307 1217282 408.84
## - percent_m             1    105640 1254614 409.96
## + state_pop             1      3598 1145376 410.59
## + police_exp59          1      1818 1147157 410.65
## + labour_participation  1       789 1148186 410.68
## - unemploy_m24          1    142101 1291075 411.02
## - m_per1000f            1    152198 1301173 411.31
## - mean_education        1    206104 1355079 412.81
## - unemploy_m39          1    246722 1395697 413.91
## - inequality            1    274184 1423159 414.63
## - police_exp60          1    402260 1551235 417.81
## 
## Step:  AIC=406.84
## crime_rate ~ percent_m + is_south + mean_education + police_exp60 + 
##     m_per1000f + nonwhites_per1000 + unemploy_m24 + unemploy_m39 + 
##     gdp + inequality + prob_prison
## 
##                        Df Sum of Sq     RSS    AIC
## - is_south              1     10049 1163007 405.16
## - gdp                   1     10978 1163937 405.19
## - nonwhites_per1000     1     17994 1170952 405.41
## <none>                              1152958 406.84
## - percent_m             1    111548 1264506 408.25
## + police_exp59          1      4347 1148611 408.70
## + time_prison           1      3983 1148975 408.71
## + state_pop             1      1523 1151435 408.79
## + labour_participation  1      1016 1151943 408.80
## - prob_prison           1    142080 1295039 409.14
## - unemploy_m24          1    144421 1297379 409.20
## - m_per1000f            1    148215 1301173 409.31
## - mean_education        1    202307 1355265 410.82
## - unemploy_m39          1    251866 1404825 412.15
## - inequality            1    271850 1424809 412.67
## - police_exp60          1    398984 1551942 415.83
## 
## Step:  AIC=405.16
## crime_rate ~ percent_m + mean_education + police_exp60 + m_per1000f + 
##     nonwhites_per1000 + unemploy_m24 + unemploy_m39 + gdp + inequality + 
##     prob_prison
## 
##                        Df Sum of Sq     RSS    AIC
## - gdp                   1     11334 1174342 403.52
## - nonwhites_per1000     1     13901 1176909 403.60
## <none>                              1163007 405.16
## - percent_m             1    103127 1266134 406.30
## + is_south              1     10049 1152958 406.84
## + police_exp59          1      4698 1158310 407.01
## + time_prison           1      4136 1158871 407.03
## + state_pop             1      2286 1160721 407.08
## + labour_participation  1       100 1162907 407.15
## - unemploy_m24          1    135954 1298961 407.25
## - m_per1000f            1    148204 1311211 407.60
## - prob_prison           1    213993 1377000 409.41
## - mean_education        1    231928 1394935 409.89
## - unemploy_m39          1    242747 1405754 410.17
## - inequality            1    262973 1425981 410.70
## - police_exp60          1    389276 1552284 413.84
## 
## Step:  AIC=403.52
## crime_rate ~ percent_m + mean_education + police_exp60 + m_per1000f + 
##     nonwhites_per1000 + unemploy_m24 + unemploy_m39 + inequality + 
##     prob_prison
## 
##                        Df Sum of Sq     RSS    AIC
## - nonwhites_per1000     1      9466 1183808 401.81
## <none>                              1174342 403.52
## - percent_m             1     95312 1269653 404.40
## + gdp                   1     11334 1163007 405.16
## + is_south              1     10405 1163937 405.19
## + time_prison           1      6091 1168250 405.32
## + police_exp59          1      3274 1171067 405.41
## + state_pop             1       540 1173802 405.50
## + labour_participation  1       455 1173887 405.50
## - unemploy_m24          1    143492 1317833 405.78
## - m_per1000f            1    177963 1352305 406.74
## - mean_education        1    227959 1402300 408.08
## - prob_prison           1    242508 1416850 408.46
## - unemploy_m39          1    249410 1423751 408.64
## - inequality            1    286285 1460626 409.59
## - police_exp60          1    616530 1790872 417.13
## 
## Step:  AIC=401.81
## crime_rate ~ percent_m + mean_education + police_exp60 + m_per1000f + 
##     unemploy_m24 + unemploy_m39 + inequality + prob_prison
## 
##                        Df Sum of Sq     RSS    AIC
## <none>                              1183808 401.81
## + time_prison           1     10289 1173518 403.49
## + nonwhites_per1000     1      9466 1174342 403.52
## + gdp                   1      6899 1176909 403.60
## + is_south              1      6684 1177123 403.60
## + labour_participation  1      1948 1181860 403.75
## + police_exp59          1      1540 1182267 403.76
## + state_pop             1       647 1183160 403.79
## - unemploy_m24          1    134893 1318700 403.81
## - percent_m             1    149388 1333195 404.21
## - m_per1000f            1    169039 1352847 404.75
## - mean_education        1    220608 1404416 406.14
## - prob_prison           1    237275 1421082 406.57
## - unemploy_m39          1    239975 1423783 406.64
## - inequality            1    461989 1645796 412.00
## - police_exp60          1   1025300 2209107 422.90

summary(model_step)

## 
## Call:
## lm(formula = crime_rate ~ percent_m + mean_education + police_exp60 + 
##     m_per1000f + unemploy_m24 + unemploy_m39 + inequality + prob_prison, 
##     data = data_train)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -394.74 -125.65   27.95  128.24  465.39 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    -7289.928   1383.644  -5.269 1.33e-05 ***
## percent_m          7.636      4.062   1.880   0.0706 .  
## mean_education    16.603      7.269   2.284   0.0301 *  
## police_exp60       9.219      1.872   4.925 3.41e-05 ***
## m_per1000f         3.712      1.857   2.000   0.0553 .  
## unemploy_m24      -7.495      4.196  -1.786   0.0849 .  
## unemploy_m39      22.563      9.471   2.382   0.0242 *  
## inequality         5.625      1.702   3.306   0.0026 ** 
## prob_prison    -3837.725   1619.981  -2.369   0.0250 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 205.6 on 28 degrees of freedom
## Multiple R-squared:  0.8052, Adjusted R-squared:  0.7495 
## F-statistic: 14.46 on 8 and 28 DF,  p-value: 4.236e-08

Therefore, we will further analyze model_step:

crime_rate = -7289.92 + 7.63 percent_m + 16.60 mean_education + 9.21 police_exp60 + 3.71 m_per1000f - 7.49 unemploy_m24 + 22.56 unemploy_m39 + 5.62 inequality - 3837.72 prob_prison

5. Prediction and Model Evaluation

The predictions and evaluations carried out in this section are aimed to check whether the performance of the model in our train data or test data shows one of the following conditions:

Overfitting: The model is too good at following patterns in the data train, causing the model to be less able to predict new data
Underfitting: The model is less able to capture patterns in the data train
Optimum: The model is able to follow the train data pattern but still has a good ability to predict new data

Let’s predict the model’s performance in the data train:

pred_train <- predict(model_step, data_train)

data.frame(MAE = MAE(pred_train, data_train$crime_rate),
           MAPE = MAPE(pred_train, data_train$crime_rate),
           RMSE = RMSE(pred_train, data_train$crime_rate)
           )

##       MAE      MAPE     RMSE
## 1 146.433 0.1835156 178.8709

Then we predict the performance of the model in the test data:

pred_test <- predict(model_step, data_test)

data.frame(MAE = MAE(pred_test, data_test$crime_rate),
           MAPE = MAPE(pred_test, data_test$crime_rate),
           RMSE = RMSE(pred_test, data_test$crime_rate)
           )

##        MAE      MAPE     RMSE
## 1 164.2806 0.2299896 189.1708

From the two performance models above, it can be seen that the performance of the train data and test data is similar or not much different. So we can conclude that the model is neither overfit nor underfitted.

5.1 Assumptions of linear regression

Normality

The model assumes that this error has a normal distribution centered on 0. To check the distribution of the residual value whether it is normally distributed (forming a bell curve) or not, we can use a histogram.

hist(model_step$residuals)

Another alternative is to perform statistical tests using the Shapiro-Wilk test,

shapiro.test(model_step$residuals)

## 
##  Shapiro-Wilk normality test
## 
## data:  model_step$residuals
## W = 0.98774, p-value = 0.9498

Based on the histogram, it can be seen at a glance that the model_step residuals are normally distributed. Furthermore, the Shapiro-Wilk test shows a p-value > 0.05 so that H0 is accepted, which means that the residuals or errors are normally distributed.

5.2 Constant Variance of Error (Homocesdasticity) using Breusch-Pagan

bptest(model_step)

## 
##  studentized Breusch-Pagan test
## 
## data:  model_step
## BP = 11.85, df = 8, p-value = 0.158

The Breusch-Pagan statistical test shows p-value > 0.05 so that H0 is accepted and it can be interpreted that the residual does not form a certain pattern so that we can conclude that the model already has a constant error.

5.3 Multicollinearity (Variance Inflation Factor)

Multicollinearity occurs when the predictor variables used in the model have a strong relationship. The presence or absence of multicollinearity can be seen from the value of VIF (Variance Inflation Factor). VIF is a measure that explains how much the coefficient variance increases due to multicollinearity. When the VIF value is more than 10, it means that there is multicollinearity.

vif(model_step)

##      percent_m mean_education   police_exp60     m_per1000f   unemploy_m24 
##       2.298626       5.609506       2.990798       2.276976       4.760707 
##   unemploy_m39     inequality    prob_prison 
##       5.522660       4.416847       1.392422

From the multicollinearity test above, there is no VIF value equal to or more than 10 so that means there is no multicollinearity between variables (the predictor variables are mutually independent).

Conclusion

model_step has good criteria as a linear regression model. Then, in addition to the model_step interpretation that has been described in the tests carried out, here are some other interpretations:

When all predictor variables are 0, then crime_rate is -7289.92

Every 1 percent increase in percent_m will be followed by an increase in crime_rate value of 7.63
Every 1 year increase in mean_education will be followed by an increase in crime_rate value of 16.60
Every 1 dollar increase in police_exp60 will be followed by an increase in the crime_rate value of 9.21
Every 1 person increase in m_per1000f will be followed by an increase in crime_rate value of 3.71
Every 1 percent increase in unemploy_m24 will be followed by a decrease in crime_rate value of 7.49
Every 1 percent increase in unemploy_m39 will be followed by an increase in crime_rate value of 22.56
Every 1 dollar increase in inequality will be followed by an increase in the crime_rate value of 5.62
Every 1 percent increase in probability value will be followed by a decrease in crime_rate value of 3837.72

Through the interpretations described above, to reduce the crime rate or crime_rate, the US government can do:

The decrease in police budget spending and the possibility of using other predictor variables outside of the existing data, which are related to the police and are deemed more appropriate to use.

Based on the model generated from the step model, it can be seen that an increase of 1 dollar in police_exp60 will be followed by an increase in the value of crime_rate. Because it can be said that the police spending budget should not increase or maybe it can be lowered to reduce the crime rate.

Furthermore, based on a study conducted by researchers especially Wilson (1968, https://www.ncjrs.gov/pdffiles1/Digitization/58531NCJRS.pdf page 10 paragraph 3), to be able to reduce crime_rate more precisely, the variables that need to be considered it is not the number of police recruits, the amount of expenditure made, or the number of police patrolling but the behavior of the police patrolling. The first behavior that can be shown by the police can be “aggressive” or active observation and checking. The second behavior that can be shown is “passive” or non-active, such as: 1) The police do not stop and give sanctions to motorists who violate traffic regulations, 2) The police do not check suspicious people or vehicles, 3) The police rarely check or pretend - pretending to be a “decoy” in an area with a high crime rate.

One of the studies that supports Wilson’s opinion is the research conducted by Boydstun (1975, https://www.ncjrs.gov/pdffiles1/Digitization/58531NCJRS.pdf page 10 paragraph 4 - page 11 paragraph 1). Boydstun’s research shows that areas in the city of San Diego where police conduct field interrogations or street stops experienced a significant reduction in certain types of crime. The types of crimes in question are robbery, theft, rape or other immoral acts, and dangerous fraud. Meanwhile, areas in the city of San Diego where the police did not implement field interrogations or street stops, did not experience a change in the crime rate.