Tingkat Kejahatan di suatu daerah, dipengaruhi oleh banyak faktor. Berikut adalah analisisi Tingkat Kejahatan di US berdasarkan beberapa variable sosio-demografi.

Menggunakan Analisis Regresi Linier untuk menganalisis dan memprediksi Tingkat Kejahatan, sehingga bisa mengurangi Tingkat Kejahatan dengan memperbaiki variabel-variabel yang memiliki pengaruh signifikan.

Data Preparation

Load Library yang dibutuhkan

library(GGally)

## Loading required package: ggplot2

## Registered S3 method overwritten by 'GGally':
##   method from   
##   +.gg   ggplot2

library(lmtest)

## Loading required package: zoo

## 
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

library(car)

## Loading required package: carData

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following object is masked from 'package:car':
## 
##     recode

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(MLmetrics)

## 
## Attaching package: 'MLmetrics'

## The following object is masked from 'package:base':
## 
##     Recall

Load dataset

crime <- read.csv("Data Input/crime.csv")

Berikut adalah keterangan dari setiap kolom:

percent_m: Persentase laki-laki berusia 14-24
is_south: Apakah berada di negara bagian selatan? 1 untuk Iya, 0 untuk Tidak.
mean_education: Rata-rata tahun sekolah
police_exp60: Pengeluaran polisi pada tahun 1960
police_exp59: Pengeluaran polisi pada tahun 1959
labour_participation: Tingkat partisipasi kerja
m_per1000f: Jumlah laki-laki dibandingkan setiap 1000 perempuan
state_pop: Populasi negara bagian
nonwhites_per1000: Jumlah penduduk non-kulit putih per 1000 orang
unemploy_m24: Tingkat pengangguran laki-laki perkotaan berusia 14-24
unemploy_m39: Tingkat pengangguran laki-laki perkotaan berusia 35-39
gdp: Gross domestic product per kepala
inequality: Ketimpangan pendapatan
prob_prison: Kemungkinan dipenjara
time_prison: Rata-rata waktu menjalani hukuman di Penjara
crime_rate: Tingkat kejahatan dalam kategori yang tidak ditentukan

Explanatory Data Analysis

Cek Data

glimpse(crime)

## Rows: 47
## Columns: 16
## $ percent_m            <int> 151, 143, 142, 136, 141, 121, 127, 131, 157, 140,…
## $ is_south             <int> 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0…
## $ mean_education       <int> 91, 113, 89, 121, 121, 110, 111, 109, 90, 118, 10…
## $ police_exp60         <int> 58, 103, 45, 149, 109, 118, 82, 115, 65, 71, 121,…
## $ police_exp59         <int> 56, 95, 44, 141, 101, 115, 79, 109, 62, 68, 116, …
## $ labour_participation <int> 510, 583, 533, 577, 591, 547, 519, 542, 553, 632,…
## $ m_per1000f           <int> 950, 1012, 969, 994, 985, 964, 982, 969, 955, 102…
## $ state_pop            <int> 33, 13, 18, 157, 18, 25, 4, 50, 39, 7, 101, 47, 2…
## $ nonwhites_per1000    <int> 301, 102, 219, 80, 30, 44, 139, 179, 286, 15, 106…
## $ unemploy_m24         <int> 108, 96, 94, 102, 91, 84, 97, 79, 81, 100, 77, 83…
## $ unemploy_m39         <int> 41, 36, 33, 39, 20, 29, 38, 35, 28, 24, 35, 31, 2…
## $ gdp                  <int> 394, 557, 318, 673, 578, 689, 620, 472, 421, 526,…
## $ inequality           <int> 261, 194, 250, 167, 174, 126, 168, 206, 239, 174,…
## $ prob_prison          <dbl> 0.084602, 0.029599, 0.083401, 0.015801, 0.041399,…
## $ time_prison          <dbl> 26.2011, 25.2999, 24.3006, 29.9012, 21.2998, 20.9…
## $ crime_rate           <int> 791, 1635, 578, 1969, 1234, 682, 963, 1555, 856, …

Mengubah Tipe Data

crime <- crime %>% 
  mutate(is_south = as.factor(is_south))
glimpse(crime)

## Rows: 47
## Columns: 16
## $ percent_m            <int> 151, 143, 142, 136, 141, 121, 127, 131, 157, 140,…
## $ is_south             <fct> 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0…
## $ mean_education       <int> 91, 113, 89, 121, 121, 110, 111, 109, 90, 118, 10…
## $ police_exp60         <int> 58, 103, 45, 149, 109, 118, 82, 115, 65, 71, 121,…
## $ police_exp59         <int> 56, 95, 44, 141, 101, 115, 79, 109, 62, 68, 116, …
## $ labour_participation <int> 510, 583, 533, 577, 591, 547, 519, 542, 553, 632,…
## $ m_per1000f           <int> 950, 1012, 969, 994, 985, 964, 982, 969, 955, 102…
## $ state_pop            <int> 33, 13, 18, 157, 18, 25, 4, 50, 39, 7, 101, 47, 2…
## $ nonwhites_per1000    <int> 301, 102, 219, 80, 30, 44, 139, 179, 286, 15, 106…
## $ unemploy_m24         <int> 108, 96, 94, 102, 91, 84, 97, 79, 81, 100, 77, 83…
## $ unemploy_m39         <int> 41, 36, 33, 39, 20, 29, 38, 35, 28, 24, 35, 31, 2…
## $ gdp                  <int> 394, 557, 318, 673, 578, 689, 620, 472, 421, 526,…
## $ inequality           <int> 261, 194, 250, 167, 174, 126, 168, 206, 239, 174,…
## $ prob_prison          <dbl> 0.084602, 0.029599, 0.083401, 0.015801, 0.041399,…
## $ time_prison          <dbl> 26.2011, 25.2999, 24.3006, 29.9012, 21.2998, 20.9…
## $ crime_rate           <int> 791, 1635, 578, 1969, 1234, 682, 963, 1555, 856, …

Cek Missing Value

sum(is.na(crime))

## [1] 0

Karena tidak ada missing value, maka bisa dilanjutkan ke tahapan selanjutnya.

Cek Korelasi antar Variabel

ggcorr(crime, label = TRUE, hjust = 1, layout.exp = 2)

## Warning in ggcorr(crime, label = TRUE, hjust = 1, layout.exp = 2): data in
## column(s) 'is_south' are not numeric and were ignored

Dari grafik korelasi diatas, variabel-variabel yang ada, memiliki pengaruh positif dan negatif terhadap crime_rate. Untuk variabel police_exp59 dan police_exp60 memiliki korelasi paling tinggi diantara yang lain dan variabel nonwhites_per1000 bisa dikatakan tidak memiliki korelasi sama sekali. Akan tetapi dari variabel police_exp59 dan police_exp60 keduanya memiliki korelasi yang sangat kuat, maka cukup diambil salah satu dari kedua variabel tersebut.

crime <- crime %>% 
  select(-police_exp59)

ggcorr(crime, label = TRUE, hjust = 1, layout.exp = 2)

## Warning in ggcorr(crime, label = TRUE, hjust = 1, layout.exp = 2): data in
## column(s) 'is_south' are not numeric and were ignored

Pembuatan Model Regresi Linear

Tahapan selanjutnya adalah pembuatan model regresi linear dengan variabel-variabel prediktor yang ada. Model yang dibuat terdiri dari beberapa model yang menggunakan prediktor berbeda.

Model dengan 1 Prediktor

Untuk membuat model dengan 1 prediktor, kita gunakan variabel dengan korelasi paling tinggi dan yang dipakai adalah variabel police_exp60.

m1 <- lm(crime_rate~police_exp60, crime)
m1

## 
## Call:
## lm(formula = crime_rate ~ police_exp60, data = crime)
## 
## Coefficients:
##  (Intercept)  police_exp60  
##      144.464         8.948

summary(m1)

## 
## Call:
## lm(formula = crime_rate ~ police_exp60, data = crime)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -586.91 -155.63   32.52  139.58  568.84 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   144.464    126.693   1.140     0.26    
## police_exp60    8.948      1.409   6.353 9.34e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 283.9 on 45 degrees of freedom
## Multiple R-squared:  0.4728, Adjusted R-squared:  0.4611 
## F-statistic: 40.36 on 1 and 45 DF,  p-value: 9.338e-08

plot(crime$police_exp60, crime$crime_rate)
abline(m1$coefficients[1],m1$coefficients[2], col="red")

Dari model diatas, dapat dilihat nilai adjuster R-squared adalah 0.4611.

Model dengan Semua Variable Prediktor

Dari model pertama, nilai dari ajusted R-squared masih cukup rendah, sehingga diperlukan model lain untuk mendapat nilai adjusted R-squared yang lebih baik.

m2 <- lm(crime_rate ~ ., crime)
m2

## 
## Call:
## lm(formula = crime_rate ~ ., data = crime)
## 
## Coefficients:
##          (Intercept)             percent_m             is_south1  
##           -6379.0457                8.9861                5.6688  
##       mean_education          police_exp60  labour_participation  
##              17.7304                9.6526               -0.2801  
##           m_per1000f             state_pop     nonwhites_per1000  
##               1.8218               -0.7836                0.2446  
##         unemploy_m24          unemploy_m39                   gdp  
##              -5.4163               16.9362                0.9072  
##           inequality           prob_prison           time_prison  
##               7.2708            -4285.1992               -1.1276

summary(m2)

## 
## Call:
## lm(formula = crime_rate ~ ., data = crime)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -442.55 -116.46    8.86  118.26  473.49 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          -6379.0457  1568.9382  -4.066 0.000291 ***
## percent_m                8.9861     4.1571   2.162 0.038232 *  
## is_south1                5.6688   148.0997   0.038 0.969705    
## mean_education          17.7304     6.0824   2.915 0.006445 ** 
## police_exp60             9.6526     2.3921   4.035 0.000317 ***
## labour_participation    -0.2801     1.4079  -0.199 0.843538    
## m_per1000f               1.8218     2.0293   0.898 0.376026    
## state_pop               -0.7836     1.2857  -0.609 0.546523    
## nonwhites_per1000        0.2446     0.6187   0.395 0.695239    
## unemploy_m24            -5.4163     4.1784  -1.296 0.204164    
## unemploy_m39            16.9362     8.2148   2.062 0.047441 *  
## gdp                      0.9072     1.0329   0.878 0.386292    
## inequality               7.2708     2.2564   3.222 0.002921 ** 
## prob_prison          -4285.1992  2183.8679  -1.962 0.058484 .  
## time_prison             -1.1276     6.6919  -0.168 0.867251    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 208.6 on 32 degrees of freedom
## Multiple R-squared:  0.7976, Adjusted R-squared:  0.709 
## F-statistic: 9.006 on 14 and 32 DF,  p-value: 1.673e-07

Dari model menggunakan semua variabel prediktor, diperoleh nilai adjuster R-squared yang lebih baik yaitu 0.709.

Model dengan Menggunakan Step-Wise backward

Model dengan semua variabel prediktor memang lebih baik dibandingkan dengan 1 variabel prediktor, akan tetapi perlu dipastikan juga model lain yang mungkin memiliki adjusted R-squared yang lebih baik.

m_step <- step(m2, direction = "backward")

## Start:  AIC=513.95
## crime_rate ~ percent_m + is_south + mean_education + police_exp60 + 
##     labour_participation + m_per1000f + state_pop + nonwhites_per1000 + 
##     unemploy_m24 + unemploy_m39 + gdp + inequality + prob_prison + 
##     time_prison
## 
##                        Df Sum of Sq     RSS    AIC
## - is_south              1        64 1392929 511.95
## - time_prison           1      1236 1394101 511.99
## - labour_participation  1      1723 1394588 512.00
## - nonwhites_per1000     1      6802 1399667 512.18
## - state_pop             1     16168 1409033 512.49
## - gdp                   1     33582 1426447 513.07
## - m_per1000f            1     35080 1427945 513.12
## <none>                              1392865 513.95
## - unemploy_m24          1     73136 1466001 514.35
## - prob_prison           1    167590 1560455 517.29
## - unemploy_m39          1    185009 1577874 517.81
## - percent_m             1    203389 1596254 518.35
## - mean_education        1    369864 1762729 523.01
## - inequality            1    451937 1844802 525.15
## - police_exp60          1    708738 2101603 531.28
## 
## Step:  AIC=511.95
## crime_rate ~ percent_m + mean_education + police_exp60 + labour_participation + 
##     m_per1000f + state_pop + nonwhites_per1000 + unemploy_m24 + 
##     unemploy_m39 + gdp + inequality + prob_prison + time_prison
## 
##                        Df Sum of Sq     RSS    AIC
## - time_prison           1      1319 1394247 509.99
## - labour_participation  1      2646 1395574 510.04
## - nonwhites_per1000     1      8949 1401878 510.25
## - state_pop             1     16166 1409095 510.49
## - gdp                   1     36125 1429054 511.15
## - m_per1000f            1     36467 1429396 511.16
## <none>                              1392929 511.95
## - unemploy_m24          1     86999 1479928 512.80
## - prob_prison           1    171381 1564310 515.40
## - unemploy_m39          1    196372 1589301 516.15
## - percent_m             1    206121 1599050 516.43
## - mean_education        1    371159 1764088 521.05
## - inequality            1    534611 1927540 525.22
## - police_exp60          1    728570 2121499 529.72
## 
## Step:  AIC=509.99
## crime_rate ~ percent_m + mean_education + police_exp60 + labour_participation + 
##     m_per1000f + state_pop + nonwhites_per1000 + unemploy_m24 + 
##     unemploy_m39 + gdp + inequality + prob_prison
## 
##                        Df Sum of Sq     RSS    AIC
## - labour_participation  1      3019 1397266 508.09
## - nonwhites_per1000     1      7996 1402243 508.26
## - state_pop             1     19634 1413881 508.65
## - gdp                   1     35276 1429524 509.17
## - m_per1000f            1     40680 1434928 509.34
## <none>                              1394247 509.99
## - unemploy_m24          1     85946 1480194 510.80
## - unemploy_m39          1    195095 1589343 514.15
## - percent_m             1    206909 1601157 514.50
## - prob_prison           1    223309 1617557 514.98
## - mean_education        1    381593 1775840 519.36
## - inequality            1    537046 1931294 523.31
## - police_exp60          1    764536 2158784 528.54
## 
## Step:  AIC=508.09
## crime_rate ~ percent_m + mean_education + police_exp60 + m_per1000f + 
##     state_pop + nonwhites_per1000 + unemploy_m24 + unemploy_m39 + 
##     gdp + inequality + prob_prison
## 
##                     Df Sum of Sq     RSS    AIC
## - nonwhites_per1000  1      6963 1404229 506.33
## - state_pop          1     23381 1420648 506.87
## - gdp                1     34787 1432053 507.25
## - m_per1000f         1     41289 1438555 507.46
## <none>                           1397266 508.09
## - unemploy_m24       1     84385 1481652 508.85
## - unemploy_m39       1    197849 1595115 512.32
## - prob_prison        1    221812 1619078 513.02
## - percent_m          1    226201 1623468 513.15
## - mean_education     1    395884 1793150 517.82
## - inequality         1    534370 1931637 521.32
## - police_exp60       1    834362 2231628 528.10
## 
## Step:  AIC=506.33
## crime_rate ~ percent_m + mean_education + police_exp60 + m_per1000f + 
##     state_pop + unemploy_m24 + unemploy_m39 + gdp + inequality + 
##     prob_prison
## 
##                  Df Sum of Sq     RSS    AIC
## - state_pop       1     22345 1426575 505.07
## - gdp             1     32142 1436371 505.39
## - m_per1000f      1     36808 1441037 505.54
## <none>                        1404229 506.33
## - unemploy_m24    1     86373 1490602 507.13
## - unemploy_m39    1    205814 1610043 510.76
## - prob_prison     1    218607 1622836 511.13
## - percent_m       1    307001 1711230 513.62
## - mean_education  1    389502 1793731 515.83
## - inequality      1    608627 2012856 521.25
## - police_exp60    1   1050202 2454432 530.57
## 
## Step:  AIC=505.07
## crime_rate ~ percent_m + mean_education + police_exp60 + m_per1000f + 
##     unemploy_m24 + unemploy_m39 + gdp + inequality + prob_prison
## 
##                  Df Sum of Sq     RSS    AIC
## - gdp             1     26493 1453068 503.93
## <none>                        1426575 505.07
## - m_per1000f      1     84491 1511065 505.77
## - unemploy_m24    1     99463 1526037 506.24
## - prob_prison     1    198571 1625145 509.20
## - unemploy_m39    1    208880 1635455 509.49
## - percent_m       1    320926 1747501 512.61
## - mean_education  1    386773 1813348 514.35
## - inequality      1    594779 2021354 519.45
## - police_exp60    1   1127277 2553852 530.44
## 
## Step:  AIC=503.93
## crime_rate ~ percent_m + mean_education + police_exp60 + m_per1000f + 
##     unemploy_m24 + unemploy_m39 + inequality + prob_prison
## 
##                  Df Sum of Sq     RSS    AIC
## <none>                        1453068 503.93
## - m_per1000f      1    103159 1556227 505.16
## - unemploy_m24    1    127044 1580112 505.87
## - prob_prison     1    247978 1701046 509.34
## - unemploy_m39    1    255443 1708511 509.55
## - percent_m       1    296790 1749858 510.67
## - mean_education  1    445788 1898855 514.51
## - inequality      1    738244 2191312 521.24
## - police_exp60    1   1672038 3125105 537.93

m_step

## 
## Call:
## lm(formula = crime_rate ~ percent_m + mean_education + police_exp60 + 
##     m_per1000f + unemploy_m24 + unemploy_m39 + inequality + prob_prison, 
##     data = crime)
## 
## Coefficients:
##    (Intercept)       percent_m  mean_education    police_exp60      m_per1000f  
##      -6426.101           9.332          18.012          10.265           2.234  
##   unemploy_m24    unemploy_m39      inequality     prob_prison  
##         -6.087          18.735           6.133       -3796.032

summary(m_step)

## 
## Call:
## lm(formula = crime_rate ~ percent_m + mean_education + police_exp60 + 
##     m_per1000f + unemploy_m24 + unemploy_m39 + inequality + prob_prison, 
##     data = crime)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -444.70 -111.07    3.03  122.15  483.30 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    -6426.101   1194.611  -5.379 4.04e-06 ***
## percent_m          9.332      3.350   2.786  0.00828 ** 
## mean_education    18.012      5.275   3.414  0.00153 ** 
## police_exp60      10.265      1.552   6.613 8.26e-08 ***
## m_per1000f         2.234      1.360   1.642  0.10874    
## unemploy_m24      -6.087      3.339  -1.823  0.07622 .  
## unemploy_m39      18.735      7.248   2.585  0.01371 *  
## inequality         6.133      1.396   4.394 8.63e-05 ***
## prob_prison    -3796.032   1490.646  -2.547  0.01505 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 195.5 on 38 degrees of freedom
## Multiple R-squared:  0.7888, Adjusted R-squared:  0.7444 
## F-statistic: 17.74 on 8 and 38 DF,  p-value: 1.159e-10

Dengan menggunakan metode step-wise regression, kita dapat memperoleh model terbaik berdasarkan nilai AIC yang paling rendah.

Dari ketiga model yang telah dibuat, nilai adjusted R dari model yang menggunakan step-wise reggresion memberikan nilai paling besar yaitu 0.7444 dibandingkan 0.4611 pada model 1 prediktor dan 0.709 pada model dengan semua prediktor.

Evaluasi Model

Dari ketiga model yang telah dibuat, kita akan membandingkan 2 model m2 dan m_step.

data_eval <- data.frame(aktual = crime$crime_rate, pred_m1 = m1$fitted.values, pred_m_step = m_step$fitted.values)

Mean Absolute Error (MAE)

MAE dari model m1

MAE(y_pred = data_eval$pred_m1 , y_true = data_eval$aktual)

## [1] 218.298

MAE dari model m_step

MAE(y_pred = data_eval$pred_m_step , y_true = data_eval$aktual)

## [1] 138.6674

Dari kedua MAE yang telah dihitung, MAE model m_step lebih kecil dibandingkan model m1, akan tetapi kita tetap perlu membandingkan dengan range dari variabel target.

range(crime$crime_rate)

## [1]  342 1993

Mean Absolute Percentage Error (MAPE)

MAPE dari model m1

MAPE(y_pred = data_eval$pred_m1 , y_true = data_eval$aktual)*100

## [1] 27.24177

MAPE dari model m_step

MAPE(y_pred = data_eval$pred_m_step , y_true = data_eval$aktual)*100

## [1] 17.48751

Mean Squared Error (MSE)

MSE dari model m1

MSE(y_pred = data_eval$pred_m1 , y_true = data_eval$aktual)

## [1] 77183.53

MSE dari model m_step

MSE(y_pred = data_eval$pred_m_step , y_true = data_eval$aktual)

## [1] 30916.34

Root Mean Squared Error (RMSE)

RMSE dari model m1

RMSE(y_pred = data_eval$pred_m1 , y_true = data_eval$aktual)

## [1] 277.8192

RMSE dari model m_step

RMSE(y_pred = data_eval$pred_m_step , y_true = data_eval$aktual)

## [1] 175.8304

Dari semua perhitungan error, model m_step memiliki error yang lebih kecil dibandingkan model m1

Uji Asumsi

Dari evaluasi yang telah dilakukan, maka model m_step merupakan model terbaik yang bisa diperoleh. Akan tetap perlu dilakukan uji asumsi untuk memastikan bahwa model m_step memenuhi Best Linear Unbiased Estimator (BLUE) model.

Normality of Residuals

Visualisasi histogram residual

hist(m_step$residuals)

Uji statistik

Shapiro-Wilk hypothesis test:

H0: error berdistribusi normal
H1: error TIDAK berdistribusi normal

Kondisi yang diharapkan: H0

shapiro.test(m_step$residuals)

## 
##  Shapiro-Wilk normality test
## 
## data:  m_step$residuals
## W = 0.98511, p-value = 0.8051

Nilai p-value > 0.05. Sehingga gagal tolak H0, disimpulkan bahwa residual dari model m_step sudah berdistribusi normal. Asumsi normality terpenuhi.

Homoscedasticity of Residuals

Visualisasi scatter plot: `fitted.values` vs `residuals`

plot(m_step$fitted.values, m_step$residuals)
abline(h=0, col="red")

Uji statistik

Breusch-Pagan hypothesis test:

H0: error menyebar konstan atau homoscedasticity
H1: error menyebar TIDAK konstan atau heteroscedasticity

Kondisi yang diharapkan: H0

bptest(m_step)

## 
##  studentized Breusch-Pagan test
## 
## data:  m_step
## BP = 13.51, df = 8, p-value = 0.09546

Dari pengujian di atas, diketahui nilai p-value > 0.05. Sehingga gagal tolak H0, disimpulkan bahwa residual m_step sudah memenuhi asumsi homoskedastisitas.

No Multicollinearity

Uji VIF (Variance Inflation Factor): * nilai VIF > 10: terjadi multicollinearity pada model * nilai VIF < 10: tidak terjadi multicollinearity pada model

vif(m_step)

##      percent_m mean_education   police_exp60     m_per1000f   unemploy_m24 
##       2.131963       4.189684       2.560496       1.932367       4.360038 
##   unemploy_m39     inequality    prob_prison 
##       4.508106       3.731074       1.381879

Dari nilai VIF di atas, dihasilkan nilai VIF <10 untuk keseluruhan variabel prediktor, sehingga asumsi no multicollinearity terpenuhi

Dari uji asumsi yang telah dilakukan, model m_step mememnuhi semua uji asumsi. Oleh karena itu model m_step merupakan model yang baik untuk digunakan,

Kesimpulan dan Saran

summary(m_step)$call

## lm(formula = crime_rate ~ percent_m + mean_education + police_exp60 + 
##     m_per1000f + unemploy_m24 + unemploy_m39 + inequality + prob_prison, 
##     data = crime)

Dari model m_step yang telah dibuat, bisa dilihat variabel-variabel yang mempengaruhi Tingkat Kejahatan yang terjadi. Diharapkan nilai-nilai pada variabel prediktor tersebut diperbaiki, sehingga bisa menekan angka tingkat kejahatan yang terjadi.

LBB_RM

Fikri K Rizal

2022-08-22

Data Preparation

Load Library yang dibutuhkan

Load dataset

Explanatory Data Analysis

Cek Data

Mengubah Tipe Data

Cek Missing Value

Cek Korelasi antar Variabel

Pembuatan Model Regresi Linear

Model dengan 1 Prediktor

Model dengan Semua Variable Prediktor

Model dengan Menggunakan Step-Wise backward

Evaluasi Model

Mean Absolute Error (MAE)

Mean Absolute Percentage Error (MAPE)

Mean Squared Error (MSE)

Root Mean Squared Error (RMSE)

Uji Asumsi

Normality of Residuals

Visualisasi histogram residual

Uji statistik

Homoscedasticity of Residuals

Visualisasi scatter plot: `fitted.values` vs `residuals`

Uji statistik

No Multicollinearity

Kesimpulan dan Saran

LBB_RM

Fikri K Rizal

2022-08-22

Data Preparation

Load Library yang dibutuhkan

Load dataset

Explanatory Data Analysis

Cek Data

Mengubah Tipe Data

Cek Missing Value

Cek Korelasi antar Variabel

Pembuatan Model Regresi Linear

Model dengan 1 Prediktor

Model dengan Semua Variable Prediktor

Model dengan Menggunakan Step-Wise backward

Evaluasi Model

Mean Absolute Error (MAE)

Mean Absolute Percentage Error (MAPE)

Mean Squared Error (MSE)

Root Mean Squared Error (RMSE)

Uji Asumsi

Normality of Residuals

Visualisasi histogram residual

Uji statistik

Homoscedasticity of Residuals

Visualisasi scatter plot: fitted.values vs residuals

Uji statistik

No Multicollinearity

Kesimpulan dan Saran

Visualisasi scatter plot: `fitted.values` vs `residuals`