Library

Bagian ini berisi dari library-library yang digunkan dalam pengrjan tugas.

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(GGally)
## Loading required package: ggplot2
## 
## Attaching package: 'GGally'
## The following object is masked from 'package:dplyr':
## 
##     nasa
library(car)
## Loading required package: carData
## 
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
## 
##     recode

Dataset

Bagian ini mengerjakan bagian untuk membaca csv dan juga memberi nama pda setiap kolom dari dataset crime.

crime <- read.csv("crime.csv") %>%
dplyr::select(-X)
names(crime) <- c("percent_m", "is_south", "mean_education", "police_exp60", "police_exp59", "labour_participation", "m_per1000f", "state_pop", "nonwhites_per1000", "unemploy_m24", "unemploy_m39", "gdp", "inequality", "prob_prison", "time_prison", "crime_rate")

Deskripsi dataset

Bagian ini berisi penjelasan dataset crime.

Melihat gambaran dataset

Bagian ini melihat isi dan bagian dari dataset crime.

summary(crime)
##    percent_m        is_south      mean_education   police_exp60  
##  Min.   :119.0   Min.   :0.0000   Min.   : 87.0   Min.   : 45.0  
##  1st Qu.:130.0   1st Qu.:0.0000   1st Qu.: 97.5   1st Qu.: 62.5  
##  Median :136.0   Median :0.0000   Median :108.0   Median : 78.0  
##  Mean   :138.6   Mean   :0.3404   Mean   :105.6   Mean   : 85.0  
##  3rd Qu.:146.0   3rd Qu.:1.0000   3rd Qu.:114.5   3rd Qu.:104.5  
##  Max.   :177.0   Max.   :1.0000   Max.   :122.0   Max.   :166.0  
##   police_exp59    labour_participation   m_per1000f       state_pop     
##  Min.   : 41.00   Min.   :480.0        Min.   : 934.0   Min.   :  3.00  
##  1st Qu.: 58.50   1st Qu.:530.5        1st Qu.: 964.5   1st Qu.: 10.00  
##  Median : 73.00   Median :560.0        Median : 977.0   Median : 25.00  
##  Mean   : 80.23   Mean   :561.2        Mean   : 983.0   Mean   : 36.62  
##  3rd Qu.: 97.00   3rd Qu.:593.0        3rd Qu.: 992.0   3rd Qu.: 41.50  
##  Max.   :157.00   Max.   :641.0        Max.   :1071.0   Max.   :168.00  
##  nonwhites_per1000  unemploy_m24     unemploy_m39        gdp       
##  Min.   :  2.0     Min.   : 70.00   Min.   :20.00   Min.   :288.0  
##  1st Qu.: 24.0     1st Qu.: 80.50   1st Qu.:27.50   1st Qu.:459.5  
##  Median : 76.0     Median : 92.00   Median :34.00   Median :537.0  
##  Mean   :101.1     Mean   : 95.47   Mean   :33.98   Mean   :525.4  
##  3rd Qu.:132.5     3rd Qu.:104.00   3rd Qu.:38.50   3rd Qu.:591.5  
##  Max.   :423.0     Max.   :142.00   Max.   :58.00   Max.   :689.0  
##    inequality     prob_prison       time_prison      crime_rate    
##  Min.   :126.0   Min.   :0.00690   Min.   :12.20   Min.   : 342.0  
##  1st Qu.:165.5   1st Qu.:0.03270   1st Qu.:21.60   1st Qu.: 658.5  
##  Median :176.0   Median :0.04210   Median :25.80   Median : 831.0  
##  Mean   :194.0   Mean   :0.04709   Mean   :26.60   Mean   : 905.1  
##  3rd Qu.:227.5   3rd Qu.:0.05445   3rd Qu.:30.45   3rd Qu.:1057.5  
##  Max.   :276.0   Max.   :0.11980   Max.   :44.00   Max.   :1993.0
str(crime)
## 'data.frame':    47 obs. of  16 variables:
##  $ percent_m           : int  151 143 142 136 141 121 127 131 157 140 ...
##  $ is_south            : int  1 0 1 0 0 0 1 1 1 0 ...
##  $ mean_education      : int  91 113 89 121 121 110 111 109 90 118 ...
##  $ police_exp60        : int  58 103 45 149 109 118 82 115 65 71 ...
##  $ police_exp59        : int  56 95 44 141 101 115 79 109 62 68 ...
##  $ labour_participation: int  510 583 533 577 591 547 519 542 553 632 ...
##  $ m_per1000f          : int  950 1012 969 994 985 964 982 969 955 1029 ...
##  $ state_pop           : int  33 13 18 157 18 25 4 50 39 7 ...
##  $ nonwhites_per1000   : int  301 102 219 80 30 44 139 179 286 15 ...
##  $ unemploy_m24        : int  108 96 94 102 91 84 97 79 81 100 ...
##  $ unemploy_m39        : int  41 36 33 39 20 29 38 35 28 24 ...
##  $ gdp                 : int  394 557 318 673 578 689 620 472 421 526 ...
##  $ inequality          : int  261 194 250 167 174 126 168 206 239 174 ...
##  $ prob_prison         : num  0.0846 0.0296 0.0834 0.0158 0.0414 ...
##  $ time_prison         : num  26.2 25.3 24.3 29.9 21.3 ...
##  $ crime_rate          : int  791 1635 578 1969 1234 682 963 1555 856 705 ...
head(crime)
##   percent_m is_south mean_education police_exp60 police_exp59
## 1       151        1             91           58           56
## 2       143        0            113          103           95
## 3       142        1             89           45           44
## 4       136        0            121          149          141
## 5       141        0            121          109          101
## 6       121        0            110          118          115
##   labour_participation m_per1000f state_pop nonwhites_per1000 unemploy_m24
## 1                  510        950        33               301          108
## 2                  583       1012        13               102           96
## 3                  533        969        18               219           94
## 4                  577        994       157                80          102
## 5                  591        985        18                30           91
## 6                  547        964        25                44           84
##   unemploy_m39 gdp inequality prob_prison time_prison crime_rate
## 1           41 394        261    0.084602     26.2011        791
## 2           36 557        194    0.029599     25.2999       1635
## 3           33 318        250    0.083401     24.3006        578
## 4           39 673        167    0.015801     29.9012       1969
## 5           20 578        174    0.041399     21.2998       1234
## 6           29 689        126    0.034201     20.9995        682

Tabel korelasi

Melihat korelasi dari bagian-bagian kolom pada dataset crime.

ggcorr(crime, label = T, label_size = 3)

AIC Values

Digunakan sebagai pertimbangan variabel yang digunakan untuk membuat model.

crime.all <- lm(crime_rate ~., crime)
step(crime.all, direction="backward")
## Start:  AIC=514.65
## crime_rate ~ percent_m + is_south + mean_education + police_exp60 + 
##     police_exp59 + labour_participation + m_per1000f + state_pop + 
##     nonwhites_per1000 + unemploy_m24 + unemploy_m39 + gdp + inequality + 
##     prob_prison + time_prison
## 
##                        Df Sum of Sq     RSS    AIC
## - is_south              1        29 1354974 512.65
## - labour_participation  1      8917 1363862 512.96
## - time_prison           1     10304 1365250 513.00
## - state_pop             1     14122 1369068 513.14
## - nonwhites_per1000     1     18395 1373341 513.28
## - m_per1000f            1     31967 1386913 513.74
## - gdp                   1     37613 1392558 513.94
## - police_exp59          1     37919 1392865 513.95
## <none>                              1354946 514.65
## - unemploy_m24          1     83722 1438668 515.47
## - police_exp60          1    144306 1499252 517.41
## - unemploy_m39          1    181536 1536482 518.56
## - percent_m             1    193770 1548716 518.93
## - prob_prison           1    199538 1554484 519.11
## - mean_education        1    402117 1757063 524.86
## - inequality            1    423031 1777977 525.42
## 
## Step:  AIC=512.65
## crime_rate ~ percent_m + mean_education + police_exp60 + police_exp59 + 
##     labour_participation + m_per1000f + state_pop + nonwhites_per1000 + 
##     unemploy_m24 + unemploy_m39 + gdp + inequality + prob_prison + 
##     time_prison
## 
##                        Df Sum of Sq     RSS    AIC
## - time_prison           1     10341 1365315 511.01
## - labour_participation  1     10878 1365852 511.03
## - state_pop             1     14127 1369101 511.14
## - nonwhites_per1000     1     21626 1376600 511.39
## - m_per1000f            1     32449 1387423 511.76
## - police_exp59          1     37954 1392929 511.95
## - gdp                   1     39223 1394197 511.99
## <none>                              1354974 512.65
## - unemploy_m24          1     96420 1451395 513.88
## - police_exp60          1    144302 1499277 515.41
## - unemploy_m39          1    189859 1544834 516.81
## - percent_m             1    195084 1550059 516.97
## - prob_prison           1    204463 1559437 517.26
## - mean_education        1    403140 1758114 522.89
## - inequality            1    488834 1843808 525.13
## 
## Step:  AIC=511.01
## crime_rate ~ percent_m + mean_education + police_exp60 + police_exp59 + 
##     labour_participation + m_per1000f + state_pop + nonwhites_per1000 + 
##     unemploy_m24 + unemploy_m39 + gdp + inequality + prob_prison
## 
##                        Df Sum of Sq     RSS    AIC
## - labour_participation  1     10533 1375848 509.37
## - nonwhites_per1000     1     15482 1380797 509.54
## - state_pop             1     21846 1387161 509.75
## - police_exp59          1     28932 1394247 509.99
## - gdp                   1     36070 1401385 510.23
## - m_per1000f            1     41784 1407099 510.42
## <none>                              1365315 511.01
## - unemploy_m24          1     91420 1456735 512.05
## - police_exp60          1    134137 1499452 513.41
## - unemploy_m39          1    184143 1549458 514.95
## - percent_m             1    186110 1551425 515.01
## - prob_prison           1    237493 1602808 516.54
## - mean_education        1    409448 1774763 521.33
## - inequality            1    502909 1868224 523.75
## 
## Step:  AIC=509.37
## crime_rate ~ percent_m + mean_education + police_exp60 + police_exp59 + 
##     m_per1000f + state_pop + nonwhites_per1000 + unemploy_m24 + 
##     unemploy_m39 + gdp + inequality + prob_prison
## 
##                     Df Sum of Sq     RSS    AIC
## - nonwhites_per1000  1     11675 1387523 507.77
## - police_exp59       1     21418 1397266 508.09
## - state_pop          1     27803 1403651 508.31
## - m_per1000f         1     31252 1407100 508.42
## - gdp                1     35035 1410883 508.55
## <none>                           1375848 509.37
## - unemploy_m24       1     80954 1456802 510.06
## - police_exp60       1    123896 1499744 511.42
## - unemploy_m39       1    190746 1566594 513.47
## - percent_m          1    217716 1593564 514.27
## - prob_prison        1    226971 1602819 514.54
## - mean_education     1    413254 1789103 519.71
## - inequality         1    500944 1876792 521.96
## 
## Step:  AIC=507.77
## crime_rate ~ percent_m + mean_education + police_exp60 + police_exp59 + 
##     m_per1000f + state_pop + unemploy_m24 + unemploy_m39 + gdp + 
##     inequality + prob_prison
## 
##                  Df Sum of Sq     RSS    AIC
## - police_exp59    1     16706 1404229 506.33
## - state_pop       1     25793 1413315 506.63
## - m_per1000f      1     26785 1414308 506.66
## - gdp             1     31551 1419073 506.82
## <none>                        1387523 507.77
## - unemploy_m24    1     83881 1471404 508.52
## - police_exp60    1    118348 1505871 509.61
## - unemploy_m39    1    201453 1588976 512.14
## - prob_prison     1    216760 1604282 512.59
## - percent_m       1    309214 1696737 515.22
## - mean_education  1    402754 1790276 517.74
## - inequality      1    589736 1977259 522.41
## 
## Step:  AIC=506.33
## crime_rate ~ percent_m + mean_education + police_exp60 + m_per1000f + 
##     state_pop + unemploy_m24 + unemploy_m39 + gdp + inequality + 
##     prob_prison
## 
##                  Df Sum of Sq     RSS    AIC
## - state_pop       1     22345 1426575 505.07
## - gdp             1     32142 1436371 505.39
## - m_per1000f      1     36808 1441037 505.54
## <none>                        1404229 506.33
## - unemploy_m24    1     86373 1490602 507.13
## - unemploy_m39    1    205814 1610043 510.76
## - prob_prison     1    218607 1622836 511.13
## - percent_m       1    307001 1711230 513.62
## - mean_education  1    389502 1793731 515.83
## - inequality      1    608627 2012856 521.25
## - police_exp60    1   1050202 2454432 530.57
## 
## Step:  AIC=505.07
## crime_rate ~ percent_m + mean_education + police_exp60 + m_per1000f + 
##     unemploy_m24 + unemploy_m39 + gdp + inequality + prob_prison
## 
##                  Df Sum of Sq     RSS    AIC
## - gdp             1     26493 1453068 503.93
## <none>                        1426575 505.07
## - m_per1000f      1     84491 1511065 505.77
## - unemploy_m24    1     99463 1526037 506.24
## - prob_prison     1    198571 1625145 509.20
## - unemploy_m39    1    208880 1635455 509.49
## - percent_m       1    320926 1747501 512.61
## - mean_education  1    386773 1813348 514.35
## - inequality      1    594779 2021354 519.45
## - police_exp60    1   1127277 2553852 530.44
## 
## Step:  AIC=503.93
## crime_rate ~ percent_m + mean_education + police_exp60 + m_per1000f + 
##     unemploy_m24 + unemploy_m39 + inequality + prob_prison
## 
##                  Df Sum of Sq     RSS    AIC
## <none>                        1453068 503.93
## - m_per1000f      1    103159 1556227 505.16
## - unemploy_m24    1    127044 1580112 505.87
## - prob_prison     1    247978 1701046 509.34
## - unemploy_m39    1    255443 1708511 509.55
## - percent_m       1    296790 1749858 510.67
## - mean_education  1    445788 1898855 514.51
## - inequality      1    738244 2191312 521.24
## - police_exp60    1   1672038 3125105 537.93
## 
## Call:
## lm(formula = crime_rate ~ percent_m + mean_education + police_exp60 + 
##     m_per1000f + unemploy_m24 + unemploy_m39 + inequality + prob_prison, 
##     data = crime)
## 
## Coefficients:
##    (Intercept)       percent_m  mean_education    police_exp60  
##      -6426.101           9.332          18.012          10.265  
##     m_per1000f    unemploy_m24    unemploy_m39      inequality  
##          2.234          -6.087          18.735           6.133  
##    prob_prison  
##      -3796.032

Memilih variabel yang akan digunakan & Membuat Regresinya

Bagian ini mengerjakan linear regresi dari crime_rate dengan GDP dan mean_education. Lalu melihat gambaran model yang sudah dibuat.

model.crime <- lm(formula = crime_rate ~ percent_m + mean_education + m_per1000f + 
    unemploy_m24 + unemploy_m39 + inequality + prob_prison + 
    police_exp60, data = crime)

summary(model.crime)
## 
## Call:
## lm(formula = crime_rate ~ percent_m + mean_education + m_per1000f + 
##     unemploy_m24 + unemploy_m39 + inequality + prob_prison + 
##     police_exp60, data = crime)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -444.70 -111.07    3.03  122.15  483.30 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    -6426.101   1194.611  -5.379 4.04e-06 ***
## percent_m          9.332      3.350   2.786  0.00828 ** 
## mean_education    18.012      5.275   3.414  0.00153 ** 
## m_per1000f         2.234      1.360   1.642  0.10874    
## unemploy_m24      -6.087      3.339  -1.823  0.07622 .  
## unemploy_m39      18.735      7.248   2.585  0.01371 *  
## inequality         6.133      1.396   4.394 8.63e-05 ***
## prob_prison    -3796.032   1490.646  -2.547  0.01505 *  
## police_exp60      10.265      1.552   6.613 8.26e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 195.5 on 38 degrees of freedom
## Multiple R-squared:  0.7888, Adjusted R-squared:  0.7444 
## F-statistic: 17.74 on 8 and 38 DF,  p-value: 1.159e-10

Terlihat bahwasanya Multiple R-squared dan Adjusted R-squared memiliki nilai yang cukup baik.

Hasil yang didapatkan dari model.crime

plot(model.crime)

Rata-rata dari pendidikan (seluruh gender) serta laki-laki yang berumur 35-39 cukup berpengaruh dalam angka terjadinya kriminal. Selain itu, kenapa muncul nilai yang besar pada “prob_prison” karena data yang didaptkan dalam angka 0.06 - 0.11, sehingga nilainya tidak cukup pas untuk dibandingkan dengan yang lainnya. Selanjutnya yang diketahui juga dari model ini adalah jumlah laki-laki/1000 wanita paling kecil mempengaruhi dibandingkan yang lainnya dalam model yang sudah dibuat.

Multicolinearity (dengan melihat nilai VIF)

Bagian ini melihat hubungan dengan melihat nilai VIFnya.

vif(model.crime)
##      percent_m mean_education     m_per1000f   unemploy_m24   unemploy_m39 
##       2.131963       4.189684       1.932367       4.360038       4.508106 
##     inequality    prob_prison   police_exp60 
##       3.731074       1.381879       2.560496