Membangkitkan data

Skenario

Y : Keputusan menolak/menerima pelamar kerja pada PT A posisi B X1 : Lama Pengalaman kerja sebelumnya (bulan) X2 : Status pekerjaan saat ini (0: Bekerja, 1: Tidak bekerja) x3 : Tingkat pendidikan (0: Lulusan Sekolah Menengah, 1: Lulusan Perguruan Tinggi) X4 : IPK (skala 4)

Membangkitkan data X1

X1 : Lama pengalaman kerja sebelumnya (bulan) Membangkitkan variabel X1 dengan lama pekerjaan 0-60 bulan dengan nilai tengah 12 dan banyak pelamar adalah 100

set.seed(1)
n <- 100
u <- runif(n)

x1 <- round(60*(-(log(1-u)/12)))
x1
##   [1]  2  2  4 12  1 11 14  5  5  0  1  1  6  2  7  3  6 24  2  8 14  1  5  1  2
##  [26]  2  0  2 10  2  3  5  3  1  9  6  8  1  6  3  9  5  8  4  4  8  0  3  7  6
##  [51]  3 10  3  1  0  1  2  4  5  3 12  2  3  2  5  1  3  7  0 10  2  9  2  2  3
##  [76] 11 10  2  8 16  3  6  3  2  7  1  6  1  1  1  1  0  5 10  8  8  3  3  8  5

Membangkitkan data X2

X2 : Status pekerjaan Keterangan yang digunakan (0= Bekerja) dan (1= Tidak bekerja)

set.seed(12)
x2 <- round(runif(n))
x2
##   [1] 0 1 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 1 0 0 1 0 1 0 0 1 0 0 1 0 1 1 1 1 1 1
##  [38] 1 0 0 1 1 1 0 1 0 0 0 0 1 0 0 0 0 1 1 1 1 1 1 1 0 1 1 0 0 0 1 0 1 0 1 1 0
##  [75] 0 1 1 1 0 1 1 1 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 0 0 0

Membangkitkan data X3

X3 : Tingkat pendidikan Keterangan yang digunakan (0= lulus SMA/Tidak kuliah) dan (1= lulus kuliah)

set.seed(123)
x3 <- round(runif(n))
x3
##   [1] 0 1 0 1 1 0 1 1 1 0 1 0 1 1 0 1 0 0 0 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 0 0 1
##  [38] 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 1 1 0 1 0 0 0 1 0 1 1 1 0 1 1 1 0
##  [75] 0 0 0 1 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 0 0 1 0 0 1

Membangkitkan data X4

X4 adalah data IPK Pelamar dengan skala 4

set.seed(1234)
x4 <- round(rnorm(n,3,0.5),2)
x4
##   [1] 2.40 3.14 3.54 1.83 3.21 3.25 2.71 2.73 2.72 2.55 2.76 2.50 2.61 3.03 3.48
##  [16] 2.94 2.74 2.54 2.58 4.21 3.07 2.75 2.78 3.23 2.65 2.28 3.29 2.49 2.99 2.53
##  [31] 3.55 2.76 2.65 2.75 2.19 2.42 1.91 2.33 2.85 2.77 3.72 2.47 2.57 2.86 2.50
##  [46] 2.52 2.45 2.37 2.74 2.75 2.10 2.71 2.45 2.49 2.92 3.28 3.82 2.61 3.80 2.42
##  [61] 3.33 4.27 2.98 2.67 3.00 3.89 2.43 3.68 3.66 3.17 3.00 2.77 2.82 3.32 4.04
##  [76] 2.92 2.30 2.64 3.13 2.84 2.91 2.92 2.31 2.91 3.43 3.35 3.27 2.80 2.90 2.40
##  [91] 2.97 3.13 3.85 3.50 2.75 3.18 2.43 3.44 3.49 4.06

Membangkitkan data Y

Menentukan koef

b0 <- -11
b1 <- 3.5
b2 <- 0.5
b3 <- 2.7
b4 <- 2.2
set.seed(12345)
datapendukung <- b0+(b1*x1)+(b2*x2)+(b3*x3)+(b4*x4)
datapendukung
##   [1]  1.280  6.108 11.288 37.726  2.262 34.650 46.662 15.706 15.184 -5.390
##  [11]  1.272 -1.500 18.442  5.366 21.156  8.668 16.028 79.088  2.176 28.962
##  [21] 47.454  1.750 15.316  2.806  4.530  3.716 -0.562  4.178 30.578  2.066
##  [31] 10.010 15.772  8.530  1.750 25.818 15.824 24.402 -1.874 16.270  5.594
##  [41] 29.184 12.434 23.154  9.292  9.000 22.544 -5.610  4.714 19.528 19.250
##  [51]  4.120 29.962  7.590 -2.022 -1.376  0.216  4.904 11.942 18.060  5.324
##  [61] 41.526  5.394  6.556  2.374 15.800  1.058  7.546 24.796 -0.248 31.474
##  [71]  5.300 29.794  5.404  3.304  8.388 34.424 29.560  5.008 23.886 51.748
##  [81]  6.402 19.624  4.582  5.102 21.546 -0.130 19.894  1.860  1.580 -2.220
##  [91] -0.466 -1.414 14.970 34.400 23.550 23.996  8.046  7.068 24.678 18.132
p <- exp(datapendukung)/(1+exp(datapendukung))
p
##   [1] 0.782449776 0.997779943 0.999987478 1.000000000 0.905680616 1.000000000
##   [7] 1.000000000 0.999999849 0.999999746 0.004541256 0.781084923 0.182425524
##  [13] 0.999999990 0.995348948 0.999999999 0.999828027 0.999999891 1.000000000
##  [19] 0.898073505 1.000000000 1.000000000 0.851952802 0.999999777 0.942999193
##  [25] 0.989334307 0.976246843 0.363084825 0.984902299 1.000000000 0.887554373
##  [31] 0.999955054 0.999999859 0.999802584 0.851952802 1.000000000 0.999999866
##  [37] 1.000000000 0.133079567 0.999999914 0.996293670 1.000000000 0.999996019
##  [43] 1.000000000 0.999907850 0.999876605 1.000000000 0.003647715 0.991110894
##  [49] 0.999999997 0.999999996 0.984015152 1.000000000 0.999494774 0.116912345
##  [55] 0.201652186 0.553791023 0.992637749 0.999993489 0.999999986 0.995150411
##  [61] 1.000000000 0.995476790 0.998580457 0.914823065 0.999999863 0.742308157
##  [67] 0.999472060 1.000000000 0.438315828 1.000000000 0.995033198 1.000000000
##  [73] 0.995521596 0.964565780 0.999772470 1.000000000 1.000000000 0.993360124
##  [79] 1.000000000 1.000000000 0.998344508 0.999999997 0.989869275 0.993952233
##  [85] 1.000000000 0.467545694 0.999999998 0.865296948 0.829204518 0.097968804
##  [91] 0.385563426 0.195603918 0.999999685 1.000000000 1.000000000 1.000000000
##  [97] 0.999679722 0.999148790 1.000000000 0.999999987
set.seed(123456)
y <- rbinom(n,1,p)
y
##   [1] 0 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1
##  [38] 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1
##  [75] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 1 1 1 1 1 1 1
datagab <- data.frame(y,x1,x2,x3,x4)
datagab
##     y x1 x2 x3   x4
## 1   0  2  0  0 2.40
## 2   1  2  1  1 3.14
## 3   1  4  1  0 3.54
## 4   1 12  0  1 1.83
## 5   1  1  0  1 3.21
## 6   1 11  0  0 3.25
## 7   1 14  0  1 2.71
## 8   1  5  1  1 2.73
## 9   1  5  0  1 2.72
## 10  0  0  0  0 2.55
## 11  0  1  0  1 2.76
## 12  0  1  1  0 2.50
## 13  1  6  0  1 2.61
## 14  1  2  0  1 3.03
## 15  1  7  0  0 3.48
## 16  1  3  0  1 2.94
## 17  1  6  0  0 2.74
## 18  1 24  1  0 2.54
## 19  1  2  1  0 2.58
## 20  1  8  0  1 4.21
## 21  1 14  0  1 3.07
## 22  1  1  1  1 2.75
## 23  1  5  0  1 2.78
## 24  1  1  1  1 3.23
## 25  1  2  0  1 2.65
## 26  1  2  0  1 2.28
## 27  0  0  1  1 3.29
## 28  1  2  0  1 2.49
## 29  1 10  0  0 2.99
## 30  1  2  1  0 2.53
## 31  1  3  0  1 3.55
## 32  1  5  1  1 2.76
## 33  1  3  1  1 2.65
## 34  1  1  1  1 2.75
## 35  1  9  1  0 2.19
## 36  1  6  1  0 2.42
## 37  1  8  1  1 1.91
## 38  0  1  1  0 2.33
## 39  1  6  0  0 2.85
## 40  1  3  0  0 2.77
## 41  1  9  1  0 3.72
## 42  1  5  1  0 2.47
## 43  1  8  1  0 2.57
## 44  1  4  0  0 2.86
## 45  1  4  1  0 2.50
## 46  1  8  0  0 2.52
## 47  0  0  0  0 2.45
## 48  1  3  0  0 2.37
## 49  1  7  0  0 2.74
## 50  1  6  1  1 2.75
## 51  1  3  0  0 2.10
## 52  1 10  0  0 2.71
## 53  1  3  0  1 2.45
## 54  0  1  0  0 2.49
## 55  0  0  1  1 2.92
## 56  0  1  1  0 3.28
## 57  1  2  1  0 3.82
## 58  1  4  1  1 2.61
## 59  1  5  1  1 3.80
## 60  1  3  1  0 2.42
## 61  1 12  1  1 3.33
## 62  1  2  0  0 4.27
## 63  1  3  1  0 2.98
## 64  1  2  1  0 2.67
## 65  1  5  0  1 3.00
## 66  1  1  0  0 3.89
## 67  1  3  0  1 2.43
## 68  1  7  1  1 3.68
## 69  0  0  0  1 3.66
## 70  1 10  1  0 3.17
## 71  1  2  0  1 3.00
## 72  1  9  1  1 2.77
## 73  1  2  1  1 2.82
## 74  1  2  0  0 3.32
## 75  1  3  0  0 4.04
## 76  1 11  1  0 2.92
## 77  1 10  1  0 2.30
## 78  1  2  1  1 2.64
## 79  1  8  0  0 3.13
## 80  1 16  1  0 2.84
## 81  1  3  1  0 2.91
## 82  1  6  1  1 2.92
## 83  1  3  0  0 2.31
## 84  1  2  0  1 2.91
## 85  1  7  1  0 3.43
## 86  1  1  0  0 3.35
## 87  1  6  0  1 3.27
## 88  1  1  1  1 2.80
## 89  0  1  0  1 2.90
## 90  1  1  0  0 2.40
## 91  0  1  1  0 2.97
## 92  0  0  0  1 3.13
## 93  1  5  0  0 3.85
## 94  1 10  0  1 3.50
## 95  1  8  1  0 2.75
## 96  1  8  0  0 3.18
## 97  1  3  1  1 2.43
## 98  1  3  0  0 3.44
## 99  1  8  0  0 3.49
## 100 1  5  0  1 4.06

Analisis Regresi Logistik

modelreglog <- glm(y~x1+x2+x3+x4, family = binomial(link = "logit"), data=datagab)
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
summary(modelreglog)
## 
## Call:
## glm(formula = y ~ x1 + x2 + x3 + x4, family = binomial(link = "logit"), 
##     data = datagab)
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)   
## (Intercept) -14.2927     6.7321  -2.123  0.03375 * 
## x1            4.2842     1.4175   3.022  0.00251 **
## x2            0.7724     1.1423   0.676  0.49894   
## x3            1.9310     1.2230   1.579  0.11435   
## x4            2.9125     1.7503   1.664  0.09611 . 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 80.993  on 99  degrees of freedom
## Residual deviance: 22.213  on 95  degrees of freedom
## AIC: 32.213
## 
## Number of Fisher Scoring iterations: 10