Membangkitkan Data

Skenario

Y : Keputusan menolak/menerima pelamar kerja pada PT A posisi B X1 : Lama pengalaman kerja sebelumnya (bulan) X2 : Status pekerjaan saat ini (0: Bekerja, 1: Tidak bekerja) X3 : Tingkat pendidikan (0: Lulusah Sekolah Menengah, 1: Lulusan Perguruan Tinggi) X4 : IPK (skala 4)

Membangkitkan data X1

X1 : Lama pengalaman kerja sebelumnya (bulan) Membangkitkan variabel X1 dengan lama pekerjaan 0-60 bulan dengan nilai tengah 12 dan banyak pelamar adalah 100

set.seed(447)
n <- 100
u <- runif(n)

x1 <- round(60*(-(log(1-u)/12)))
x1
##   [1]  2  6  5  6  4  3  1  1  1 19  8  9  1  2  5  0  6  1  4  6  2  1  6  1  1
##  [26]  0  0  4  2  7  8  2  2  3 11  1 11  2  1  2  1  2  4 10  5  7 10  7  5  1
##  [51]  2 28  1  4 12 17  6  0  7  7  4  7  5  1  1  5 12  0  7  0  2  1  1  1  1
##  [76]  7  3  2 11  2  6  3  5  3  8  1  2  7  5  4  5 13  3  1  3  1  7  3  3  3

Membangkitkan data X2

X2 : Status pekerjaan Keterangan yang digunakan (0=Tidak Bekerja) dan (1=Bekerja)

set.seed(217)
x2 <- round(runif(n))
x2
##   [1] 0 0 0 1 0 1 1 1 1 0 0 1 1 0 0 0 1 0 0 1 0 0 1 1 1 1 0 1 0 1 1 1 1 0 0 1 1
##  [38] 0 0 1 1 1 0 1 1 1 1 0 1 1 0 1 1 0 1 1 1 0 1 0 1 1 1 0 1 0 1 0 1 0 1 1 0 0
##  [75] 0 0 0 0 0 1 0 0 0 1 1 1 0 0 1 0 1 0 0 1 1 0 0 0 1 0

Membangkitkan data X3

X3 : Tingkat pendidikan Keterangan yang digunakan (0=lulus SMA/Tidak kuliah) dan (1=lulus kuliah)

set.seed(427)
x3 <- round(runif(n))
x3
##   [1] 0 1 0 0 1 1 1 0 0 0 1 0 1 1 1 0 1 1 1 1 1 1 0 0 1 0 0 0 0 0 1 1 0 1 0 0 1
##  [38] 1 1 0 1 1 1 1 1 0 1 0 0 0 1 1 1 0 0 1 1 1 1 1 0 1 0 1 0 1 0 0 0 1 1 1 1 1
##  [75] 0 0 1 1 1 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0

Membangkitkan data X4

X4 adalah data IPK Pelamar dengan skala 4

set.seed(123)
x4 <- round(rnorm(n,3,0.5),2)
x4
##   [1] 2.72 2.88 3.78 3.04 3.06 3.86 3.23 2.37 2.66 2.78 3.61 3.18 3.20 3.06 2.72
##  [16] 3.89 3.25 2.02 3.35 2.76 2.47 2.89 2.49 2.64 2.69 2.16 3.42 3.08 2.43 3.63
##  [31] 3.21 2.85 3.45 3.44 3.41 3.34 3.28 2.97 2.85 2.81 2.65 2.90 2.37 4.08 3.60
##  [46] 2.44 2.80 2.77 3.39 2.96 3.13 2.99 2.98 3.68 2.89 3.76 2.23 3.29 3.06 3.11
##  [61] 3.19 2.75 2.83 2.49 2.46 3.15 3.22 3.03 3.46 4.03 2.75 1.85 3.50 2.65 2.66
##  [76] 3.51 2.86 2.39 3.09 2.93 3.00 3.19 2.81 3.32 2.89 3.17 3.55 3.22 2.84 3.57
##  [91] 3.50 3.27 3.12 2.69 3.68 2.70 4.09 3.77 2.88 2.49
set.seed(222)
x44 <- round(rnorm(n,2.7,0.5),2)
x44
##   [1] 3.44 2.70 3.39 2.51 2.79 2.58 2.09 3.48 2.91 2.10 3.23 2.05 2.35 3.00 2.60
##  [16] 2.11 1.70 2.70 2.96 2.33 3.06 3.06 2.37 3.45 1.98 1.62 2.90 2.50 2.55 3.37
##  [31] 2.29 3.04 2.59 2.64 2.60 2.90 3.03 2.75 2.61 3.17 2.80 2.95 2.42 3.26 3.80
##  [46] 2.86 2.23 3.11 2.51 2.87 3.00 2.96 2.22 2.09 2.60 3.23 2.89 3.32 2.86 2.18
##  [61] 2.13 3.32 3.09 3.07 2.73 3.12 2.80 3.43 2.47 1.31 2.73 2.67 2.11 1.44 3.11
##  [76] 2.83 2.67 3.04 2.71 2.97 3.04 2.10 2.08 2.81 1.97 2.64 2.97 3.06 1.91 3.25
##  [91] 2.53 3.01 2.95 3.54 2.89 2.82 2.91 2.11 2.38 2.73
summary(x44)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.310   2.410   2.795   2.711   3.040   3.800

Membangkitkan data Y

Menentukan koef

b0 <- -11
b1 <- 3.5
b2 <- 0.5
b3 <- 2.7
b4 <- 2.2
set.seed(1)
datapendukung <- b0+(b1*x1)+(b2*x2)+(b3*x3)+(b4*x4)
datapendukung
##   [1]  1.984 19.036 14.816 17.188 12.432 11.192  2.806 -1.786 -1.148 61.616
##  [11] 27.642 27.996  2.740  5.432 15.184 -2.442 20.350 -0.356 13.070 19.272
##  [21]  4.134  1.558 15.978 -1.192  1.618 -5.748 -3.476 10.276  1.346 21.986
##  [31] 27.262  5.470  4.090  9.768 35.002  0.348 37.916  5.234  1.470  2.682
##  [41]  1.530  5.580 10.914 36.176 17.620 19.368 33.360 19.594 14.458 -0.488
##  [51]  5.586 96.778  2.256 11.096 37.858 59.972 18.106 -1.062 23.432 23.042
##  [61] 10.518 22.750 13.226  0.678 -1.588 16.130 38.584 -4.334 21.612  0.566
##  [71]  5.250 -0.230  2.900  1.030 -1.648 21.222  8.492  3.958 36.998  5.646
##  [81] 19.300  6.518 15.382  7.304 26.558  2.674  6.510 20.584 13.248 10.854
##  [91] 14.700 41.694  6.364 -1.082 10.796  1.140 22.498 10.494  6.336  4.978
p <- exp(datapendukung)/(1+exp(datapendukung))
p
##   [1] 0.879106919 0.999999995 0.999999632 0.999999966 0.999996011 0.999986216
##   [7] 0.942999193 0.143563836 0.240854581 1.000000000 1.000000000 1.000000000
##  [13] 0.939346097 0.995644713 0.999999746 0.080025546 0.999999999 0.411928197
##  [19] 0.999997892 0.999999996 0.984233877 0.826066179 0.999999885 0.232901428
##  [25] 0.834519121 0.003179014 0.030002872 0.999965551 0.793474908 1.000000000
##  [31] 1.000000000 0.995806428 0.983536355 0.999942749 1.000000000 0.586132500
##  [37] 1.000000000 0.994696127 0.813057386 0.935956113 0.822006314 0.996241613
##  [43] 0.999981799 1.000000000 0.999999978 0.999999996 1.000000000 0.999999997
##  [49] 0.999999474 0.380364830 0.996264012 1.000000000 0.905166828 0.999984827
##  [55] 1.000000000 1.000000000 0.999999986 0.256927438 1.000000000 1.000000000
##  [61] 0.999972956 1.000000000 0.999998197 0.663292172 0.169665469 0.999999901
##  [67] 1.000000000 0.012945206 1.000000000 0.637839685 0.994779874 0.442752145
##  [73] 0.947846437 0.736915896 0.161379439 0.999999999 0.999794939 0.981256742
##  [79] 1.000000000 0.996480813 0.999999996 0.998525558 0.999999791 0.999327610
##  [85] 1.000000000 0.935474899 0.998513733 0.999999999 0.999998236 0.999980673
##  [91] 0.999999587 1.000000000 0.998280499 0.253127722 0.999979519 0.757679639
##  [97] 1.000000000 0.999972299 0.998231759 0.993159293
set.seed(2)
y <- rbinom(n,1,p)
y
##   [1] 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1
##  [38] 1 0 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 0 1 1 0 1 1 1 0 1 1
##  [75] 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
datagab <- data.frame(y,x1,x2,x3,x4)
datagab
##     y x1 x2 x3   x4
## 1   1  2  0  0 2.72
## 2   1  6  0  1 2.88
## 3   1  5  0  0 3.78
## 4   1  6  1  0 3.04
## 5   1  4  0  1 3.06
## 6   1  3  1  1 3.86
## 7   1  1  1  1 3.23
## 8   0  1  1  0 2.37
## 9   0  1  1  0 2.66
## 10  1 19  0  0 2.78
## 11  1  8  0  1 3.61
## 12  1  9  1  0 3.18
## 13  1  1  1  1 3.20
## 14  1  2  0  1 3.06
## 15  1  5  0  1 2.72
## 16  0  0  0  0 3.89
## 17  1  6  1  1 3.25
## 18  1  1  0  1 2.02
## 19  1  4  0  1 3.35
## 20  1  6  1  1 2.76
## 21  1  2  0  1 2.47
## 22  1  1  0  1 2.89
## 23  1  6  1  0 2.49
## 24  1  1  1  0 2.64
## 25  1  1  1  1 2.69
## 26  0  0  1  0 2.16
## 27  0  0  0  0 3.42
## 28  1  4  1  0 3.08
## 29  1  2  0  0 2.43
## 30  1  7  1  0 3.63
## 31  1  8  1  1 3.21
## 32  1  2  1  1 2.85
## 33  1  2  1  0 3.45
## 34  1  3  0  1 3.44
## 35  1 11  0  0 3.41
## 36  1  1  1  0 3.34
## 37  1 11  1  1 3.28
## 38  1  2  0  1 2.97
## 39  0  1  0  1 2.85
## 40  1  2  1  0 2.81
## 41  1  1  1  1 2.65
## 42  1  2  1  1 2.90
## 43  1  4  0  1 2.37
## 44  1 10  1  1 4.08
## 45  1  5  1  1 3.60
## 46  1  7  1  0 2.44
## 47  1 10  1  1 2.80
## 48  1  7  0  0 2.77
## 49  1  5  1  0 3.39
## 50  0  1  1  0 2.96
## 51  1  2  0  1 3.13
## 52  1 28  1  1 2.99
## 53  1  1  1  1 2.98
## 54  1  4  0  0 3.68
## 55  1 12  1  0 2.89
## 56  1 17  1  1 3.76
## 57  1  6  1  1 2.23
## 58  0  0  0  1 3.29
## 59  1  7  1  1 3.06
## 60  1  7  0  1 3.11
## 61  1  4  1  0 3.19
## 62  1  7  1  1 2.75
## 63  1  5  1  0 2.83
## 64  1  1  0  1 2.49
## 65  0  1  1  0 2.46
## 66  1  5  0  1 3.15
## 67  1 12  1  0 3.22
## 68  0  0  0  0 3.03
## 69  1  7  1  0 3.46
## 70  1  0  0  1 4.03
## 71  1  2  1  1 2.75
## 72  0  1  1  1 1.85
## 73  1  1  0  1 3.50
## 74  1  1  0  1 2.65
## 75  0  1  0  0 2.66
## 76  1  7  0  0 3.51
## 77  1  3  0  1 2.86
## 78  1  2  0  1 2.39
## 79  1 11  0  1 3.09
## 80  1  2  1  1 2.93
## 81  1  6  0  1 3.00
## 82  1  3  0  0 3.19
## 83  1  5  0  1 2.81
## 84  1  3  1  0 3.32
## 85  1  8  1  1 2.89
## 86  1  1  1  1 3.17
## 87  1  2  0  1 3.55
## 88  1  7  0  0 3.22
## 89  1  5  1  0 2.84
## 90  1  4  0  0 3.57
## 91  1  5  1  0 3.50
## 92  1 13  0  0 3.27
## 93  1  3  0  0 3.12
## 94  1  1  1  0 2.69
## 95  1  3  1  1 3.68
## 96  1  1  0  1 2.70
## 97  1  7  0  0 4.09
## 98  1  3  0  1 3.77
## 99  1  3  1  0 2.88
## 100 1  3  0  0 2.49

Analisis Regresi Logistik

modelreglog <- glm(y~x1+x2+x3+x4, family = binomial(link = "logit"), data=datagab)
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
summary(modelreglog)
## 
## Call:
## glm(formula = y ~ x1 + x2 + x3 + x4, family = binomial(link = "logit"), 
##     data = datagab)
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept) -16.9249     6.8875  -2.457   0.0140 *
## x1            5.9382     2.5365   2.341   0.0192 *
## x2            0.6319     1.4786   0.427   0.6691  
## x3            3.3549     1.5380   2.181   0.0292 *
## x4            3.6201     1.6520   2.191   0.0284 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 73.385  on 99  degrees of freedom
## Residual deviance: 19.478  on 95  degrees of freedom
## AIC: 29.478
## 
## Number of Fisher Scoring iterations: 11