Membangkitkan Data

Skenario

Y: Keputusan menolak/menerima pelamar kerja pada PT A posisi B X1 : Lama Pengalaman kerja sebelumnya (bulan) X2 : Status pekerjaan saat ini (0: Bekerja, 1: tidak bekerja) X3 : Tingkat pendidikan (0: Lulusan Sekolah Menengah, 1: Lulusan Perguruan Tinggi) X4 : IPK (skala 4)

Membangkitkan X1

X1 : Lama Pengalaman kerja sebelumnya (bulan) membangkitkan variabel x1 dengan lama pekerjaan 0-60 bulan dengan nilai tengah 12 dan banyak pelamar adalah 100.

set.seed(100) ## mengunci data set seed
n <- 100
u <- runif(n)

## 60 = lama pekerja
## 12 = nilai tengah
## 100 = banyak pelamar 
x1 <- round(60*(-(log(1-u)/12)))
x1
##   [1]  2  1  4  0  3  3  8  2  4  1  5 11  2  3  7  6  1  2  2  6  4  6  4  7  3
##  [26]  1  7 11  4  2  3 13  2 15  6 11  1  5 23  1  2 10  8  9  5  3  8 11  1  2
##  [51]  2  1  1  2  4  1  1  1  5  1  3  5 16  6  3  2  3  3  1  6  3  2  4 17  5
##  [76]  5 10  7  9  0  3  5 13 20  0  4  7  1  2  7 12  1  2  3 12  2  4  1  0  7

Membangkitkan Data x2

X2 : Status pekerjaan saat ini (0: Bekerja, 1: tidak bekerja)

set.seed(322) ## mengunci data set seed
x2 <- round(runif(n))
x2
##   [1] 0 0 0 1 0 1 1 0 0 1 0 1 0 1 1 1 0 0 0 1 1 1 1 1 0 0 1 1 1 0 0 0 1 1 0 0 1
##  [38] 0 1 0 0 0 0 0 1 0 1 1 1 0 1 0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1 0 1 0 1 0 0 1
##  [75] 1 1 0 1 1 1 1 0 0 1 1 1 0 1 1 0 0 1 1 0 0 0 1 1 1 1

Membangkitkan Data x3

x3 : Tingkat Pendidikan X3 : Tingkat pendidikan (0: Lulusan Sekolah Menengah, 1: Lulusan Perguruan Tinggi)

set.seed(100) ## mengunci data set seed
x3 <- round(runif(n))
x3
##   [1] 0 0 1 0 0 0 1 0 1 0 1 1 0 0 1 1 0 0 0 1 1 1 1 1 0 0 1 1 1 0 0 1 0 1 1 1 0
##  [38] 1 1 0 0 1 1 1 1 0 1 1 0 0 0 0 0 0 1 0 0 0 1 0 0 1 1 1 0 0 0 0 0 1 0 0 1 1
##  [75] 1 1 1 1 1 0 0 1 1 1 0 1 1 0 0 1 1 0 0 0 1 0 1 0 0 1

membangkitkan data x4

X4 : IPK (skala 4)

set.seed(11)
x4 <- round(rnorm(n,3,0.5),2)
x4
##   [1] 2.70 3.01 2.24 2.32 3.59 2.53 3.66 3.31 2.98 2.50 2.59 2.83 2.23 2.87 2.43
##  [16] 3.01 2.89 3.44 2.70 2.67 2.66 2.99 2.78 3.18 3.04 3.00 2.91 2.62 2.89 2.51
##  [31] 2.45 2.53 3.34 2.21 2.57 3.24 2.91 3.77 2.69 2.83 2.18 3.01 3.45 2.56 3.45
##  [46] 2.83 1.91 3.44 3.36 3.11 3.39 2.89 2.59 3.25 3.08 3.27 2.92 3.22 3.74 3.03
##  [61] 2.58 4.17 2.94 2.02 3.27 3.85 2.60 2.46 2.70 3.38 3.23 2.94 2.62 3.11 3.56
##  [76] 3.08 2.66 3.23 2.47 3.20 2.97 3.16 2.70 2.55 4.13 2.70 2.35 3.25 2.57 2.25
##  [91] 3.60 2.49 3.47 2.73 3.26 2.82 3.66 2.43 3.71 2.70

Membangkitkan Data Y

mennetukan Koefesien

b0 <- -10
b1 <- 3.5
b2 <- 0.5
b3 <- 2.7 
b4 <- 1.2
set.seed(2)
datapendukung <- b0+(b1*x1)+(b2*x2)+(b3*x3)+(b4*x4)
datapendukung
##   [1]  0.240 -2.888  9.388 -6.716  4.808  4.036 25.592  0.972 10.276 -3.000
##  [11] 13.308 35.096 -0.324  4.444 20.616 17.812 -3.032  1.128  0.240 17.404
##  [21] 10.392 17.788 10.536 21.516  4.148 -2.900 21.192 34.844 10.668  0.012
##  [31]  3.440 41.236  1.508 48.352 16.784 35.088 -2.508 14.724 76.928 -3.104
##  [41] -0.384 31.312 24.840 27.272 14.840  3.896 23.492 35.828 -1.968  0.732
##  [51]  1.568 -3.032 -2.892  1.400 10.396 -2.576 -2.496 -2.636 15.188 -2.864
##  [61]  3.596 15.204 52.728 16.624  4.424  2.120  4.120  3.452 -2.760 17.756
##  [71]  4.876  0.528  9.844 56.432 14.972 14.396 30.892 21.576 27.664 -5.660
##  [81]  4.564 13.992 41.440 66.260 -4.544 10.440 20.020 -2.100  0.584 19.900
##  [91] 39.020 -3.012  1.664  3.776 38.612  0.384 11.592 -3.084 -5.048 20.940
p <- exp(datapendukung)/(1+exp(datapendukung))
p
##   [1] 0.559713649 0.052749964 0.999916284 0.001209908 0.991901942 0.982638736
##   [7] 1.000000000 0.725517961 0.999965551 0.047425873 0.999998339 1.000000000
##  [13] 0.419701228 0.988387584 0.999999999 0.999999982 0.046000978 0.755469617
##  [19] 0.559713649 0.999999972 0.999969324 0.999999981 0.999973438 1.000000000
##  [25] 0.984449656 0.052153563 0.999999999 1.000000000 0.999976722 0.502999964
##  [31] 0.968931516 1.000000000 0.818764618 1.000000000 0.999999949 1.000000000
##  [37] 0.075299250 0.999999597 1.000000000 0.042942560 0.405162509 1.000000000
##  [43] 1.000000000 1.000000000 0.999999641 0.980081758 1.000000000 1.000000000
##  [49] 0.122603869 0.675244005 0.827498306 0.046000978 0.052550451 0.802183889
##  [55] 0.999969446 0.070699083 0.076139071 0.066857153 0.999999747 0.053962135
##  [61] 0.973299252 0.999999751 1.000000000 0.999999940 0.988155776 0.892831930
##  [67] 0.984015152 0.969290729 0.059524366 0.999999981 0.992430275 0.629016523
##  [73] 0.999946938 1.000000000 0.999999685 0.999999440 1.000000000 1.000000000
##  [79] 1.000000000 0.003470431 0.989687168 0.999999162 1.000000000 1.000000000
##  [85] 0.010518973 0.999970762 0.999999998 0.109096821 0.641987286 0.999999998
##  [91] 1.000000000 0.046886688 0.840774225 0.977599132 1.000000000 0.594837491
##  [97] 0.999990760 0.043772085 0.006381184 0.999999999
set.seed(3)
y <- rbinom(n,1,p)
y
##   [1] 1 0 1 0 1 1 1 1 1 0 1 1 0 1 1 1 0 1 0 1 1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 1 0
##  [38] 1 1 0 0 1 1 1 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1
##  [75] 1 1 1 1 1 0 1 1 1 1 0 1 1 0 0 1 1 0 0 1 1 1 1 0 0 1
datagab <- data.frame(y,x1,x2,x3,x4)
datagab
##     y x1 x2 x3   x4
## 1   1  2  0  0 2.70
## 2   0  1  0  0 3.01
## 3   1  4  0  1 2.24
## 4   0  0  1  0 2.32
## 5   1  3  0  0 3.59
## 6   1  3  1  0 2.53
## 7   1  8  1  1 3.66
## 8   1  2  0  0 3.31
## 9   1  4  0  1 2.98
## 10  0  1  1  0 2.50
## 11  1  5  0  1 2.59
## 12  1 11  1  1 2.83
## 13  0  2  0  0 2.23
## 14  1  3  1  0 2.87
## 15  1  7  1  1 2.43
## 16  1  6  1  1 3.01
## 17  0  1  0  0 2.89
## 18  1  2  0  0 3.44
## 19  0  2  0  0 2.70
## 20  1  6  1  1 2.67
## 21  1  4  1  1 2.66
## 22  1  6  1  1 2.99
## 23  1  4  1  1 2.78
## 24  1  7  1  1 3.18
## 25  1  3  0  0 3.04
## 26  0  1  0  0 3.00
## 27  1  7  1  1 2.91
## 28  1 11  1  1 2.62
## 29  1  4  1  1 2.89
## 30  0  2  0  0 2.51
## 31  1  3  0  0 2.45
## 32  1 13  0  1 2.53
## 33  1  2  1  0 3.34
## 34  1 15  1  1 2.21
## 35  1  6  0  1 2.57
## 36  1 11  0  1 3.24
## 37  0  1  1  0 2.91
## 38  1  5  0  1 3.77
## 39  1 23  1  1 2.69
## 40  0  1  0  0 2.83
## 41  0  2  0  0 2.18
## 42  1 10  0  1 3.01
## 43  1  8  0  1 3.45
## 44  1  9  0  1 2.56
## 45  1  5  1  1 3.45
## 46  1  3  0  0 2.83
## 47  1  8  1  1 1.91
## 48  1 11  1  1 3.44
## 49  0  1  1  0 3.36
## 50  1  2  0  0 3.11
## 51  1  2  1  0 3.39
## 52  0  1  0  0 2.89
## 53  0  1  1  0 2.59
## 54  1  2  1  0 3.25
## 55  1  4  0  1 3.08
## 56  0  1  0  0 3.27
## 57  1  1  1  0 2.92
## 58  0  1  0  0 3.22
## 59  1  5  1  1 3.74
## 60  0  1  0  0 3.03
## 61  1  3  0  0 2.58
## 62  1  5  0  1 4.17
## 63  1 16  1  1 2.94
## 64  1  6  1  1 2.02
## 65  1  3  0  0 3.27
## 66  1  2  1  0 3.85
## 67  1  3  1  0 2.60
## 68  1  3  0  0 2.46
## 69  0  1  1  0 2.70
## 70  1  6  0  1 3.38
## 71  1  3  1  0 3.23
## 72  1  2  0  0 2.94
## 73  1  4  0  1 2.62
## 74  1 17  1  1 3.11
## 75  1  5  1  1 3.56
## 76  1  5  1  1 3.08
## 77  1 10  0  1 2.66
## 78  1  7  1  1 3.23
## 79  1  9  1  1 2.47
## 80  0  0  1  0 3.20
## 81  1  3  1  0 2.97
## 82  1  5  0  1 3.16
## 83  1 13  0  1 2.70
## 84  1 20  1  1 2.55
## 85  0  0  1  0 4.13
## 86  1  4  1  1 2.70
## 87  1  7  0  1 2.35
## 88  0  1  1  0 3.25
## 89  0  2  1  0 2.57
## 90  1  7  0  1 2.25
## 91  1 12  0  1 3.60
## 92  0  1  1  0 2.49
## 93  0  2  1  0 3.47
## 94  1  3  0  0 2.73
## 95  1 12  0  1 3.26
## 96  1  2  0  0 2.82
## 97  1  4  1  1 3.66
## 98  0  1  1  0 2.43
## 99  0  0  1  0 3.71
## 100 1  7  1  1 2.70

Analisisi Regresi Logistik

modelreglog <- glm(y~x1+x2+x3+x4, family = binomial(link = "logit"), data=datagab)
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
summary(modelreglog)
## 
## Call:
## glm(formula = y ~ x1 + x2 + x3 + x4, family = binomial(link = "logit"), 
##     data = datagab)
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -17.6497     6.0340  -2.925 0.003444 ** 
## x1             4.1203     1.1436   3.603 0.000315 ***
## x2            -0.3883     1.2880  -0.302 0.763030    
## x3            11.6475  3735.6328   0.003 0.997512    
## x4             3.5062     1.7119   2.048 0.040553 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 114.611  on 99  degrees of freedom
## Residual deviance:  22.672  on 95  degrees of freedom
## AIC: 32.672
## 
## Number of Fisher Scoring iterations: 21