Membangkitkan Data

Skenario

Y : Keputusan menolak/menerima pelamar kerja pada PT A posisi B X1 : Lama pengalaman kerja sebelum (bulan) X2 : Status pekerjaan saat ini (0: Bekerja, 1: Tidak bekerja) X3 : Tingkat pendidikan (0: Lulusan Sekolah Menengah, 1: Lulusan Peruruan Tinggi) X4 : IPK (skala 4)

Membangkitkan data X1

X1 : Lama pengalaman kerja sebelumnya (bulan) Membangkitkan variabel X1 dengan lama pekerjaan 0-60 bulan dengan nilai tengah 12 dan banyak pelamar adalah 100

set.seed(100)
n <- 100
u <- runif(n)

X1 <- round(60*(-log(1-u)/12))
X1
##   [1]  2  1  4  0  3  3  8  2  4  1  5 11  2  3  7  6  1  2  2  6  4  6  4  7  3
##  [26]  1  7 11  4  2  3 13  2 15  6 11  1  5 23  1  2 10  8  9  5  3  8 11  1  2
##  [51]  2  1  1  2  4  1  1  1  5  1  3  5 16  6  3  2  3  3  1  6  3  2  4 17  5
##  [76]  5 10  7  9  0  3  5 13 20  0  4  7  1  2  7 12  1  2  3 12  2  4  1  0  7

Membangkitkan data X2

X2 : Status pekerjaan Keterangan yang digunakan (0=Tidak bekerja) dan (1=Bekerja)

set.seed(157)
X2 <- round(runif(n))
X2
##   [1] 1 0 1 1 1 1 1 0 1 1 0 0 0 1 0 0 1 0 1 1 0 1 1 0 1 1 1 0 0 1 1 1 0 1 0 0 1
##  [38] 0 1 0 1 0 1 1 1 0 1 0 1 1 0 0 0 0 1 0 0 1 0 1 1 1 0 1 0 0 1 0 0 1 0 0 0 0
##  [75] 1 1 1 1 0 0 1 1 0 1 1 1 1 1 1 0 0 0 0 0 0 0 1 1 0 0

Membangkitkan data X3

X3 : Tingkat pendidikan keterangan yang digunakan (0=lulus SMA/Tidak kuliah) dan (1=lulus kuliah)

set.seed(111)
X3 <- round(runif(n))
X3
##   [1] 1 1 0 1 0 0 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 1 0
##  [38] 1 1 1 1 0 1 1 1 0 0 1 1 1 1 1 0 0 1 1 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0
##  [75] 1 0 0 1 0 1 0 1 0 0 0 0 1 1 1 1 1 1 1 1 1 0 1 0 1 1

Membangkitkan data X4

X4 adalah data IPK Pelamar dengan skala 4

set.seed(123)
X4 <- round(rnorm(n,3,0.5),2)
X4
##   [1] 2.72 2.88 3.78 3.04 3.06 3.86 3.23 2.37 2.66 2.78 3.61 3.18 3.20 3.06 2.72
##  [16] 3.89 3.25 2.02 3.35 2.76 2.47 2.89 2.49 2.64 2.69 2.16 3.42 3.08 2.43 3.63
##  [31] 3.21 2.85 3.45 3.44 3.41 3.34 3.28 2.97 2.85 2.81 2.65 2.90 2.37 4.08 3.60
##  [46] 2.44 2.80 2.77 3.39 2.96 3.13 2.99 2.98 3.68 2.89 3.76 2.23 3.29 3.06 3.11
##  [61] 3.19 2.75 2.83 2.49 2.46 3.15 3.22 3.03 3.46 4.03 2.75 1.85 3.50 2.65 2.66
##  [76] 3.51 2.86 2.39 3.09 2.93 3.00 3.19 2.81 3.32 2.89 3.17 3.55 3.22 2.84 3.57
##  [91] 3.50 3.27 3.12 2.69 3.68 2.70 4.09 3.77 2.88 2.49

Membangkitkan data Y

Menentukan koef

b0 <- -11
b1 <- 3.5
b2 <- 0.5
b3 <- 2.7
b4 <- 3.2
set.seed(1)
datapendukung <- b0+(b1*X1)+(b2*X2)+(b3*X3)+(b4*X4)
datapendukung
##   [1]  7.904  4.416 15.596  1.928  9.792 12.352 27.836  6.284 12.012  1.896
##  [11] 20.752 40.376  6.240  9.792 22.204 22.448  3.400  5.164  7.220 22.032
##  [21] 10.904 19.748 11.468 21.948 11.308 -0.088 27.644 37.356 13.476 10.816
##  [31] 10.272 46.820  7.040 53.008 20.912 40.888  3.496 18.704 81.820  4.192
##  [41]  7.680 33.280 27.784 36.756 21.220  7.308 26.460 39.064  6.548  8.672
##  [51]  8.716  4.768  2.036  7.776 15.448  7.232  2.336  3.528 16.292  5.652
##  [61] 10.208 15.800 54.056 18.468 10.072  6.080 10.304  9.196  3.572 23.396
##  [71] 11.000  4.620 14.200 56.980 18.212 18.232 33.652 24.348 30.388  1.076
##  [81]  9.600 19.908 43.492 70.124 -1.252 13.644 28.060  6.004  8.288 27.624
##  [91] 44.900  5.664  8.684 10.808 45.476  4.640 19.288  5.064  0.916 24.168
p <- exp(datapendukung)/(1+exp(datapendukung))
p
##   [1] 0.9996309 0.9880618 0.9999998 0.8730279 0.9999441 0.9999957 1.0000000
##   [8] 0.9981376 0.9999939 0.8694381 1.0000000 1.0000000 0.9980539 0.9999441
##  [15] 1.0000000 1.0000000 0.9677045 0.9943137 0.9992687 1.0000000 0.9999816
##  [22] 1.0000000 0.9999895 1.0000000 0.9999877 0.4780142 1.0000000 1.0000000
##  [29] 0.9999986 0.9999799 0.9999654 1.0000000 0.9991246 1.0000000 1.0000000
##  [36] 1.0000000 0.9705737 1.0000000 1.0000000 0.9851091 0.9995382 1.0000000
##  [43] 1.0000000 1.0000000 1.0000000 0.9993303 1.0000000 1.0000000 0.9985691
##  [50] 0.9998287 0.9998361 0.9915742 0.8845253 0.9995805 0.9999998 0.9992774
##  [57] 0.9118150 0.9714740 0.9999999 0.9965018 0.9999631 0.9999999 1.0000000
##  [64] 1.0000000 0.9999578 0.9977170 0.9999665 0.9998986 0.9726684 1.0000000
##  [71] 0.9999833 0.9902433 0.9999993 1.0000000 1.0000000 1.0000000 1.0000000
##  [78] 1.0000000 1.0000000 0.7457363 0.9999323 1.0000000 1.0000000 1.0000000
##  [85] 0.2223541 0.9999988 1.0000000 0.9975372 0.9997485 1.0000000 1.0000000
##  [92] 0.9965434 0.9998308 0.9999798 1.0000000 0.9904347 1.0000000 0.9937195
##  [99] 0.7142264 1.0000000
set.seed(2)
y <- rbinom(n,1,p)
y
##   [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1
##  [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
##  [75] 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1
datagab <- data.frame(y,X1,X2,X3,X4)
datagab
##     y X1 X2 X3   X4
## 1   1  2  1  1 2.72
## 2   1  1  0  1 2.88
## 3   1  4  1  0 3.78
## 4   1  0  1  1 3.04
## 5   1  3  1  0 3.06
## 6   1  3  1  0 3.86
## 7   1  8  1  0 3.23
## 8   1  2  0  1 2.37
## 9   1  4  1  0 2.66
## 10  1  1  1  0 2.78
## 11  1  5  0  1 3.61
## 12  1 11  0  1 3.18
## 13  1  2  0  0 3.20
## 14  1  3  1  0 3.06
## 15  1  7  0  0 2.72
## 16  1  6  0  0 3.89
## 17  1  1  1  0 3.25
## 18  1  2  0  1 2.02
## 19  1  2  1  0 3.35
## 20  1  6  1  1 2.76
## 21  1  4  0  0 2.47
## 22  1  6  1  0 2.89
## 23  1  4  1  0 2.49
## 24  1  7  0  0 2.64
## 25  1  3  1  1 2.69
## 26  0  1  1  0 2.16
## 27  1  7  1  1 3.42
## 28  1 11  0  0 3.08
## 29  1  4  0  1 2.43
## 30  1  2  1  1 3.63
## 31  1  3  1  0 3.21
## 32  1 13  1  1 2.85
## 33  1  2  0  0 3.45
## 34  1 15  1  0 3.44
## 35  1  6  0  0 3.41
## 36  1 11  0  1 3.34
## 37  1  1  1  0 3.28
## 38  1  5  0  1 2.97
## 39  1 23  1  1 2.85
## 40  1  1  0  1 2.81
## 41  1  2  1  1 2.65
## 42  1 10  0  0 2.90
## 43  1  8  1  1 2.37
## 44  1  9  1  1 4.08
## 45  1  5  1  1 3.60
## 46  1  3  0  0 2.44
## 47  1  8  1  0 2.80
## 48  1 11  0  1 2.77
## 49  1  1  1  1 3.39
## 50  1  2  1  1 2.96
## 51  1  2  0  1 3.13
## 52  1  1  0  1 2.99
## 53  0  1  0  0 2.98
## 54  1  2  0  0 3.68
## 55  1  4  1  1 2.89
## 56  1  1  0  1 3.76
## 57  1  1  0  1 2.23
## 58  1  1  1  0 3.29
## 59  1  5  0  0 3.06
## 60  1  1  1  1 3.11
## 61  1  3  1  0 3.19
## 62  1  5  1  0 2.75
## 63  1 16  0  0 2.83
## 64  1  6  1  0 2.49
## 65  1  3  0  1 2.46
## 66  1  2  0  0 3.15
## 67  1  3  1  0 3.22
## 68  1  3  0  0 3.03
## 69  1  1  0  0 3.46
## 70  1  6  1  0 4.03
## 71  1  3  0  1 2.75
## 72  1  2  0  1 1.85
## 73  1  4  0  0 3.50
## 74  1 17  0  0 2.65
## 75  1  5  1  1 2.66
## 76  1  5  1  0 3.51
## 77  1 10  1  0 2.86
## 78  1  7  1  1 2.39
## 79  1  9  0  0 3.09
## 80  1  0  0  1 2.93
## 81  1  3  1  0 3.00
## 82  1  5  1  1 3.19
## 83  1 13  0  0 2.81
## 84  1 20  1  0 3.32
## 85  0  0  1  0 2.89
## 86  1  4  1  0 3.17
## 87  1  7  1  1 3.55
## 88  1  1  1  1 3.22
## 89  1  2  1  1 2.84
## 90  1  7  0  1 3.57
## 91  1 12  0  1 3.50
## 92  1  1  0  1 3.27
## 93  1  2  0  1 3.12
## 94  1  3  0  1 2.69
## 95  1 12  0  1 3.68
## 96  1  2  0  0 2.70
## 97  1  4  1  1 4.09
## 98  1  1  1  0 3.77
## 99  0  0  0  1 2.88
## 100 1  7  0  1 2.49

Analisis Regresi Logistik

modelreglog <- glm(y ~ X1 + X2 + X3 + X4, family = binomial(link="logit"),data=datagab)
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
summary(modelreglog)
## 
## Call:
## glm(formula = y ~ X1 + X2 + X3 + X4, family = binomial(link = "logit"), 
##     data = datagab)
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  -2437.7   158954.0  -0.015    0.988
## X1             422.5    21440.9   0.020    0.984
## X2             371.6   103370.5   0.004    0.997
## X3             698.3   105741.9   0.007    0.995
## X4             598.8    30550.1   0.020    0.984
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 3.3589e+01  on 99  degrees of freedom
## Residual deviance: 1.2846e-06  on 95  degrees of freedom
## AIC: 10
## 
## Number of Fisher Scoring iterations: 25