PEMODELAN STATISTIKA DAN SIMULASI

Simulasi Prediksi Performa Karyawan


MEMBUAT DATA


set.seed(123)
n <- 1000

Bangkitkan data pakai fungsi dasar

pengalaman <- sample(1:10, n, replace = TRUE)
jam_meeting <- rpois(n, lambda = 3)
jam_fokus   <- rnorm(n, mean = 5, sd = 1)
skor_stres  <- rnorm(n, mean = 4, sd = 1)

Hitung variabel Y (Output Kerja) pakai rumus tambah-kurang biasa

Rumus: 10 + (2 * fokus) + (1 * pengalaman) - (1.5 * stres) + sedikit acak

output_pr <- 10 + (2 * jam_fokus) + (1 * pengalaman) - (1.5 * skor_stres) + rnorm(n, 0, 1)

Gabung jadi satu tabel/dataframe (Week 2)

data_tugas <- data.frame(
  PR = output_pr,
  Fokus = jam_fokus,
  Pengalaman = pengalaman,
  Stres = skor_stres,
  Meeting = jam_meeting
)

Tampilkan Data yang Dibangkitkan

head(data_tugas, 10)
##          PR    Fokus Pengalaman    Stres Meeting
## 1  12.88461 3.705186          3 3.676400       1
## 2  12.89682 3.933967          3 5.648590       1
## 3  23.38187 5.163432         10 4.370831       1
## 4  17.65214 5.915459          2 4.081219       4
## 5  18.15865 4.706236          6 5.067810       3
## 6  22.71264 5.829991          5 2.685822       3
## 7  14.00201 3.740678          4 4.975007       2
## 8  23.57910 6.003169          6 3.139392       3
## 9  26.19364 5.762974          9 2.215193       2
## 10 22.01385 4.466761         10 4.229506       4
tail(data_tugas, 10)
##            PR    Fokus Pengalaman    Stres Meeting
## 991  19.86068 4.811801          7 3.371361       3
## 992  15.06841 3.421740          2 2.390454       2
## 993  28.63820 6.190113         10 3.044008       3
## 994  21.22652 6.780184          5 4.214192       1
## 995  17.97547 6.055045          3 4.543186       0
## 996  17.66502 6.443759          2 4.518524       0
## 997  14.56406 4.408353          6 5.277077       1
## 998  20.70227 6.414192          2 3.288898       2
## 999  16.12683 5.446844          3 4.518906       4
## 1000 12.50198 3.791128          3 4.487838       4

ESTIMASI PARAMETER


Menghitung rata-rata dan Interval Kepercayaan secara manual

Keterangan: ki = Kepercayaan Interval

rata_pr <- mean(data_tugas$PR)
sd_pr   <- sd(data_tugas$PR)
se      <- sd_pr / sqrt(n)
t_tabel <- qt(0.975, df = n - 1)
ki_bawah <- rata_pr - t_tabel * se
ki_atas  <- rata_pr + t_tabel * se

Tampilkan hasil estimasi

print("Estimasi Rata-rata PR Approved:")
## [1] "Estimasi Rata-rata PR Approved:"
print(rata_pr)
## [1] 19.62597
print("Estimasi Standard Error:")
## [1] "Estimasi Standard Error:"
print(se)
## [1] 0.1240028
print("Interval Kepercayaan 95%:")
## [1] "Interval Kepercayaan 95%:"
cat("[", ki_bawah, "sampai", ki_atas, "]")
## [ 19.38263 sampai 19.8693 ]

PEMODELAN (Regresi Linear)


Fungsi lm (linear model)

model <- lm(PR ~ Fokus + Pengalaman + Stres, data = data_tugas)

Tampilkan hasil model

summary(model)
## 
## Call:
## lm(formula = PR ~ Fokus + Pengalaman + Stres, data = data_tugas)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -3.06109 -0.68065 -0.00816  0.68049  2.83085 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  9.69248    0.21614   44.84   <2e-16 ***
## Fokus        2.00498    0.03056   65.60   <2e-16 ***
## Pengalaman   1.01618    0.01080   94.07   <2e-16 ***
## Stres       -1.46250    0.03217  -45.46   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9781 on 996 degrees of freedom
## Multiple R-squared:  0.938,  Adjusted R-squared:  0.9378 
## F-statistic:  5020 on 3 and 996 DF,  p-value: < 2.2e-16

VISUALISASI DASAR


hist(data_tugas$PR, main="Penyebaran Data PR", col="lightblue", xlab="Jumlah PR")