# ============================================================# LOGIT DAN PROBIT — TEORI & VISUALISASI LENGKAP# Siap salin dan jalankan di R / RStudio# ============================================================# Paket yang dibutuhkan:# install.packages(c("ggplot2", "dplyr", "gridExtra",# "margins", "pROC", "tidyr"))# ============================================================library(ggplot2)
Warning: package 'ggplot2' was built under R version 4.5.3
library(dplyr)
Warning: package 'dplyr' was built under R version 4.5.3
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
library(gridExtra)
Warning: package 'gridExtra' was built under R version 4.5.3
Attaching package: 'gridExtra'
The following object is masked from 'package:dplyr':
combine
library(margins)
Warning: package 'margins' was built under R version 4.5.3
library(pROC)
Warning: package 'pROC' was built under R version 4.5.3
Type 'citation("pROC")' for a citation.
Attaching package: 'pROC'
The following objects are masked from 'package:stats':
cov, smooth, var
library(tidyr)
Warning: package 'tidyr' was built under R version 4.5.3
# ─────────────────────────────────────────────# 0. TEMA GRAFIK SERAGAM# ─────────────────────────────────────────────tema <-theme_minimal(base_size =13) +theme(plot.title =element_text(face ="bold", size =14),plot.subtitle =element_text(color ="grey50", size =11),axis.title =element_text(size =11),legend.position ="bottom",panel.grid.minor =element_blank() )# ============================================================# BAGIAN 1: FUNGSI LINK — LOGIT vs PROBIT# ============================================================z <-seq(-5, 5, length.out =300)df_link <-data.frame(z =rep(z, 2),P =c(plogis(z), # fungsi logistikpnorm(z)), # CDF normal standarModel =rep(c("Logit (Logistik)", "Probit (Normal CDF)"), each =300))p1 <-ggplot(df_link, aes(x = z, y = P, color = Model, linetype = Model)) +geom_line(linewidth =1.2) +geom_hline(yintercept =c(0, 1), linetype ="dotted", color ="grey70") +geom_vline(xintercept =0, linetype ="dotted", color ="grey70") +scale_color_manual(values =c("Logit (Logistik)"="#5046A8","Probit (Normal CDF)"="#0D6B50")) +scale_linetype_manual(values =c("Logit (Logistik)"="solid","Probit (Normal CDF)"="dashed")) +labs(title ="Fungsi Link: Logit vs Probit",subtitle ="Kedua kurva sangat mirip di tengah; Logit memiliki ekor yang sedikit lebih tebal",x ="Indeks linear z = β₀ + β₁X₁ + …",y ="P(Y = 1)",color =NULL, linetype =NULL ) + temaprint(p1)
# ============================================================# BAGIAN 2: PERBANDINGAN EKOR (TAIL BEHAVIOR)# ============================================================df_tail <-data.frame(z =rep(z, 2),pdf =c(dlogis(z), # PDF distribusi logistikdnorm(z)), # PDF normal standarModel =rep(c("Logistik (Logit)", "Normal (Probit)"), each =300))p2 <-ggplot(df_tail, aes(x = z, y = pdf, color = Model, linetype = Model)) +geom_line(linewidth =1.2) +scale_color_manual(values =c("Logistik (Logit)"="#5046A8","Normal (Probit)"="#0D6B50")) +scale_linetype_manual(values =c("Logistik (Logit)"="solid","Normal (Probit)"="dashed")) +labs(title ="PDF Distribusi Error Asumsi",subtitle ="Distribusi logistik memiliki varians π²/3 ≈ 3.29; normal standar = 1",x ="z", y ="Densitas",color =NULL, linetype =NULL ) + temaprint(p2)
# ============================================================# BAGIAN 3: SIMULASI DATA & ESTIMASI MODEL# ============================================================set.seed(42)n <-500X1 <-rnorm(n, mean =0, sd =1) # variabel kontinuX2 <-rbinom(n, 1, 0.5) # variabel dummy (0/1)# Probabilitas sejati menggunakan model logistikz_true <--0.5+1.2* X1 +0.8* X2p_true <-plogis(z_true)Y <-rbinom(n, 1, p_true) # variabel dependen binerdf <-data.frame(Y = Y, X1 = X1, X2 =factor(X2, labels =c("Grup A", "Grup B")))cat("\n===== RINGKASAN DATA =====\n")
# ============================================================# BAGIAN 4: GRAFIK KOEFISIEN (COEFFICIENT PLOT)# ============================================================df_koef <-data.frame(term =rep(names(coef(model_logit))[-1], 2), # hapus interceptest =c(coef(model_logit)[-1], coef(model_probit)[-1]),lower =c(confint(model_logit)[-1, 1], confint(model_probit)[-1, 1]),upper =c(confint(model_logit)[-1, 2], confint(model_probit)[-1, 2]),Model =rep(c("Logit", "Probit"), each =2))
Waiting for profiling to be done...
Waiting for profiling to be done...
Waiting for profiling to be done...
Waiting for profiling to be done...
p3 <-ggplot(df_koef, aes(x = est, y = term, color = Model, xmin = lower, xmax = upper)) +geom_vline(xintercept =0, linetype ="dashed", color ="grey60") +geom_errorbarh(height =0.2, linewidth =0.8, position =position_dodge(0.5)) +geom_point(size =3.5, position =position_dodge(0.5)) +scale_color_manual(values =c(Logit ="#5046A8", Probit ="#0D6B50")) +labs(title ="Estimasi Koefisien + Interval Kepercayaan 95%",subtitle ="Skala berbeda karena distribusi error berbeda (Logit ≈ 1.8 × Probit)",x ="Nilai Koefisien", y =NULL, color =NULL ) + tema
Warning: `geom_errorbarh()` was deprecated in ggplot2 4.0.0.
ℹ Please use the `orientation` argument of `geom_errorbar()` instead.
print(p3)
`height` was translated to `width`.
# ============================================================# BAGIAN 5: MARGINAL EFFECTS# ============================================================# Average Marginal Effects (AME)ame_logit <-margins(model_logit)ame_probit <-margins(model_probit)cat("\n===== AVERAGE MARGINAL EFFECTS — LOGIT =====\n")
===== AVERAGE MARGINAL EFFECTS — LOGIT =====
print(summary(ame_logit))
factor AME SE z p lower upper
X1 0.2199 0.0174 12.6080 0.0000 0.1857 0.2540
X2Grup B 0.0861 0.0402 2.1432 0.0321 0.0074 0.1649
cat("\n===== AVERAGE MARGINAL EFFECTS — PROBIT =====\n")
===== AVERAGE MARGINAL EFFECTS — PROBIT =====
print(summary(ame_probit))
factor AME SE z p lower upper
X1 0.2208 0.0171 12.9478 0.0000 0.1874 0.2542
X2Grup B 0.0879 0.0401 2.1922 0.0284 0.0093 0.1665
# Visualisasi: Marginal Effect X1 di berbagai nilai X1x1_seq <-seq(-3, 3, length.out =200)x2_fix <-0# X2 dikunci pada nilai referensiz_logit <-coef(model_logit)["(Intercept)"] +coef(model_logit)["X1"] * x1_seq +coef(model_logit)["X2Grup B"] * x2_fixz_probit <-coef(model_probit)["(Intercept)"] +coef(model_probit)["X1"] * x1_seq +coef(model_probit)["X2Grup B"] * x2_fixme_logit <-coef(model_logit)["X1"] *dlogis(z_logit)me_probit <-coef(model_probit)["X1"] *dnorm(z_probit)df_me <-data.frame(X1 =rep(x1_seq, 2),ME =c(me_logit, me_probit),Model =rep(c("Logit", "Probit"), each =200))p4 <-ggplot(df_me, aes(x = X1, y = ME, color = Model, linetype = Model)) +geom_line(linewidth =1.2) +scale_color_manual(values =c(Logit ="#5046A8", Probit ="#0D6B50")) +scale_linetype_manual(values =c(Logit ="solid", Probit ="dashed")) +labs(title ="Marginal Effect X1 terhadap P(Y=1)",subtitle ="Efek tidak konstan — terbesar di tengah (P ≈ 0.5), mengecil di ekor",x ="Nilai X1", y ="∂P(Y=1)/∂X1",color =NULL, linetype =NULL ) + temaprint(p4)
# ============================================================# BAGIAN 6: PREDIKSI & PROBABILITAS FITTED# ============================================================df$prob_logit <-fitted(model_logit)df$prob_probit <-fitted(model_probit)# Scatter: prediksi logit vs probitp5 <-ggplot(df, aes(x = prob_logit, y = prob_probit)) +geom_point(alpha =0.35, color ="#5046A8", size =1.5) +geom_abline(slope =1, intercept =0, linetype ="dashed", color ="grey50") +labs(title ="Prediksi Probabilitas: Logit vs Probit",subtitle ="Titik hampir sejajar dengan garis 45° — prediksi sangat mirip",x ="P̂ dari Logit", y ="P̂ dari Probit" ) + temaprint(p5)
# Distribusi probabilitas fitted, dibedakan Y=0 dan Y=1df_long <- df %>%pivot_longer(cols =c(prob_logit, prob_probit),names_to ="Model",values_to ="Probabilitas") %>%mutate(Model =recode(Model,prob_logit ="Logit",prob_probit ="Probit"),Y =factor(Y, labels =c("Y = 0", "Y = 1")))p6 <-ggplot(df_long, aes(x = Probabilitas, fill = Y)) +geom_histogram(bins =30, alpha =0.7, position ="identity") +facet_wrap(~ Model) +scale_fill_manual(values =c("Y = 0"="#C8C3F0", "Y = 1"="#5046A8")) +labs(title ="Distribusi Probabilitas Fitted",subtitle ="Kedua model memisahkan Y=0 dan Y=1 secara serupa",x ="Probabilitas Prediksi", y ="Frekuensi", fill =NULL ) + temaprint(p6)
# ============================================================# BAGIAN 9: PANEL RINGKASAN (4 GRAFIK SEKALIGUS)# ============================================================grid.arrange(p1, p2, p4, p7,ncol =2,top ="Logit & Probit — Ringkasan Visual")
# ============================================================# BAGIAN 10: PREDIKSI UNTUK OBSERVASI BARU# ============================================================data_baru <-data.frame(X1 =c(-2, -1, 0, 1, 2),X2 =factor(c("Grup A", "Grup A", "Grup B", "Grup B", "Grup A"),levels =c("Grup A", "Grup B")))pred_baru <- data_barupred_baru$P_logit <-round(predict(model_logit, newdata = data_baru, type ="response"), 4)pred_baru$P_probit <-round(predict(model_probit, newdata = data_baru, type ="response"), 4)pred_baru$Selisih <-round(abs(pred_baru$P_logit - pred_baru$P_probit), 4)cat("\n===== PREDIKSI DATA BARU =====\n")
===== PREDIKSI DATA BARU =====
print(pred_baru)
X1 X2 P_logit P_probit Selisih
1 -2 Grup A 0.0913 0.0809 0.0104
2 -1 Grup A 0.2295 0.2302 0.0007
3 0 Grup B 0.5743 0.5734 0.0009
4 1 Grup B 0.7999 0.8012 0.0013
5 2 Grup A 0.8857 0.8933 0.0076
cat("\n✓ Selesai! Semua grafik dan output telah ditampilkan.\n")
✓ Selesai! Semua grafik dan output telah ditampilkan.
cat(" Gunakan Ctrl+Z (undo) di RStudio jika ingin kembali ke grafik sebelumnya.\n")
Gunakan Ctrl+Z (undo) di RStudio jika ingin kembali ke grafik sebelumnya.