Pendahuluan

Dataset: https://www.kaggle.com/datasets/zkyfauzi/work-wellbeing-dataset

Variabel Dependen (Y): Burnout_Risk — Ordinal dengan tiga kategori: Low < Medium < High

Variabel Independen (X):

Variabel	Tipe	Keterangan
`Work_Location`	Kategorik (Home / Office / Coworking)	Lokasi kerja utama: `Home, Office, Coworking`
`Seniority_Level`	Kategorik (Junior / Mid / Senior)	Level jabatan: `Junior, Mid, Senior`
`Avg_Working_Hours`	Numerik	Rata-rata jam kerja per hari (rentang dibatasi 6.0-13.0)
`Meeting_Intensity`	Numerik	Rata-rata jam meeting/call per hari (0-10)
`Sentiment_Score`	Numerik	Skor sentimen pada rentang -1.0 sampai 1.0

Tujuan: Memodelkan faktor-faktor yang memengaruhi tingkat risiko burnout karyawan menggunakan Ordinal Logistic Regression.

Install & Load Package

# install.packages(c("MASS", "brant", "car", "dplyr", "tidyr", "ggplot2"), quiet = TRUE)

library(MASS)
library(brant)
library(car)
library(dplyr)
library(tidyr)
library(ggplot2)

Load Data

df <- read.csv("work_wellbeing_dataset.csv")
head(df, 10)

Employee_ID	Work_Location	Avg_Working_Hours	Meeting_Intensity	Internet_Reliability	Seniority_Level	Work_Life_Balance	Daily_Mood_Note	Sentiment_Score	Burnout_Risk
1	Home	8.14	2	Fair	Junior	4	Fokus bagus hari ini, meeting berjalan efisien dan tugas utama tuntas.	0.685	Low
2	Office	9.66	4	Good	Mid	2	Hari cukup padat, beberapa meeting memecah fokus tapi masih terkendali. Fokus tambahan: Organized leadingedge info-mediaries.	-0.463	Medium
3	Home	8.21	1	Excellent	Senior	2	Progress ada, walau ritme kerja naik turun sepanjang hari.	-0.226	Low
4	Home	9.71	4	Good	Mid	5	Hari produktif, pekerjaan selesai tepat waktu dan energi masih stabil.	0.897	Low
5	Home	11.26	5	Excellent	Mid	4	Progress ada, walau ritme kerja naik turun sepanjang hari. Fokus tambahan: Persistent high-level Graphical User Interface.	0.365	Medium
6	Office	9.44	3	Excellent	Mid	4	Hari produktif, pekerjaan selesai tepat waktu dan energi masih stabil.	0.797	Low
7	Coworking	10.68	4	Good	Senior	1	Koneksi dan tekanan deadline membuat hari ini cukup berat.	-1.000	High
8	Home	8.12	1	Fair	Junior	4	Ritme kerja nyaman, bisa selesai tanpa lembur berlebihan.	0.675	Low
9	Office	11.72	3	Fair	Mid	4	Tugas selesai sebagian, perlu atur ulang prioritas untuk besok.	0.304	Low
10	Home	10.46	5	Good	Senior	5	Hari produktif, pekerjaan selesai tepat waktu dan energi masih stabil.	0.612	Medium

Info data

str(df)

## 'data.frame':    30000 obs. of  10 variables:
##  $ Employee_ID         : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Work_Location       : chr  "Home" "Office" "Home" "Home" ...
##  $ Avg_Working_Hours   : num  8.14 9.66 8.21 9.71 11.26 ...
##  $ Meeting_Intensity   : int  2 4 1 4 5 3 4 1 3 5 ...
##  $ Internet_Reliability: chr  "Fair" "Good" "Excellent" "Good" ...
##  $ Seniority_Level     : chr  "Junior" "Mid" "Senior" "Mid" ...
##  $ Work_Life_Balance   : int  4 2 2 5 4 4 1 4 4 5 ...
##  $ Daily_Mood_Note     : chr  "Fokus bagus hari ini, meeting berjalan efisien dan tugas utama tuntas." "Hari cukup padat, beberapa meeting memecah fokus tapi masih terkendali. Fokus tambahan: Organized leadingedge info-mediaries." "Progress ada, walau ritme kerja naik turun sepanjang hari." "Hari produktif, pekerjaan selesai tepat waktu dan energi masih stabil." ...
##  $ Sentiment_Score     : num  0.685 -0.463 -0.226 0.897 0.365 0.797 -1 0.675 0.304 0.612 ...
##  $ Burnout_Risk        : chr  "Low" "Medium" "Low" "Low" ...

Statistika Deskriptif

summary(df)

##   Employee_ID    Work_Location      Avg_Working_Hours Meeting_Intensity
##  Min.   :    1   Length:30000       Min.   : 6.000    Min.   : 1.000   
##  1st Qu.: 7501   Class :character   1st Qu.: 8.680    1st Qu.: 2.000   
##  Median :15000   Mode  :character   Median : 9.570    Median : 3.000   
##  Mean   :15000                      Mean   : 9.586    Mean   : 3.631   
##  3rd Qu.:22500                      3rd Qu.:10.470    3rd Qu.: 5.000   
##  Max.   :30000                      Max.   :13.000    Max.   :10.000   
##  Internet_Reliability Seniority_Level    Work_Life_Balance Daily_Mood_Note   
##  Length:30000         Length:30000       Min.   :1.00      Length:30000      
##  Class :character     Class :character   1st Qu.:3.00      Class :character  
##  Mode  :character     Mode  :character   Median :4.00      Mode  :character  
##                                          Mean   :3.61                        
##                                          3rd Qu.:4.00                        
##                                          Max.   :5.00                        
##  Sentiment_Score   Burnout_Risk      
##  Min.   :-1.0000   Length:30000      
##  1st Qu.:-0.2040   Class :character  
##  Median : 0.5125   Mode  :character  
##  Mean   : 0.3042                     
##  3rd Qu.: 0.8410                     
##  Max.   : 1.0000

Cek Missing Values

colSums(is.na(df))

##          Employee_ID        Work_Location    Avg_Working_Hours 
##                    0                    0                    0 
##    Meeting_Intensity Internet_Reliability      Seniority_Level 
##                    0                    0                    0 
##    Work_Life_Balance      Daily_Mood_Note      Sentiment_Score 
##                    0                    0                    0 
##         Burnout_Risk 
##                    0

Eksplorasi Data

Distribusi Variabel Numerik

df %>%
  select(Avg_Working_Hours, Meeting_Intensity, Sentiment_Score) %>%
  pivot_longer(cols = everything()) %>%
  ggplot(aes(x = value)) +
  geom_histogram(bins = 20, fill = "steelblue", color = "white") +
  facet_wrap(~name, scales = "free") +
  theme_minimal(base_size = 12) +
  labs(title = "Distribusi Variabel Numerik", x = NULL, y = "Frekuensi")

Distribusi Variabel Kategorik

df %>%
  select(Work_Location, Seniority_Level, Burnout_Risk) %>%
  pivot_longer(cols = everything()) %>%
  ggplot(aes(x = value)) +
  geom_bar(fill = "orange", color = "white") +
  facet_wrap(~name, scales = "free") +
  theme_minimal(base_size = 12) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  labs(title = "Distribusi Variabel Kategorik", x = NULL, y = "Frekuensi")

Persiapan Data

Pilih Variabel yang digunakan

ordinaldf <- df[, c("Burnout_Risk", "Work_Location", "Seniority_Level",
                    "Avg_Working_Hours", "Meeting_Intensity", "Sentiment_Score")]

Konversi Y ke ordered factor

ordinaldf$Burnout_Risk <- factor(
  ordinaldf$Burnout_Risk,
  levels  = c("Low", "Medium", "High"),
  ordered = TRUE
)

Konversi variabel kategorik ke faktor

ordinaldf$Work_Location <- factor(
  ordinaldf$Work_Location,
  levels = c("Home", "Office", "Coworking")
)

ordinaldf$Seniority_Level <- factor(
  ordinaldf$Seniority_Level,
  levels = c("Junior", "Mid", "Senior")
)

cat("Dimensi data:", dim(ordinaldf), "\n")

## Dimensi data: 30000 6

head(ordinaldf)

Burnout_Risk	Work_Location	Seniority_Level	Avg_Working_Hours	Meeting_Intensity	Sentiment_Score
Low	Home	Junior	8.14	2	0.685
Medium	Office	Mid	9.66	4	-0.463
Low	Home	Senior	8.21	1	-0.226
Low	Home	Mid	9.71	4	0.897
Medium	Home	Mid	11.26	5	0.365
Low	Office	Mid	9.44	3	0.797

Uji Asumsi

Asumsi 1: Variabel Dependen Bersifat Ordinal

cat("Class:", class(ordinaldf$Burnout_Risk), "\n")

## Class: ordered factor

cat("Is ordered:", is.ordered(ordinaldf$Burnout_Risk), "\n")

## Is ordered: TRUE

cat("Levels:", paste(levels(ordinaldf$Burnout_Risk), collapse = " < "), "\n\n")

## Levels: Low < Medium < High

tbl <- table(ordinaldf$Burnout_Risk)
data.frame(
  Kategori  = names(tbl),
  Frekuensi = as.integer(tbl),
  Proporsi  = paste0(round(prop.table(tbl) * 100, 2), "%")
)

Kategori	Frekuensi	Proporsi
Low	15584	51.95%
Medium	7581	25.27%
High	6835	22.78%

Terpenuhi Burnout_Risk adalah ordered factor dengan urutan Low < Medium < High.

Asumsi 2: Independensi Observasi

cat("Jumlah baris     :", nrow(ordinaldf), "\n")

## Jumlah baris     : 30000

cat("Jumlah ID unik   :", length(unique(df$Employee_ID)), "\n")

## Jumlah ID unik   : 30000

cat("Independen?      :", nrow(ordinaldf) == length(unique(df$Employee_ID)), "\n")

## Independen?      : TRUE

Terpenuhi — Setiap baris merepresentasikan satu karyawan unik.

Asumsi 3: Tidak Ada Multikolinearitas

Matriks Korelasi variabel numerik

round(cor(ordinaldf[, c("Avg_Working_Hours", "Meeting_Intensity", "Sentiment_Score")]), 4)

##                   Avg_Working_Hours Meeting_Intensity Sentiment_Score
## Avg_Working_Hours            1.0000             0.520         -0.3115
## Meeting_Intensity            0.5200             1.000         -0.1740
## Sentiment_Score             -0.3115            -0.174          1.0000

VIF

ordinaldf$Y_num <- as.numeric(ordinaldf$Burnout_Risk)

model_ols <- lm(
  Y_num ~ Work_Location + Seniority_Level +
          Avg_Working_Hours + Meeting_Intensity +
          Sentiment_Score,
  data = ordinaldf
)

round(vif(model_ols), 3)

##                    GVIF Df GVIF^(1/(2*Df))
## Work_Location     1.031  2           1.008
## Seniority_Level   1.324  2           1.073
## Avg_Working_Hours 1.596  1           1.263
## Meeting_Intensity 1.494  1           1.222
## Sentiment_Score   1.115  1           1.056

ordinaldf$Y_num <- NULL

Terpenuhi — Semua |r| < 0.8 dan VIF < 10.

Asumsi 4: Tidak Ada Outlier Ekstrem

detect_outlier_iqr <- function(x, varname) {
  Q1    <- quantile(x, 0.25)
  Q3    <- quantile(x, 0.75)
  IQR_v <- Q3 - Q1
  lo    <- Q1 - 1.5 * IQR_v
  hi    <- Q3 + 1.5 * IQR_v
  idx   <- x < lo | x > hi
  data.frame(
    Variabel      = varname,
    Q1            = round(Q1, 3),
    Q3            = round(Q3, 3),
    Batas_Bawah   = round(lo, 3),
    Batas_Atas    = round(hi, 3),
    N_Outlier     = sum(idx),
    Pct_Outlier   = paste0(round(mean(idx) * 100, 2), "%")
  )
}

do.call(rbind, list(
  detect_outlier_iqr(ordinaldf$Avg_Working_Hours, "Avg_Working_Hours"),
  detect_outlier_iqr(ordinaldf$Meeting_Intensity, "Meeting_Intensity"),
  detect_outlier_iqr(ordinaldf$Sentiment_Score,   "Sentiment_Score")
))

	Variabel	Q1	Q3	Batas_Bawah	Batas_Atas	N_Outlier	Pct_Outlier
25%	Avg_Working_Hours	8.680	10.470	5.995	13.155	0	0%
25%1	Meeting_Intensity	2.000	5.000	-2.500	9.500	75	0.25%
25%2	Sentiment_Score	-0.204	0.841	-1.771	2.409	0	0%

par(mfrow = c(1, 3))
boxplot(ordinaldf$Avg_Working_Hours, main = "Avg Working Hours",
        col = "lightblue",   ylab = "Jam")
boxplot(ordinaldf$Meeting_Intensity, main = "Meeting Intensity",
        col = "lightyellow", ylab = "Skor")
boxplot(ordinaldf$Sentiment_Score,   main = "Sentiment Score",
        col = "lightgreen",  ylab = "Skor")

par(mfrow = c(1, 1))

ordinaldf$Y_num <- as.numeric(ordinaldf$Burnout_Risk)

model_ols2 <- lm(
  Y_num ~ Work_Location + Seniority_Level +
          Avg_Working_Hours + Meeting_Intensity +
          Sentiment_Score,
  data = ordinaldf
)
n         <- nrow(ordinaldf)
cooksd    <- cooks.distance(model_ols2)
threshold <- 4 / n

data.frame(
  Threshold       = round(threshold, 6),
  N_Berpengaruh   = sum(cooksd > threshold, na.rm = TRUE),
  Pct_Berpengaruh = paste0(round(mean(cooksd > threshold, na.rm = TRUE) * 100, 2), "%")
)

Threshold	N_Berpengaruh	Pct_Berpengaruh
0.000133	1567	5.22%

ordinaldf$Y_num <- NULL

Terpenuhi — Outlier IQR wajar.

Estimasi Model

Fit Model Ordinal Logistic Regression

model_polr <- polr(
  Burnout_Risk ~ Work_Location + Seniority_Level +
                 Avg_Working_Hours + Meeting_Intensity +
                 Sentiment_Score,
  data   = ordinaldf,
  method = "logistic",
  Hess   = TRUE
)

Summary Model

summary(model_polr)

## Call:
## polr(formula = Burnout_Risk ~ Work_Location + Seniority_Level + 
##     Avg_Working_Hours + Meeting_Intensity + Sentiment_Score, 
##     data = ordinaldf, Hess = TRUE, method = "logistic")
## 
## Coefficients:
##                            Value Std. Error  t value
## Work_LocationOffice    -0.006156   0.030520  -0.2017
## Work_LocationCoworking -0.006986   0.034416  -0.2030
## Seniority_LevelMid      0.032234   0.030816   1.0460
## Seniority_LevelSenior   0.071556   0.038569   1.8553
## Avg_Working_Hours       0.630927   0.013819  45.6564
## Meeting_Intensity       0.160553   0.007213  22.2595
## Sentiment_Score        -2.298132   0.024485 -93.8588
## 
## Intercepts:
##             Value    Std. Error t value 
## Low|Medium    5.8553   0.1258    46.5411
## Medium|High   7.9673   0.1309    60.8603
## 
## Residual Deviance: 41910.35 
## AIC: 41928.35

model_null <- polr(
  Burnout_Risk ~ 1,
  data   = ordinaldf,
  method = "logistic",
  Hess   = TRUE
)

Uji Serentak — Likelihood Ratio Test

H₀: Semua koefisien prediktor = 0
H₁: Minimal satu koefisien ≠ 0
Tolak H₀ jika p-value < 0.05

LL_full <- as.numeric(logLik(model_polr))
LL_null <- as.numeric(logLik(model_null))
G2      <- -2 * (LL_null - LL_full)
df_lrt  <- length(coef(model_polr))
p_lrt   <- pchisq(G2, df = df_lrt, lower.tail = FALSE)

data.frame(
  Statistik      = round(G2, 4),
  df             = df_lrt,
  p_value        = format(p_lrt, scientific = TRUE, digits = 4),
  Keputusan      = ifelse(p_lrt < 0.05, "TOLAK H0", "GAGAL TOLAK H0"),
  Kesimpulan     = ifelse(p_lrt < 0.05,
    "Model signifikan secara serentak",
    "Model tidak signifikan secara serentak")
)

Statistik	df	p_value	Keputusan	Kesimpulan
19579.19	7	0e+00	TOLAK H0	Model signifikan secara serentak

Uji Parsial — Wald Test

H₀: Koefisien variabel ke-j = 0
H₁: Koefisien variabel ke-j ≠ 0
Tolak H₀ jika p-value < 0.05

coef_tbl <- coef(summary(model_polr))
z_vals   <- coef_tbl[, "t value"]
p_vals   <- 2 * pnorm(abs(z_vals), lower.tail = FALSE)

wald_result <- data.frame(
  Variabel  = rownames(coef_tbl),
  Koefisien = round(coef_tbl[, "Value"],      4),
  Std_Error = round(coef_tbl[, "Std. Error"], 4),
  z_value   = round(z_vals, 4),
  p_value   = round(p_vals, 6),
  Sig       = case_when(
    p_vals < 0.001 ~ "***",
    p_vals < 0.01  ~ "**",
    p_vals < 0.05  ~ "*",
    p_vals < 0.1   ~ ".",
    TRUE           ~ "ns"
  )
)
wald_result

	Variabel	Koefisien	Std_Error	z_value	p_value	Sig
Work_LocationOffice	Work_LocationOffice	-0.0062	0.0305	-0.2017	0.840135	ns
Work_LocationCoworking	Work_LocationCoworking	-0.0070	0.0344	-0.2030	0.839158	ns
Seniority_LevelMid	Seniority_LevelMid	0.0322	0.0308	1.0460	0.295561	ns
Seniority_LevelSenior	Seniority_LevelSenior	0.0716	0.0386	1.8553	0.063555	.
Avg_Working_Hours	Avg_Working_Hours	0.6309	0.0138	45.6564	0.000000	***
Meeting_Intensity	Meeting_Intensity	0.1606	0.0072	22.2595	0.000000	***
Sentiment_Score	Sentiment_Score	-2.2981	0.0245	-93.8588	0.000000	***
Low\|Medium	Low\|Medium	5.8553	0.1258	46.5411	0.000000	***
Medium\|High	Medium\|High	7.9673	0.1309	60.8603	0.000000	***

Goodness of Fit

McFadden R²	Interpretasi
0.00 – 0.10	Lemah
0.10 – 0.20	Cukup
0.20 – 0.40	Baik
> 0.40	Sangat Baik

n           <- nrow(ordinaldf)
r2_mcfadden <- 1 - (LL_full / LL_null)
r2_cox      <- 1 - exp((2 / n) * (LL_null - LL_full))
r2_nag      <- r2_cox / (1 - exp((2 / n) * LL_null))

data.frame(
  Metrik = c("McFadden Pseudo R²", "Cox & Snell R²", "Nagelkerke R²",
             "AIC Full Model", "AIC Null Model", "Residual Deviance"),
  Nilai  = round(c(r2_mcfadden, r2_cox, r2_nag,
                   AIC(model_polr), AIC(model_null),
                   model_polr$deviance), 4)
)

Metrik	Nilai
McFadden Pseudo R²	0.3184
Cox & Snell R²	0.4793
Nagelkerke R²	0.5502
AIC Full Model	41928.3535
AIC Null Model	61493.5434
Residual Deviance	41910.3535

Prediksi

Prediksi Kelas & Probabilitas

ordinaldf$pred_class <- predict(model_polr, newdata = ordinaldf, type = "class")

head(data.frame(
  Aktual   = ordinaldf$Burnout_Risk,
  Prediksi = ordinaldf$pred_class
), 10)

Aktual	Prediksi
Low	Low
Medium	High
Low	Low
Low	Low
Medium	Medium
Low	Low
High	High
Low	Low
Low	Medium
Medium	Low

prob_pred <- predict(model_polr, newdata = ordinaldf, type = "probs")
head(round(prob_pred, 4), 10)

##       Low Medium   High
## 1  0.8779 0.1056 0.0166
## 2  0.1222 0.4128 0.4650
## 3  0.4810 0.4035 0.1155
## 4  0.7532 0.2086 0.0381
## 5  0.2235 0.4806 0.2959
## 6  0.7726 0.1930 0.0344
## 7  0.0201 0.1247 0.8552
## 8  0.8931 0.0926 0.0143
## 9  0.2061 0.4760 0.3178
## 10 0.4472 0.4227 0.1301

Evaluasi Model

Confusion Matrix

cm <- table(Aktual = ordinaldf$Burnout_Risk, Prediksi = ordinaldf$pred_class)
print(cm)

##         Prediksi
## Aktual     Low Medium  High
##   Low    13532   1807   245
##   Medium  4078   2134  1369
##   High     662   1123  5050

Akurasi & Metrik Per-Kelas

akurasi <- sum(diag(cm)) / sum(cm)
cat(sprintf("Akurasi Model: %.2f%%\n\n", akurasi * 100))

## Akurasi Model: 69.05%

metrics <- do.call(rbind, lapply(rownames(cm), function(kelas) {
  tp   <- cm[kelas, kelas]
  fp   <- sum(cm[, kelas]) - tp
  fn   <- sum(cm[kelas, ]) - tp
  prec <- ifelse((tp + fp) > 0, tp / (tp + fp), NA)
  rec  <- ifelse((tp + fn) > 0, tp / (tp + fn), NA)
  data.frame(Kelas = kelas, Precision = round(prec, 3), Recall = round(rec, 3))
}))
print(metrics)

##    Kelas Precision Recall
## 1    Low     0.741  0.868
## 2 Medium     0.421  0.281
## 3   High     0.758  0.739

Visualisasi Confusion Matrix

cm_df <- as.data.frame(cm)
ggplot(cm_df, aes(x = Prediksi, y = Aktual, fill = Freq)) +
  geom_tile(color = "white") +
  geom_text(aes(label = Freq), size = 7, fontface = "bold") +
  scale_fill_gradient(low = "#f0f7ff", high = "#2171b5") +
  theme_minimal(base_size = 13) +
  labs(title = "Confusion Matrix — Ordinal Logistic Regression",
       x = "Prediksi", y = "Aktual", fill = "Jumlah")

Interpretasi

Odds Ratio

or_vals <- exp(coef(model_polr))
se_vals <- sqrt(diag(vcov(model_polr))[names(coef(model_polr))])

or_tbl <- data.frame(
  Variabel   = names(or_vals),
  OR         = round(or_vals, 4),
  CI_Lower   = round(exp(coef(model_polr) - 1.96 * se_vals), 4),
  CI_Upper   = round(exp(coef(model_polr) + 1.96 * se_vals), 4)
)
or_tbl

	Variabel	OR	CI_Lower	CI_Upper
Work_LocationOffice	Work_LocationOffice	0.9939	0.9362	1.0551
Work_LocationCoworking	Work_LocationCoworking	0.9930	0.9283	1.0623
Seniority_LevelMid	Seniority_LevelMid	1.0328	0.9722	1.0971
Seniority_LevelSenior	Seniority_LevelSenior	1.0742	0.9960	1.1585
Avg_Working_Hours	Avg_Working_Hours	1.8794	1.8291	1.9309
Meeting_Intensity	Meeting_Intensity	1.1742	1.1577	1.1909
Sentiment_Score	Sentiment_Score	0.1004	0.0957	0.1054

Forest Plot Odds Ratio

ggplot(or_tbl, aes(x = OR, y = reorder(Variabel, OR))) +
  geom_point(size = 3.5, color = "steelblue") +
  geom_errorbarh(aes(xmin = CI_Lower, xmax = CI_Upper), height = 0.25, color = "steelblue") +
  geom_vline(xintercept = 1, linetype = "dashed", color = "red", linewidth = 0.8) +
  theme_minimal(base_size = 12) +
  labs(title = "Odds Ratio dengan 95% Confidence Interval",
       subtitle = "Garis merah = OR 1 (tidak ada efek)",
       x = "Odds Ratio", y = "Variabel")

Pengaruh Avg Working Hours terhadap Probabilitas Burnout Risk

# Buat grid nilai Avg_Working_Hours
grid <- data.frame(
  Avg_Working_Hours = seq(
    min(ordinaldf$Avg_Working_Hours),
    max(ordinaldf$Avg_Working_Hours),
    length.out = 100
  ),
  Meeting_Intensity = mean(ordinaldf$Meeting_Intensity),
  Sentiment_Score   = mean(ordinaldf$Sentiment_Score),
  Work_Location     = factor("Office", levels = c("Home", "Office", "Coworking")),
  Seniority_Level   = factor("Mid", levels = c("Junior", "Mid", "Senior"))
)

# Prediksi probabilitas
prob <- predict(model_polr, newdata = grid, type = "probs")

# Gabungkan ke data
prob_df <- cbind(grid, prob)

# Ubah ke long format
prob_long <- prob_df %>%
  pivot_longer(cols = c("Low", "Medium", "High"),
               names_to = "Kategori",
               values_to = "Probabilitas")

# Plot
ggplot(prob_long, aes(x = Avg_Working_Hours, y = Probabilitas, color = Kategori)) +
  geom_line(size = 1.2) +
  theme_minimal(base_size = 13) +
  labs(
    title = "Pengaruh Jam Kerja terhadap Probabilitas Burnout",
    x = "Avg Working Hours",
    y = "Probabilitas",
    color = "Kategori"
  )

Ordinal Logistic Regression: Analisis Burnout Risk Karyawan

Najwa Ayu Ramadhani (24031554168), Hidayah Atma Negara (24031554169), Alvida Lewiana (24031554198)

03 May 2026