Link Dataset: https://uci-ics-mlr-prod.aws.uci.edu/dataset/320/student%2Bperformance
Ordinal Logistic Regression merupakan metode statistika yang digunakan untuk memodelkan hubungan antara variabel dependen yang bersifat ordinal dengan variabel independen. Variabel ordinal merupakan variabel kategorik yang memiliki urutan, namun jarak antar kategori tidak harus sama (Williams, 2019). Metode ini merupakan pengembangan dari regresi logistik biner untuk menangani variabel respon dengan lebih dari dua kategori yang berurutan. Dibandingkan dengan regresi multinomial, Ordinal Logistic Regression lebih efisien karena memanfaatkan informasi urutan dalam data (Wang et al. , 2025). Model yang paling umum digunakan dalam Ordinal Logistic Regression adalah proportional odds model (ordered logit). Model ini menggunakan pendekatan logit kumulatif untuk memodelkan probabilitas kumulatif dari suatu kategori.
Model tersebut menghasilkan parameter dalam bentuk odds ratio yang menunjukkan peluang suatu observasi berada pada kategori yang lebih tinggi dibandingkan kategori lainnya
set.seed(123)
#load dataset
data <- read.csv("D:/coding/coding sem 4/Analisis Multivariat/tugas harian/modul 4/student-mat.csv",sep = ";")
#cek tipe data
str(data)
## 'data.frame': 395 obs. of 33 variables:
## $ school : chr "GP" "GP" "GP" "GP" ...
## $ sex : chr "F" "F" "F" "F" ...
## $ age : int 18 17 15 15 16 16 16 17 15 15 ...
## $ address : chr "U" "U" "U" "U" ...
## $ famsize : chr "GT3" "GT3" "LE3" "GT3" ...
## $ Pstatus : chr "A" "T" "T" "T" ...
## $ Medu : int 4 1 1 4 3 4 2 4 3 3 ...
## $ Fedu : int 4 1 1 2 3 3 2 4 2 4 ...
## $ Mjob : chr "at_home" "at_home" "at_home" "health" ...
## $ Fjob : chr "teacher" "other" "other" "services" ...
## $ reason : chr "course" "course" "other" "home" ...
## $ guardian : chr "mother" "father" "mother" "mother" ...
## $ traveltime: int 2 1 1 1 1 1 1 2 1 1 ...
## $ studytime : int 2 2 2 3 2 2 2 2 2 2 ...
## $ failures : int 0 0 3 0 0 0 0 0 0 0 ...
## $ schoolsup : chr "yes" "no" "yes" "no" ...
## $ famsup : chr "no" "yes" "no" "yes" ...
## $ paid : chr "no" "no" "yes" "yes" ...
## $ activities: chr "no" "no" "no" "yes" ...
## $ nursery : chr "yes" "no" "yes" "yes" ...
## $ higher : chr "yes" "yes" "yes" "yes" ...
## $ internet : chr "no" "yes" "yes" "yes" ...
## $ romantic : chr "no" "no" "no" "yes" ...
## $ famrel : int 4 5 4 3 4 5 4 4 4 5 ...
## $ freetime : int 3 3 3 2 3 4 4 1 2 5 ...
## $ goout : int 4 3 2 2 2 2 4 4 2 1 ...
## $ Dalc : int 1 1 2 1 1 1 1 1 1 1 ...
## $ Walc : int 1 1 3 1 2 2 1 1 1 1 ...
## $ health : int 3 3 3 5 5 5 3 1 1 5 ...
## $ absences : int 6 4 10 2 4 10 0 6 0 0 ...
## $ G1 : int 5 5 7 15 6 15 12 6 16 14 ...
## $ G2 : int 6 5 8 14 10 15 12 5 18 15 ...
## $ G3 : int 6 6 10 15 10 15 11 6 19 15 ...
summary(data)
## school sex age address
## Length:395 Length:395 Min. :15.0 Length:395
## Class :character Class :character 1st Qu.:16.0 Class :character
## Mode :character Mode :character Median :17.0 Mode :character
## Mean :16.7
## 3rd Qu.:18.0
## Max. :22.0
## famsize Pstatus Medu Fedu
## Length:395 Length:395 Min. :0.000 Min. :0.000
## Class :character Class :character 1st Qu.:2.000 1st Qu.:2.000
## Mode :character Mode :character Median :3.000 Median :2.000
## Mean :2.749 Mean :2.522
## 3rd Qu.:4.000 3rd Qu.:3.000
## Max. :4.000 Max. :4.000
## Mjob Fjob reason guardian
## Length:395 Length:395 Length:395 Length:395
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## traveltime studytime failures schoolsup
## Min. :1.000 Min. :1.000 Min. :0.0000 Length:395
## 1st Qu.:1.000 1st Qu.:1.000 1st Qu.:0.0000 Class :character
## Median :1.000 Median :2.000 Median :0.0000 Mode :character
## Mean :1.448 Mean :2.035 Mean :0.3342
## 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:0.0000
## Max. :4.000 Max. :4.000 Max. :3.0000
## famsup paid activities nursery
## Length:395 Length:395 Length:395 Length:395
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## higher internet romantic famrel
## Length:395 Length:395 Length:395 Min. :1.000
## Class :character Class :character Class :character 1st Qu.:4.000
## Mode :character Mode :character Mode :character Median :4.000
## Mean :3.944
## 3rd Qu.:5.000
## Max. :5.000
## freetime goout Dalc Walc
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:3.000 1st Qu.:2.000 1st Qu.:1.000 1st Qu.:1.000
## Median :3.000 Median :3.000 Median :1.000 Median :2.000
## Mean :3.235 Mean :3.109 Mean :1.481 Mean :2.291
## 3rd Qu.:4.000 3rd Qu.:4.000 3rd Qu.:2.000 3rd Qu.:3.000
## Max. :5.000 Max. :5.000 Max. :5.000 Max. :5.000
## health absences G1 G2
## Min. :1.000 Min. : 0.000 Min. : 3.00 Min. : 0.00
## 1st Qu.:3.000 1st Qu.: 0.000 1st Qu.: 8.00 1st Qu.: 9.00
## Median :4.000 Median : 4.000 Median :11.00 Median :11.00
## Mean :3.554 Mean : 5.709 Mean :10.91 Mean :10.71
## 3rd Qu.:5.000 3rd Qu.: 8.000 3rd Qu.:13.00 3rd Qu.:13.00
## Max. :5.000 Max. :75.000 Max. :19.00 Max. :19.00
## G3
## Min. : 0.00
## 1st Qu.: 8.00
## Median :11.00
## Mean :10.42
## 3rd Qu.:14.00
## Max. :20.00
Uji Multikolinearitas dilakukan menggunakan Variance Inflation Factor (VIF). Hasil pengujian menunjukkan bahwa seluruh variabel independen memiliki nilai VIF kurang dari 10 dengan nilai VIF tertinggi pada Variabel G2 dengan nilai 8. 26. Namun itu terjadi karena ada hubungan dengan G1 dan G3 dan tidak perlu dipermasalahkan karena untuk variabel ordinal hanya memakai G3 sehingga dapat disimpulkan bahwa tidak terjadi multikolinearitas dalam model.
library(car)
df <- data[, !(names(data) %in% c(
" age",
"absences"
))]
df[] <- lapply(df, function(x) {
if (is.character(x)) as.factor(x) else x
})
X <- as.data.frame(model.matrix(~ ., data = df))
X <- X[, -1]
model_vif <- lm(X[[1]] ~ ., data = X)
vif_values <- vif(model_vif)
print(vif_values)
## schoolMS sexM age addressU
## 1.482242 1.466355 1.754832 1.379284
## famsizeLE3 PstatusT Medu Fedu
## 1.152438 1.136190 2.898504 2.144507
## Mjobhealth Mjobother Mjobservices Mjobteacher
## 2.295378 2.766610 2.876523 3.171680
## Fjobhealth Fjobother Fjobservices Fjobteacher
## 2.109250 6.144457 5.362759 2.690161
## reasonhome reasonother reasonreputation guardianmother
## 1.406715 1.309893 1.510129 1.487145
## guardianother traveltime studytime failures
## 1.734035 1.322127 1.380014 1.567424
## schoolsupyes famsupyes paidyes activitiesyes
## 1.259080 1.305836 1.335596 1.167254
## nurseryyes higheryes internetyes romanticyes
## 1.152167 1.315340 1.238608 1.168176
## famrel freetime goout Dalc
## 1.168325 1.319708 1.494915 2.036256
## Walc health G1 G2
## 2.382947 1.181238 4.794857 8.265748
## G3
## 6.275588
data$grade_cat <- cut(data$G3,
breaks = c(0,10,14,20),
labels = c("rendah","sedang","tinggi"),
ordered_result = TRUE)
num_cols <- sapply(data, is.numeric)
num_names <- names(data)[num_cols]
data[] <- lapply(data, function(x) {
if (length(unique(x)) < 7) {
return(as.factor(x))
} else {
return(x)
}
})
#ordinal x
data$Medu <- ordered(data$Medu)
data$Fedu <- ordered(data$Fedu)
data$traveltime <- ordered(data$traveltime)
data$studytime <- ordered(data$studytime)
data$failures <- ordered(data$failures)
data$famrel <- ordered(data$famrel)
data$freetime <- ordered(data$freetime)
data$goout <- ordered(data$goout)
data$Dalc <- ordered(data$Dalc)
data$Walc <- ordered(data$Walc)
data$health <- ordered(data$health)
data <- data[, !(names(data) %in% c("G1","G2","G3"))]
str(data)
## 'data.frame': 395 obs. of 31 variables:
## $ school : Factor w/ 2 levels "GP","MS": 1 1 1 1 1 1 1 1 1 1 ...
## $ sex : Factor w/ 2 levels "F","M": 1 1 1 1 1 2 2 1 2 2 ...
## $ age : int 18 17 15 15 16 16 16 17 15 15 ...
## $ address : Factor w/ 2 levels "R","U": 2 2 2 2 2 2 2 2 2 2 ...
## $ famsize : Factor w/ 2 levels "GT3","LE3": 1 1 2 1 1 2 2 1 2 1 ...
## $ Pstatus : Factor w/ 2 levels "A","T": 1 2 2 2 2 2 2 1 1 2 ...
## $ Medu : Ord.factor w/ 5 levels "0"<"1"<"2"<"3"<..: 5 2 2 5 4 5 3 5 4 4 ...
## $ Fedu : Ord.factor w/ 5 levels "0"<"1"<"2"<"3"<..: 5 2 2 3 4 4 3 5 3 5 ...
## $ Mjob : Factor w/ 5 levels "at_home","health",..: 1 1 1 2 3 4 3 3 4 3 ...
## $ Fjob : Factor w/ 5 levels "at_home","health",..: 5 3 3 4 3 3 3 5 3 3 ...
## $ reason : Factor w/ 4 levels "course","home",..: 1 1 3 2 2 4 2 2 2 2 ...
## $ guardian : Factor w/ 3 levels "father","mother",..: 2 1 2 2 1 2 2 2 2 2 ...
## $ traveltime: Ord.factor w/ 4 levels "1"<"2"<"3"<"4": 2 1 1 1 1 1 1 2 1 1 ...
## $ studytime : Ord.factor w/ 4 levels "1"<"2"<"3"<"4": 2 2 2 3 2 2 2 2 2 2 ...
## $ failures : Ord.factor w/ 4 levels "0"<"1"<"2"<"3": 1 1 4 1 1 1 1 1 1 1 ...
## $ schoolsup : Factor w/ 2 levels "no","yes": 2 1 2 1 1 1 1 2 1 1 ...
## $ famsup : Factor w/ 2 levels "no","yes": 1 2 1 2 2 2 1 2 2 2 ...
## $ paid : Factor w/ 2 levels "no","yes": 1 1 2 2 2 2 1 1 2 2 ...
## $ activities: Factor w/ 2 levels "no","yes": 1 1 1 2 1 2 1 1 1 2 ...
## $ nursery : Factor w/ 2 levels "no","yes": 2 1 2 2 2 2 2 2 2 2 ...
## $ higher : Factor w/ 2 levels "no","yes": 2 2 2 2 2 2 2 2 2 2 ...
## $ internet : Factor w/ 2 levels "no","yes": 1 2 2 2 1 2 2 1 2 2 ...
## $ romantic : Factor w/ 2 levels "no","yes": 1 1 1 2 1 1 1 1 1 1 ...
## $ famrel : Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 4 5 4 3 4 5 4 4 4 5 ...
## $ freetime : Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 3 3 3 2 3 4 4 1 2 5 ...
## $ goout : Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 4 3 2 2 2 2 4 4 2 1 ...
## $ Dalc : Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 1 1 2 1 1 1 1 1 1 1 ...
## $ Walc : Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 1 1 3 1 2 2 1 1 1 1 ...
## $ health : Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 3 3 3 5 5 5 3 1 1 5 ...
## $ absences : int 6 4 10 2 4 10 0 6 0 0 ...
## $ grade_cat : Ord.factor w/ 3 levels "rendah"<"sedang"<..: 1 1 1 3 1 3 2 1 3 3 ...
barplot(
table(data$grade_cat),
col = c("green", "yellow", "red"),
border = "white",
main = "Distribusi Grade Category",
xlab = "Kategori",
ylab = "Jumlah"
)
Untuk uji Box-Tidwell dilakukan pada variabel umur dan absences (ketidakhadiran).
library(MASS)
# 1. Pastikan Y ordinal
data$grade_cat <- ordered(data$grade_cat)
# 2. Shift (hindari log(0))
data$age_shift <- data$age + 1
data$abs_shift <- data$absences + 1
# 3. Buat interaksi Box-Tidwell
data$age_inter <- data$age_shift * log(data$age_shift)
data$abs_inter <- data$abs_shift * log(data$abs_shift)
# 4. Model ordinal logistic
model_bt <- polr(grade_cat ~
age + age_inter +
absences + abs_inter,
data = data, Hess = TRUE)
# 5. Ambil hasil + p-value
ctable <- coef(summary(model_bt))
p_values <- pnorm(abs(ctable[, "t value"]), lower.tail = FALSE) * 2
result <- cbind(ctable, "p value" = p_values)
# 6. Output
print(result)
## Value Std. Error t value p value
## age -1.15283126 0.27485153 -4.194378 2.736211e-05
## age_inter 0.25249938 0.09182576 2.749766 5.963778e-03
## absences -0.17697153 0.07363778 -2.403271 1.624915e-02
## abs_inter 0.03729613 0.01938541 1.923927 5.436369e-02
## rendah|sedang -7.21714137 0.08286975 -87.090179 0.000000e+00
## sedang|tinggi -5.43369201 0.14906481 -36.451877 6.424033e-291
Variabel age tidak memenuhi asumsi linearitas bahkan ketika sudah dilakukan transformasi logaritma, sehingga dilakukan nominalisasi untuk variabel umur dengan kategori remaja (15-18 tahun) dan dewasa muda (19-22 tahun). Selanjutnya untuk Variabel dengan tipe data faktor dilakukan pengujian menggunakan chi-Square.
library(MASS)
# pastikan Y ordinal
data$grade_cat <- ordered(data$grade_cat)
# transformasi
data$age_log <- log(data$age)
# shift absences (biar aman)
data$abs_shift <- data$absences + 1
# interaksi Box-Tidwell
data$age_log_inter <- data$age_log * log(data$age_log)
data$abs_inter <- data$abs_shift * log(data$abs_shift)
# model baru
model_bt2 <- polr(grade_cat ~
age_log + age_log_inter +
absences + abs_inter,
data = data, Hess = TRUE)
# p-value
ctable <- coef(summary(model_bt2))
p_values <- pnorm(abs(ctable[, "t value"]), lower.tail = FALSE) * 2
result <- cbind(ctable, "p value" = p_values)
print(result)
## Value Std. Error t value p value
## age_log 25.22413319 0.76023616 33.179339 2.138514e-241
## age_log_inter -13.81052681 1.05823014 -13.050589 6.305784e-39
## absences -0.17695681 0.07363303 -2.403226 1.625115e-02
## abs_inter 0.03728814 0.01938313 1.923742 5.438691e-02
## rendah|sedang 29.94485164 0.92434001 32.395927 3.132235e-230
## sedang|tinggi 31.72793453 0.91934977 34.511277 5.433993e-261
data$age_cat <- cut(data$age,
breaks = c(14,18,22),
labels = c("remaja","dewasa_muda"))
data$age_cat <- ordered(data$age_cat,
levels = c("remaja","dewasa_muda"))
cat_vars <- names(data)[sapply(data, is.factor)]
cat_vars <- setdiff(cat_vars, "grade_cat")
results <- data.frame(
fitur = character(),
p_val = numeric(),
stringsAsFactors = FALSE
)
for (v in cat_vars) {
tbl <- table(data[[v]], data$grade_cat)
test <- chisq.test(tbl)
results <- rbind(results,
data.frame(
fitur = v,
p_val = test$p.value
))
}
# urutkan dari paling signifikan
results <- results[order(results$p_val), ]
results
## fitur p_val
## 14 failures 1.634259e-06
## 6 Medu 3.681927e-04
## 15 schoolsup 4.556858e-04
## 8 Mjob 2.750955e-03
## 27 Walc 4.411656e-03
## 7 Fedu 1.026479e-02
## 21 internet 3.795270e-02
## 20 higher 3.908556e-02
## 26 Dalc 5.210178e-02
## 3 address 7.098624e-02
## 25 goout 8.751608e-02
## 9 Fjob 1.134143e-01
## 22 romantic 1.182430e-01
## 1 school 1.691299e-01
## 24 freetime 2.488823e-01
## 19 nursery 2.617798e-01
## 13 studytime 2.646791e-01
## 2 sex 2.679207e-01
## 29 age_cat 3.033189e-01
## 12 traveltime 3.091617e-01
## 28 health 3.757013e-01
## 23 famrel 3.979044e-01
## 17 paid 6.708436e-01
## 11 guardian 6.817288e-01
## 16 famsup 6.885765e-01
## 5 Pstatus 7.508726e-01
## 4 famsize 8.297718e-01
## 18 activities 8.418499e-01
## 10 reason 9.963376e-01
data <- data[, !grepl("_shift|_inter|_log", names(data))]
str(data)
## 'data.frame': 395 obs. of 32 variables:
## $ school : Factor w/ 2 levels "GP","MS": 1 1 1 1 1 1 1 1 1 1 ...
## $ sex : Factor w/ 2 levels "F","M": 1 1 1 1 1 2 2 1 2 2 ...
## $ age : int 18 17 15 15 16 16 16 17 15 15 ...
## $ address : Factor w/ 2 levels "R","U": 2 2 2 2 2 2 2 2 2 2 ...
## $ famsize : Factor w/ 2 levels "GT3","LE3": 1 1 2 1 1 2 2 1 2 1 ...
## $ Pstatus : Factor w/ 2 levels "A","T": 1 2 2 2 2 2 2 1 1 2 ...
## $ Medu : Ord.factor w/ 5 levels "0"<"1"<"2"<"3"<..: 5 2 2 5 4 5 3 5 4 4 ...
## $ Fedu : Ord.factor w/ 5 levels "0"<"1"<"2"<"3"<..: 5 2 2 3 4 4 3 5 3 5 ...
## $ Mjob : Factor w/ 5 levels "at_home","health",..: 1 1 1 2 3 4 3 3 4 3 ...
## $ Fjob : Factor w/ 5 levels "at_home","health",..: 5 3 3 4 3 3 3 5 3 3 ...
## $ reason : Factor w/ 4 levels "course","home",..: 1 1 3 2 2 4 2 2 2 2 ...
## $ guardian : Factor w/ 3 levels "father","mother",..: 2 1 2 2 1 2 2 2 2 2 ...
## $ traveltime: Ord.factor w/ 4 levels "1"<"2"<"3"<"4": 2 1 1 1 1 1 1 2 1 1 ...
## $ studytime : Ord.factor w/ 4 levels "1"<"2"<"3"<"4": 2 2 2 3 2 2 2 2 2 2 ...
## $ failures : Ord.factor w/ 4 levels "0"<"1"<"2"<"3": 1 1 4 1 1 1 1 1 1 1 ...
## $ schoolsup : Factor w/ 2 levels "no","yes": 2 1 2 1 1 1 1 2 1 1 ...
## $ famsup : Factor w/ 2 levels "no","yes": 1 2 1 2 2 2 1 2 2 2 ...
## $ paid : Factor w/ 2 levels "no","yes": 1 1 2 2 2 2 1 1 2 2 ...
## $ activities: Factor w/ 2 levels "no","yes": 1 1 1 2 1 2 1 1 1 2 ...
## $ nursery : Factor w/ 2 levels "no","yes": 2 1 2 2 2 2 2 2 2 2 ...
## $ higher : Factor w/ 2 levels "no","yes": 2 2 2 2 2 2 2 2 2 2 ...
## $ internet : Factor w/ 2 levels "no","yes": 1 2 2 2 1 2 2 1 2 2 ...
## $ romantic : Factor w/ 2 levels "no","yes": 1 1 1 2 1 1 1 1 1 1 ...
## $ famrel : Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 4 5 4 3 4 5 4 4 4 5 ...
## $ freetime : Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 3 3 3 2 3 4 4 1 2 5 ...
## $ goout : Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 4 3 2 2 2 2 4 4 2 1 ...
## $ Dalc : Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 1 1 2 1 1 1 1 1 1 1 ...
## $ Walc : Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 1 1 3 1 2 2 1 1 1 1 ...
## $ health : Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 3 3 3 5 5 5 3 1 1 5 ...
## $ absences : int 6 4 10 2 4 10 0 6 0 0 ...
## $ grade_cat : Ord.factor w/ 3 levels "rendah"<"sedang"<..: 1 1 1 3 1 3 2 1 3 3 ...
## $ age_cat : Ord.factor w/ 2 levels "remaja"<"dewasa_muda": 1 1 1 1 1 1 1 1 1 1 ...
Model logit dibentuk berdasarkan variabel independen yang memiliki parameter signifikan dalam model regresi logistik ordinal. Variabel yang digunakan dalam pembentukan model ini meliputi absences, failures, Medu, schoolsup, Mjob, Walc, yang secara simultan mempengaruhi kategori nilai siswa (grade_cat).
Model logit yang terbentuk merupakan model logit kumulatif (proportional odds model), yang terdiri dari dua fungsi logit, yaitu g1(x) dan g2(x). Fungsi tersebut merepresentasikan batas antara kategori nilai rendah dengan sedang, serta kategori sedang dengan tinggi.
library(MASS)
model_final <- polr(grade_cat ~ absences + failures + Medu +
schoolsup + Mjob + Walc + Fedu + internet,
data = data, Hess = TRUE)
summary(model_final)
## Call:
## polr(formula = grade_cat ~ absences + failures + Medu + schoolsup +
## Mjob + Walc + Fedu + internet, data = data, Hess = TRUE)
##
## Coefficients:
## Value Std. Error t value
## absences -0.04102 0.01593 -2.5754
## failures.L -20.33725 0.34535 -58.8880
## failures.Q -13.37482 0.37160 -35.9923
## failures.C -5.53378 0.57031 -9.7030
## Medu.L -9.14861 0.28307 -32.3192
## Medu.Q 8.35401 0.23616 35.3741
## Medu.C -4.97163 0.25688 -19.3539
## Medu^4 1.79694 0.23781 7.5561
## schoolsupyes -1.59728 0.35471 -4.5030
## Mjobhealth 0.87178 0.54445 1.6012
## Mjobother 0.30444 0.38663 0.7874
## Mjobservices 1.09601 0.42189 2.5978
## Mjobteacher -0.17229 0.52383 -0.3289
## Walc.L -0.78048 0.31922 -2.4450
## Walc.Q 0.50259 0.30046 1.6727
## Walc.C 0.44830 0.28619 1.5664
## Walc^4 0.23686 0.25801 0.9180
## Fedu.L 0.12201 1.05964 0.1151
## Fedu.Q 0.26773 0.89814 0.2981
## Fedu.C 0.13029 0.56561 0.2303
## Fedu^4 0.36689 0.29236 1.2549
## internetyes 0.45226 0.31005 1.4586
##
## Intercepts:
## Value Std. Error t value
## rendah|sedang 5.1330 0.5342 9.6086
## sedang|tinggi 7.2940 0.5588 13.0541
##
## Residual Deviance: 634.3528
## AIC: 682.3528
## (38 observations deleted due to missingness)
# Model null (tanpa variabel X)
model_null <- polr(grade_cat ~ 1, data = data, Hess = TRUE)
# Bandingkan model
anova(model_null, model_final)
## Likelihood ratio tests of ordinal regression models
##
## Response: grade_cat
## Model
## 1 1
## 2 absences + failures + Medu + schoolsup + Mjob + Walc + Fedu + internet
## Resid. df Resid. Dev Test Df LR stat. Pr(Chi)
## 1 355 754.8793
## 2 333 634.3528 1 vs 2 22 120.5265 1.44329e-15
ctable <- coef(summary(model_final))
p_values <- pnorm(abs(ctable[, "t value"]), lower.tail = FALSE) * 2
result <- cbind(ctable, "p value" = p_values)
print(result)
## Value Std. Error t value p value
## absences -0.04102313 0.01592891 -2.5753889 1.001274e-02
## failures.L -20.33724667 0.34535491 -58.8879609 0.000000e+00
## failures.Q -13.37482075 0.37160183 -35.9923432 1.102224e-283
## failures.C -5.53377927 0.57031415 -9.7030369 2.926556e-22
## Medu.L -9.14861247 0.28307026 -32.3192283 3.755869e-229
## Medu.Q 8.35400843 0.23616188 35.3740763 4.276394e-274
## Medu.C -4.97162895 0.25688032 -19.3538725 1.890870e-83
## Medu^4 1.79693665 0.23781361 7.5560716 4.154253e-14
## schoolsupyes -1.59727661 0.35471402 -4.5029983 6.700142e-06
## Mjobhealth 0.87177627 0.54444549 1.6012186 1.093285e-01
## Mjobother 0.30443985 0.38662690 0.7874254 4.310329e-01
## Mjobservices 1.09601031 0.42189221 2.5978444 9.381100e-03
## Mjobteacher -0.17229068 0.52382640 -0.3289080 7.422253e-01
## Walc.L -0.78047697 0.31921903 -2.4449575 1.448692e-02
## Walc.Q 0.50258773 0.30045948 1.6727305 9.438036e-02
## Walc.C 0.44829904 0.28619454 1.5664137 1.172518e-01
## Walc^4 0.23685787 0.25801073 0.9180156 3.586107e-01
## Fedu.L 0.12201376 1.05963714 0.1151467 9.083288e-01
## Fedu.Q 0.26773024 0.89814244 0.2980933 7.656320e-01
## Fedu.C 0.13028631 0.56561093 0.2303462 8.178228e-01
## Fedu^4 0.36689342 0.29236256 1.2549261 2.095056e-01
## internetyes 0.45225615 0.31005293 1.4586418 1.446637e-01
## rendah|sedang 5.13303277 0.53420979 9.6086461 7.350916e-22
## sedang|tinggi 7.29403273 0.55875310 13.0541247 6.019789e-39
Berdasarkan nilai odds ratio, variabel absences, failures, Medu, schoolsup, Mjob (kategori services), dan Walc memiliki pengaruh terhadap peluang siswa berada pada kategori nilai yang lebih tinggi. Variabel absences dan Walc menunjukkan nilai odds ratio kurang dari satu, yang berarti meningkatkan nilai variabel tersebut akan menurunkan peluang siswa memperoleh nilai yang lebih tinggi. Sebaliknya, variabel seperti Mjob (services) memiliki odds ratio lebih dari satu, yang menunjukkan peningkatan peluang kategori nilai yang lebih tinggi.
OR <- exp(coef(model_final))
print(OR)
## absences failures.L failures.Q failures.C Medu.L Medu.Q
## 9.598069e-01 1.471113e-09 1.553779e-06 3.951029e-03 1.063673e-04 4.247171e+03
## Medu.C Medu^4 schoolsupyes Mjobhealth Mjobother Mjobservices
## 6.931847e-03 6.031144e+00 2.024471e-01 2.391154e+00 1.355865e+00 2.992204e+00
## Mjobteacher Walc.L Walc.Q Walc.C Walc^4 Fedu.L
## 8.417345e-01 4.581874e-01 1.652993e+00 1.565647e+00 1.267261e+00 1.129770e+00
## Fedu.Q Fedu.C Fedu^4 internetyes
## 1.306995e+00 1.139154e+00 1.443244e+00 1.571855e+00
Berdasarkan hasil uji kebaikan model, diperoleh nilai Nagelkerke sebesar 0. 3259 yang menunjukkan bahwa variabel independen dalam model mampu menjelaskan variabilitas kategori nilai siswa sebesar 32. 59%, sedangkan sisanya sebesar 67. 41% dijelaskan oleh faktor lain di luar model.
deviance <- deviance(model_final)
df <- df.residual(model_final)
p_value <- pchisq(deviance, df, lower.tail = FALSE)
deviance
## [1] 634.3528
df
## [1] 333
p_value
## [1] 4.905002e-21
library(pscl)
pR2(model_final)
## fitting null model for pseudo-r2
## llh llhNull G2 McFadden r2ML r2CU
## -317.1764243 -377.4396544 120.5264602 0.1596632 0.2865259 0.3258541
Linear Discriminant Analysis (LDA) merupakan salah satu metode klasifikasi yang berbasis pada pendekatan probabilistik dengan asumsi bahwa distribusi variabel prediktor pada setiap kelompok mengikuti distribusi normal multivariat dengan matriks kovarians yang sama . Dalam penelitian ini, LDA digunakan untuk mengklasifikasikan performa akademik siswa ke dalam kategori lulus dan tidak lulus berdasarkan nilai akhir.
Pemilihan variabel independen dalam model LDA didasarkan pada hasil analisis sebelumnya menggunakan Ordinal Logistic Regression. Variabel-variabel yang terbukti signifikan, seperti absences, failures, Medu, schoolsup, Mjob, dan Walc, dipilih sebagai kandidat utama karena memiliki kontribusi yang relevan terhadap performa akademik siswa.
Selain itu, variabel G2 ditambahkan sebagai indikator performa akademik sebelumnya yang secara teoritis memiliki hubungan kuat dengan nilai akhir. Pemilihan variabel yang konsisten antara kedua metode bertujuan untuk memastikan bahwa perbandingan kinerja model dilakukan secara adil, sehingga perbedaan hasil yang diperoleh lebih mencerminkan perbedaan metode analisis, bukan perbedaan variabel yang digunakan.
# Load data
df <- read.csv("D:/coding/coding sem 4/Analisis Multivariat/tugas harian/modul 4/student-mat.csv",sep = ";")
head(df)
## school sex age address famsize Pstatus Medu Fedu Mjob Fjob reason
## 1 GP F 18 U GT3 A 4 4 at_home teacher course
## 2 GP F 17 U GT3 T 1 1 at_home other course
## 3 GP F 15 U LE3 T 1 1 at_home other other
## 4 GP F 15 U GT3 T 4 2 health services home
## 5 GP F 16 U GT3 T 3 3 other other home
## 6 GP M 16 U LE3 T 4 3 services other reputation
## guardian traveltime studytime failures schoolsup famsup paid activities
## 1 mother 2 2 0 yes no no no
## 2 father 1 2 0 no yes no no
## 3 mother 1 2 3 yes no yes no
## 4 mother 1 3 0 no yes yes yes
## 5 father 1 2 0 no yes yes no
## 6 mother 1 2 0 no yes yes yes
## nursery higher internet romantic famrel freetime goout Dalc Walc health
## 1 yes yes no no 4 3 4 1 1 3
## 2 no yes yes no 5 3 3 1 1 3
## 3 yes yes yes no 4 3 2 2 3 3
## 4 yes yes yes yes 3 2 2 1 1 5
## 5 yes yes no no 4 3 2 1 2 5
## 6 yes yes yes no 5 4 2 1 2 5
## absences G1 G2 G3
## 1 6 5 6 6
## 2 4 5 5 6
## 3 10 7 8 10
## 4 2 15 14 15
## 5 4 6 10 10
## 6 10 15 15 15
# Cek tipe data
str(df)
## 'data.frame': 395 obs. of 33 variables:
## $ school : chr "GP" "GP" "GP" "GP" ...
## $ sex : chr "F" "F" "F" "F" ...
## $ age : int 18 17 15 15 16 16 16 17 15 15 ...
## $ address : chr "U" "U" "U" "U" ...
## $ famsize : chr "GT3" "GT3" "LE3" "GT3" ...
## $ Pstatus : chr "A" "T" "T" "T" ...
## $ Medu : int 4 1 1 4 3 4 2 4 3 3 ...
## $ Fedu : int 4 1 1 2 3 3 2 4 2 4 ...
## $ Mjob : chr "at_home" "at_home" "at_home" "health" ...
## $ Fjob : chr "teacher" "other" "other" "services" ...
## $ reason : chr "course" "course" "other" "home" ...
## $ guardian : chr "mother" "father" "mother" "mother" ...
## $ traveltime: int 2 1 1 1 1 1 1 2 1 1 ...
## $ studytime : int 2 2 2 3 2 2 2 2 2 2 ...
## $ failures : int 0 0 3 0 0 0 0 0 0 0 ...
## $ schoolsup : chr "yes" "no" "yes" "no" ...
## $ famsup : chr "no" "yes" "no" "yes" ...
## $ paid : chr "no" "no" "yes" "yes" ...
## $ activities: chr "no" "no" "no" "yes" ...
## $ nursery : chr "yes" "no" "yes" "yes" ...
## $ higher : chr "yes" "yes" "yes" "yes" ...
## $ internet : chr "no" "yes" "yes" "yes" ...
## $ romantic : chr "no" "no" "no" "yes" ...
## $ famrel : int 4 5 4 3 4 5 4 4 4 5 ...
## $ freetime : int 3 3 3 2 3 4 4 1 2 5 ...
## $ goout : int 4 3 2 2 2 2 4 4 2 1 ...
## $ Dalc : int 1 1 2 1 1 1 1 1 1 1 ...
## $ Walc : int 1 1 3 1 2 2 1 1 1 1 ...
## $ health : int 3 3 3 5 5 5 3 1 1 5 ...
## $ absences : int 6 4 10 2 4 10 0 6 0 0 ...
## $ G1 : int 5 5 7 15 6 15 12 6 16 14 ...
## $ G2 : int 6 5 8 14 10 15 12 5 18 15 ...
## $ G3 : int 6 6 10 15 10 15 11 6 19 15 ...
# Cek jumlah baris dan kolom
cat("Jumlah baris:", nrow(df), "\nJumlah kolom:", ncol(df))
## Jumlah baris: 395
## Jumlah kolom: 33
# Cek apakah ada missing value pada variabel
colSums(is.na(df))
## school sex age address famsize Pstatus Medu
## 0 0 0 0 0 0 0
## Fedu Mjob Fjob reason guardian traveltime studytime
## 0 0 0 0 0 0 0
## failures schoolsup famsup paid activities nursery higher
## 0 0 0 0 0 0 0
## internet romantic famrel freetime goout Dalc Walc
## 0 0 0 0 0 0 0
## health absences G1 G2 G3
## 0 0 0 0 0
# Statistik Deskriptif
summary(df)
## school sex age address
## Length:395 Length:395 Min. :15.0 Length:395
## Class :character Class :character 1st Qu.:16.0 Class :character
## Mode :character Mode :character Median :17.0 Mode :character
## Mean :16.7
## 3rd Qu.:18.0
## Max. :22.0
## famsize Pstatus Medu Fedu
## Length:395 Length:395 Min. :0.000 Min. :0.000
## Class :character Class :character 1st Qu.:2.000 1st Qu.:2.000
## Mode :character Mode :character Median :3.000 Median :2.000
## Mean :2.749 Mean :2.522
## 3rd Qu.:4.000 3rd Qu.:3.000
## Max. :4.000 Max. :4.000
## Mjob Fjob reason guardian
## Length:395 Length:395 Length:395 Length:395
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## traveltime studytime failures schoolsup
## Min. :1.000 Min. :1.000 Min. :0.0000 Length:395
## 1st Qu.:1.000 1st Qu.:1.000 1st Qu.:0.0000 Class :character
## Median :1.000 Median :2.000 Median :0.0000 Mode :character
## Mean :1.448 Mean :2.035 Mean :0.3342
## 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:0.0000
## Max. :4.000 Max. :4.000 Max. :3.0000
## famsup paid activities nursery
## Length:395 Length:395 Length:395 Length:395
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## higher internet romantic famrel
## Length:395 Length:395 Length:395 Min. :1.000
## Class :character Class :character Class :character 1st Qu.:4.000
## Mode :character Mode :character Mode :character Median :4.000
## Mean :3.944
## 3rd Qu.:5.000
## Max. :5.000
## freetime goout Dalc Walc
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:3.000 1st Qu.:2.000 1st Qu.:1.000 1st Qu.:1.000
## Median :3.000 Median :3.000 Median :1.000 Median :2.000
## Mean :3.235 Mean :3.109 Mean :1.481 Mean :2.291
## 3rd Qu.:4.000 3rd Qu.:4.000 3rd Qu.:2.000 3rd Qu.:3.000
## Max. :5.000 Max. :5.000 Max. :5.000 Max. :5.000
## health absences G1 G2
## Min. :1.000 Min. : 0.000 Min. : 3.00 Min. : 0.00
## 1st Qu.:3.000 1st Qu.: 0.000 1st Qu.: 8.00 1st Qu.: 9.00
## Median :4.000 Median : 4.000 Median :11.00 Median :11.00
## Mean :3.554 Mean : 5.709 Mean :10.91 Mean :10.71
## 3rd Qu.:5.000 3rd Qu.: 8.000 3rd Qu.:13.00 3rd Qu.:13.00
## Max. :5.000 Max. :75.000 Max. :19.00 Max. :19.00
## G3
## Min. : 0.00
## 1st Qu.: 8.00
## Median :11.00
## Mean :10.42
## 3rd Qu.:14.00
## Max. :20.00
Berdasarkan statistik deskriptif, range antara min-max pada variabel numerik masih tergolong aman sehingga kami tidak melakukan standarisasi terhadap data.
# Buat target binary pada Variabel G3 (Nilai Akhir Siswa)
df$G3_class <- cut(df$G3,
breaks = c(-Inf, 10, 14, Inf),
labels = c("Rendah", "Sedang", "Tinggi"),
ordered_result = TRUE)
df$G3_class <- as.factor(df$G3_class)
table(df$G3_class)
##
## Rendah Sedang Tinggi
## 186 136 73
# Membuat faktor dari variabel dengan nilai uniknya < 7
num_cols <- sapply(df, is.numeric)
num_names <- names(df)[num_cols]
df[] <- lapply(df, function(x) {
if (length(unique(x)) < 7) {
return(as.numeric(as.factor(x)))
} else {
return(x)
}
})
X <- df[, !(names(df) %in% c("G1","G2","G3","G3_class"))]
str(X)
## 'data.frame': 395 obs. of 30 variables:
## $ school : num 1 1 1 1 1 1 1 1 1 1 ...
## $ sex : num 1 1 1 1 1 2 2 1 2 2 ...
## $ age : int 18 17 15 15 16 16 16 17 15 15 ...
## $ address : num 2 2 2 2 2 2 2 2 2 2 ...
## $ famsize : num 1 1 2 1 1 2 2 1 2 1 ...
## $ Pstatus : num 1 2 2 2 2 2 2 1 1 2 ...
## $ Medu : num 5 2 2 5 4 5 3 5 4 4 ...
## $ Fedu : num 5 2 2 3 4 4 3 5 3 5 ...
## $ Mjob : num 1 1 1 2 3 4 3 3 4 3 ...
## $ Fjob : num 5 3 3 4 3 3 3 5 3 3 ...
## $ reason : num 1 1 3 2 2 4 2 2 2 2 ...
## $ guardian : num 2 1 2 2 1 2 2 2 2 2 ...
## $ traveltime: num 2 1 1 1 1 1 1 2 1 1 ...
## $ studytime : num 2 2 2 3 2 2 2 2 2 2 ...
## $ failures : num 1 1 4 1 1 1 1 1 1 1 ...
## $ schoolsup : num 2 1 2 1 1 1 1 2 1 1 ...
## $ famsup : num 1 2 1 2 2 2 1 2 2 2 ...
## $ paid : num 1 1 2 2 2 2 1 1 2 2 ...
## $ activities: num 1 1 1 2 1 2 1 1 1 2 ...
## $ nursery : num 2 1 2 2 2 2 2 2 2 2 ...
## $ higher : num 2 2 2 2 2 2 2 2 2 2 ...
## $ internet : num 1 2 2 2 1 2 2 1 2 2 ...
## $ romantic : num 1 1 1 2 1 1 1 1 1 1 ...
## $ famrel : num 4 5 4 3 4 5 4 4 4 5 ...
## $ freetime : num 3 3 3 2 3 4 4 1 2 5 ...
## $ goout : num 4 3 2 2 2 2 4 4 2 1 ...
## $ Dalc : num 1 1 2 1 1 1 1 1 1 1 ...
## $ Walc : num 1 1 3 1 2 2 1 1 1 1 ...
## $ health : num 3 3 3 5 5 5 3 1 1 5 ...
## $ absences : int 6 4 10 2 4 10 0 6 0 0 ...
# Uji Normalitas Multivariat - Henze-Zirkler (HZ)
df$G3_class <- factor(
df$G3_class,
levels = c(1,2,3),
labels = c("Rendah","Sedang","Tinggi")
)
str(df$G3_class)
## Factor w/ 3 levels "Rendah","Sedang",..: 1 1 1 3 1 3 2 1 3 3 ...
levels(df$G3_class)
## [1] "Rendah" "Sedang" "Tinggi"
# Uji Normalitas per kelas karena asumsi LDA berlaku pada distribusi tiap kelas
library(MVN)
kelas <- levels(df$G3_class)
hasil_hz <- lapply(kelas, function(k){
# Subset data per kelas
data_k <- X[df$G3_class == k, ]
# Hapus variabel variance = 0
data_k <- data_k[, apply(data_k, 2, var) != 0]
# Jalankan HZ
mvn(
data = data_k,
mvn_test = "hz",
descriptives = FALSE,
tidy = TRUE
)
})
names(hasil_hz) <- kelas
str(hasil_hz)
## List of 3
## $ Rendah:List of 5
## ..$ multivariate_normality:'data.frame': 1 obs. of 5 variables:
## .. ..$ Test : chr "Henze-Zirkler"
## .. ..$ Statistic: num 1
## .. ..$ p.value : chr "<0.001"
## .. ..$ Method : chr "asymptotic"
## .. ..$ MVN : chr "✗ Not normal"
## ..$ univariate_normality :'data.frame': 30 obs. of 5 variables:
## .. ..$ Test : chr [1:30] "Anderson-Darling" "Anderson-Darling" "Anderson-Darling" "Anderson-Darling" ...
## .. ..$ Variable : chr [1:30] "school" "sex" "age" "address" ...
## .. ..$ Statistic: num [1:30] 55.66 33.99 5.22 42.86 42.86 ...
## .. ..$ p.value : chr [1:30] "<0.001" "<0.001" "<0.001" "<0.001" ...
## .. ..$ Normality: chr [1:30] "✗ Not normal" "✗ Not normal" "✗ Not normal" "✗ Not normal" ...
## ..$ data :'data.frame': 186 obs. of 30 variables:
## .. ..$ school : num [1:186] 1 1 1 1 1 1 1 1 1 1 ...
## .. ..$ sex : num [1:186] 1 1 1 1 1 1 1 2 2 1 ...
## .. ..$ age : int [1:186] 18 17 15 16 17 15 16 17 16 15 ...
## .. ..$ address : num [1:186] 2 2 2 2 2 2 2 2 2 1 ...
## .. ..$ famsize : num [1:186] 1 1 2 1 1 1 1 1 2 1 ...
## .. ..$ Pstatus : num [1:186] 1 2 2 2 1 2 2 2 2 2 ...
## .. ..$ Medu : num [1:186] 5 2 2 4 5 5 4 4 5 3 ...
## .. ..$ Fedu : num [1:186] 5 2 2 4 5 5 4 3 4 5 ...
## .. ..$ Mjob : num [1:186] 1 1 1 3 3 5 3 4 2 4 ...
## .. ..$ Fjob : num [1:186] 5 3 3 3 5 2 3 4 3 2 ...
## .. ..$ reason : num [1:186] 1 1 3 2 2 4 4 1 2 1 ...
## .. ..$ guardian : num [1:186] 2 1 2 1 2 2 2 2 1 2 ...
## .. ..$ traveltime: num [1:186] 2 1 1 1 2 1 3 1 1 1 ...
## .. ..$ studytime : num [1:186] 2 2 2 2 2 2 2 1 1 3 ...
## .. ..$ failures : num [1:186] 1 1 4 1 1 1 1 4 1 1 ...
## .. ..$ schoolsup : num [1:186] 2 1 2 1 2 1 2 1 1 2 ...
## .. ..$ famsup : num [1:186] 1 2 1 2 2 2 2 2 1 2 ...
## .. ..$ paid : num [1:186] 1 1 2 2 1 2 1 1 2 2 ...
## .. ..$ activities: num [1:186] 1 1 1 1 1 1 2 2 2 2 ...
## .. ..$ nursery : num [1:186] 2 1 2 2 2 2 2 2 2 2 ...
## .. ..$ higher : num [1:186] 2 2 2 2 2 2 2 2 2 2 ...
## .. ..$ internet : num [1:186] 1 2 2 1 1 2 1 2 2 2 ...
## .. ..$ romantic : num [1:186] 1 1 1 1 1 1 1 1 1 1 ...
## .. ..$ famrel : num [1:186] 4 5 4 4 4 3 5 5 3 4 ...
## .. ..$ freetime : num [1:186] 3 3 3 3 1 3 3 5 1 3 ...
## .. ..$ goout : num [1:186] 4 3 2 2 4 3 2 5 3 2 ...
## .. ..$ Dalc : num [1:186] 1 1 2 1 1 1 1 2 1 1 ...
## .. ..$ Walc : num [1:186] 1 1 3 2 1 2 1 4 3 1 ...
## .. ..$ health : num [1:186] 3 3 3 5 1 2 4 5 5 5 ...
## .. ..$ absences : int [1:186] 6 4 10 4 6 0 4 16 4 2 ...
## ..$ subset : NULL
## ..$ outlierMethod : chr "none"
## ..- attr(*, "class")= chr "mvn"
## $ Sedang:List of 5
## ..$ multivariate_normality:'data.frame': 1 obs. of 5 variables:
## .. ..$ Test : chr "Henze-Zirkler"
## .. ..$ Statistic: num 1
## .. ..$ p.value : chr "<0.001"
## .. ..$ Method : chr "asymptotic"
## .. ..$ MVN : chr "✗ Not normal"
## ..$ univariate_normality :'data.frame': 30 obs. of 5 variables:
## .. ..$ Test : chr [1:30] "Anderson-Darling" "Anderson-Darling" "Anderson-Darling" "Anderson-Darling" ...
## .. ..$ Variable : chr [1:30] "school" "sex" "age" "address" ...
## .. ..$ Statistic: num [1:30] 45.1 24.3 4.7 35.6 29.8 ...
## .. ..$ p.value : chr [1:30] "<0.001" "<0.001" "<0.001" "<0.001" ...
## .. ..$ Normality: chr [1:30] "✗ Not normal" "✗ Not normal" "✗ Not normal" "✗ Not normal" ...
## ..$ data :'data.frame': 136 obs. of 30 variables:
## .. ..$ school : num [1:136] 1 1 1 1 1 1 1 1 1 1 ...
## .. ..$ sex : num [1:136] 2 1 2 2 1 1 2 2 2 2 ...
## .. ..$ age : int [1:136] 16 15 15 15 16 16 16 15 16 16 ...
## .. ..$ address : num [1:136] 2 2 2 2 2 2 2 2 2 2 ...
## .. ..$ famsize : num [1:136] 2 1 2 1 1 1 2 1 2 1 ...
## .. ..$ Pstatus : num [1:136] 2 2 2 2 2 2 2 2 1 2 ...
## .. ..$ Medu : num [1:136] 3 3 5 5 5 5 3 3 4 5 ...
## .. ..$ Fedu : num [1:136] 3 2 5 4 5 5 3 3 5 5 ...
## .. ..$ Mjob : num [1:136] 3 4 2 5 2 4 3 3 4 5 ...
## .. ..$ Fjob : num [1:136] 3 3 4 3 3 4 3 3 3 5 ...
## .. ..$ reason : num [1:136] 2 4 1 1 2 4 4 2 2 2 ...
## .. ..$ guardian : num [1:136] 2 1 1 2 2 2 2 2 2 2 ...
## .. ..$ traveltime: num [1:136] 1 3 1 2 1 1 2 1 1 1 ...
## .. ..$ studytime : num [1:136] 2 3 1 2 1 3 2 1 2 2 ...
## .. ..$ failures : num [1:136] 1 1 1 1 1 1 1 1 1 1 ...
## .. ..$ schoolsup : num [1:136] 1 1 1 1 1 1 1 1 2 1 ...
## .. ..$ famsup : num [1:136] 1 2 2 2 2 2 2 2 2 2 ...
## .. ..$ paid : num [1:136] 1 1 2 2 1 2 1 2 1 2 ...
## .. ..$ activities: num [1:136] 1 2 2 1 1 2 2 1 2 2 ...
## .. ..$ nursery : num [1:136] 2 2 2 2 2 2 2 2 2 2 ...
## .. ..$ higher : num [1:136] 2 2 2 2 2 2 2 2 2 2 ...
## .. ..$ internet : num [1:136] 2 2 2 2 2 2 2 2 2 2 ...
## .. ..$ romantic : num [1:136] 1 1 1 1 1 1 1 1 1 2 ...
## .. ..$ famrel : num [1:136] 4 5 4 5 4 3 5 4 5 4 ...
## .. ..$ freetime : num [1:136] 4 2 3 4 4 2 4 2 3 4 ...
## .. ..$ goout : num [1:136] 4 2 3 3 4 3 4 2 3 5 ...
## .. ..$ Dalc : num [1:136] 1 1 1 1 1 1 2 1 1 5 ...
## .. ..$ Walc : num [1:136] 1 1 3 2 2 2 4 2 1 5 ...
## .. ..$ health : num [1:136] 3 4 5 3 2 2 5 5 5 5 ...
## .. ..$ absences : int [1:136] 0 4 2 2 4 6 0 2 4 16 ...
## ..$ subset : NULL
## ..$ outlierMethod : chr "none"
## ..- attr(*, "class")= chr "mvn"
## $ Tinggi:List of 5
## ..$ multivariate_normality:'data.frame': 1 obs. of 5 variables:
## .. ..$ Test : chr "Henze-Zirkler"
## .. ..$ Statistic: num 1
## .. ..$ p.value : chr "<0.001"
## .. ..$ Method : chr "asymptotic"
## .. ..$ MVN : chr "✗ Not normal"
## ..$ univariate_normality :'data.frame': 29 obs. of 5 variables:
## .. ..$ Test : chr [1:29] "Anderson-Darling" "Anderson-Darling" "Anderson-Darling" "Anderson-Darling" ...
## .. ..$ Variable : chr [1:29] "school" "sex" "age" "address" ...
## .. ..$ Statistic: num [1:29] 24.8 13.2 3.2 22.2 15 ...
## .. ..$ p.value : chr [1:29] "<0.001" "<0.001" "<0.001" "<0.001" ...
## .. ..$ Normality: chr [1:29] "✗ Not normal" "✗ Not normal" "✗ Not normal" "✗ Not normal" ...
## ..$ data :'data.frame': 73 obs. of 29 variables:
## .. ..$ school : num [1:73] 1 1 1 1 1 1 1 1 1 1 ...
## .. ..$ sex : num [1:73] 1 2 2 2 2 2 2 2 2 2 ...
## .. ..$ age : int [1:73] 15 16 15 15 15 15 15 16 15 15 ...
## .. ..$ address : num [1:73] 2 2 2 2 2 2 2 2 2 2 ...
## .. ..$ famsize : num [1:73] 1 2 2 1 1 1 1 2 1 1 ...
## .. ..$ Pstatus : num [1:73] 2 2 1 2 1 2 2 2 2 2 ...
## .. ..$ Medu : num [1:73] 5 5 4 4 3 5 5 5 5 5 ...
## .. ..$ Fedu : num [1:73] 3 4 3 5 3 4 5 3 3 5 ...
## .. ..$ Mjob : num [1:73] 2 4 4 3 3 5 2 5 2 4 ...
## .. ..$ Fjob : num [1:73] 4 3 3 3 3 3 2 3 4 4 ...
## .. ..$ reason : num [1:73] 2 4 2 2 2 4 3 1 3 4 ...
## .. ..$ guardian : num [1:73] 2 2 2 2 3 2 1 2 2 2 ...
## .. ..$ traveltime: num [1:73] 1 1 1 1 1 1 1 1 1 2 ...
## .. ..$ studytime : num [1:73] 3 2 2 2 3 2 1 2 1 2 ...
## .. ..$ failures : num [1:73] 1 1 1 1 1 1 1 1 1 1 ...
## .. ..$ schoolsup : num [1:73] 1 1 1 1 1 1 1 1 1 1 ...
## .. ..$ famsup : num [1:73] 2 2 2 2 2 1 2 1 1 2 ...
## .. ..$ paid : num [1:73] 2 2 2 2 1 1 2 1 2 1 ...
## .. ..$ activities: num [1:73] 2 2 1 2 1 1 1 2 1 2 ...
## .. ..$ nursery : num [1:73] 2 2 2 2 2 2 2 2 2 2 ...
## .. ..$ internet : num [1:73] 2 2 2 2 2 2 2 2 2 2 ...
## .. ..$ romantic : num [1:73] 2 1 1 1 2 1 1 1 1 1 ...
## .. ..$ famrel : num [1:73] 3 5 4 5 4 4 5 4 2 4 ...
## .. ..$ freetime : num [1:73] 2 4 2 5 5 4 4 5 2 3 ...
## .. ..$ goout : num [1:73] 2 2 2 1 2 1 2 1 4 1 ...
## .. ..$ Dalc : num [1:73] 1 1 1 1 1 1 1 1 2 1 ...
## .. ..$ Walc : num [1:73] 1 2 1 1 1 1 1 3 4 1 ...
## .. ..$ health : num [1:73] 5 5 1 5 3 1 5 5 1 5 ...
## .. ..$ absences : int [1:73] 2 10 0 0 0 0 0 2 4 0 ...
## ..$ subset : NULL
## ..$ outlierMethod : chr "none"
## ..- attr(*, "class")= chr "mvn"
Uji asumsi normalitas multivariat dilakukan menggunakan uji Henze-Zirkler terhadap masing-masing kelas variabel respon, yaitu kategori nilai akhir siswa (G3) yang dibagi menjadi kelas rendah, sedang, dan tinggi. Variabel prediktor yang digunakan merupakan seluruh variabel selain G1 dan G2 karena kedua variabel tersebut berpotensi menimbulkan multikolinearitas terhadap variabel respon G3.
Hasil pengujian menunjukkan bahwa ketiga kelompok kelas, yaitu rendah, sedang, dan tinggi, memiliki nilai p-value < 0,001 sehingga hipotesis nol (H0) menyatakan data berdistribusi normal multivariat ditolak. Dengan demikian, dapat disimpulkan bahwa data pada masing-masing kelas tidak memenuhi asumsi normalitas multivariat. Selain itu, hasil uji normalitas univariat Anderson-Darling juga menunjukkan sebagian besar variabel memiliki p-value < 0,05 sehingga secara individual banyak variabel tidak berdistribusi normal.
Meskipun asumsi normalitas multivariat tidak terpenuhi secara sempurna, kami memutuskan untuk tetap melanjutkan analisis LDA karena metode ini relatif cukup robust terhadap pelanggaran normalitas, terutama ketika ukuran sampel pada masing-masing kelompok cukup memadai dan tidak terdapat penyimpangan ekstrem yang dominan. Pelanggaran asumsi ini mungkin disebabkan oleh karakteristik data yang sebagian besar berupa data kategorik atau ordinal hasil pengkodean numerik sehingga distribusi data sulit mengikuti distribusi normal multivariat secara ideal.
# Uji Homogenitas Varians (Box'M)
library(biotools)
boxM(X, df$G3_class)
##
## Box's M-test for Homogeneity of Covariance Matrices
##
## data: X
## Chi-Sq (approx.) = Inf, df = 930, p-value < 2.2e-16
# Cek Outlier
boxplot(X,
main = "Boxplot Variabel Prediktor (Deteksi Outlier)",
col = "lightblue",
las = 2)
Hasil uji homogenitas matriks kovarians menggunakan Box’s M Test menunjukkan nilai p-value < 2,2 × 10⁻¹⁶ sehingga H0 ditolak yang artinya matriks kovarians antar kelas tidak homogen dan asumsi homogenitas pada LDA tidak sepenuhnya terpenuhi.
Walaupun kedua asumsi utama tersebut dilanggar, metode LDA masih tetap akan kami lakukan karena secara empiris cukup tahan (robust) terhadap pelanggaran asumsi, terutama jika ukuran sampel memadai. Hal ini didukung oleh perkataan Mardia (1971) yang menjelaskan bahwa prosedur multivariat seperti MANOVA dan analisis diskriminan masih cukup stabil terhadap pelanggaran normalitas tertentu, sedangkan Box’s M diketahui sangat sensitif terhadap penyimpangan distribusi normal. Beberapa penelitian menjelaskan bahwa Box’s M sangat sensitif terhadap data non-normal sehingga sering menghasilkan penolakan asumsi meskipun model diskriminan masih dapat bekerja dengan baik. Penelitian lain juga menunjukkan bahwa LDA masih mampu memberikan performa klasifikasi yang baik pada data non-normal.
Adapun hasil visualisasi boxplot variabel prediktor (X) menunjukkan
bahwa sebagian besar variabel prediktor memiliki persebaran data yang
cukup baik. Namun, variabel absences terlihat memiliki
banyak outlier ekstrem dibandingkan variabel lain yang berpotensi
mengganggu pembentukan model LDA. Karena itu, kami mempertimbangkan
untuk melakukan transformasi terhadap variabel absencesagar
hasil klasifikasi lebih stabil.
# 1A.Ambil Variabel Respon / Y (Bagi G3 menjadi 3 Kelas)
df$G3_class <- cut(
df$G3,
breaks = c(-Inf, 9, 14, Inf),
labels = c("Rendah", "Sedang", "Tinggi")
)
df$G3_class <- as.factor(df$G3_class)
table(df$G3_class)
##
## Rendah Sedang Tinggi
## 130 192 73
# 1B.Transformasi log pada variabel absences
df$absences_log <- log(df$absences + 1)
# 1C.Tentukan Variabel Prediktor
X <- df[, setdiff(
names(df),
c("G1", "G2", "G3", "G3_class", "absences")
)]
str(X)
## 'data.frame': 395 obs. of 30 variables:
## $ school : num 1 1 1 1 1 1 1 1 1 1 ...
## $ sex : num 1 1 1 1 1 2 2 1 2 2 ...
## $ age : int 18 17 15 15 16 16 16 17 15 15 ...
## $ address : num 2 2 2 2 2 2 2 2 2 2 ...
## $ famsize : num 1 1 2 1 1 2 2 1 2 1 ...
## $ Pstatus : num 1 2 2 2 2 2 2 1 1 2 ...
## $ Medu : num 5 2 2 5 4 5 3 5 4 4 ...
## $ Fedu : num 5 2 2 3 4 4 3 5 3 5 ...
## $ Mjob : num 1 1 1 2 3 4 3 3 4 3 ...
## $ Fjob : num 5 3 3 4 3 3 3 5 3 3 ...
## $ reason : num 1 1 3 2 2 4 2 2 2 2 ...
## $ guardian : num 2 1 2 2 1 2 2 2 2 2 ...
## $ traveltime : num 2 1 1 1 1 1 1 2 1 1 ...
## $ studytime : num 2 2 2 3 2 2 2 2 2 2 ...
## $ failures : num 1 1 4 1 1 1 1 1 1 1 ...
## $ schoolsup : num 2 1 2 1 1 1 1 2 1 1 ...
## $ famsup : num 1 2 1 2 2 2 1 2 2 2 ...
## $ paid : num 1 1 2 2 2 2 1 1 2 2 ...
## $ activities : num 1 1 1 2 1 2 1 1 1 2 ...
## $ nursery : num 2 1 2 2 2 2 2 2 2 2 ...
## $ higher : num 2 2 2 2 2 2 2 2 2 2 ...
## $ internet : num 1 2 2 2 1 2 2 1 2 2 ...
## $ romantic : num 1 1 1 2 1 1 1 1 1 1 ...
## $ famrel : num 4 5 4 3 4 5 4 4 4 5 ...
## $ freetime : num 3 3 3 2 3 4 4 1 2 5 ...
## $ goout : num 4 3 2 2 2 2 4 4 2 1 ...
## $ Dalc : num 1 1 2 1 1 1 1 1 1 1 ...
## $ Walc : num 1 1 3 1 2 2 1 1 1 1 ...
## $ health : num 3 3 3 5 5 5 3 1 1 5 ...
## $ absences_log: num 1.95 1.61 2.4 1.1 1.61 ...
# 1D.Pisahkan Berdasarkan Kelas
X_rendah <- X[df$G3_class == "Rendah", ]
X_sedang <- X[df$G3_class == "Sedang", ]
X_tinggi <- X[df$G3_class == "Tinggi", ]
# 1E.Hitung Mean Total
mean_total <- colMeans(X)
# 2.Mean dan Covariance Matrix Tiap Kelas
mean_rendah <- colMeans(X_rendah)
mean_sedang <- colMeans(X_sedang)
mean_tinggi <- colMeans(X_tinggi)
S_rendah <- cov(X_rendah)
S_sedang <- cov(X_sedang)
S_tinggi <- cov(X_tinggi)
# 3.Hitung Sampel Tiap Kelas dan Hitung Pooled Covariance Matrix
n1 <- nrow(X_rendah)
n2 <- nrow(X_sedang)
n3 <- nrow(X_tinggi)
S_pooled <- (
((n1 - 1) * S_rendah) +
((n2 - 1) * S_sedang) +
((n3 - 1) * S_tinggi)
) / (n1 + n2 + n3 - 3)
S_pooled
## school sex age address famsize
## school 0.103389194 -0.001181664 0.151784880 -0.036918552 0.0098798727
## sex -0.001181664 0.248911132 -0.006825934 -0.007892597 0.0193731273
## age 0.151784880 -0.006825934 1.572753890 -0.069628929 0.0278273137
## address -0.036918552 -0.007892597 -0.069628929 0.172720252 0.0128659196
## famsize 0.009879873 0.019373127 0.027827314 0.012865920 0.2063366021
## Pstatus 0.004320950 0.004239198 0.007956633 -0.004966988 -0.0205080939
## Medu -0.043174192 0.032320275 -0.180376402 0.053949671 -0.0265453745
## Fedu -0.025609331 0.012007792 -0.191590173 0.027106549 -0.0326840865
## Mjob -0.020500220 0.113113887 -0.088909894 0.051009219 0.0377302818
## Fjob 0.006517957 0.033706362 -0.025878638 -0.003769736 -0.0346890613
## reason -0.033780904 -0.062729386 0.004757335 -0.027158287 -0.0129886096
## guardian 0.001423256 -0.017419484 0.187772104 -0.008936065 0.0003260683
## traveltime 0.053467113 0.023849538 0.050124829 -0.093136253 0.0215558536
## studytime -0.023243474 -0.132988515 0.015231886 -0.010513013 -0.0300421855
## failures 0.011000813 0.027433088 0.168285779 -0.016842506 0.0002557069
## schoolsup -0.016037396 -0.020859626 -0.120325179 0.005628021 -0.0032047250
## famsup -0.026386431 -0.035856602 -0.095527656 0.005825186 -0.0244404466
## paid -0.002650385 -0.033156741 -0.015118974 0.010965473 -0.0036794460
## activities -0.018686916 0.024545745 -0.063613523 -0.011283570 -0.0002897563
## nursery -0.011384329 -0.002245065 -0.043438039 0.009382754 0.0185889596
## higher -0.001177190 -0.018322173 -0.049484481 0.002702487 -0.0014243197
## internet -0.015296767 0.006114186 -0.043994254 0.031918223 -0.0008873241
## romantic 0.008401695 -0.021670933 0.086168431 0.003053580 0.0086705573
## famrel -0.013169184 0.024429022 0.073811361 0.003709934 -0.0103886655
## freetime 0.010907488 0.119414407 0.020298746 0.013810540 0.0079598199
## goout -0.005351719 0.051383878 0.128757877 0.038429878 0.0163163382
## Dalc 0.030645911 0.125755346 0.125267529 -0.029407289 0.0438674930
## Walc 0.024149920 0.184689423 0.165663244 -0.047165694 0.0639637902
## health -0.020716790 0.105073191 -0.136890330 -0.019804119 -0.0159936301
## absences_log -0.027625982 -0.002629754 0.189071056 -0.013658135 0.0323796419
## Pstatus Medu Fedu Mjob Fjob
## school 0.0043209504 -0.043174192 -0.025609331 -0.020500220 0.0065179574
## sex 0.0042391981 0.032320275 0.012007792 0.113113887 0.0337063624
## age 0.0079566333 -0.180376402 -0.191590173 -0.088909894 -0.0258786381
## address -0.0049669883 0.053949671 0.027106549 0.051009219 -0.0037697358
## famsize -0.0205080939 -0.026545374 -0.032684087 0.037730282 -0.0346890613
## Pstatus 0.0935274661 -0.038823938 -0.027588444 -0.019481469 0.0096773540
## Medu -0.0388239383 1.151122106 0.714217540 0.587356175 0.1331980232
## Fedu -0.0275884440 0.714217540 1.168267906 0.312254631 0.1797884695
## Mjob -0.0194814695 0.587356175 0.312254631 1.499446562 0.2004506736
## Fjob 0.0096773540 0.133198023 0.179788469 0.200450674 0.7438532432
## reason 0.0003475732 0.139557393 0.042022931 0.027904059 -0.0325806613
## guardian -0.0186269852 -0.007882929 -0.072327040 0.011168980 -0.0364119522
## traveltime 0.0053437527 -0.116876569 -0.111954867 -0.086783282 0.0386038573
## studytime 0.0073607960 0.042330698 -0.020319971 -0.031868221 -0.0610558330
## failures -0.0044620596 -0.150265124 -0.168649221 -0.052074989 0.0050617611
## schoolsup -0.0050089817 -0.001380307 0.021306066 -0.011194943 0.0002478721
## famsup 0.0024196808 0.103950119 0.103152674 0.032458363 -0.0114415242
## paid 0.0076003574 0.086913560 0.044374051 0.061034955 -0.0167623131
## activities 0.0150959035 0.056784047 0.059978283 0.059525994 0.0131615253
## nursery -0.0111800435 0.082185192 0.067952988 0.045344533 -0.0122798070
## higher -0.0022082270 0.033798363 0.036620429 0.023798895 -0.0104246386
## internet 0.0085597874 0.071787454 0.045612631 0.094308982 0.0080599213
## romantic -0.0066315610 0.031905301 0.016032625 -0.031893288 0.0028890452
## famrel 0.0075996350 -0.013109821 -0.008115977 0.038915382 0.0081946042
## freetime 0.0117913627 0.029899228 -0.014830220 0.137600110 -0.0430530448
## goout -0.0018456676 0.114538450 0.081305512 0.017200022 0.0187213520
## Dalc -0.0096753884 0.049435600 0.019274336 0.099590387 0.0701225483
## Walc 0.0009350857 -0.026580816 0.003151492 0.003605377 0.0922005402
## health 0.0080433334 -0.050972563 0.037452972 0.120434316 -0.0161836209
## absences_log -0.0340055963 0.139289052 0.009619896 0.088560509 -0.0092400921
## reason guardian traveltime studytime
## school -0.0337809044 0.0014232557 0.053467113 -0.023243474
## sex -0.0627293857 -0.0174194843 0.023849538 -0.132988515
## age 0.0047573350 0.1877721044 0.050124829 0.015231886
## address -0.0271582869 -0.0089360645 -0.093136253 -0.010513013
## famsize -0.0129886096 0.0003260683 0.021555854 -0.030042185
## Pstatus 0.0003475732 -0.0186269852 0.005343753 0.007360796
## Medu 0.1395573930 -0.0078829286 -0.116876569 0.042330698
## Fedu 0.0420229314 -0.0723270400 -0.111954867 -0.020319971
## Mjob 0.0279040594 0.0111689802 -0.086783282 -0.031868221
## Fjob -0.0325806613 -0.0364119522 0.038603857 -0.061055833
## reason 1.4527439374 0.0061012129 -0.052569300 0.141059959
## guardian 0.0061012129 0.2865098331 -0.001526378 0.008496054
## traveltime -0.0525692996 -0.0015263785 0.484842429 -0.054591165
## studytime 0.1410599593 0.0084960538 -0.054591165 0.701466008
## failures -0.0298058106 0.0588712138 0.036902684 -0.090084420
## schoolsup 0.0119256396 -0.0132054284 -0.005461005 0.014793275
## famsup 0.0574373849 -0.0027383484 -0.002513471 0.062004603
## paid 0.0755629807 0.0142969372 -0.023284143 0.068786200
## activities 0.0695822850 -0.0060390166 -0.001929646 0.037023285
## nursery 0.0275696580 -0.0178558324 -0.008323796 0.026858809
## higher 0.0113095238 -0.0005420918 -0.011017219 0.029570578
## internet 0.0195088995 -0.0108492627 -0.026067366 0.015175368
## romantic 0.0166387103 0.0215063251 0.004250681 0.025538743
## famrel -0.0214802728 0.0220398486 -0.008113401 0.026396834
## freetime -0.0791404128 0.0235167658 -0.010631122 -0.121153801
## goout -0.0438053304 0.0229762196 0.012964120 -0.044542971
## Dalc -0.0453268030 -0.0051827850 0.077924920 -0.137974430
## Walc -0.0637275722 -0.0298308437 0.109586437 -0.264129288
## health -0.2538933751 -0.0554504540 0.001795610 -0.080562268
## absences_log 0.1723540357 0.0906036601 -0.023012989 -0.033463033
## failures schoolsup famsup paid
## school 0.0110008135 -0.0160373959 -2.638643e-02 -0.002650385
## sex 0.0274330878 -0.0208596265 -3.585660e-02 -0.033156741
## age 0.1682857795 -0.1203251789 -9.552766e-02 -0.015118974
## address -0.0168425057 0.0056280208 5.825186e-03 0.010965473
## famsize 0.0002557069 -0.0032047250 -2.444045e-02 -0.003679446
## Pstatus -0.0044620596 -0.0050089817 2.419681e-03 0.007600357
## Medu -0.1502651237 -0.0013803075 1.039501e-01 0.086913560
## Fedu -0.1686492209 0.0213060662 1.031527e-01 0.044374051
## Mjob -0.0520749890 -0.0111949429 3.245836e-02 0.061034955
## Fjob 0.0050617611 0.0002478721 -1.144152e-02 -0.016762313
## reason -0.0298058106 0.0119256396 5.743738e-02 0.075562981
## guardian 0.0588712138 -0.0132054284 -2.738348e-03 0.014296937
## traveltime 0.0369026842 -0.0054610049 -2.513471e-03 -0.023284143
## studytime -0.0900844202 0.0147932751 6.200460e-02 0.068786200
## failures 0.4888957781 -0.0109349967 -2.788662e-02 -0.060022987
## schoolsup -0.0109349967 0.1105998505 1.591310e-02 -0.002986282
## famsup -0.0278866231 0.0159131042 2.382104e-01 0.072818803
## paid -0.0600229867 -0.0029862821 7.281880e-02 0.247306319
## activities -0.0237394508 0.0084083933 -9.519005e-05 -0.005373232
## nursery -0.0300318222 0.0070196015 1.188602e-02 0.021449466
## higher -0.0397401148 0.0057344813 1.197119e-02 0.019573767
## internet -0.0094008815 0.0012157389 2.000065e-02 0.028715215
## romantic 0.0195574591 -0.0155331225 1.296987e-03 0.002713663
## famrel -0.0179374503 0.0018106689 -7.591795e-03 -0.001200019
## freetime 0.0660198336 -0.0146340827 4.914853e-03 -0.030458608
## goout 0.0489995763 -0.0231159287 -1.491208e-02 0.013962105
## Dalc 0.0704477877 -0.0130034110 -1.632594e-02 0.027055069
## Walc 0.1160540818 -0.0464205127 -5.743743e-02 0.036001438
## health 0.0425439019 -0.0209636677 1.690808e-02 -0.051479073
## absences_log 0.0233663878 0.0110810499 2.228335e-02 0.022029410
## activities nursery higher internet
## school -1.868692e-02 -0.0113843290 -0.0011771896 -0.0152967673
## sex 2.454575e-02 -0.0022450654 -0.0183221726 0.0061141858
## age -6.361352e-02 -0.0434380387 -0.0494844813 -0.0439942544
## address -1.128357e-02 0.0093827535 0.0027024872 0.0319182232
## famsize -2.897563e-04 0.0185889596 -0.0014243197 -0.0008873241
## Pstatus 1.509590e-02 -0.0111800435 -0.0022082270 0.0085597874
## Medu 5.678405e-02 0.0821851921 0.0337983631 0.0717874543
## Fedu 5.997828e-02 0.0679529881 0.0366204294 0.0456126308
## Mjob 5.952599e-02 0.0453445334 0.0237988946 0.0943089822
## Fjob 1.316153e-02 -0.0122798070 -0.0104246386 0.0080599213
## reason 6.958228e-02 0.0275696580 0.0113095238 0.0195088995
## guardian -6.039017e-03 -0.0178558324 -0.0005420918 -0.0108492627
## traveltime -1.929646e-03 -0.0083237961 -0.0110172194 -0.0260673662
## studytime 3.702329e-02 0.0268588086 0.0295705782 0.0151753681
## failures -2.373945e-02 -0.0300318222 -0.0397401148 -0.0094008815
## schoolsup 8.408393e-03 0.0070196015 0.0057344813 0.0012157389
## famsup -9.519005e-05 0.0118860151 0.0119711947 0.0200006479
## paid -5.373232e-03 0.0214494659 0.0195737670 0.0287152147
## activities 2.516816e-01 0.0003441766 0.0102970876 0.0085549095
## nursery 3.441766e-04 0.1637683391 0.0046742134 0.0003559707
## higher 1.029709e-02 0.0046742134 0.0470530400 0.0003162202
## internet 8.554910e-03 0.0003559707 0.0003162202 0.1379523620
## romantic 5.256747e-03 0.0057253081 -0.0090295493 0.0176912246
## famrel 1.787607e-02 -0.0016002120 0.0030824830 0.0092095914
## freetime 4.485435e-02 -0.0108100218 -0.0133025085 0.0182791824
## goout 2.764606e-02 0.0025003131 -0.0019478104 0.0380034698
## Dalc -2.818918e-02 -0.0282309649 -0.0103183461 0.0183761197
## Walc -2.209519e-02 -0.0482293196 -0.0247262968 0.0141760364
## health 1.780818e-02 -0.0097272578 -0.0010655825 -0.0377608171
## absences_log 2.167098e-02 0.0139810840 0.0098020346 0.0333222552
## romantic famrel freetime goout Dalc
## school 0.0084016954 -0.013169184 0.010907488 -0.005351719 0.0306459107
## sex -0.0216709329 0.024429022 0.119414407 0.051383878 0.1257553461
## age 0.0861684314 0.073811361 0.020298746 0.128757877 0.1252675290
## address 0.0030535802 0.003709934 0.013810540 0.038429878 -0.0294072893
## famsize 0.0086705573 -0.010388665 0.007959820 0.016316338 0.0438674930
## Pstatus -0.0066315610 0.007599635 0.011791363 -0.001845668 -0.0096753884
## Medu 0.0319053006 -0.013109821 0.029899228 0.114538450 0.0494355996
## Fedu 0.0160326245 -0.008115977 -0.014830220 0.081305512 0.0192743363
## Mjob -0.0318932881 0.038915382 0.137600110 0.017200022 0.0995903874
## Fjob 0.0028890452 0.008194604 -0.043053045 0.018721352 0.0701225483
## reason 0.0166387103 -0.021480273 -0.079140413 -0.043805330 -0.0453268030
## guardian 0.0215063251 0.022039849 0.023516766 0.022976220 -0.0051827850
## traveltime 0.0042506814 -0.008113401 -0.010631122 0.012964120 0.0779249203
## studytime 0.0255387429 0.026396834 -0.121153801 -0.044542971 -0.1379744298
## failures 0.0195574591 -0.017937450 0.066019834 0.048999576 0.0704477877
## schoolsup -0.0155331225 0.001810669 -0.014634083 -0.023115929 -0.0130034110
## famsup 0.0012969871 -0.007591795 0.004914853 -0.014912076 -0.0163259398
## paid 0.0027136625 -0.001200019 -0.030458608 0.013962105 0.0270550689
## activities 0.0052567474 0.017876072 0.044854350 0.027646065 -0.0281891759
## nursery 0.0057253081 -0.001600212 -0.010810022 0.002500313 -0.0282309649
## higher -0.0090295493 0.003082483 -0.013302509 -0.001947810 -0.0103183461
## internet 0.0176912246 0.009209591 0.018279182 0.038003470 0.0183761197
## romantic 0.2212775778 -0.024651128 -0.005226825 -0.006778478 0.0006146765
## famrel -0.0246511276 0.805923977 0.135762089 0.074435339 -0.0576686747
## freetime -0.0052268253 0.135762089 1.001392253 0.316527540 0.1899712898
## goout -0.0067784777 0.074435339 0.316527540 1.201084002 0.2492814699
## Dalc 0.0006146765 -0.057668675 0.189971290 0.249281470 0.7800655021
## Walc -0.0131085611 -0.126284359 0.196441032 0.588367485 0.7224693561
## health 0.0118232952 0.122653836 0.105589357 -0.036105712 0.0856288177
## absences_log 0.0276405152 -0.089331399 -0.006535981 0.142484165 0.1403156375
## Walc health absences_log
## school 0.0241499203 -0.020716790 -0.027625982
## sex 0.1846894226 0.105073191 -0.002629754
## age 0.1656632440 -0.136890330 0.189071056
## address -0.0471656944 -0.019804119 -0.013658135
## famsize 0.0639637902 -0.015993630 0.032379642
## Pstatus 0.0009350857 0.008043333 -0.034005596
## Medu -0.0265808156 -0.050972563 0.139289052
## Fedu 0.0031514919 0.037452972 0.009619896
## Mjob 0.0036053772 0.120434316 0.088560509
## Fjob 0.0922005402 -0.016183621 -0.009240092
## reason -0.0637275722 -0.253893375 0.172354036
## guardian -0.0298308437 -0.055450454 0.090603660
## traveltime 0.1095864370 0.001795610 -0.023012989
## studytime -0.2641292884 -0.080562268 -0.033463033
## failures 0.1160540818 0.042543902 0.023366388
## schoolsup -0.0464205127 -0.020963668 0.011081050
## famsup -0.0574374269 0.016908085 0.022283354
## paid 0.0360014379 -0.051479073 0.022029410
## activities -0.0220951930 0.017808183 0.021670980
## nursery -0.0482293196 -0.009727258 0.013981084
## higher -0.0247262968 -0.001065582 0.009802035
## internet 0.0141760364 -0.037760817 0.033322255
## romantic -0.0131085611 0.011823295 0.027640515
## famrel -0.1262843586 0.122653836 -0.089331399
## freetime 0.1964410320 0.105589357 -0.006535981
## goout 0.5883674853 -0.036105712 0.142484165
## Dalc 0.7224693561 0.085628818 0.140315638
## Walc 1.6323256013 0.154254901 0.249994700
## health 0.1542549012 1.932189986 -0.097716310
## absences_log 0.2499946997 -0.097716310 1.099661850
# 4.Hitung Between-Class Scatter Matrix (SB)
SB <-
n1 * (as.matrix(mean_rendah - mean_total) %*%
t(as.matrix(mean_rendah - mean_total))) +
n2 * (as.matrix(mean_sedang - mean_total) %*%
t(as.matrix(mean_sedang - mean_total))) +
n3 * (as.matrix(mean_tinggi - mean_total) %*%
t(as.matrix(mean_tinggi - mean_total)))
str(SB)
## num [1:30, 1:30] 0.114 -0.314 1.475 -0.28 -0.149 ...
## - attr(*, "dimnames")=List of 2
## ..$ : chr [1:30] "school" "sex" "age" "address" ...
## ..$ : chr [1:30] "school" "sex" "age" "address" ...
# 5.Hitung Fisher Linear Discriminant
fisher_matriks <- solve(S_pooled) %*% SB
eig <- eigen(fisher_matriks)
# 6.Koefisien LD (Linear Discriminant)
eig$vectors
## [,1] [,2] [,3]
## [1,] -9.046407e-02+0i -0.082979326+0i -0.5495534313+0i
## [2,] -2.414889e-01+0i 0.091516281+0i -0.1268679695+0i
## [3,] 1.216451e-01+0i 0.055980140+0i -0.0348529904+0i
## [4,] -1.246899e-01+0i -0.035411595+0i -0.3138571677+0i
## [5,] -1.347039e-01+0i 0.173735224+0i -0.0276003189+0i
## [6,] 1.782931e-01+0i 0.077160825+0i 0.0199368940+0i
## [7,] -1.201225e-01+0i 0.280361163+0i 0.0466531974+0i
## [8,] -1.297039e-02+0i -0.127239179+0i 0.0003282617+0i
## [9,] 6.989801e-02+0i -0.001847978+0i 0.0575980813+0i
## [10,] -8.064102e-02+0i 0.150598734+0i -0.0380356005+0i
## [11,] -6.284752e-02+0i -0.042398691+0i -0.0244877870+0i
## [12,] -1.454448e-02+0i 0.213034012+0i -0.0446660802+0i
## [13,] 9.294993e-05+0i -0.103274303+0i 0.0198260467+0i
## [14,] -1.219824e-01+0i 0.022013800+0i -0.0736667329+0i
## [15,] 3.447018e-01+0i 0.283350980+0i -0.3993851375+0i
## [16,] 5.671256e-01+0i -0.265205792+0i 0.2919331995+0i
## [17,] 2.305889e-01+0i 0.139495923+0i 0.1880834835+0i
## [18,] -4.916489e-03+0i -0.398133884+0i -0.2175094707+0i
## [19,] 8.174659e-02+0i 0.005433910+0i -0.0527498513+0i
## [20,] 1.116828e-01+0i 0.242963944+0i 0.1366018273+0i
## [21,] -3.558379e-01+0i 0.161849132+0i 0.2121066771+0i
## [22,] -2.369417e-01+0i 0.421249076+0i 0.0455325689+0i
## [23,] 2.101870e-01+0i -0.146667924+0i 0.0144148362+0i
## [24,] -5.428874e-02+0i -0.104307053+0i -0.1592176724+0i
## [25,] -8.865486e-02+0i 0.024732408+0i -0.0782454766+0i
## [26,] 1.975362e-01+0i 0.149620693+0i 0.3691283927+0i
## [27,] 1.164606e-01+0i -0.127564872+0i -0.0017046413+0i
## [28,] -5.546168e-02+0i -0.151633490+0i 0.0014995778+0i
## [29,] 4.454529e-02+0i 0.012895625+0i 0.0421291323+0i
## [30,] 2.815915e-05+0i -0.286857098+0i -0.0043910125+0i
## [,4] [,5] [,6]
## [1,] -0.151501669+0.201363573i -0.151501669-0.201363573i -0.872937389+0i
## [2,] -0.674356233+0.000000000i -0.674356233+0.000000000i -0.269526358+0i
## [3,] -0.066181589-0.094682281i -0.066181589+0.094682281i -0.007833751+0i
## [4,] -0.136699361+0.130302683i -0.136699361-0.130302683i 0.025958153+0i
## [5,] 0.124488189-0.172965297i 0.124488189+0.172965297i -0.051369563+0i
## [6,] 0.043810951-0.094655719i 0.043810951+0.094655719i -0.021549479+0i
## [7,] 0.075645913-0.131507950i 0.075645913+0.131507950i 0.062281660+0i
## [8,] 0.096163051+0.035842052i 0.096163051-0.035842052i -0.020683082+0i
## [9,] -0.034238227+0.061940903i -0.034238227-0.061940903i 0.007062406+0i
## [10,] -0.081805189-0.034346902i -0.081805189+0.034346902i -0.059153960+0i
## [11,] -0.072969835-0.018488804i -0.072969835+0.018488804i 0.013343091+0i
## [12,] 0.050866665+0.047839322i 0.050866665-0.047839322i -0.018430976+0i
## [13,] -0.016238733-0.057678958i -0.016238733+0.057678958i 0.024088305+0i
## [14,] -0.158976840+0.123552083i -0.158976840-0.123552083i 0.091400977+0i
## [15,] -0.019656902-0.034351849i -0.019656902+0.034351849i 0.060100427+0i
## [16,] 0.044517341+0.144224777i 0.044517341-0.144224777i -0.024285014+0i
## [17,] -0.096898076+0.099139949i -0.096898076-0.099139949i 0.028802263+0i
## [18,] 0.001149198-0.056706582i 0.001149198+0.056706582i 0.129700108+0i
## [19,] -0.036783744-0.037237215i -0.036783744+0.037237215i -0.069976170+0i
## [20,] 0.168396065-0.060566330i 0.168396065+0.060566330i -0.065320031+0i
## [21,] -0.370478209-0.012591756i -0.370478209+0.012591756i 0.243674276+0i
## [22,] 0.042744960-0.121726962i 0.042744960+0.121726962i -0.059546599+0i
## [23,] 0.039470865+0.065318891i 0.039470865-0.065318891i 0.058253904+0i
## [24,] -0.020109608-0.021595957i -0.020109608+0.021595957i -0.016783669+0i
## [25,] 0.015084515-0.019828401i 0.015084515+0.019828401i 0.052816662+0i
## [26,] -0.107103025+0.055248757i -0.107103025-0.055248757i 0.049560170+0i
## [27,] -0.007220469-0.127142554i -0.007220469+0.127142554i -0.157650322+0i
## [28,] 0.004492089-0.003795447i 0.004492089+0.003795447i 0.127991609+0i
## [29,] 0.006231269+0.011580690i 0.006231269-0.011580690i -0.009376801+0i
## [30,] -0.036383817+0.012014479i -0.036383817-0.012014479i 0.002282953+0i
## [,7] [,8] [,9] [,10]
## [1,] 0.2920024431+0i -0.9893413964+0i -0.997460284+0i 0.145318390+0i
## [2,] 0.5881742211+0i -0.0679625821+0i -0.016028771+0i 0.257104223+0i
## [3,] -0.0858723733+0i 0.0078084816+0i 0.004025859+0i -0.238531795+0i
## [4,] 0.2294094525+0i 0.0304010732+0i -0.004997912+0i -0.171015277+0i
## [5,] -0.3521601241+0i -0.0546536713+0i -0.013806499+0i -0.513235867+0i
## [6,] -0.1648390136+0i -0.0055542038+0i 0.006094493+0i -0.368924113+0i
## [7,] -0.0409489515+0i -0.0084779666+0i -0.017286413+0i -0.069418619+0i
## [8,] -0.0485344287+0i -0.0080392303+0i 0.004380121+0i -0.076447480+0i
## [9,] 0.0480152251+0i 0.0072299969+0i 0.003662241+0i -0.033417741+0i
## [10,] -0.0456488595+0i -0.0235699852+0i -0.010113054+0i -0.027908777+0i
## [11,] 0.0022175205+0i -0.0052231990+0i -0.001545481+0i -0.120651387+0i
## [12,] 0.0283394938+0i 0.0032137775+0i -0.009195368+0i 0.107042536+0i
## [13,] -0.0303864848+0i 0.0027198968+0i 0.004100459+0i -0.097431275+0i
## [14,] 0.1823178260+0i 0.0214246184+0i -0.007136288+0i -0.094429037+0i
## [15,] -0.0123837321+0i 0.0170047447+0i 0.006461689+0i -0.029227072+0i
## [16,] 0.2172051415+0i 0.0423151961+0i 0.039637013+0i 0.127235220+0i
## [17,] 0.1519236865+0i 0.0239430774+0i 0.006307543+0i -0.159877662+0i
## [18,] -0.0756432502+0i 0.0553015125+0i 0.015536885+0i 0.039756779+0i
## [19,] -0.0126139425+0i -0.0154649923+0i 0.003981826+0i -0.025125320+0i
## [20,] -0.1279488838+0i -0.0120150680+0i -0.003901128+0i -0.097455431+0i
## [21,] 0.2485949753+0i 0.0089140927+0i -0.024689370+0i 0.060028536+0i
## [22,] -0.2985992007+0i -0.0422956524+0i -0.028871945+0i -0.404015218+0i
## [23,] 0.2336250094+0i 0.0430901682+0i 0.016608779+0i 0.323525802+0i
## [24,] 0.0002616939+0i -0.0001432936+0i 0.001349157+0i -0.026624103+0i
## [25,] -0.0507216604+0i 0.0045655114+0i -0.005532881+0i -0.009379734+0i
## [26,] 0.1064576684+0i -0.0112983950+0i 0.004208902+0i 0.045670210+0i
## [27,] -0.0316369386+0i -0.0266302944+0i 0.011038747+0i -0.174839332+0i
## [28,] -0.0208855207+0i 0.0250186455+0i 0.003165819+0i -0.103381554+0i
## [29,] 0.0002475226+0i -0.0033635839+0i 0.001775785+0i 0.013443490+0i
## [30,] 0.0255435972+0i 0.0042892253+0i 0.011377719+0i -0.001750600+0i
## [,11] [,12]
## [1,] -0.218119258-0.245296066i -0.218119258+0.245296066i
## [2,] -0.290601526+0.048840765i -0.290601526-0.048840765i
## [3,] 0.007840448-0.098628895i 0.007840448+0.098628895i
## [4,] -0.069912136+0.231170197i -0.069912136-0.231170197i
## [5,] -0.022845557-0.277572143i -0.022845557+0.277572143i
## [6,] -0.085048710-0.036136550i -0.085048710+0.036136550i
## [7,] 0.032385444-0.051951133i 0.032385444+0.051951133i
## [8,] 0.011429464-0.010849088i 0.011429464+0.010849088i
## [9,] -0.013933354-0.033755577i -0.013933354+0.033755577i
## [10,] -0.026398483-0.082109978i -0.026398483+0.082109978i
## [11,] -0.063858507-0.032049450i -0.063858507+0.032049450i
## [12,] -0.016188623+0.125940878i -0.016188623-0.125940878i
## [13,] -0.056770631-0.045491172i -0.056770631+0.045491172i
## [14,] -0.129474920+0.005705209i -0.129474920-0.005705209i
## [15,] -0.008390251-0.046497316i -0.008390251+0.046497316i
## [16,] -0.474259357+0.000000000i -0.474259357+0.000000000i
## [17,] -0.161522703+0.179958209i -0.161522703-0.179958209i
## [18,] 0.099803868+0.061194717i 0.099803868-0.061194717i
## [19,] 0.089047324-0.006588534i 0.089047324+0.006588534i
## [20,] 0.104004419-0.120909272i 0.104004419+0.120909272i
## [21,] -0.359926279-0.017226267i -0.359926279+0.017226267i
## [22,] -0.249619403-0.124032313i -0.249619403+0.124032313i
## [23,] 0.048848902+0.171295751i 0.048848902-0.171295751i
## [24,] -0.004208472+0.018194763i -0.004208472-0.018194763i
## [25,] 0.004108366-0.060372579i 0.004108366+0.060372579i
## [26,] -0.060062313+0.050022625i -0.060062313-0.050022625i
## [27,] -0.041850441+0.004790713i -0.041850441-0.004790713i
## [28,] 0.048453446-0.087294935i 0.048453446+0.087294935i
## [29,] -0.008683301+0.002788011i -0.008683301-0.002788011i
## [30,] -0.033957260-0.001228736i -0.033957260+0.001228736i
## [,13] [,14]
## [1,] -0.750154027+0.000000000i -0.750154027+0.000000000i
## [2,] 0.086817529-0.085297960i 0.086817529+0.085297960i
## [3,] 0.014879437-0.029040881i 0.014879437+0.029040881i
## [4,] -0.104425277+0.029680385i -0.104425277-0.029680385i
## [5,] 0.117139111+0.152537634i 0.117139111-0.152537634i
## [6,] 0.059523349+0.120337681i 0.059523349-0.120337681i
## [7,] -0.021970345+0.035869324i -0.021970345-0.035869324i
## [8,] 0.071074586+0.008236321i 0.071074586-0.008236321i
## [9,] -0.071106015-0.030979048i -0.071106015+0.030979048i
## [10,] 0.013170184-0.017381210i 0.013170184+0.017381210i
## [11,] -0.052767331+0.003458438i -0.052767331-0.003458438i
## [12,] -0.045192035-0.024758503i -0.045192035+0.024758503i
## [13,] 0.018101940-0.095270535i 0.018101940+0.095270535i
## [14,] -0.143312480-0.069469625i -0.143312480+0.069469625i
## [15,] 0.052331591+0.039061901i 0.052331591-0.039061901i
## [16,] -0.244640173+0.093638080i -0.244640173-0.093638080i
## [17,] -0.204413174-0.112275370i -0.204413174+0.112275370i
## [18,] -0.098427686-0.038411104i -0.098427686+0.038411104i
## [19,] -0.003758693+0.065167566i -0.003758693-0.065167566i
## [20,] -0.001949441-0.151233653i -0.001949441+0.151233653i
## [21,] -0.020127145-0.319983662i -0.020127145+0.319983662i
## [22,] 0.057678643-0.039555459i 0.057678643+0.039555459i
## [23,] -0.067523355-0.051234251i -0.067523355+0.051234251i
## [24,] -0.002893053+0.074540332i -0.002893053-0.074540332i
## [25,] 0.053961633-0.019191976i 0.053961633+0.019191976i
## [26,] -0.034858581-0.033456360i -0.034858581+0.033456360i
## [27,] 0.080681147-0.035653713i 0.080681147+0.035653713i
## [28,] 0.021832252+0.006505473i 0.021832252-0.006505473i
## [29,] -0.024081005+0.020093819i -0.024081005-0.020093819i
## [30,] 0.005656509+0.007424102i 0.005656509-0.007424102i
## [,15] [,16] [,17]
## [1,] 0.5225381173+0.000000000i 0.5225381173+0.000000000i -0.518252299+0i
## [2,] 0.1295338454-0.165654847i 0.1295338454+0.165654847i -0.047160542+0i
## [3,] -0.0052119766+0.049485968i -0.0052119766-0.049485968i 0.054821241+0i
## [4,] -0.2051229963-0.059605298i -0.2051229963+0.059605298i 0.237924354+0i
## [5,] 0.1463122661+0.302970781i 0.1463122661-0.302970781i 0.042493533+0i
## [6,] 0.0467743046+0.007320136i 0.0467743046-0.007320136i -0.374020662+0i
## [7,] 0.0159057682+0.031334176i 0.0159057682-0.031334176i 0.074712607+0i
## [8,] -0.0077460295+0.026899136i -0.0077460295-0.026899136i -0.065297697+0i
## [9,] 0.0386695598-0.003795966i 0.0386695598+0.003795966i -0.012135469+0i
## [10,] 0.0343909882+0.067408524i 0.0343909882-0.067408524i -0.148581177+0i
## [11,] 0.0100385213+0.028288179i 0.0100385213-0.028288179i -0.041660218+0i
## [12,] -0.1138959474-0.142359694i -0.1138959474+0.142359694i -0.048240821+0i
## [13,] 0.0730482787+0.015675713i 0.0730482787-0.015675713i 0.026855289+0i
## [14,] 0.0765760493-0.051412121i 0.0765760493+0.051412121i 0.051820532+0i
## [15,] 0.0186450457+0.067979542i 0.0186450457-0.067979542i 0.053430181+0i
## [16,] 0.1608002335-0.180024359i 0.1608002335+0.180024359i 0.329635667+0i
## [17,] 0.0351868748-0.211639645i 0.0351868748+0.211639645i 0.117204498+0i
## [18,] -0.1841992891-0.149010532i -0.1841992891+0.149010532i 0.167450997+0i
## [19,] -0.0395463881+0.110256579i -0.0395463881-0.110256579i 0.029773551+0i
## [20,] 0.1758435348+0.165283014i 0.1758435348-0.165283014i -0.126920947+0i
## [21,] 0.0080288987-0.131550039i 0.0080288987+0.131550039i -0.383128256+0i
## [22,] 0.3089562886-0.066971041i 0.3089562886+0.066971041i 0.061535104+0i
## [23,] -0.2432126327-0.138158065i -0.2432126327+0.138158065i -0.294843887+0i
## [24,] -0.0005477353-0.029240612i -0.0005477353+0.029240612i 0.180893206+0i
## [25,] -0.0204579654+0.062375083i -0.0204579654-0.062375083i 0.018716723+0i
## [26,] 0.0342706926-0.066972465i 0.0342706926+0.066972465i 0.008022737+0i
## [27,] 0.0820745427+0.010488088i 0.0820745427-0.010488088i -0.117576250+0i
## [28,] 0.0133318085+0.093323854i 0.0133318085-0.093323854i -0.089597866+0i
## [29,] -0.0117872565-0.010350076i -0.0117872565+0.010350076i -0.001065177+0i
## [30,] 0.0444667729+0.004331016i 0.0444667729-0.004331016i 0.159867902+0i
## [,18] [,19]
## [1,] -0.780106741+0.00000000i -0.780106741+0.00000000i
## [2,] -0.087537499-0.04464240i -0.087537499+0.04464240i
## [3,] 0.029381461+0.02023571i 0.029381461-0.02023571i
## [4,] -0.015428576+0.01083392i -0.015428576-0.01083392i
## [5,] -0.141727344-0.15099543i -0.141727344+0.15099543i
## [6,] 0.076583055-0.11184865i 0.076583055+0.11184865i
## [7,] -0.039669211+0.02791934i -0.039669211-0.02791934i
## [8,] 0.051092762-0.01415725i 0.051092762+0.01415725i
## [9,] -0.012580261-0.01531808i -0.012580261+0.01531808i
## [10,] 0.054321216-0.09312340i 0.054321216+0.09312340i
## [11,] 0.016350246-0.06483362i 0.016350246+0.06483362i
## [12,] 0.094411705-0.02420940i 0.094411705+0.02420940i
## [13,] 0.030105755-0.02719242i 0.030105755+0.02719242i
## [14,] -0.052961119+0.05070803i -0.052961119-0.05070803i
## [15,] 0.003955959+0.00850441i 0.003955959-0.00850441i
## [16,] -0.271929813+0.10701209i -0.271929813-0.10701209i
## [17,] -0.173171907+0.09726963i -0.173171907-0.09726963i
## [18,] -0.062056488+0.04788691i -0.062056488-0.04788691i
## [19,] -0.090476254-0.02860118i -0.090476254+0.02860118i
## [20,] -0.162509521-0.12578723i -0.162509521+0.12578723i
## [21,] 0.190772506+0.06820504i 0.190772506-0.06820504i
## [22,] -0.058964436+0.04260241i -0.058964436-0.04260241i
## [23,] 0.136319161-0.13425634i 0.136319161+0.13425634i
## [24,] -0.018407072+0.01685103i -0.018407072-0.01685103i
## [25,] -0.006287293+0.05742129i -0.006287293-0.05742129i
## [26,] -0.035974238-0.00712357i -0.035974238+0.00712357i
## [27,] 0.065037106-0.04521776i 0.065037106+0.04521776i
## [28,] -0.007012019-0.03474875i -0.007012019+0.03474875i
## [29,] 0.010428771-0.01872565i 0.010428771+0.01872565i
## [30,] -0.019691800+0.04210375i -0.019691800-0.04210375i
## [,20] [,21]
## [1,] 0.826397072+0.000000000i 0.826397072+0.000000000i
## [2,] 0.068935245+0.028510960i 0.068935245-0.028510960i
## [3,] -0.017342605-0.005297994i -0.017342605+0.005297994i
## [4,] 0.049066413+0.029176264i 0.049066413-0.029176264i
## [5,] 0.124055239+0.137802683i 0.124055239-0.137802683i
## [6,] -0.072249291-0.036746663i -0.072249291+0.036746663i
## [7,] 0.045124090-0.012891096i 0.045124090+0.012891096i
## [8,] -0.063117505+0.002890766i -0.063117505-0.002890766i
## [9,] 0.014261030+0.011912773i 0.014261030-0.011912773i
## [10,] -0.082625735+0.059714701i -0.082625735-0.059714701i
## [11,] -0.022138906+0.047260083i -0.022138906-0.047260083i
## [12,] -0.085720303+0.011208044i -0.085720303-0.011208044i
## [13,] -0.011449884+0.046795867i -0.011449884-0.046795867i
## [14,] 0.094513670-0.045655958i 0.094513670+0.045655958i
## [15,] -0.005136119+0.003514604i -0.005136119-0.003514604i
## [16,] 0.257732641-0.026304744i 0.257732641+0.026304744i
## [17,] 0.179029814-0.087498932i 0.179029814+0.087498932i
## [18,] 0.097872851-0.014448594i 0.097872851+0.014448594i
## [19,] 0.054757941+0.028429990i 0.054757941-0.028429990i
## [20,] 0.171325446+0.057680969i 0.171325446-0.057680969i
## [21,] -0.193517100-0.074750760i -0.193517100+0.074750760i
## [22,] 0.036164258-0.022407505i 0.036164258+0.022407505i
## [23,] -0.116794897+0.016186714i -0.116794897-0.016186714i
## [24,] 0.024291626+0.017176474i 0.024291626-0.017176474i
## [25,] 0.009849086-0.024239864i 0.009849086+0.024239864i
## [26,] 0.033708195+0.010434581i 0.033708195-0.010434581i
## [27,] -0.091144444+0.033186653i -0.091144444-0.033186653i
## [28,] 0.003000434+0.015161263i 0.003000434-0.015161263i
## [29,] -0.025496089+0.016982319i -0.025496089-0.016982319i
## [30,] 0.022955212-0.026109974i 0.022955212+0.026109974i
## [,22] [,23]
## [1,] -0.1999755252+0.303915736i -0.1999755252-0.303915736i
## [2,] 0.0085821465+0.018300548i 0.0085821465-0.018300548i
## [3,] -0.0306507784-0.008766224i -0.0306507784+0.008766224i
## [4,] 0.0694991822-0.109322324i 0.0694991822+0.109322324i
## [5,] -0.1699253217-0.230173018i -0.1699253217+0.230173018i
## [6,] -0.0272691527+0.284538731i -0.0272691527-0.284538731i
## [7,] -0.0135726771+0.035410934i -0.0135726771-0.035410934i
## [8,] 0.0174694610-0.042501623i 0.0174694610+0.042501623i
## [9,] -0.0119808170+0.063807117i -0.0119808170-0.063807117i
## [10,] 0.0061165730+0.049768193i 0.0061165730-0.049768193i
## [11,] 0.0075817694-0.079608906i 0.0075817694+0.079608906i
## [12,] 0.0828086114-0.176071890i 0.0828086114+0.176071890i
## [13,] -0.1071881966+0.095968393i -0.1071881966-0.095968393i
## [14,] -0.0063060197+0.031079482i -0.0063060197-0.031079482i
## [15,] -0.0252593109+0.002410129i -0.0252593109-0.002410129i
## [16,] -0.0772327027-0.063383771i -0.0772327027+0.063383771i
## [17,] -0.3611928018-0.197905303i -0.3611928018+0.197905303i
## [18,] 0.1706419282+0.084799169i 0.1706419282-0.084799169i
## [19,] -0.0509019745-0.123735754i -0.0509019745+0.123735754i
## [20,] -0.1123245861-0.281587282i -0.1123245861+0.281587282i
## [21,] -0.0531256172-0.012726429i -0.0531256172+0.012726429i
## [22,] -0.1767791535+0.096979285i -0.1767791535-0.096979285i
## [23,] 0.4282261137+0.000000000i 0.4282261137+0.000000000i
## [24,] -0.0483038931+0.082200870i -0.0483038931-0.082200870i
## [25,] 0.1138052499-0.154446708i 0.1138052499+0.154446708i
## [26,] 0.0115815908+0.046901660i 0.0115815908-0.046901660i
## [27,] -0.0260106709-0.046731141i -0.0260106709+0.046731141i
## [28,] -0.0008769366+0.008054664i -0.0008769366-0.008054664i
## [29,] 0.0363903797+0.021704497i 0.0363903797-0.021704497i
## [30,] -0.0366664109-0.044222469i -0.0366664109+0.044222469i
## [,24] [,25]
## [1,] -0.374033889+0.128223790i -0.374033889-0.128223790i
## [2,] 0.155232335-0.029131310i 0.155232335+0.029131310i
## [3,] -0.042442237+0.009930314i -0.042442237-0.009930314i
## [4,] 0.106976012-0.028668937i 0.106976012+0.028668937i
## [5,] -0.286514656+0.030373491i -0.286514656-0.030373491i
## [6,] -0.277026310+0.157804284i -0.277026310-0.157804284i
## [7,] 0.004501063+0.020473758i 0.004501063-0.020473758i
## [8,] -0.007606337-0.006559975i -0.007606337+0.006559975i
## [9,] -0.035683722+0.002132080i -0.035683722-0.002132080i
## [10,] 0.075046233+0.047416126i 0.075046233-0.047416126i
## [11,] 0.043457628-0.019787681i 0.043457628+0.019787681i
## [12,] 0.105943483-0.072733734i 0.105943483+0.072733734i
## [13,] -0.006332101+0.045598437i -0.006332101-0.045598437i
## [14,] 0.018061574+0.044271737i 0.018061574-0.044271737i
## [15,] -0.049957663+0.018051487i -0.049957663-0.018051487i
## [16,] 0.027355670-0.028841253i 0.027355670+0.028841253i
## [17,] -0.016005802-0.060244236i -0.016005802+0.060244236i
## [18,] 0.303033998-0.026064705i 0.303033998+0.026064705i
## [19,] 0.033985058-0.039282250i 0.033985058+0.039282250i
## [20,] -0.023598920+0.042737585i -0.023598920-0.042737585i
## [21,] -0.492346262+0.000000000i -0.492346262+0.000000000i
## [22,] -0.316679939+0.033453180i -0.316679939-0.033453180i
## [23,] 0.281953108-0.127294323i 0.281953108+0.127294323i
## [24,] -0.145345446+0.039904267i -0.145345446-0.039904267i
## [25,] 0.073238554-0.063532360i 0.073238554+0.063532360i
## [26,] 0.039729096+0.008505963i 0.039729096-0.008505963i
## [27,] -0.063153417+0.008609378i -0.063153417-0.008609378i
## [28,] -0.011315166+0.015304998i -0.011315166-0.015304998i
## [29,] -0.031133259+0.028637472i -0.031133259-0.028637472i
## [30,] -0.039768317+0.005915141i -0.039768317-0.005915141i
## [,26] [,27]
## [1,] -0.300931750-0.100911783i -0.300931750+0.100911783i
## [2,] 0.097440154+0.077955232i 0.097440154-0.077955232i
## [3,] -0.036519815-0.034774385i -0.036519815+0.034774385i
## [4,] 0.107326657+0.124004656i 0.107326657-0.124004656i
## [5,] -0.214296923-0.148844176i -0.214296923+0.148844176i
## [6,] -0.526785078+0.000000000i -0.526785078+0.000000000i
## [7,] -0.052608687-0.013302907i -0.052608687+0.013302907i
## [8,] 0.009080232-0.001947615i 0.009080232+0.001947615i
## [9,] -0.051853060-0.012398649i -0.051853060+0.012398649i
## [10,] -0.061722635-0.031169543i -0.061722635+0.031169543i
## [11,] 0.034450638+0.018453289i 0.034450638-0.018453289i
## [12,] 0.206316758+0.031755907i 0.206316758-0.031755907i
## [13,] -0.124302556-0.038959964i -0.124302556+0.038959964i
## [14,] -0.017356997-0.033213389i -0.017356997+0.033213389i
## [15,] -0.047198292-0.005156031i -0.047198292+0.005156031i
## [16,] 0.070784465+0.039222106i 0.070784465-0.039222106i
## [17,] 0.154976623-0.243892578i 0.154976623+0.243892578i
## [18,] 0.053052053+0.084776930i 0.053052053-0.084776930i
## [19,] 0.112848237-0.168302885i 0.112848237+0.168302885i
## [20,] 0.179292630+0.073936993i 0.179292630-0.073936993i
## [21,] -0.080838951-0.112583701i -0.080838951+0.112583701i
## [22,] -0.196395774-0.113631716i -0.196395774+0.113631716i
## [23,] 0.344462912+0.157033367i 0.344462912-0.157033367i
## [24,] -0.099726713-0.108001114i -0.099726713+0.108001114i
## [25,] 0.057930119+0.034018805i 0.057930119-0.034018805i
## [26,] -0.015279288+0.014936247i -0.015279288-0.014936247i
## [27,] -0.011440587-0.039489048i -0.011440587+0.039489048i
## [28,] -0.046124114-0.024852084i -0.046124114+0.024852084i
## [29,] -0.091446537-0.012883014i -0.091446537+0.012883014i
## [30,] 0.001863452+0.001423799i 0.001863452-0.001423799i
## [,28] [,29] [,30]
## [1,] -0.629893607+0.000000000i -0.629893607+0.000000000i 0.6727786960+0i
## [2,] -0.008756183+0.023531894i -0.008756183-0.023531894i -0.0001119101+0i
## [3,] 0.017400925-0.006131775i 0.017400925+0.006131775i -0.0138847261+0i
## [4,] 0.122613737+0.007008301i 0.122613737-0.007008301i -0.1011826962+0i
## [5,] 0.146931456-0.103278698i 0.146931456+0.103278698i -0.0763563326+0i
## [6,] -0.190599425+0.319603355i -0.190599425-0.319603355i 0.2756819699+0i
## [7,] -0.065611999+0.001506942i -0.065611999-0.001506942i 0.0630558734+0i
## [8,] 0.014657888-0.003027662i 0.014657888+0.003027662i -0.0163136302+0i
## [9,] 0.006106032-0.010807366i 0.006106032+0.010807366i -0.0016255267+0i
## [10,] 0.031183109-0.006462185i 0.031183109+0.006462185i -0.0325869700+0i
## [11,] -0.060767002+0.061951713i -0.060767002-0.061951713i 0.0421557690+0i
## [12,] 0.264976935-0.008086504i 0.264976935+0.008086504i -0.2963669320+0i
## [13,] 0.015070098-0.031208054i 0.015070098+0.031208054i -0.0070839346+0i
## [14,] -0.172689166+0.070515992i -0.172689166-0.070515992i 0.1618743080+0i
## [15,] -0.005073039-0.011630173i -0.005073039+0.011630173i 0.0173198974+0i
## [16,] 0.007162799-0.003632178i 0.007162799+0.003632178i -0.0380321666+0i
## [17,] 0.030507742+0.047392497i 0.030507742-0.047392497i -0.0864093270+0i
## [18,] -0.068018844+0.055305618i -0.068018844-0.055305618i 0.0402317542+0i
## [19,] 0.223151564-0.208936275i 0.223151564+0.208936275i -0.3495597303+0i
## [20,] -0.339653183+0.139246457i -0.339653183-0.139246457i 0.4076278174+0i
## [21,] 0.075431315-0.079953188i 0.075431315+0.079953188i -0.0567785252+0i
## [22,] -0.052936795-0.034310372i -0.052936795+0.034310372i 0.0871546560+0i
## [23,] -0.099267795+0.097283576i -0.099267795-0.097283576i 0.0545640824+0i
## [24,] 0.032471487+0.035153410i 0.032471487-0.035153410i -0.0252169713+0i
## [25,] 0.061764758+0.028204985i 0.061764758-0.028204985i -0.0853659333+0i
## [26,] -0.073197061+0.016642097i -0.073197061-0.016642097i 0.0661492527+0i
## [27,] 0.024166131-0.011869720i 0.024166131+0.011869720i -0.0201605589+0i
## [28,] -0.017340462-0.005572476i -0.017340462+0.005572476i 0.0230414252+0i
## [29,] -0.084940309+0.015797497i -0.084940309-0.015797497i 0.0844364602+0i
## [30,] 0.034850695-0.001841565i 0.034850695+0.001841565i -0.0242845680+0i
library(MASS)
lda_model <- lda(G3_class ~ school + sex + age + address + famsize + Pstatus + Medu +
Fedu + Mjob + Fjob + reason + guardian +traveltime + studytime +
failures +schoolsup + famsup + paid + activities + nursery + higher +
internet + romantic + famrel + freetime + goout + Dalc + Walc + health +
absences_log, data = df
)
lda_model
## Call:
## lda(G3_class ~ school + sex + age + address + famsize + Pstatus +
## Medu + Fedu + Mjob + Fjob + reason + guardian + traveltime +
## studytime + failures + schoolsup + famsup + paid + activities +
## nursery + higher + internet + romantic + famrel + freetime +
## goout + Dalc + Walc + health + absences_log, data = df)
##
## Prior probabilities of groups:
## Rendah Sedang Tinggi
## 0.3291139 0.4860759 0.1848101
##
## Group means:
## school sex age address famsize Pstatus Medu Fedu
## Rendah 1.130769 1.423077 17.02308 1.746154 1.261538 1.915385 3.569231 3.353846
## Sedang 1.119792 1.473958 16.61458 1.765625 1.291667 1.890625 3.692708 3.536458
## Tinggi 1.082192 1.561644 16.32877 1.863014 1.328767 1.876712 4.219178 3.780822
## Mjob Fjob reason guardian traveltime studytime failures
## Rendah 3.123077 3.246154 2.084615 1.930769 1.492308 1.946154 1.692308
## Sedang 3.104167 3.244792 2.333333 1.812500 1.468750 2.041667 1.203125
## Tinggi 3.424658 3.438356 2.356164 1.821918 1.315068 2.178082 1.041096
## schoolsup famsup paid activities nursery higher internet
## Rendah 1.176923 1.653846 1.392308 1.500000 1.800000 1.900000 1.800000
## Sedang 1.135417 1.598958 1.510417 1.505208 1.776042 1.963542 1.817708
## Tinggi 1.027397 1.575342 1.438356 1.534247 1.835616 2.000000 1.931507
## romantic famrel freetime goout Dalc Walc health
## Rendah 1.400000 3.884615 3.261538 3.400000 1.553846 2.346154 3.684615
## Sedang 1.322917 3.958333 3.197917 3.005208 1.536458 2.401042 3.526042
## Tinggi 1.246575 4.013699 3.287671 2.863014 1.205479 1.904110 3.397260
## absences_log
## Rendah 1.345458
## Sedang 1.496027
## Tinggi 1.084583
##
## Coefficients of linear discriminants:
## LD1 LD2
## school 1.762872e-01 -0.139843702
## sex 4.705890e-01 0.154230893
## age -2.370496e-01 0.094342414
## address 2.429830e-01 -0.059678583
## famsize 2.624972e-01 0.292793134
## Pstatus -3.474394e-01 0.130037878
## Medu 2.340824e-01 0.472488086
## Fedu 2.527538e-02 -0.214434109
## Mjob -1.362101e-01 -0.003114366
## Fjob 1.571450e-01 0.253801585
## reason 1.224708e-01 -0.071453821
## guardian 2.834280e-02 0.359022739
## traveltime -1.811314e-04 -0.174046495
## studytime 2.377069e-01 0.037099497
## failures -6.717198e-01 0.477526775
## schoolsup -1.105157e+00 -0.446946988
## famsup -4.493481e-01 0.235090198
## paid 9.580752e-03 -0.670968528
## activities -1.592994e-01 0.009157679
## nursery -2.176360e-01 0.409463163
## higher 6.934207e-01 0.272761697
## internet 4.617279e-01 0.709924183
## romantic -4.095911e-01 -0.247177055
## famrel 1.057924e-01 -0.175786972
## freetime 1.727616e-01 0.041681123
## goout -3.849384e-01 0.252153308
## Dalc -2.269466e-01 -0.214982994
## Walc 1.080781e-01 -0.255545442
## health -8.680532e-02 0.021732787
## absences_log -5.487369e-05 -0.483435578
##
## Proportion of trace:
## LD1 LD2
## 0.7654 0.2346
Perbedaan hasil koefisien yang diperoleh dari fungsi LDA secara manual ataupaun library MASS disebabkan oleh adanya perbedaan skala dan metode estimasi yang digunakan dalam perhitungan, seperti penggunaan dekomposisi singular (SVD). Secara teoritis, fungsi diskriminan bersifat proporsional sehingga perbedaan skala tidak mempengaruhi hasil klasifikasi. Oleh karena itu, meskipun nilai koefisien berbeda, selama arah diskriminasi yang dihasilkan tetap konsisten antara kedua metode, maka fungsi LDA tetap dikatakan representatif.
Karena terdapat 3 kelas (Rendah, Sedang,
Tinggi), maka terbentuk 2 fungsi diskriminan: LD(max) = g-1
dengan g = 2 sehingga dihasilkan 2 fungsi diskriminan, yaitu LD1 dan
LD2. Berikut Fungsi Diskriminan yang didapat:
D1(x)=0.1763(school)+0.4706(sex)−0.2370(age)+0.2430(address)+0.2625(famsize)−0.3474(Pstatus)+0.2341(Medu)+0.0253(Fedu)−0.1362(Mjob)+0.1571(Fjob)+0.1225(reason)+0.0283(guardian)−0.0002(traveltime)+0.2377(studytime)−0.6717(failures)−1.1052(schoolsup)−0.4493(famsup)+0.0096(paid)−0.1593(activities)−0.2176(nursery)+0.6934(higher)+0.4617(internet)−0.4096(romantic)+0.1058(famrel)+0.1728(freetime)−0.3849(goout)−0.2269(Dalc)+0.1081(Walc)−0.0868(health)−0.00005(absences_log).
D2(x)=−0.1398(school)+0.1542(sex)+0.0943(age)−0.0597(address)+0.2928(famsize)+0.1300(Pstatus)+0.4725(Medu)−0.2144(Fedu)−0.0031(Mjob)+0.2538(Fjob)−0.0715(reason)+0.3590(guardian)−0.1740(traveltime)+0.0371(studytime)+0.4775(failures)−0.4469(schoolsup)+0.2351(famsup)−0.6710(paid)+0.0092(activities)+0.4095(nursery)+0.2728(higher)+0.7099(internet)−0.2472(romantic)−0.1758(famrel)+0.0417(freetime)+0.2522(goout)−0.2150(Dalc)−0.2555(Walc)+0.0217(health)−0.4834(absences_log).
Interpretasi Fungsi Diskriminan:
Hasil Fisher Linear Discriminant Analysis menghasilkan dua fungsi
diskriminan, yaitu LD1 dan LD2 yang didapat dari tiga kelompok nilai
siswa yaitu rendah, sedang, dan tinggi. Fungsi pertama (LD1) memiliki
kemampuan pemisahan paling besar dengan kontribusi 76,54%, sedangkan LD2
memberikan tambahan informasi sebesar 23,46%. Rinciannya, pada LD1,
variabel yang paling berpengaruh dalam membedakan kelompok siswa adalah
schoolsup, higher, failures,
internet, dan romantic. Nilai koefisien yang
besar menunjukkan bahwa variabel tersebut memiliki peran kuat dalam
menentukan apakah siswa cenderung masuk ke kategori performa rendah,
sedang, atau tinggi. Sementara pada LD2, variabel yang paling dominan
adalah internet, paid, failures,
Medu, dan absences_log. Ini menunjukkan bahwa
akses internet, dukungan pendidikan tambahan, tingkat kegagalan belajar,
pendidikan ibu, dan tingkat ketidakhadiran masih memberikan pengaruh
tambahan terhadap pemisahan kelompok siswa meskipun tidak sebesar LD1.
Secara keseluruhan, hasil ini menunjukkan bahwa faktor akademik dan
lingkungan belajar menjadi komponen utama yang membedakan performa
siswa.
# Mean dan Prior
means <- lda_model$means
prior <- lda_model$prior
# Pooled Covariance Matrix
S <- cov(X)
S_inv <- solve(S)
# Fungsi Intercept Fisher
compute_ak <- function(mu, S_inv, prior) {
mu <- as.matrix(mu)
ak <- -0.5 * t(mu) %*% S_inv %*% mu + log(prior)
return(as.numeric(ak))
}
# Hitung Intercept Tiap Kelas
a_rendah <- compute_ak(
means["Rendah", ],
S_inv,
prior["Rendah"]
)
a_sedang <- compute_ak(
means["Sedang", ],
S_inv,
prior["Sedang"]
)
a_tinggi <- compute_ak(
means["Tinggi", ],
S_inv,
prior["Tinggi"]
)
# Koefisien Fisher Tiap Kelas (wk = S^-1 * mu_k)
w_rendah <- S_inv %*% as.matrix(means["Rendah", ])
w_sedang <- S_inv %*% as.matrix(means["Sedang", ])
w_tinggi <- S_inv %*% as.matrix(means["Tinggi", ])
# Tabel Fisher Classification Function
tabel_fisher <- data.frame(
Variabel = c("Intercept", colnames(X)),
Rendah = c(a_rendah, w_rendah),
Sedang = c(a_sedang, w_sedang),
Tinggi = c(a_tinggi, w_tinggi)
)
tabel_fisher
## Variabel Rendah Sedang Tinggi
## 1 Intercept -332.2668545 -327.0096067 -329.7840191
## 2 school 0.1753342 0.3602901 0.3586583
## 3 sex 11.5777974 11.8176178 12.1851465
## 4 age 16.3140860 16.1104798 16.0466189
## 5 address 16.9886441 17.1795259 17.2710861
## 6 famsize 7.0557034 7.0899595 7.4407255
## 7 Pstatus 22.0688340 21.7743772 21.6749698
## 8 Medu 2.8579000 2.7867038 3.2485181
## 9 Fedu 0.8870767 1.0071541 0.8698900
## 10 Mjob -2.1644747 -2.2538772 -2.3309825
## 11 Fjob 5.6494172 5.6321323 5.8974719
## 12 reason 1.8306675 1.9467915 1.9637742
## 13 guardian 2.3971531 2.2432662 2.5119453
## 14 traveltime 6.2438606 6.3275103 6.2047172
## 15 studytime 1.9616019 2.1023822 2.2592642
## 16 failures 4.1764698 3.4983499 3.4655631
## 17 schoolsup 26.9055818 26.3831623 25.4602990
## 18 famsup 7.4432682 7.0302385 6.9488419
## 19 paid -0.5666617 -0.2373229 -0.7050508
## 20 activities 2.9754986 2.8647805 2.7836282
## 21 nursery 12.3037423 11.9614207 12.1303797
## 22 higher 62.2468143 62.5782939 63.1519285
## 23 internet 13.1206071 13.0870523 13.8414417
## 24 romantic 2.3633359 2.2089594 1.8094550
## 25 famrel 1.8555919 2.0108021 1.9450630
## 26 freetime 2.5639543 2.6591873 2.7835818
## 27 goout -1.5756857 -1.9539435 -1.9878888
## 28 Dalc -3.4741458 -3.5221277 -3.7984902
## 29 Walc 2.4254229 2.6205472 2.4998397
## 30 health 3.0940964 3.0257055 2.9932867
## 31 absences_log -2.3898923 -2.1572458 -2.4980721
Critical cutting score dalam analisis Linear Discriminant Analysis (LDA) digunakan untuk menentukan nilai batas yang akan dijadikan acuan klasifikasi suatu observasi ke dalam kelompok tertentu. Nilai ini diperoleh dari rata-rata skor diskriminan masing-masing kelompok yang fungsinya sebagai titik / nilai pemisah antara kategori yang dianalisis. Dengan adanya cutoff, proses klasifikasi menjadi lebih sistematis dan adil karena setiap observasi dikategorikan berdasarkan posisi relatifnya terhadap batas tersebut dalam ruang diskriminan.
# Prediksi LDA
pred <- predict(lda_model)
# Ambil LD score
lda_scores <- pred$x
df_lda <- data.frame(
LD1 = lda_scores[,1],
LD2 = lda_scores[,2],
Actual = df$G3_class
)
# Hitung centroid tiap kelas
centroid <- aggregate(
cbind(LD1, LD2) ~ Actual,
data = df_lda,
mean
)
# Ambil centroid LD1
c_rendah <- centroid$LD1[centroid$Actual == "Rendah"]
c_sedang <- centroid$LD1[centroid$Actual == "Sedang"]
c_tinggi <- centroid$LD1[centroid$Actual == "Tinggi"]
# Critical Cutting Score (Cuttof)
cutoff_rendah_sedang <- (c_rendah + c_sedang) / 2
cutoff_sedang_tinggi <- (c_sedang + c_tinggi) / 2
# Hasil
cutoff_rendah_sedang
## [1] -0.286551
cutoff_sedang_tinggi
## [1] 0.5221081
Berdasarkan hasil fungsi klasifikasi Fisher untuk tiga kelompok nilai siswa, yaitu rendah, sedang, dan tinggi; kami dapati nilai konstanta (intercept) dan koefisien variabel yang berbeda-beda dengan konstanta (intercept) terbesar dimiliki oleh kelompok rendah sebesar -332,267, sedangkan kelompok sedang dan tinggi masing-masing sebesar -327,010 dan -329,784. Beberapa variabel seperti
sex,age,address,Pstatus,danSchoolsupmemiliki nilai koefisien yang cukup besar pada semua kelompok sehingga variabel tersebut disinyalir memiliki kontribusi penting dalam proses klasifikasi siswa ke berbagai kategori baik rendah, sedang, maupun tinggi. Selain itu, perbedaan nilai koefisien antar kelompok menunjukkan adanya pola karakteristik yang berbeda pada masing-masing kategori performa akademik.Critical cutting score pada fungsi diskriminan pertama (LD1) diperoleh berdasarkan titik tengah centroid antar kelompok. Didapatkan nilai cutoff antara kelompok rendah dan sedang sebesar -0,2866, sedangkan cutoff antara kelompok sedang dan tinggi sebesar 0,5221. Dengan demikian, observasi dengan skor LD1 kurang dari -0,2866 cenderung diklasifikasikan ke kelompok rendah, skor antara -0,2866 hingga 0,5221 diklasifikasikan ke kelompok sedang, dan skor di atas 0,5221 diklasifikasikan ke kelompok tinggi.
Secara umum, hasil ini menunjukkan bahwa metode Fisher’s Linear Discriminant Analysis berhasil membentuk kombinasi linear variabel prediktor yang mampu memaksimalkan perbedaan antar kelompok dan meminimalkan variasi dalam kelompok. Pendekatan centroid dan midpoint cutoff dalam menentukan batas klasifikasi antar kelompok berdasarkan rata-rata skor diskriminan masing-masing kelas ternyata dapat representatif dalam memisahkan setiap kelompok. Hal ini mengindikasikan bahwa model memiliki kemampuan diskriminasi yang baik dalam mengklasifikasikan performa akademik siswa.
# Prediksi
df_lda$Predicted <- ifelse(
df_lda$LD1 < cutoff_rendah_sedang,
"Rendah",
ifelse(
df_lda$LD1 < cutoff_sedang_tinggi,
"Sedang",
"Tinggi"
)
)
df_lda$Predicted <- as.factor(df_lda$Predicted)
# Confusion Matrix
conf_matrix <- table(
Aktual = df_lda$Actual,
Prediksi = df_lda$Predicted
)
conf_matrix
## Prediksi
## Aktual Rendah Sedang Tinggi
## Rendah 82 34 14
## Sedang 63 49 80
## Tinggi 4 24 45
# Cek Akurasi
accuracy <- sum(diag(conf_matrix)) / sum(conf_matrix)
accuracy
## [1] 0.4455696
APER = total error (FP + FN) / jumlah confusion matrix
# Total salah klasifikasi
error <- sum(conf_matrix) - sum(diag(conf_matrix))
# APER
APER <- error / sum(conf_matrix)
APER
## [1] 0.5544304
Berdasarkan hasil confusion matrix, model Linear Discriminant Analysis (LDA) menghasilkan tingkat akurasi klasifikasi sebesar 44,56%, yang berarti sekitar 44,56% observasi berhasil diklasifikasikan dengan benar ke dalam kategori performa akademik siswa, yaitu rendah, sedang, dan tinggi. Sementara itu, nilai Apparent Error Rate (APER) sebesar 55,44% menunjukkan bahwa tingkat kesalahan klasifikasi model masih cukup besar pada model yang dibangun.
Secara keseluruhan, kelompok rendah memiliki jumlah klasifikasi benar tertinggi, yaitu sebanyak 82 observasi, sedangkan kelompok sedang dan tinggi masing-masing memiliki 49 dan 45 observasi yang berhasil diklasifikasikan dengan benar. Meskipun demikian, masih terlihat adanya tumpang tindih klasifikasi antar kelompok, khususnya pada kelompok sedang yang cukup banyak diprediksi sebagai tinggi, yaitu sebanyak 80 observasi. Kondisi tersebut mengindikasikan bahwa pola karakteristik siswa pada kategori sedang dan tinggi memiliki kemiripan sehingga cukup sulit dipisahkan secara jelas oleh model diskriminan.
library(ggplot2)
# Density Plot LD1
ggplot(df_lda,
aes(x = LD1,
fill = Actual)) +
geom_density(alpha = 0.4) +
# Garis cutoff
geom_vline(
xintercept = cutoff_rendah_sedang,
linetype = "dashed",
linewidth = 1
) +
geom_vline(
xintercept = cutoff_sedang_tinggi,
linetype = "dashed",
linewidth = 1
) +
theme_minimal() +
labs(
title = "Distribusi Linear Discriminant (LD1)",
subtitle = "Dengan Garis Critical Cutting Score",
x = "LD1",
y = "Density"
)
# Scatter Plot dengan Hyperplane / Garis Diskriminan
ggplot(df_lda,
aes(x = LD1,
y = LD2,
color = Actual)) +
geom_point(size = 3, alpha = 0.8) +
# Vertical separator
geom_vline(
xintercept = cutoff_rendah_sedang,
linetype = "dashed",
linewidth = 1
) +
geom_vline(
xintercept = cutoff_sedang_tinggi,
linetype = "dashed",
linewidth = 1
) +
theme_minimal() +
labs(
title = "Visualisasi Klasifikasi LDA",
subtitle = "Dengan Garis Pemisah Antar Kelas",
x = "LD1",
y = "LD2"
)
Pada grafik distribusi LD1 terlihat bahwa fungsi diskriminan pertama mampu memisahkan kelompok rendah dari kelompok lainnya secara cukup baik, ditunjukkan oleh centroid kelompok rendah yang lebih dominan berada pada sisi kiri distribusi. Akan tetapi, distribusi kelompok sedang dan tinggi masih mengalami overlap yang cukup besar. Scatter plot LD1 dan LD2 juga memperlihatkan bahwa pemisahan antar kelompok belum sepenuhnya jelas karena masih terdapat overlap antar titik observasi.
Meskipun tingkat akurasi model belum tergolong tinggi, hasil ini masih dapat diterima mengingat sebagian besar variabel prediktor bersifat kategorik dan asumsi normalitas multivariat serta homogenitas matriks kovarians pada proses sebelumnya tidak sepenuhnya terpenuhi. Hasil ini mengindikasikan bahwa kombinasi variabel prediktor yang digunakan cukup mampu untuk melakukan diskriminasi dengan menunjukkan pola pemisahan antar kelompok performa akademik siswa dan memberikan gambaran umum mengenai karakteristik tiap kategori, meskipun belum optimal dalam memisahkan seluruh kelompok secara sempurna.