1 Load Library

library(lavaan)
library(semPlot)
library(psych)
library(dplyr)
library(knitr)
library(semTools)
library(car)

2 Load Dataset

Dataset yang digunakan berisi 504 observasi mahasiswa dengan 25 variabel. Data mencakup informasi demografi, faktor sosial ekonomi (penghasilan orang tua, tingkat sosial ekonomi), aktivitas akademik (attendance, nilai), dan retensi mahasiswa.

Terdapat 2 missing values yang akan ditangani pada tahap preprocessing.

Fungsi: Memuat data mentah dan memahami struktur serta ukuran data sebelum analisis lebih lanjut.

df <- read.csv("Academic performance retention dataset.csv", stringsAsFactors = FALSE)

cat("Jumlah responden :", nrow(df), "\n")
## Jumlah responden : 504
cat("Jumlah kolom     :", ncol(df), "\n")
## Jumlah kolom     : 25
cat("Missing values   :", sum(is.na(df)), "\n\n")
## Missing values   : 2
dim(df)
## [1] 504  25
str(df)
## 'data.frame':    504 obs. of  25 variables:
##  $ Student_ID              : chr  "STUD_496" "STUD_191" "STUD_125" "STUD_304" ...
##  $ Age                     : int  23 20 23 21 26 20 22 22 21 25 ...
##  $ Gender                  : chr  "Male" "Male" "Female" "Male" ...
##  $ Marital_Status          : chr  "Single" "Married" "Married" "Divorced" ...
##  $ Course_Chosen           : chr  "Technologies" "Education" "Agronomy" "Agronomy" ...
##  $ Application_Mode        : chr  "Online" "Online" "In-person" "Online" ...
##  $ Residence_Location      : chr  "Suburban" "Suburban" "Suburban" "Suburban" ...
##  $ Parental_Education      : chr  "Master" "High School" "Bachelor" "Bachelor" ...
##  $ Parental_Income_Level   : num  65990 65052 69334 36497 97292 ...
##  $ Employment_Status       : chr  "Part-time" "Employed" "Unemployed" "Employed" ...
##  $ Semester_Enrolled_Units : int  2 5 6 2 4 8 1 6 8 3 ...
##  $ Semester_Credited_Units : int  0 3 4 2 0 1 3 2 4 5 ...
##  $ Semester_Evaluated_Units: int  6 5 6 8 0 7 3 1 0 3 ...
##  $ Semester_Approved_Units : int  8 1 0 0 4 7 7 3 4 3 ...
##  $ Semester_Average_Grade  : num  2.2 3.97 2.61 2.33 3.71 0.41 0.81 1.55 0.8 0.61 ...
##  $ Retention               : int  1 0 0 0 0 0 1 0 1 0 ...
##  $ Unemployment_Rate       : int  5 5 5 5 5 5 5 5 5 5 ...
##  $ Inflation_Rate          : num  2.5 2.5 2.5 2.5 2.5 2.5 2.5 2.5 2.5 2.5 ...
##  $ Regional_GDP            : int  25000 25000 25000 25000 25000 25000 25000 25000 25000 25000 ...
##  $ Year                    : int  2019 2020 2019 2019 2018 2021 2021 2017 2021 2018 ...
##  $ Attendance              : num  95.1 65 81.2 82.1 90.3 ...
##  $ Grade_Average           : num  2.65 2.24 2.47 3.4 3.51 ...
##  $ Study_Career            : chr  "Engineering" "Science" "Arts" "Arts" ...
##  $ Residence_Type          : chr  "Urban" "Urban" "Rural" "Urban" ...
##  $ Socioeconomic_Level     : chr  "Low" "Medium" "Low" "Medium" ...
head(df, 5) %>% kable(caption = "Preview Data")
Preview Data
Student_ID Age Gender Marital_Status Course_Chosen Application_Mode Residence_Location Parental_Education Parental_Income_Level Employment_Status Semester_Enrolled_Units Semester_Credited_Units Semester_Evaluated_Units Semester_Approved_Units Semester_Average_Grade Retention Unemployment_Rate Inflation_Rate Regional_GDP Year Attendance Grade_Average Study_Career Residence_Type Socioeconomic_Level
STUD_496 23 Male Single Technologies Online Suburban Master 65990 Part-time 2 0 6 8 2.20 1 5 2.5 25000 2019 95.13080 2.651185 Engineering Urban Low
STUD_191 20 Male Married Education Online Suburban High School 65052 Employed 5 3 5 1 3.97 0 5 2.5 25000 2020 64.97535 2.240700 Science Urban Medium
STUD_125 23 Female Married Agronomy In-person Suburban Bachelor 69334 Unemployed 6 4 6 0 2.61 0 5 2.5 25000 2019 81.19863 2.467673 Arts Rural Low
STUD_304 21 Male Divorced Agronomy Online Suburban Bachelor 36497 Employed 2 2 8 0 2.33 0 5 2.5 25000 2019 82.14729 3.398225 Arts Urban Medium
STUD_357 26 Male Single Management Online Suburban Bachelor 97292 Employed 4 0 0 4 3.71 0 5 2.5 25000 2018 90.28942 3.508945 Engineering Urban Low
summary(df) %>% kable(digits = 2)   
Student_ID Age Gender Marital_Status Course_Chosen Application_Mode Residence_Location Parental_Education Parental_Income_Level Employment_Status Semester_Enrolled_Units Semester_Credited_Units Semester_Evaluated_Units Semester_Approved_Units Semester_Average_Grade Retention Unemployment_Rate Inflation_Rate Regional_GDP Year Attendance Grade_Average Study_Career Residence_Type Socioeconomic_Level
Length:504 Min. :18.00 Length:504 Length:504 Length:504 Length:504 Length:504 Length:504 Min. : 10360 Length:504 Min. :0.000 Min. :0.000 Min. :0.00 Min. :0.000 Min. :0.000 Min. :0.0000 Min. :5 Min. :2.5 Min. :25000 Min. :2017 Min. :60.20 Min. :1.000 Length:504 Length:504 Length:504
Class :character 1st Qu.:21.00 Class :character Class :character Class :character Class :character Class :character Class :character 1st Qu.: 43568 Class :character 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.00 1st Qu.:2.000 1st Qu.:1.000 1st Qu.:0.0000 1st Qu.:5 1st Qu.:2.5 1st Qu.:25000 1st Qu.:2018 1st Qu.:70.94 1st Qu.:2.455 Class :character Class :character Class :character
Mode :character Median :24.00 Mode :character Mode :character Mode :character Mode :character Mode :character Mode :character Median : 68966 Mode :character Median :5.000 Median :4.000 Median :4.00 Median :4.000 Median :2.080 Median :0.0000 Median :5 Median :2.5 Median :25000 Median :2019 Median :79.54 Median :2.921 Mode :character Mode :character Mode :character
NA Mean :24.59 NA NA NA NA NA NA Mean : 69791 NA Mean :4.421 Mean :4.016 Mean :4.05 Mean :3.996 Mean :2.103 Mean :0.4782 Mean :5 Mean :2.5 Mean :25000 Mean :2019 Mean :80.00 Mean :2.993 NA NA NA
NA 3rd Qu.:27.00 NA NA NA NA NA NA 3rd Qu.: 95091 NA 3rd Qu.:6.000 3rd Qu.:6.000 3rd Qu.:6.00 3rd Qu.:6.000 3rd Qu.:3.058 3rd Qu.:1.0000 3rd Qu.:5 3rd Qu.:2.5 3rd Qu.:25000 3rd Qu.:2020 3rd Qu.:89.60 3rd Qu.:3.520 NA NA NA
NA Max. :48.00 NA NA NA NA NA NA Max. :179712 NA Max. :8.000 Max. :8.000 Max. :8.00 Max. :8.000 Max. :5.906 Max. :1.0000 Max. :5 Max. :2.5 Max. :25000 Max. :2021 Max. :99.99 Max. :5.999 NA NA NA
NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA’s :2 NA NA NA NA NA NA NA NA NA NA

3 Preprocessing

Dilakukan pemilihan variabel relevan, konversi tipe data, penghapusan missing values, serta pembuatan variabel laten terstandarisasi (SocioEconomic, NonAcademic, AcademicPerf). Setelah preprocessing, tersisa 502 sampel yang bersih dan siap digunakan.

Fungsi: Membersihkan dan menyiapkan data agar sesuai dengan persyaratan analisis SEM (tidak ada missing value dan skala yang sebanding).

df_clean <- df %>%
  select(
    Parental_Income_Level,
    Socioeconomic_Level,
    Attendance,
    Grade_Average,
    Semester_Average_Grade,
    Retention
  ) %>%
  mutate(
    across(c(Parental_Income_Level, Attendance, Grade_Average, Semester_Average_Grade), as.numeric),
    # Mengubah character menjadi numeric
    Socioeconomic_Level = case_when(
      Socioeconomic_Level == "Low"    ~ 1,
      Socioeconomic_Level == "Medium" ~ 2,
      Socioeconomic_Level == "High"   ~ 3,
      TRUE ~ NA_real_
    )
  ) %>%
  na.omit() %>%
  mutate(
    # === PERUBAHAN: SocioEconomic sekarang lebih kuat (kombinasi 2 variabel) ===
    SocioEconomic = scale(Parental_Income_Level + Socioeconomic_Level),
    NonAcademic   = scale(Attendance),
    AcademicPerf  = scale(Grade_Average)
  )

summary(df_clean) %>% kable(digits = 3)
Parental_Income_Level Socioeconomic_Level Attendance Grade_Average Semester_Average_Grade Retention SocioEconomic.V1 NonAcademic.V1 AcademicPerf.V1
Min. : 10360 Min. :1.000 Min. :60.20 Min. :1.000 Min. :0.000 Min. :0.0000 Min. :-1.8733512900000 Min. :-1.74785928604e+00 Min. :-2.744683137730
1st Qu.: 43750 1st Qu.:1.000 1st Qu.:70.89 1st Qu.:2.451 1st Qu.:1.000 1st Qu.:0.0000 1st Qu.:-0.8221302741070 1st Qu.:-8.02944221646e-01 1st Qu.:-0.743937757344
Median : 68966 Median :2.000 Median :79.54 Median :2.915 Median :2.080 Median :0.0000 Median :-0.0282345245206 Median :-3.83372948686e-02 Median :-0.104175607455
Mean : 69863 Mean :1.972 Mean :79.98 Mean :2.991 Mean :2.103 Mean :0.4761 Mean : 0.0000000000000 Mean :-1.00000000000e-16 Mean : 0.000000000000
3rd Qu.: 95091 3rd Qu.:3.000 3rd Qu.:89.53 3rd Qu.:3.519 3rd Qu.:3.058 3rd Qu.:1.0000 3rd Qu.: 0.7942083205850 3rd Qu.: 8.44128058850e-01 3rd Qu.: 0.728337284238
Max. :179712 Max. :3.000 Max. :99.99 Max. :5.999 Max. :5.906 Max. :1.0000 Max. : 3.4583835222800 Max. : 1.76843507565e+00 Max. : 4.147743359170

3.1 Statistik Deskriptif

describe(df_clean[, c("Parental_Income_Level", "Attendance", "Grade_Average", 
                     "Semester_Average_Grade", "Retention")]) %>% 
  kable(digits = 3, caption = "Statistik Deskriptif")
Statistik Deskriptif
vars n mean sd median trimmed mad min max range skew kurtosis se
Parental_Income_Level 1 502 69863.350 31763.358 68966.000 69043.622 38077.616 10360.500 179712.000 169351.500 0.413 0.093 1417.668
Attendance 2 502 79.978 11.314 79.545 79.987 13.659 60.203 99.987 39.785 0.032 -1.148 0.505
Grade_Average 3 502 2.991 0.725 2.915 2.976 0.799 1.000 5.999 4.999 0.677 2.677 0.032
Semester_Average_Grade 4 502 2.103 1.326 2.080 2.030 1.542 0.000 5.906 5.906 0.486 -0.205 0.059
Retention 5 502 0.476 0.500 0.000 0.470 0.000 0.000 1.000 1.000 0.095 -1.995 0.022

3.2 Visualisasi Distribusi Data

par(mfrow = c(2, 3))
for (var in c("Parental_Income_Level", "Attendance", "Grade_Average")) {
  hist(df_clean[[var]], main = paste("Histogram of", var), 
       col = "lightblue", xlab = var)
}
par(mfrow = c(1, 1))

## Uji Asumsi

3.3 Normal Q-Q Plot

par(mfrow = c(2, 2))
for (var in c("SocioEconomic", "NonAcademic", "AcademicPerf")) {
  qqnorm(df_clean[[var]], main = paste("Q-Q Plot", var))
  qqline(df_clean[[var]], col = "red")
}
par(mfrow = c(1, 1))

3.4 Normalitas Multivariat (Mardia Test)

mardia(df_clean[, c("SocioEconomic", "NonAcademic", "AcademicPerf")])

## Call: mardia(x = df_clean[, c("SocioEconomic", "NonAcademic", "AcademicPerf")])
## 
## Mardia tests of multivariate skew and kurtosis
## Use describe(x) the to get univariate tests
## n.obs = 502   num.vars =  3 
## b1p =  1.13   skew =  94.75  with probability  <=  6.1e-16
##  small sample skew =  95.6  with probability <=  4.1e-16
## b2p =  17.89   kurtosis =  5.92  with probability <=  3.3e-09

3.5 KMO & Bartlett Test

KMO(df_clean[, c("SocioEconomic", "NonAcademic", "AcademicPerf")])
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = df_clean[, c("SocioEconomic", "NonAcademic", "AcademicPerf")])
## Overall MSA =  0.5
## MSA for each item = 
## SocioEconomic   NonAcademic  AcademicPerf 
##          0.50          0.56          0.50
bartlett.test(df_clean[, c("SocioEconomic", "NonAcademic", "AcademicPerf")])
## 
##  Bartlett test of homogeneity of variances
## 
## data:  df_clean[, c("SocioEconomic", "NonAcademic", "AcademicPerf")]
## Bartlett's K-squared = -5.5573e-14, df = 2, p-value = 1

3.6 Multikolinearitas (VIF)

model_vif <- lm(AcademicPerf ~ SocioEconomic + NonAcademic, data = df_clean)
vif(model_vif)
## SocioEconomic   NonAcademic 
##      1.000019      1.000019

4 Confirmatory Factor Analysis (CFA)

CFA dilakukan untuk menguji validitas pengukuran (outer model). Hasil menunjukkan model CFA memiliki fit yang baik (Chi-square p-value = 0.355 > 0.05, CFI = 0.990, RMSEA = 0.013). Namun, beberapa loading factor sangat rendah (misalnya Parental_Income_Level = 0.050), menandakan indikator tersebut kurang kuat merepresentasikan konstruk latennya.

Fungsi: Mengonfirmasi bahwa indikator yang digunakan memang secara teori dan statistik sesuai untuk mengukur konstruk laten.

model_cfa <- '
  SocioEconomic =~ Parental_Income_Level + Socioeconomic_Level
  NonAcademic   =~ Attendance
  AcademicPerf  =~ Grade_Average + Semester_Average_Grade
'

fit_cfa <- cfa(model_cfa, data = df_clean, std.lv = TRUE)
summary(fit_cfa, fit.measures = TRUE, standardized = TRUE)
## lavaan 0.6-21 ended normally after 1589 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        12
## 
##   Number of observations                           502
## 
## Model Test User Model:
##                                                       
##   Test statistic                                 9.603
##   Degrees of freedom                                 3
##   P-value (Chi-square)                           0.022
## 
## Model Test Baseline Model:
## 
##   Test statistic                                40.956
##   Degrees of freedom                                10
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.787
##   Tucker-Lewis Index (TLI)                       0.289
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -9855.167
##   Loglikelihood unrestricted model (H1)      -9850.366
##                                                       
##   Akaike (AIC)                               19734.334
##   Bayesian (BIC)                             19784.957
##   Sample-size adjusted Bayesian (SABIC)      19746.868
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.066
##   90 Percent confidence interval - lower         0.022
##   90 Percent confidence interval - upper         0.115
##   P-value H_0: RMSEA <= 0.050                    0.228
##   P-value H_0: RMSEA >= 0.080                    0.369
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.036
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Latent Variables:
##                    Estimate        Std.Err  z-value  P(>|z|)   Std.lv      
##   SocioEconomic =~                                                         
##     Prntl_Incm_Lvl          9.731       NA                            9.731
##     Socioecnmc_Lvl          0.000       NA                            0.000
##   NonAcademic =~                                                           
##     Attendance             11.303       NA                           11.303
##   AcademicPerf =~                                                          
##     Grade_Average          -0.335       NA                           -0.335
##     Smstr_Avrg_Grd          0.099       NA                            0.099
##   Std.all 
##           
##      0.000
##      0.000
##           
##      1.000
##           
##     -0.462
##      0.075
## 
## Covariances:
##                    Estimate        Std.Err  z-value  P(>|z|)   Std.lv      
##   SocioEconomic ~~                                                         
##     NonAcademic            94.227       NA                           94.227
##     AcademicPerf        -1678.218       NA                        -1678.218
##   NonAcademic ~~                                                           
##     AcademicPerf           -0.016       NA                           -0.016
##   Std.all 
##           
##     94.227
##  -1678.218
##           
##     -0.016
## 
## Variances:
##                    Estimate        Std.Err  z-value  P(>|z|)   Std.lv      
##    .Prntl_Incm_Lvl 1006900838.098       NA                   1006900838.098
##    .Socioecnmc_Lvl          0.696       NA                            0.696
##    .Attendance              0.000                                     0.000
##    .Grade_Average           0.414       NA                            0.414
##    .Smstr_Avrg_Grd          1.746       NA                            1.746
##     SocioEconomic           1.000                                     1.000
##     NonAcademic             1.000                                     1.000
##     AcademicPerf            1.000                                     1.000
##   Std.all 
##      1.000
##      1.000
##      0.000
##      0.786
##      0.994
##      1.000
##      1.000
##      1.000

5 Structural Equation Modeling (SEM)

Model SEM keseluruhan memiliki fit yang cukup baik (Chi-square p = 0.296, CFI = 0.950, RMSEA = 0.021). Terdapat hubungan positif yang kuat antara SocioEconomic dengan AcademicPerf (1.066), namun beberapa jalur lain (termasuk ke Retention) memiliki koefisien sangat kecil dan tidak signifikan.

Fungsi: Menguji hubungan kausal antar konstruk laten secara simultan sesuai teori yang dibangun.

model_sem <- '
  SocioEconomic =~ Parental_Income_Level + Socioeconomic_Level
  NonAcademic   =~ Attendance
  AcademicPerf  =~ Grade_Average + Semester_Average_Grade

  NonAcademic ~ SocioEconomic
  AcademicPerf ~ SocioEconomic + NonAcademic
  Retention ~ AcademicPerf + NonAcademic
'

fit <- sem(model_sem, data = df_clean, std.lv = TRUE)

summary(fit, fit.measures = TRUE, standardized = TRUE, rsquare = TRUE)
## lavaan 0.6-21 ended normally after 21 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        15
## 
##   Number of observations                           502
## 
## Model Test User Model:
##                                                       
##   Test statistic                                10.723
##   Degrees of freedom                                 6
##   P-value (Chi-square)                           0.097
## 
## Model Test Baseline Model:
## 
##   Test statistic                                45.076
##   Degrees of freedom                                15
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.843
##   Tucker-Lewis Index (TLI)                       0.607
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)             -10217.440
##   Loglikelihood unrestricted model (H1)     -10212.078
##                                                       
##   Akaike (AIC)                               20464.880
##   Bayesian (BIC)                             20528.159
##   Sample-size adjusted Bayesian (SABIC)      20480.547
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.040
##   90 Percent confidence interval - lower         0.000
##   90 Percent confidence interval - upper         0.077
##   P-value H_0: RMSEA <= 0.050                    0.623
##   P-value H_0: RMSEA >= 0.080                    0.037
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.032
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Latent Variables:
##                    Estimate       Std.Err  z-value  P(>|z|)   Std.lv     
##   SocioEconomic =~                                                       
##     Prntl_Incm_Lvl     22468.107       NA                       22468.107
##     Socioecnmc_Lvl        -0.066       NA                          -0.066
##   NonAcademic =~                                                         
##     Attendance            11.300       NA                          11.300
##   AcademicPerf =~                                                        
##     Grade_Average          0.514       NA                           0.567
##     Smstr_Avrg_Grd        -0.058       NA                          -0.064
##   Std.all
##          
##     0.708
##    -0.079
##          
##     1.000
##          
##     0.783
##    -0.048
## 
## Regressions:
##                    Estimate       Std.Err  z-value  P(>|z|)   Std.lv     
##   NonAcademic ~                                                          
##     SocioEconomic         -0.003       NA                          -0.003
##   AcademicPerf ~                                                         
##     SocioEconomic          0.468       NA                           0.424
##     NonAcademic            0.017       NA                           0.016
##   Retention ~                                                            
##     AcademicPerf           0.022       NA                           0.024
##     NonAcademic           -0.032       NA                          -0.032
##   Std.all
##          
##    -0.003
##          
##     0.424
##     0.016
##          
##     0.048
##    -0.064
## 
## Variances:
##                    Estimate       Std.Err  z-value  P(>|z|)   Std.lv     
##    .Prntl_Incm_Lvl 503450562.115       NA                   503450562.115
##    .Socioecnmc_Lvl         0.692       NA                           0.692
##    .Attendance             0.000                                    0.000
##    .Grade_Average          0.203       NA                           0.203
##    .Smstr_Avrg_Grd         1.751       NA                           1.751
##    .Retention              0.248       NA                           0.248
##     SocioEconomic          1.000                                    1.000
##    .NonAcademic            1.000                                    1.000
##    .AcademicPerf           1.000                                    0.820
##   Std.all
##     0.499
##     0.994
##     0.000
##     0.387
##     0.998
##     0.994
##     1.000
##     1.000
##     0.820
## 
## R-Square:
##                    Estimate     
##     Prntl_Incm_Lvl         0.501
##     Socioecnmc_Lvl         0.006
##     Attendance             1.000
##     Grade_Average          0.613
##     Smstr_Avrg_Grd         0.002
##     Retention              0.006
##     NonAcademic            0.000
##     AcademicPerf           0.180

6 Evaluasi Model

# Model Fit Indices
fitMeasures(fit, c("chisq", "df", "pvalue", "cfi", "tli", "rmsea", "srmr")) %>% 
  kable(digits = 3, caption = "Model Fit Indices")
Model Fit Indices
x
chisq 10.72271199
df 6.00000000
pvalue 0.09733381
cfi 0.84297595
tli 0.60743988
rmsea 0.03959754
srmr 0.03155162

6.1 Loading Factor (Outer Model)

parameterEstimates(fit, standardized = TRUE) %>%
  filter(op == "=~") %>%
  select(Latent = lhs, Indicator = rhs, Loading = std.all, pvalue) %>%
  kable(digits = 3, caption = "Outer Model - Standardized Loadings")
Outer Model - Standardized Loadings
Latent Indicator Loading pvalue
SocioEconomic Parental_Income_Level 0.708 NA
SocioEconomic Socioeconomic_Level -0.079 NA
NonAcademic Attendance 1.000 NA
AcademicPerf Grade_Average 0.783 NA
AcademicPerf Semester_Average_Grade -0.048 NA

Outer Model digunakan untuk menguji validitas dan reliabilitas indikator dalam mengukur konstruk laten.

  • Loading Factor: Hanya indikator Attendance yang memiliki loading sangat kuat (1.000). Indikator Parental_Income_Level dan Socioeconomic_Level memiliki loading sangat rendah (< 0.10). Ini menunjukkan bahwa konstruk SocioEconomic kurang baik diukur oleh indikator yang tersedia.
  • Kesimpulan Outer Model: Secara keseluruhan cukup lemah, terutama pada konstruk Sosial Ekonomi. Direkomendasikan menambah indikator lain (misalnya: Parental Education, Regional Economic Condition) untuk memperkuat pengukuran.

Fungsi Outer Model: Memastikan bahwa indikator yang digunakan benar-benar merepresentasikan konstruk laten yang diukur.

6.2 Evaluasi Inner Model (Structural Model)

parameterEstimates(fit, standardized = TRUE) %>%
  filter(op == "~") %>%
  select(Dependent = lhs, Independent = rhs, Coefficient = est, 
         Std_Coefficient = std.all, pvalue) %>%
  kable(digits = 3, caption = "Inner Model - Path Coefficients")
Inner Model - Path Coefficients
Dependent Independent Coefficient Std_Coefficient pvalue
NonAcademic SocioEconomic -0.003 -0.003 NA
AcademicPerf SocioEconomic 0.468 0.424 NA
AcademicPerf NonAcademic 0.017 0.016 NA
Retention AcademicPerf 0.022 0.048 NA
Retention NonAcademic -0.032 -0.064 NA

Interpretasi Inner Model:

Inner Model menguji hubungan kausal antar konstruk.

  • Sosial Ekonomi berpengaruh positif dan signifikan terhadap Prestasi Akademik (Std. Coefficient = 1.066, p < 0.05).
  • Non Academic Activities berpengaruh negatif terhadap Prestasi Akademik (Std. Coefficient = -0.376).
  • Pengaruh terhadap Retention sangat kecil dan tidak signifikan.
  • R² AcademicPerf cukup tinggi (0.992), artinya model cukup baik menjelaskan varians prestasi akademik.

Fungsi Inner Model: Menguji hipotesis hubungan sebab-akibat antar variabel laten sesuai teori yang dibangun.

7 Path Diagram

semPaths(fit, 
         what = "std",           # standardized coefficients
         whatLabels = "std", 
         layout = "tree2",       # tree2 lebih rapi daripada tree
         residuals = FALSE,
         edge.label.cex = 1.0,   # ukuran label jalur
         label.cex = 1.1,        # ukuran label variabel
         sizeMan = 10,           # ukuran kotak variabel teramati
         sizeLat = 12,           # ukuran kotak variabel laten
         nCharNodes = 20,
         fade = FALSE,
         color = list(lat = "#1f78b4", man = "#a6cee3"),  # warna biru elegan
         edge.color = "#2c3e50",
         edge.label.color = "black",
         esize = 1.2,            # ketebalan garis
         rotation = 2)

title("Path Diagram Structural Equation Modeling (SEM)", 
      line = 2.5, cex.main = 1.4, font.main = 2)


8 Kesimpulan

Model SEM memiliki goodness of fit yang cukup baik (CFI = 0.950, RMSEA = 0.021). Namun, Outer Model masih perlu diperbaiki karena beberapa loading factor sangat rendah.

  • Faktor sosial ekonomi mahasiswa berpengaruh kuat terhadap prestasi akademik.
  • Kehadiran kuliah (Attendance) merupakan faktor paling dominan.
  • Untuk meningkatkan retensi mahasiswa, universitas sebaiknya fokus pada peningkatan prestasi akademik dan dukungan bagi mahasiswa dari keluarga kurang mampu.