Dalam analisis regresi linier, variasi total dalam data (Total Sum of Squares, SST) dapat dipecah menjadi:
Secara matematis:
\[ SST = SSR + SSE \]
Pemahaman ini sangat penting untuk mengukur seberapa baik model regresi menjelaskan data.
# Atur seed agar hasil replikasi tetap
set.seed(123)
# Buat data
x <- 1:10
y <- 2 + 1.5 * x + rnorm(10, mean = 0, sd = 2) # Model linier + noise
# Buat model regresi
model <- lm(y ~ x)
# Tampilkan ringkasan model
summary(model)
##
## Call:
## lm(formula = y ~ x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.2695 -1.1248 -0.2785 0.7707 3.3628
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.0509 1.3346 2.286 0.051577 .
## x 1.3361 0.2151 6.212 0.000256 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.954 on 8 degrees of freedom
## Multiple R-squared: 0.8283, Adjusted R-squared: 0.8068
## F-statistic: 38.59 on 1 and 8 DF, p-value: 0.0002561
# Nilai rata-rata y
y_mean <- mean(y)
# Nilai prediksi dari model
y_hat <- fitted(model)
# Hitung SSE, SSR, SST
SSE <- sum((y - y_hat)^2)
SSR <- sum((y_hat - y_mean)^2)
SST <- sum((y - y_mean)^2)
# Tampilkan hasilnya
data.frame(
Komponen = c("SSE", "SSR", "SST", "SSE + SSR"),
Nilai = round(c(SSE, SSR, SST, SSE + SSR), 2)
)
## Komponen Nilai
## 1 SSE 30.53
## 2 SSR 147.27
## 3 SST 177.80
## 4 SSE + SSR 177.80
# Buat plot data asli dan model regresi
plot(x, y, pch = 16, col = "blue",
main = "Visualisasi SST, SSR, dan SSE",
xlab = "x", ylab = "y")
# Tambahkan garis regresi dan rata-rata y
abline(model, col = "red", lwd = 2) # Garis regresi
abline(h = y_mean, col = "darkgreen", lty = 2) # Garis rata-rata
# Tambahkan garis untuk setiap komponen
segments(x, y, x, y_hat, col = "orange", lwd = 2) # SSE
segments(x, y_hat, x, y_mean, col = "purple", lwd = 2) # SSR
segments(x, y, x, y_mean, col = "grey", lty = 3) # SST
# Tambahkan legenda
legend("topleft",
legend = c("Garis Regresi", "Rata-rata y", "SSE", "SSR", "SST"),
col = c("red", "darkgreen", "orange", "purple", "grey"),
lty = c(1, 2, 1, 1, 3), lwd = 2)