Analisis regresi berganda merupakan salah satu metode statistika yang digunakan untuk memodelkan hubungan antara satu variabel dependen (terikat) dengan dua atau lebih variabel independen (bebas). Metode ini sangat berguna untuk memahami sejauh mana variabel-variabel penjelas berkontribusi terhadap perubahan variabel respons.
Dalam analisis ini, data yang digunakan adalah data yang memuat informasi ekonomi dari 41 negara. Variabel yang digunakan adalah sebagai berikut:
| Peran | Variabel | Keterangan | Satuan |
|---|---|---|---|
| Y (Respons) | PNB | Pendapatan Nasional Bruto | Ribu US$ |
| X1 (Prediktor) | PDB | Produk Domestik Bruto | Ribu US$ |
| X2 (Prediktor) | Ekspor | Nilai Ekspor Barang dan Jasa | Ribu US$ |
# Install package jika belum tersedia
packages <- c("corrplot", "lmtest", "car", "knitr", "e1071")
for (pkg in packages) {
if (!require(pkg, character.only = TRUE, quietly = TRUE)) {
install.packages(pkg)
library(pkg, character.only = TRUE)
}
}
# Data langsung di-input (tanpa file xlsx)
data <- data.frame(
No = 1:41,
Negara = c("Argentina","Bahamas","Bahrain","Barbados","Brunei Darussalam",
"Bulgaria","Canada","Chile","Costa Rica","Croatia",
"Czechia","Estonia","France","Germany","Greece",
"Guyana","Hungary","Israel","Italy","Kazakhstan",
"Korea Selatan","Latvia","Lithuania","Mexico","New Zealand",
"Oman","Palau","Panama","Poland","Portugal",
"Romania","Saudi Arabia","Serbia","Seychelles","Slovak Republic",
"Slovenia","Spain","Turkiye","United Arab Emirates","United Kingdom",
"Uruguay"),
PNB = c(12.2,32.8,28.2,21.3,34.0,14.3,54.1,15.7,14.3,20.4,
28.2,27.9,45.1,53.6,22.2,19.9,20.2,54.0,38.4,11.0,
34.4,21.9,25.3,12.3,48.6,21.7,13.7,17.6,20.3,26.3,
16.8,28.5,9.8,16.6,23.0,30.9,32.3,11.6,53.4,48.3,20.2),
PDB = c(13.7,34.7,29.1,22.7,33.4,15.8,53.4,17.1,16.6,21.5,
30.4,29.8,44.5,52.7,23.0,20.6,22.1,52.3,38.4,13.1,
33.1,23.2,27.1,13.9,48.5,23.3,14.6,18.7,22.1,27.3,
18.4,28.9,11.4,17.9,24.5,32.2,32.7,13.0,53.0,48.9,22.6),
Ekspor = c(82.8,5.7,35.2,2.0,11.6,61.8,717.7,104.5,32.3,44.7,
238.3,31.9,990.5,2.1,106.9,0.7,172.5,157.4,790.4,9.4,
753.6,27.9,61.1,647.6,60.2,46.3,0.0,36.7,469.0,136.2,
137.3,371.0,45.0,1.8,121.4,57.3,615.8,357.5,335.2,1.1,21.2)
)
# Tampilkan seluruh data
kable(data, caption = "Data Komlan (41 Negara)", align = "c")| No | Negara | PNB | PDB | Ekspor |
|---|---|---|---|---|
| 1 | Argentina | 12.2 | 13.7 | 82.8 |
| 2 | Bahamas | 32.8 | 34.7 | 5.7 |
| 3 | Bahrain | 28.2 | 29.1 | 35.2 |
| 4 | Barbados | 21.3 | 22.7 | 2.0 |
| 5 | Brunei Darussalam | 34.0 | 33.4 | 11.6 |
| 6 | Bulgaria | 14.3 | 15.8 | 61.8 |
| 7 | Canada | 54.1 | 53.4 | 717.7 |
| 8 | Chile | 15.7 | 17.1 | 104.5 |
| 9 | Costa Rica | 14.3 | 16.6 | 32.3 |
| 10 | Croatia | 20.4 | 21.5 | 44.7 |
| 11 | Czechia | 28.2 | 30.4 | 238.3 |
| 12 | Estonia | 27.9 | 29.8 | 31.9 |
| 13 | France | 45.1 | 44.5 | 990.5 |
| 14 | Germany | 53.6 | 52.7 | 2.1 |
| 15 | Greece | 22.2 | 23.0 | 106.9 |
| 16 | Guyana | 19.9 | 20.6 | 0.7 |
| 17 | Hungary | 20.2 | 22.1 | 172.5 |
| 18 | Israel | 54.0 | 52.3 | 157.4 |
| 19 | Italy | 38.4 | 38.4 | 790.4 |
| 20 | Kazakhstan | 11.0 | 13.1 | 9.4 |
| 21 | Korea Selatan | 34.4 | 33.1 | 753.6 |
| 22 | Latvia | 21.9 | 23.2 | 27.9 |
| 23 | Lithuania | 25.3 | 27.1 | 61.1 |
| 24 | Mexico | 12.3 | 13.9 | 647.6 |
| 25 | New Zealand | 48.6 | 48.5 | 60.2 |
| 26 | Oman | 21.7 | 23.3 | 46.3 |
| 27 | Palau | 13.7 | 14.6 | 0.0 |
| 28 | Panama | 17.6 | 18.7 | 36.7 |
| 29 | Poland | 20.3 | 22.1 | 469.0 |
| 30 | Portugal | 26.3 | 27.3 | 136.2 |
| 31 | Romania | 16.8 | 18.4 | 137.3 |
| 32 | Saudi Arabia | 28.5 | 28.9 | 371.0 |
| 33 | Serbia | 9.8 | 11.4 | 45.0 |
| 34 | Seychelles | 16.6 | 17.9 | 1.8 |
| 35 | Slovak Republic | 23.0 | 24.5 | 121.4 |
| 36 | Slovenia | 30.9 | 32.2 | 57.3 |
| 37 | Spain | 32.3 | 32.7 | 615.8 |
| 38 | Turkiye | 11.6 | 13.0 | 357.5 |
| 39 | United Arab Emirates | 53.4 | 53.0 | 335.2 |
| 40 | United Kingdom | 48.3 | 48.9 | 1.1 |
| 41 | Uruguay | 20.2 | 22.6 | 21.2 |
## Jumlah Observasi : 41
## Jumlah Variabel : 5
## 'data.frame': 41 obs. of 5 variables:
## $ No : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Negara: chr "Argentina" "Bahamas" "Bahrain" "Barbados" ...
## $ PNB : num 12.2 32.8 28.2 21.3 34 14.3 54.1 15.7 14.3 20.4 ...
## $ PDB : num 13.7 34.7 29.1 22.7 33.4 15.8 53.4 17.1 16.6 21.5 ...
## $ Ekspor: num 82.8 5.7 35.2 2 11.6 ...
stats_table <- data.frame(
Variabel = c("PNB (Y)", "PDB (X1)", "Ekspor (X2)"),
N = rep(nrow(data), 3),
Min = c(min(data$PNB), min(data$PDB), min(data$Ekspor)),
Max = c(max(data$PNB), max(data$PDB), max(data$Ekspor)),
Mean = c(round(mean(data$PNB),3), round(mean(data$PDB),3), round(mean(data$Ekspor),3)),
Median = c(round(median(data$PNB),3), round(median(data$PDB),3), round(median(data$Ekspor),3)),
SD = c(round(sd(data$PNB),3), round(sd(data$PDB),3), round(sd(data$Ekspor),3)),
Skewness = c(round(skewness(data$PNB),3), round(skewness(data$PDB),3), round(skewness(data$Ekspor),3))
)
kable(stats_table, caption = "Statistik Deskriptif Variabel Penelitian", align = "c")| Variabel | N | Min | Max | Mean | Median | SD | Skewness |
|---|---|---|---|---|---|---|---|
| PNB (Y) | 41 | 9.8 | 54.1 | 26.861 | 22.2 | 13.176 | 0.806 |
| PDB (X1) | 41 | 11.4 | 53.4 | 27.810 | 23.3 | 12.398 | 0.776 |
| Ekspor (X2) | 41 | 0.0 | 990.5 | 192.722 | 61.1 | 263.714 | 1.513 |
Berdasarkan tabel statistik deskriptif di atas:
cor_matrix <- cor(data[, c("PNB", "PDB", "Ekspor")])
kable(round(cor_matrix, 4), caption = "Matriks Korelasi Pearson", align = "c")| PNB | PDB | Ekspor | |
|---|---|---|---|
| PNB | 1.0000 | 0.9986 | 0.3194 |
| PDB | 0.9986 | 1.0000 | 0.3037 |
| Ekspor | 0.3194 | 0.3037 | 1.0000 |
corrplot(cor_matrix,
method = "color",
type = "upper",
addCoef.col = "black",
tl.col = "black",
title = "Matriks Korelasi",
mar = c(0, 0, 2, 0))Matriks Korelasi Antar Variabel
Berdasarkan matriks korelasi:
Model regresi berganda secara umum dinyatakan sebagai:
\[ Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \cdots + \beta_k X_k + \varepsilon \]
Dimana:
Sehingga model yang dibangun dalam penelitian ini adalah:
\[ \hat{Y}_{\text{PNB}} = \beta_0 + \beta_1 X_{\text{PDB}} + \beta_2 X_{\text{Ekspor}} + \varepsilon \]
Estimasi parameter dilakukan menggunakan metode Ordinary Least Squares (OLS), yaitu meminimalkan jumlah kuadrat residual:
\[ \text{SSE} = \sum_{i=1}^{n} \varepsilon_i^2 = \sum_{i=1}^{n} (Y_i - \hat{Y}_i)^2 \]
# Membangun model regresi berganda
model <- lm(PNB ~ PDB + Ekspor, data = data)
# Ringkasan model
summary(model)##
## Call:
## lm(formula = PNB ~ PDB + Ekspor, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.43528 -0.30446 0.00104 0.33059 1.45910
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.6625869 0.2611441 -10.196 1.99e-12 ***
## PDB 1.0554966 0.0090181 117.043 < 2e-16 ***
## Ekspor 0.0008845 0.0004240 2.086 0.0437 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6737 on 38 degrees of freedom
## Multiple R-squared: 0.9975, Adjusted R-squared: 0.9974
## F-statistic: 7630 on 2 and 38 DF, p-value: < 2.2e-16
Berdasarkan output di atas, model regresi berganda yang diperoleh adalah:
\[ \hat{Y}_{\text{PNB}} = -2.6626 + 1.0555 \cdot X_{\text{PDB}} + 9\times 10^{-4} \cdot X_{\text{Ekspor}} \]
coef_df <- data.frame(
Parameter = c("Intercept (β₀)", "PDB (β₁)", "Ekspor (β₂)"),
Estimasi = round(coef(model), 4),
Std.Error = round(summary(model)$coefficients[, 2], 4),
t.value = round(summary(model)$coefficients[, 3], 4),
p.value = round(summary(model)$coefficients[, 4], 6)
)
kable(coef_df, caption = "Tabel Estimasi Parameter Regresi Berganda",
align = "c", row.names = FALSE)| Parameter | Estimasi | Std.Error | t.value | p.value |
|---|---|---|---|---|
| Intercept (β₀) | -2.6626 | 0.2611 | -10.1959 | 0.00000 |
| PDB (β₁) | 1.0555 | 0.0090 | 117.0427 | 0.00000 |
| Ekspor (β₂) | 0.0009 | 0.0004 | 2.0861 | 0.04373 |
Interpretasi:
Agar estimasi OLS bersifat BLUE (Best Linear Unbiased Estimator), perlu dipenuhi empat asumsi klasik berikut:
Hipotesis:
##
## Shapiro-Wilk normality test
##
## data: residual
## W = 0.98089, p-value = 0.7092
##
## Exact one-sample Kolmogorov-Smirnov test
##
## data: residual
## D = 0.081106, p-value = 0.9301
## alternative hypothesis: two-sided
qqnorm(residual,
main = "Normal Q-Q Plot Residual",
col = "steelblue",
pch = 19)
qqline(residual, col = "red", lwd = 2)Normal Q-Q Plot Residual
hist(residual,
breaks = 12,
main = "Histogram Residual",
xlab = "Residual",
col = "steelblue",
border = "white",
freq = FALSE)
x_seq <- seq(min(residual) - 5, max(residual) + 5, length.out = 200)
lines(x_seq, dnorm(x_seq, mean(residual), sd(residual)),
col = "red", lwd = 2, lty = 2)
legend("topright", legend = "Kurva Normal", col = "red", lty = 2, lwd = 2)Histogram Distribusi Residual
Kesimpulan:
Hipotesis:
##
## studentized Breusch-Pagan test
##
## data: model
## BP = 1.912, df = 2, p-value = 0.3844
plot(model$fitted.values, model$residuals,
main = "Residual vs Fitted Values",
xlab = "Fitted Values",
ylab = "Residual",
pch = 19, col = "steelblue")
abline(h = 0, col = "red", lwd = 2, lty = 2)Plot Residual vs Fitted Values
Kesimpulan:
Hipotesis:
##
## Durbin-Watson test
##
## data: model
## DW = 1.9043, p-value = 0.3965
## alternative hypothesis: true autocorrelation is greater than 0
Kesimpulan:
Multikolinieritas dideteksi menggunakan Variance Inflation Factor (VIF).
Kriteria:
vif_values <- vif(model)
vif_df <- data.frame(
Variabel = names(vif_values),
VIF = round(vif_values, 4),
Keterangan = ifelse(vif_values < 5, "Tidak Ada Multikolinieritas",
ifelse(vif_values < 10, "Multikolinieritas Moderat",
"Multikolinieritas Serius"))
)
kable(vif_df, caption = "Nilai VIF Variabel Prediktor", align = "c", row.names = FALSE)| Variabel | VIF | Keterangan |
|---|---|---|
| PDB | 1.1016 | Tidak Ada Multikolinieritas |
| Ekspor | 1.1016 | Tidak Ada Multikolinieritas |
Kesimpulan:
asumsi_df <- data.frame(
Uji = c("Normalitas (Shapiro-Wilk)",
"Homoskedastisitas (Breusch-Pagan)",
"Non-Autokorelasi (Durbin-Watson)",
"Non-Multikolinieritas (VIF)"),
Statistik = c(round(sw_result$statistic, 4),
round(bp_test$statistic, 4),
round(dw_test$statistic, 4),
paste(round(vif_values, 3), collapse = " / ")),
p.value = c(round(sw_result$p.value, 4),
round(bp_test$p.value, 4),
round(dw_test$p.value, 4),
"-"),
Kesimpulan = c(
ifelse(sw_result$p.value > 0.05, "✓ Terpenuhi", "✗ Tidak Terpenuhi"),
ifelse(bp_test$p.value > 0.05, "✓ Terpenuhi", "✗ Tidak Terpenuhi"),
ifelse(dw_test$p.value > 0.05, "✓ Terpenuhi", "✗ Tidak Terpenuhi"),
ifelse(all(vif_values < 10), "✓ Terpenuhi", "✗ Tidak Terpenuhi")
)
)
kable(asumsi_df, caption = "Ringkasan Hasil Uji Asumsi Klasik", align = "c")| Uji | Statistik | p.value | Kesimpulan | |
|---|---|---|---|---|
| W | Normalitas (Shapiro-Wilk) | 0.9809 | 0.7092 | ✓ Terpenuhi |
| BP | Homoskedastisitas (Breusch-Pagan) | 1.912 | 0.3844 | ✓ Terpenuhi |
| DW | Non-Autokorelasi (Durbin-Watson) | 1.9043 | 0.3965 | ✓ Terpenuhi |
| Non-Multikolinieritas (VIF) | 1.102 / 1.102 | - | ✓ Terpenuhi |
Uji F digunakan untuk menguji apakah secara bersama-sama variabel PDB dan Ekspor berpengaruh signifikan terhadap PNB.
Hipotesis:
\[H_0: \beta_1 = \beta_2 = 0 \quad \text{(Model tidak signifikan)}\] \[H_1: \text{Minimal ada satu } \beta_j \neq 0 \quad \text{(Model signifikan)}\]
Aturan Keputusan: Tolak \(H_0\) jika \(F_{\text{hitung}} > F_{\alpha,k,n-k-1}\) atau p-value \(< \alpha = 0.05\)
f_stat <- summary(model)$fstatistic
f_value <- f_stat[1]
df1 <- f_stat[2]
df2 <- f_stat[3]
p_val_f <- pf(f_value, df1, df2, lower.tail = FALSE)
cat("F-statistik :", round(f_value, 4), "\n")## F-statistik : 7629.566
## df1 (Regresi) : 2
## df2 (Residual) : 38
## p-value : 0
anova_model <- anova(model)
kable(round(anova_model, 4), caption = "Tabel ANOVA Regresi Berganda", align = "c")| Df | Sum Sq | Mean Sq | F value | Pr(>F) | |
|---|---|---|---|---|---|
| PDB | 1 | 6924.5728 | 6924.5728 | 15254.780 | 0.0000 |
| Ekspor | 1 | 1.9755 | 1.9755 | 4.352 | 0.0437 |
| Residuals | 38 | 17.2493 | 0.4539 | NA | NA |
Kesimpulan Uji F:
Diperoleh nilai \(F_{\text{hitung}} = 7629.5659\) dengan p-value = 0. Karena p-value < 0.05, maka \(H_0\) ditolak, sehingga pada taraf signifikansi 5% dapat disimpulkan bahwa variabel PDB dan Ekspor secara simultan berpengaruh signifikan terhadap PNB.
Uji t digunakan untuk menguji pengaruh masing-masing variabel prediktor secara individual terhadap PNB.
Hipotesis untuk setiap \(\beta_j\):
\[H_0: \beta_j = 0 \quad \text{(Variabel ke-j tidak berpengaruh signifikan)}\] \[H_1: \beta_j \neq 0 \quad \text{(Variabel ke-j berpengaruh signifikan)}\]
Aturan Keputusan: Tolak \(H_0\) jika \(|t_{\text{hitung}}| > t_{\alpha/2, n-k-1}\) atau p-value \(< \alpha = 0.05\)
coef_summary <- summary(model)$coefficients
t_df <- data.frame(
Parameter = rownames(coef_summary),
Estimasi = round(coef_summary[, 1], 4),
Std.Error = round(coef_summary[, 2], 4),
t.hitung = round(coef_summary[, 3], 4),
p.value = round(coef_summary[, 4], 6),
Keputusan = ifelse(coef_summary[, 4] < 0.05,
"Tolak H₀ (Signifikan)",
"Gagal Tolak H₀ (Tidak Signifikan)")
)
kable(t_df, caption = "Hasil Uji t (Uji Parsial) Parameter Regresi",
align = "c", row.names = FALSE)| Parameter | Estimasi | Std.Error | t.hitung | p.value | Keputusan |
|---|---|---|---|---|---|
| (Intercept) | -2.6626 | 0.2611 | -10.1959 | 0.00000 | Tolak H₀ (Signifikan) |
| PDB | 1.0555 | 0.0090 | 117.0427 | 0.00000 | Tolak H₀ (Signifikan) |
| Ekspor | 0.0009 | 0.0004 | 2.0861 | 0.04373 | Tolak H₀ (Signifikan) |
Kesimpulan Uji t:
Koefisien determinasi (\(R^2\)) mengukur proporsi keragaman variabel respons yang mampu dijelaskan oleh model regresi.
\[R^2 = 1 - \frac{SSE}{SST} = \frac{SSR}{SST}\]
Dimana:
r_squared <- summary(model)$r.squared
adj_r_squared <- summary(model)$adj.r.squared
SST <- sum((data$PNB - mean(data$PNB))^2)
SSE <- sum(model$residuals^2)
SSR <- SST - SSE
MSR <- SSR / 2
MSE <- SSE / (nrow(data) - 3)
ss_df <- data.frame(
Sumber = c("Regresi (SSR)", "Residual/Galat (SSE)", "Total (SST)"),
Derajat.Bebas = c(2, nrow(data) - 3, nrow(data) - 1),
Jumlah.Kuadrat = round(c(SSR, SSE, SST), 4),
Kuadrat.Tengah = c(round(MSR, 4), round(MSE, 4), NA)
)
kable(ss_df, caption = "Tabel Dekomposisi Ragam (Sum of Squares)", align = "c")| Sumber | Derajat.Bebas | Jumlah.Kuadrat | Kuadrat.Tengah |
|---|---|---|---|
| Regresi (SSR) | 2 | 6926.5483 | 3463.2741 |
| Residual/Galat (SSE) | 38 | 17.2493 | 0.4539 |
| Total (SST) | 40 | 6943.7976 | NA |
##
## R-squared : 0.997516
##
## Adjusted R-squared : 0.997385
pie(c(r_squared, 1 - r_squared),
labels = c(paste0("Dijelaskan Model\n(R² = ", round(r_squared * 100, 2), "%)"),
paste0("Tidak Dijelaskan\n(", round((1 - r_squared) * 100, 2), "%)")),
col = c("steelblue", "lightgray"),
main = "Koefisien Determinasi (R²)")Proporsi Keragaman yang Dijelaskan Model
Interpretasi:
Berdasarkan seluruh tahapan analisis regresi berganda yang telah dilakukan, diperoleh kesimpulan sebagai berikut:
1. Model Regresi Berganda yang Diperoleh:
\[\hat{Y}_{\text{PNB}} = -2.6626 + 1.0555 \cdot X_{\text{PDB}} + 9\times 10^{-4} \cdot X_{\text{Ekspor}}\]
2. Uji Asumsi Klasik:
3. Uji Signifikansi Simultan (Uji F):
Model secara keseluruhan signifikan pada \(\alpha = 5\%\) dengan \(F_{\text{hitung}} = 7629.5659\) dan p-value = 0. Variabel PDB dan Ekspor secara bersama-sama berpengaruh nyata terhadap PNB.
4. Uji Signifikansi Parsial (Uji t):
5. Koefisien Determinasi:
Model mampu menjelaskan 99.75% dari total variasi PNB (\(R^2 = 0.9975\), \(R^2_{\text{adj}} = 0.9974\)). Sisanya dipengaruhi oleh faktor lain di luar model.