๐Ÿ“š Laporan Praktikum 2 โ€” Komputasi Statistika B
Program Studi Sarjana Statistika | Departemen Statistika
Fakultas Sains, Teknologi, dan Matematika โ€” Universitas Brawijaya
Asisten: Muhammad Adโ€™hiya Hartono ยท M Adika Rosyad Bhakti Putra ยท Joyce Abigail Gracia Zebua ยท Ghina Azziqra


๐Ÿ“‹ BAB I โ€” Studi Kasus

Dataset yang digunakan berasal dari sumber terbuka (Kaggle) berisi data spesifikasi dan harga mobil. Dari dataset tersebut diambil 50 observasi pertama dengan 4 variabel:

price Variabel Dependen (Y)

horsepower Xโ‚ โ€” Tenaga Mesin (hp)

carheight Xโ‚‚ โ€” Tinggi Mobil (inch)

citympg Xโ‚ƒ โ€” Efisiensi BBM Kota (mpg)


๐Ÿ“– BAB II โ€” Tinjauan Pustaka

Regresi Linier Berganda

Regresi linier berganda adalah metode statistika untuk mengetahui hubungan satu variabel dependen (Y) dengan dua atau lebih variabel independen (X). Model umum:

ลถ = ฮฒโ‚€ + ฮฒโ‚Xโ‚ + ฮฒโ‚‚Xโ‚‚ + โ€ฆ + ฮฒโ‚–Xโ‚– + ฮต

Koefisien diestimasi dengan Ordinary Least Squares (OLS), yaitu meminimalkan jumlah kuadrat residual (Gujarati & Porter, 2009).

Uji Signifikansi

  • Uji F (Simultan) โ€” apakah semua prediktor secara bersama-sama signifikan (p-value < 0,05 โ†’ tolak Hโ‚€)
  • Uji t (Parsial) โ€” apakah masing-masing prediktor signifikan secara individual

Uji Asumsi Klasik

Asumsi Uji Keputusan
Normalitas Shapiro-Wilk p-value > 0,05 โ†’ normal
Homoskedastisitas Breusch-Pagan p-value > 0,05 โ†’ homoskedastis
Non-autokorelasi Durbin-Watson p-value > 0,05 โ†’ tidak ada autokorelasi
Non-multikolinearitas VIF VIF < 10 โ†’ tidak ada masalah

๐Ÿ” BAB III โ€” Eksplorasi Data

Import & Persiapan Data

# Import data โ€” ganti path sesuai lokasi file CSV Anda
# car_price <- read.csv(file.choose())

# Simulasi data sesuai laporan (50 observasi)
car_price <- data.frame(
  price      = c(13495,16500,16500,13950,17450,15250,17710,18920,23875,17859.17,
                 16430,16925,20970,21105,24565,30760,41315,36880,5151,6295,
                 6575,5572,6377,7957,6229,6692,7609,8558,8921,12964,
                 6479,6855,5399,6529,7129,7295,7295,7895,9095,8845,
                 10295,12945,10345,6785,8916.5,8916.5,11048,32250,35550,36000),
  horsepower = c(111,111,154,102,115,110,110,110,140,160,
                 101,101,121,121,121,182,182,182,48,70,
                 70,68,68,102,68,68,68,102,88,145,
                 58,76,60,76,76,76,76,86,86,86,
                 86,101,100,78,70,70,90,176,176,262),
  carheight  = c(48.8,48.8,52.4,54.3,54.3,53.1,55.7,55.7,55.9,52.0,
                 54.3,54.3,54.3,54.3,55.7,55.7,53.7,56.3,53.2,52.0,
                 52.0,50.8,50.8,50.8,50.6,50.6,50.6,50.6,59.8,50.2,
                 50.8,50.8,52.6,52.6,52.6,54.5,58.3,53.3,53.3,54.1,
                 54.1,54.1,51.0,53.5,52.0,52.0,51.4,52.8,52.8,47.8),
  citympg    = c(21,21,19,24,18,19,19,19,17,16,
                 23,23,21,21,20,16,16,15,47,38,
                 38,37,31,24,31,31,31,24,24,19,
                 49,31,38,30,30,30,30,27,27,27,
                 27,24,25,24,38,38,24,15,15,13)
)

data_mobil <- car_price[1:50, c("price","horsepower","carheight","citympg")]

Tampilan Data

knitr::kable(data_mobil, 
             caption = "Tabel 1. Data 50 Observasi Mobil",
             align = "cccc",
             digits = 2)
Tabel 1. Data 50 Observasi Mobil
price horsepower carheight citympg
13495.00 111 48.8 21
16500.00 111 48.8 21
16500.00 154 52.4 19
13950.00 102 54.3 24
17450.00 115 54.3 18
15250.00 110 53.1 19
17710.00 110 55.7 19
18920.00 110 55.7 19
23875.00 140 55.9 17
17859.17 160 52.0 16
16430.00 101 54.3 23
16925.00 101 54.3 23
20970.00 121 54.3 21
21105.00 121 54.3 21
24565.00 121 55.7 20
30760.00 182 55.7 16
41315.00 182 53.7 16
36880.00 182 56.3 15
5151.00 48 53.2 47
6295.00 70 52.0 38
6575.00 70 52.0 38
5572.00 68 50.8 37
6377.00 68 50.8 31
7957.00 102 50.8 24
6229.00 68 50.6 31
6692.00 68 50.6 31
7609.00 68 50.6 31
8558.00 102 50.6 24
8921.00 88 59.8 24
12964.00 145 50.2 19
6479.00 58 50.8 49
6855.00 76 50.8 31
5399.00 60 52.6 38
6529.00 76 52.6 30
7129.00 76 52.6 30
7295.00 76 54.5 30
7295.00 76 58.3 30
7895.00 86 53.3 27
9095.00 86 53.3 27
8845.00 86 54.1 27
10295.00 86 54.1 27
12945.00 101 54.1 24
10345.00 100 51.0 25
6785.00 78 53.5 24
8916.50 70 52.0 38
8916.50 70 52.0 38
11048.00 90 51.4 24
32250.00 176 52.8 15
35550.00 176 52.8 15
36000.00 262 47.8 13

Statistik Deskriptif

summary(data_mobil)
##      price         horsepower      carheight        citympg     
##  Min.   : 5151   Min.   : 48.0   Min.   :47.80   Min.   :13.00  
##  1st Qu.: 7170   1st Qu.: 76.0   1st Qu.:50.85   1st Qu.:19.00  
##  Median :10320   Median :100.5   Median :52.80   Median :24.00  
##  Mean   :14305   Mean   :105.3   Mean   :52.92   Mean   :25.70  
##  3rd Qu.:17645   3rd Qu.:119.5   3rd Qu.:54.30   3rd Qu.:30.75  
##  Max.   :41315   Max.   :262.0   Max.   :59.80   Max.   :49.00
๐Ÿ’ฐ Price

Min: $5.151

Median: $10.320

Mean: $14.305

Max: $41.315

โš™๏ธ Horsepower

Min: 48

Median: 100,5

Mean: 105,3

Max: 262

๐Ÿ“ Car Height

Min: 47,80

Median: 52,80

Mean: 52,92

Max: 59,80

โ›ฝ City MPG

Min: 13

Median: 24

Mean: 25,7

Max: 49

Scatter Plot Eksplorasi

library(ggplot2)

ggplot(data_mobil, aes(x = horsepower, y = price)) +
  geom_point(color = "#e8658a", size = 3, alpha = 0.8, shape = 19) +
  geom_smooth(method = "lm", se = TRUE,
              color = "#8b3a52", fill = "#ffd6e7", alpha = 0.35, linewidth = 1.2) +
  labs(title    = "Horsepower vs Price",
       subtitle = "Hubungan positif kuat antara tenaga mesin dan harga",
       x = "Horsepower (hp)", y = "Price (USD)") +
  scale_y_continuous(labels = scales::comma) +
  theme_pink()
Gambar 1. Scatter Plot Horsepower vs Price

Gambar 1. Scatter Plot Horsepower vs Price

ggplot(data_mobil, aes(x = carheight, y = price)) +
  geom_point(color = "#c94c70", size = 3, alpha = 0.8, shape = 19) +
  geom_smooth(method = "lm", se = TRUE,
              color = "#8b3a52", fill = "#ffd6e7", alpha = 0.35, linewidth = 1.2) +
  labs(title    = "Car Height vs Price",
       subtitle = "Hubungan yang lemah dan tersebar acak",
       x = "Car Height (inch)", y = "Price (USD)") +
  scale_y_continuous(labels = scales::comma) +
  theme_pink()
Gambar 2. Scatter Plot Car Height vs Price

Gambar 2. Scatter Plot Car Height vs Price

ggplot(data_mobil, aes(x = citympg, y = price)) +
  geom_point(color = "#b5657a", size = 3, alpha = 0.8, shape = 19) +
  geom_smooth(method = "lm", se = TRUE,
              color = "#8b3a52", fill = "#ffd6e7", alpha = 0.35, linewidth = 1.2) +
  labs(title    = "City MPG vs Price",
       subtitle = "Hubungan negatif โ€” semakin irit, harga cenderung lebih rendah",
       x = "City MPG (mpg)", y = "Price (USD)") +
  scale_y_continuous(labels = scales::comma) +
  theme_pink()
Gambar 3. Scatter Plot City MPG vs Price

Gambar 3. Scatter Plot City MPG vs Price

๐Ÿ’ก Interpretasi Eksplorasi Visual:
- Horsepower โ†’ hubungan linear positif kuat dengan harga mobil
- City MPG โ†’ hubungan linear negatif yang cukup jelas
- Car Height โ†’ menyebar acak, hubungan lemah terhadap harga


๐Ÿ“ BAB IV โ€” Membangun Model Regresi Linier Berganda

Bentuk Model

ลถ = ฮฒโ‚€ + ฮฒโ‚Xโ‚ + ฮฒโ‚‚Xโ‚‚ + ฮฒโ‚ƒXโ‚ƒ

di mana Y = price, Xโ‚ = horsepower, Xโ‚‚ = carheight, Xโ‚ƒ = citympg

Estimasi Model

model_regresi <- lm(price ~ horsepower + carheight + citympg,
                    data = data_mobil)
model_regresi
## 
## Call:
## lm(formula = price ~ horsepower + carheight + citympg, data = data_mobil)
## 
## Coefficients:
## (Intercept)   horsepower    carheight      citympg  
##   -51922.13       214.87       785.07        80.14

Persamaan Model yang Diperoleh:
ลถ = โˆ’51922,13 + 214,87 Xโ‚ + 785,07 Xโ‚‚ + 80,14 Xโ‚ƒ

๐Ÿ“Œ Interpretasi Koefisien:

  • Intercept (โˆ’51.922,13) โ€” Jika semua prediktor bernilai nol, estimasi harga mobil adalah โˆ’$51.922,13 (tidak bermakna secara praktis karena di luar rentang data).
  • Horsepower (+214,87) โ€” Setiap kenaikan 1 hp, harga mobil naik $214,87 (ceteris paribus).
  • Car Height (+785,07) โ€” Setiap kenaikan 1 inch tinggi, harga naik $785,07 (ceteris paribus).
  • City MPG (+80,14) โ€” Setiap kenaikan 1 mpg, harga naik $80,14 (ceteris paribus).

๐Ÿงช BAB V โ€” Uji Signifikansi Model

summary(model_regresi)
## 
## Call:
## lm(formula = price ~ horsepower + carheight + citympg, data = data_mobil)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7327.8 -1550.5   208.1  1949.1 10690.7 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -51922.13   15219.81  -3.411  0.00136 ** 
## horsepower     214.87      22.60   9.506 1.99e-12 ***
## carheight      785.07     240.98   3.258  0.00211 ** 
## citympg         80.14     117.37   0.683  0.49820    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3728 on 46 degrees of freedom
## Multiple R-squared:  0.8548, Adjusted R-squared:  0.8453 
## F-statistic: 90.23 on 3 and 46 DF,  p-value: < 2.2e-16

Uji F (Simultan)

F-statistic 90,23
p-value < 2,2 ร— 10โปยนโถ
Keputusan Tolak Hโ‚€

โœ… Karena p-value < ฮฑ (0,05), ketiga variabel secara bersama-sama berpengaruh signifikan terhadap harga mobil.

Uji t (Parsial)

Variabel Koefisien p-value Keputusan
Horsepower (Xโ‚) 214,87 1,99 ร— 10โปยนยฒ Signifikan โœ“
Car Height (Xโ‚‚) 785,07 0,00211 Signifikan โœ“
City MPG (Xโ‚ƒ) 80,14 0,49820 Tidak Signifikan โœ—

๐Ÿ“Š Kebaikan Model โ€” R-Squared:
- Multiple Rยฒ = 0,8548
- Adjusted Rยฒ = 0,8453
โ†’ Model mampu menjelaskan 84,53% variasi harga mobil. Sisanya 15,47% oleh faktor lain di luar model.


โœ… BAB VI โ€” Uji Asumsi Regresi

1. Normalitas Sisaan

shapiro.test(residuals(model_regresi))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(model_regresi)
## W = 0.95939, p-value = 0.08383
# QQ Plot cantik
residuals_df <- data.frame(residuals = residuals(model_regresi))

ggplot(residuals_df, aes(sample = residuals)) +
  stat_qq(color = "#e8658a", size = 2.5, alpha = 0.8) +
  stat_qq_line(color = "#8b3a52", linewidth = 1.2, linetype = "dashed") +
  labs(title    = "Normal Q-Q Plot Residual",
       subtitle = "Titik mengikuti garis โ†’ residual mendekati normal",
       x = "Theoretical Quantiles", y = "Sample Quantiles") +
  theme_pink()
Gambar 4. Q-Q Plot Residual

Gambar 4. Q-Q Plot Residual

W 0,95939
p-value 0,08383
Keputusan Terima Hโ‚€

โœ“ TERPENUHI Karena p-value (0,084) > 0,05 โ†’ residual berdistribusi normal.

2. Homoskedastisitas

library(lmtest)
bptest(model_regresi)
## 
##  studentized Breusch-Pagan test
## 
## data:  model_regresi
## BP = 18.505, df = 3, p-value = 0.0003459
fitted_df <- data.frame(
  fitted    = fitted(model_regresi),
  residuals = residuals(model_regresi)
)

ggplot(fitted_df, aes(x = fitted, y = residuals)) +
  geom_point(color = "#e8658a", size = 2.5, alpha = 0.8) +
  geom_hline(yintercept = 0, color = "#8b3a52",
             linewidth = 1.1, linetype = "dashed") +
  geom_smooth(se = FALSE, color = "#c94c70",
              linewidth = 0.9, method = "loess") +
  labs(title    = "Residuals vs Fitted Values",
       subtitle = "Pola mengembang โ†’ indikasi heteroskedastisitas",
       x = "Fitted Values", y = "Residuals") +
  scale_x_continuous(labels = scales::comma) +
  theme_pink()
Gambar 5. Plot Fitted vs Residuals

Gambar 5. Plot Fitted vs Residuals

BP 18,505
p-value 0,0003459
Keputusan Tolak Hโ‚€

โœ— TIDAK TERPENUHI Karena p-value (0,0003) < 0,05 โ†’ terdapat heteroskedastisitas.
> ๐Ÿ’ก Solusi: transformasi variabel atau metode Weighted Least Squares (WLS).

3. Non-Autokorelasi

dwtest(model_regresi)
## 
##  Durbin-Watson test
## 
## data:  model_regresi
## DW = 1.612, p-value = 0.052
## alternative hypothesis: true autocorrelation is greater than 0
DW 1,612
p-value 0,052
Keputusan Terima Hโ‚€

โœ“ TERPENUHI Karena p-value (0,052) > 0,05 โ†’ tidak terjadi autokorelasi.

4. Non-Multikolinearitas

library(car)
vif(model_regresi)
## horsepower  carheight    citympg 
##   3.234877   1.125351   3.374615
vif_values <- c(horsepower = 3.234877, carheight = 1.125351, citympg = 3.374615)
vif_df <- data.frame(
  variabel = names(vif_values),
  vif      = vif_values
)

ggplot(vif_df, aes(x = reorder(variabel, vif), y = vif, fill = variabel)) +
  geom_col(width = 0.55, show.legend = FALSE,
           color = "#c94c70", linewidth = 0.6) +
  geom_hline(yintercept = 10, color = "#721c24",
             linetype = "dashed", linewidth = 1) +
  geom_text(aes(label = round(vif, 2)), hjust = -0.2,
            color = "#8b3a52", fontface = "bold", size = 4.5) +
  scale_fill_manual(values = c("#ffb3d1","#e8658a","#c94c70")) +
  scale_y_continuous(limits = c(0, 12)) +
  coord_flip() +
  labs(title    = "Variance Inflation Factor (VIF)",
       subtitle = "Semua VIF jauh di bawah ambang batas 10",
       x = NULL, y = "Nilai VIF") +
  annotate("text", x = 0.6, y = 10.3, label = "Batas VIF = 10",
           color = "#721c24", size = 3.5, fontface = "italic") +
  theme_pink()
Gambar 6. Visualisasi Nilai VIF

Gambar 6. Visualisasi Nilai VIF

Variabel VIF Status
Horsepower 3,235 Aman โœ“
Car Height 1,125 Aman โœ“
City MPG 3,375 Aman โœ“

โœ“ TERPENUHI Seluruh VIF < 10 โ†’ tidak ada multikolinearitas.


๐Ÿ“ BAB VII โ€” Kesimpulan

1. Eksplorasi Data
Horsepower memiliki hubungan linear positif kuat dengan price; City MPG memiliki hubungan negatif; Car Height menunjukkan hubungan yang lemah.

2. Model Regresi
ลถ = โˆ’51.922,13 + 214,87Xโ‚ + 785,07Xโ‚‚ + 80,14Xโ‚ƒ
Horsepower dan carheight memberikan pengaruh positif terhadap harga.

3. Uji Signifikansi
- Uji F: ketiga prediktor signifikan secara serentak (p < 2,2ร—10โปยนโถ)
- Uji t: hanya horsepower dan carheight yang signifikan secara parsial; citympg tidak signifikan (p = 0,498)
- Adjusted Rยฒ = 0,8453 โ†’ model menjelaskan 84,53% variasi harga

4. Uji Asumsi

Asumsi Hasil Status
Normalitas p = 0,084 > 0,05 โœ“ Terpenuhi
Homoskedastisitas p = 0,0003 < 0,05 โœ— Tidak Terpenuhi
Non-autokorelasi p = 0,052 > 0,05 โœ“ Terpenuhi
Non-multikolinearitas VIF < 10 โœ“ Terpenuhi

โš ๏ธ Model perlu perbaikan pada asumsi heteroskedastisitas melalui transformasi variabel atau metode Weighted Least Squares (WLS).


๐Ÿ“š Daftar Pustaka

  • Gujarati, D. N., & Porter, D. C. (2009). Basic Econometrics (5th ed.). McGraw-Hill.
  • Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2019). Multivariate Data Analysis (8th ed.). Cengage Learning.
  • Montgomery, D. C., Peck, E. A., & Vining, G. G. (2021). Introduction to Linear Regression Analysis (6th ed.). John Wiley & Sons.
  • Razali, N. M., & Wah, Y. B. (2011). Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests. Journal of Statistical Modeling and Analytics, 2(12), 21โ€“33.