Describe the Data

# Informasi dataset mtcars
help(mtcars)
## starting httpd help server ... done
head(mtcars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
summary(mtcars)
##       mpg             cyl             disp             hp       
##  Min.   :10.40   Min.   :4.000   Min.   : 71.1   Min.   : 52.0  
##  1st Qu.:15.43   1st Qu.:4.000   1st Qu.:120.8   1st Qu.: 96.5  
##  Median :19.20   Median :6.000   Median :196.3   Median :123.0  
##  Mean   :20.09   Mean   :6.188   Mean   :230.7   Mean   :146.7  
##  3rd Qu.:22.80   3rd Qu.:8.000   3rd Qu.:326.0   3rd Qu.:180.0  
##  Max.   :33.90   Max.   :8.000   Max.   :472.0   Max.   :335.0  
##       drat             wt             qsec             vs        
##  Min.   :2.760   Min.   :1.513   Min.   :14.50   Min.   :0.0000  
##  1st Qu.:3.080   1st Qu.:2.581   1st Qu.:16.89   1st Qu.:0.0000  
##  Median :3.695   Median :3.325   Median :17.71   Median :0.0000  
##  Mean   :3.597   Mean   :3.217   Mean   :17.85   Mean   :0.4375  
##  3rd Qu.:3.920   3rd Qu.:3.610   3rd Qu.:18.90   3rd Qu.:1.0000  
##  Max.   :4.930   Max.   :5.424   Max.   :22.90   Max.   :1.0000  
##        am              gear            carb      
##  Min.   :0.0000   Min.   :3.000   Min.   :1.000  
##  1st Qu.:0.0000   1st Qu.:3.000   1st Qu.:2.000  
##  Median :0.0000   Median :4.000   Median :2.000  
##  Mean   :0.4062   Mean   :3.688   Mean   :2.812  
##  3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:4.000  
##  Max.   :1.0000   Max.   :5.000   Max.   :8.000

Rata-rata konsumsi bahan bakar (mpg) adalah 20.09 mil per galon

Rata-rata berat kendaraan (wt) adalah 3.22 (1000 lbs)

Regression Model

Parameter Estimation

Linear regression general form: \[ y = \beta_0 + \beta_1X_1+ \beta_2X_2 + \varepsilon \]

Dalam kasus ini, kita akan memodelkan pengaruh berat kendaraan (wt) terhadap konsumsi bahan bakar (mpg):

model = lm(mpg ~ wt, data=mtcars)
summary(model)
## 
## Call:
## lm(formula = mpg ~ wt, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.5432 -2.3647 -0.1252  1.4096  6.8727 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  37.2851     1.8776  19.858  < 2e-16 ***
## wt           -5.3445     0.5591  -9.559 1.29e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.046 on 30 degrees of freedom
## Multiple R-squared:  0.7528, Adjusted R-squared:  0.7446 
## F-statistic: 91.38 on 1 and 30 DF,  p-value: 1.294e-10

The model that is created using the data is:

\[ \text{MPG} = 37.2851 - 5.3445 \times \text{Weight} \]

Interpretasi: Setiap kenaikan berat kendaraan sebesar 1000 lbs, konsumsi bahan bakar (mpg) menurun sebesar 5.3445 mil per galon.

Hypothesis Test

Residual Normality Test

error = residuals(model)
# Shapiro-Wilk test untuk normalitas (lebih umum untuk ukuran sampel kecil)
shapiro.test(error)
## 
##  Shapiro-Wilk normality test
## 
## data:  error
## W = 0.94508, p-value = 0.1044
# Atau bisa juga dengan KS test
ks.test(error, "pnorm", mean(error), sd(error))
## 
##  Exact one-sample Kolmogorov-Smirnov test
## 
## data:  error
## D = 0.082516, p-value = 0.9687
## alternative hypothesis: two-sided

Autocorrelation Test

dwtest(model)
## 
##  Durbin-Watson test
## 
## data:  model
## DW = 1.2517, p-value = 0.0102
## alternative hypothesis: true autocorrelation is greater than 0

Linearity Test

# Uji linearitas dengan Ramsey's RESET test
library(car)
## Loading required package: carData
resettest(model, power=2:3, type="fitted")
## 
##  RESET test
## 
## data:  model
## RESET = 5.1315, df1 = 2, df2 = 28, p-value = 0.01263
# Visualisasi linearitas
plot(mtcars$wt, mtcars$mpg,
     main="Scatterplot dengan Regression Line",
     xlab="Weight (1000 lbs)", ylab="MPG",
     pch=19, col="blue")
abline(model, col="red", lwd=2)
lines(lowess(mtcars$wt, mtcars$mpg), col="green", lty=2, lwd=2)
legend("topright", legend=c("Regresi Linear", "Lowess"), 
       col=c("red", "green"), lty=1:2, lwd=2)

Homoskedasticity Test

# Breusch-Pagan test untuk heteroskedastisitas
bptest(model)
## 
##  studentized Breusch-Pagan test
## 
## data:  model
## BP = 0.040438, df = 1, p-value = 0.8406
# Alternatif dengan Goldfeld-Quandt test
gqtest(model)
## 
##  Goldfeld-Quandt test
## 
## data:  model
## GQ = 3.8655, df1 = 14, df2 = 14, p-value = 0.008189
## alternative hypothesis: variance increases from segment 1 to 2

Plots

Scatterplot Weight vs MPG

Weight vs MPG Scatterplot

Weight vs MPG Scatterplot

Diagnostic Plots

Diagnostic Plots

Additional Analysis: Multiple Regression

# Menambahkan variabel lain (horsepower)
model_multiple = lm(mpg ~ wt + hp, data=mtcars)
summary(model_multiple)
## 
## Call:
## lm(formula = mpg ~ wt + hp, data = mtcars)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -3.941 -1.600 -0.182  1.050  5.854 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 37.22727    1.59879  23.285  < 2e-16 ***
## wt          -3.87783    0.63273  -6.129 1.12e-06 ***
## hp          -0.03177    0.00903  -3.519  0.00145 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.593 on 29 degrees of freedom
## Multiple R-squared:  0.8268, Adjusted R-squared:  0.8148 
## F-statistic: 69.21 on 2 and 29 DF,  p-value: 9.109e-12
# Perbandingan model
anova(model, model_multiple)
## Analysis of Variance Table
## 
## Model 1: mpg ~ wt
## Model 2: mpg ~ wt + hp
##   Res.Df    RSS Df Sum of Sq      F   Pr(>F)   
## 1     30 278.32                                
## 2     29 195.05  1    83.274 12.381 0.001451 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Kesimpulan: Model regresi menunjukkan hubungan negatif yang signifikan antara berat kendaraan (wt) dan konsumsi bahan bakar (mpg). Nilai R-squared sebesar 0.7528 menunjukkan bahwa berat kendaraan menjelaskan 75.28% variasi dalam konsumsi bahan bakar.