Najah Muchsin Sanin

11 Maret 2022

Dosen Pengampu : Prof. Dr. Suhartono, M.Kom

Lembaga : Universitas Islam Negeri Maulana Malik Ibrahim Malang

Jurusan : Teknik Informatika

1 Pengertian Regresi Linear

Regresi linier adalah model paling sederhana yang paling sering dijelaskan dalam statistik. Modelnya sangat sederhana dimana kita dapat mencoba membangun model dengan pendekatan linier menggunakan prinsip meminimalkan jumlah sisa kuadrat dalam data. Model yang terbentuk menghasilkan dua nilai: nilai konstan (y-intercept) dan slope kurva. Berikut contoh regresi linear sederhana hubungan antara kecepatan dan waktu tempuh mobil.

2 Dataset Mobil

library(readxl)
## Warning: package 'readxl' was built under R version 4.1.2
Data <- read_excel(path = "dataSetCar.xlsx")
Data

3 Regresi Linear Sederhana

regrelinearcar <-lm(Data$`Time(s)`~Data$`Speed(km/h)`, data=Data)
regrelinearcar
## 
## Call:
## lm(formula = Data$`Time(s)` ~ Data$`Speed(km/h)`, data = Data)
## 
## Coefficients:
##        (Intercept)  Data$`Speed(km/h)`  
##            -11.411               0.309
anova(regrelinearcar)
summary(regrelinearcar)
## 
## Call:
## lm(formula = Data$`Time(s)` ~ Data$`Speed(km/h)`, data = Data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -3.699 -2.764 -1.229  1.391  8.341 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        -11.41091    3.24389  -3.518  0.00654 ** 
## Data$`Speed(km/h)`   0.30900    0.03771   8.194 1.83e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.955 on 9 degrees of freedom
## Multiple R-squared:  0.8818, Adjusted R-squared:  0.8687 
## F-statistic: 67.15 on 1 and 9 DF,  p-value: 1.826e-05
plot(Data$`Speed(km/h)`,Data$`Time(s)`)

plot(regrelinearcar)

# error berdistribusi normal 
# (data tidak berdistribusi normal)
shapiro.test(residuals(regrelinearcar))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(regrelinearcar)
## W = 0.88195, p-value = 0.1101
# varians bersifat konstan 
# (varians tidak konstan)
library(lmtest)
## Warning: package 'lmtest' was built under R version 4.1.2
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 4.1.2
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
library(car)
## Warning: package 'car' was built under R version 4.1.2
## Loading required package: carData
## Warning: package 'carData' was built under R version 4.1.2
# deteksi outlier (stdres > 2)
sres <- rstandard(regrelinearcar)
sres[which(abs(sres)>2)] # nomor observasi outlie
##      11 
## 2.55407
# influential observation
# observasi > percentil 50
# tidak ada observasi dengan jarak cook yang extrim
cooksD <- cooks.distance(regrelinearcar)
p50 <- qf(0.5, df1=2, df2=560-2)
any(cooksD>p50)
## [1] TRUE