Lembaga : Universitas Islam Negeri Maulana Malik Ibrahim Malang
Fakultas : Sains dan Teknologi
Program Studi : Teknik Informatika
Mata Kuliah : Linear Algebra (b)
Analisis Regresi merupakan salah satu analisis yang bertujuan untuk mengetahui pengaruh suatu variable terhadap variable lainnya. Dalam analisis regresi variable yang mempengaruhi disebut independent variable dan variable yang dipengaruhi disebut dependent variable. Jika dalam persamaan regresi hanya terdapat satu variable bebas dan satu variable terikat maka disebut persamaan regresi sederhana sedangkan jjika variable bebasnya lebih dari satu maka disebut persamaan regresi berganda.
library(readxl)
## Warning: package 'readxl' was built under R version 4.1.2
mobilityindex <- read.csv(file = '2020_ID_Region_Mobility_Report.csv')
mobilityindex
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.1.2
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.6 v dplyr 1.0.8
## v tidyr 1.2.0 v stringr 1.4.0
## v readr 2.1.2 v forcats 0.5.1
## Warning: package 'ggplot2' was built under R version 4.1.2
## Warning: package 'tibble' was built under R version 4.1.2
## Warning: package 'tidyr' was built under R version 4.1.2
## Warning: package 'readr' was built under R version 4.1.2
## Warning: package 'purrr' was built under R version 4.1.2
## Warning: package 'dplyr' was built under R version 4.1.2
## Warning: package 'stringr' was built under R version 4.1.2
## Warning: package 'forcats' was built under R version 4.1.2
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
mobilJK <- mobilityindex %>%
filter(iso_3166_2_code== "ID-JK")
mobilJK
mobilJK$date <- as.Date(mobilJK$date)
mobilJK
mobilJTup <- filter(mobilJK, "2020-12-01" < date & date < "2020-12-31")
mobilJTup
library(ggplot2)
library(reshape2)
## Warning: package 'reshape2' was built under R version 4.1.2
##
## Attaching package: 'reshape2'
## The following object is masked from 'package:tidyr':
##
## smiths
x <- mobilJTup$date
retail <- mobilJTup$retail_and_recreation_percent_change_from_baseline
grocery <- mobilJTup$grocery_and_pharmacy_percent_change_from_baseline
park <- mobilJTup$parks_percent_change_from_baseline
station <- mobilJTup$transit_stations_percent_change_from_baseline
workplace <- mobilJTup$workplaces_percent_change_from_baseline
residental <- mobilJTup$residential_percent_change_from_baseline
df <- data.frame(x, retail, grocery, park, station, workplace,residental )
# melt the data to a long format
df2 <- melt(data = df, id.vars = "x")
# plot, using the aesthetics argument 'colour'
ggplot(data = df2, aes(x = x, y = value, colour = variable))+
geom_point() +
geom_line() +
theme(legend.justification = "top") +
labs(title = "Google Mobility Index",
subtitle = "Propinsi DKI Jakarta Indonesia 2020",
y = "Mobility", x = "Date") +
theme(axis.text.x = element_text(angle = -90))
dim(mobilJTup)
## [1] 29 15
Pada pembahasan pembentukan model linier sederhana menggunakan dataset google mobility index dengan jumlah observasi sebesar 29 observasi. Pada contoh kali ini kita akan mencoba membentuk model dengan variabel dependen berupa medv (median harga rumah) dan variabel independen berupa retail_and_recreation_percent_change_from_baseline dengan mobilJTup$grocery_and_pharmacy_percent_change_from_baseline. Berikut adalah sintaks untuk membentuk model tersebut:
library(MASS)
## Warning: package 'MASS' was built under R version 4.1.2
##
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
##
## select
lm.fit <- lm(mobilJTup$retail_and_recreation_percent_change_from_baseline~mobilJTup$grocery_and_pharmacy_percent_change_from_baseline, data=mobilJTup)
anova(lm.fit)
summary(lm.fit)
##
## Call:
## lm(formula = mobilJTup$retail_and_recreation_percent_change_from_baseline ~
## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline,
## data = mobilJTup)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.9066 -0.7599 0.4757 0.6816 3.0638
##
## Coefficients:
## Estimate Std. Error
## (Intercept) -21.9172 0.4776
## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline 0.7941 0.0590
## t value Pr(>|t|)
## (Intercept) -45.89 < 2e-16
## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline 13.46 1.72e-13
##
## (Intercept) ***
## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.31 on 27 degrees of freedom
## Multiple R-squared: 0.8703, Adjusted R-squared: 0.8655
## F-statistic: 181.1 on 1 and 27 DF, p-value: 1.723e-13
plot(lm.fit)
# error berdistribusi normal
# (data tidak berdistribusi normal)
shapiro.test(residuals(lm.fit))
##
## Shapiro-Wilk normality test
##
## data: residuals(lm.fit)
## W = 0.9521, p-value = 0.2075
# varians bersifat konstan
# (varians tidak konstan)
library(lmtest)
## Warning: package 'lmtest' was built under R version 4.1.2
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 4.1.2
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
bptest(lm.fit)
##
## studentized Breusch-Pagan test
##
## data: lm.fit
## BP = 0.047053, df = 1, p-value = 0.8283
# error bersifat independen
# (error tidak bersifat independen)
dwtest(lm.fit, alternative = "two.sided")
##
## Durbin-Watson test
##
## data: lm.fit
## DW = 1.5026, p-value = 0.1467
## alternative hypothesis: true autocorrelation is not 0
# deteksi outlier (stdres > 2)
sres <- rstandard(lm.fit)
sres[which(abs(sres)>2)] # nomor observasi outlier
## 8 29
## 2.390562 -2.279139
# influential observation
# observasi > percentil 50
# tidak ada observasi dengan jarak cook yang extrim
cooksD <- cooks.distance(lm.fit)
p50 <- qf(0.5, df1=2, df2=560-2)
any(cooksD>p50)
## [1] FALSE