Membangun Regresi Linier Sederhana (Simple Linear Regression) Pada Data Google Mobility Index

Dosen Pengampu : Prof. Dr. Suhartono, M.Kom

Mata Kuliah : Linier Algebra

Prodi : Teknik Informatika

Lembaga : Universitas Islam Negeri Maulana Malik Ibrahim Malang

0.1 Apa Itu Google Mobility Index?

Pemerintah menggunakan sejumlah data dari raksasa teknologi untuk melakukan pemantauan mobilitas masyarakat saat gelombang kasus Covid-19 meningkat di Indonesia. Salah satu yang digunakan adalah laporan dari Google. Isi laporan tersebut akan menampilkan grafik tren mobilitas seiring waktu. Yakni berisi per wilayah untuk kategori tempat relevan sebagai cara merespon pandemi yakni untuk sejumlah tempat dari retail dan rekreasi, toko bahan makanan dan apotek, taman, pusat transportasi umum, tempat kerja dan area pemukiman.

0.2 Regresi Linear Sederhana pada Data Google Mobility Index

Pada pembahasan pembentukan model linier sederhana menggunakan dataset google mobility index dengan jumlah observasi sebesar 29 observasi. Pada contoh kali ini kita akan mencoba membentuk model dengan variabel dependen berupa medv dan variabel independen berupa retail_and_recreation_percent_change_from_baseline dengan mobilJTup$grocery_and_pharmacy_percent_change_from_baseline.

Berikut adalah sintaks untuk membentuk model tersebut:

library(readr)

## Warning: package 'readr' was built under R version 4.1.2

X2020_ID_Region_Mobility_Report <- read_csv("2020_ID_Region_Mobility_Report.csv")

## Rows: 11235 Columns: 15
## -- Column specification --------------------------------------------------------
## Delimiter: ","
## chr  (5): country_region_code, country_region, sub_region_1, iso_3166_2_code...
## dbl  (6): retail_and_recreation_percent_change_from_baseline, grocery_and_ph...
## lgl  (3): sub_region_2, metro_area, census_fips_code
## date (1): date
## 
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.

X2020_ID_Region_Mobility_Report

library(tidyverse)

## Warning: package 'tidyverse' was built under R version 4.1.2

## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --

## v ggplot2 3.3.5     v dplyr   1.0.8
## v tibble  3.1.6     v stringr 1.4.0
## v tidyr   1.2.0     v forcats 0.5.1
## v purrr   0.3.4

## Warning: package 'ggplot2' was built under R version 4.1.3

## Warning: package 'tibble' was built under R version 4.1.2

## Warning: package 'tidyr' was built under R version 4.1.2

## Warning: package 'purrr' was built under R version 4.1.2

## Warning: package 'dplyr' was built under R version 4.1.2

## Warning: package 'stringr' was built under R version 4.1.2

## Warning: package 'forcats' was built under R version 4.1.2

## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

mobilJK <- X2020_ID_Region_Mobility_Report %>%
       filter(iso_3166_2_code== "ID-JK")
mobilJK

mobilJK$date <- as.Date(mobilJK$date)
mobilJK

mobilJTup <- filter(mobilJK, "2020-12-01" < date & date < "2020-12-31")
mobilJTup

library(ggplot2)
library(reshape2)

## Warning: package 'reshape2' was built under R version 4.1.3

## 
## Attaching package: 'reshape2'

## The following object is masked from 'package:tidyr':
## 
##     smiths

x <- mobilJTup$date
retail <- mobilJTup$retail_and_recreation_percent_change_from_baseline
grocery <- mobilJTup$grocery_and_pharmacy_percent_change_from_baseline
park <- mobilJTup$parks_percent_change_from_baseline
station <- mobilJTup$transit_stations_percent_change_from_baseline
workplace <- mobilJTup$workplaces_percent_change_from_baseline
residental <- mobilJTup$residential_percent_change_from_baseline
df <- data.frame(x, retail, grocery, park, station, workplace,residental  )

# melt the data to a long format
df2 <- melt(data = df, id.vars = "x")

# plot, using the aesthetics argument 'colour'
ggplot(data = df2, aes(x = x, y = value, colour = variable))+
  geom_point() +
  geom_line() + 
  theme(legend.justification = "top") +
  labs(title = "Google Mobility Index", 
         subtitle = "Propinsi DKI Jakarta Indonesia 2020", 
         y = "Mobility", x = "Date") +
theme(axis.text.x = element_text(angle = -90))

dim(mobilJTup)

## [1] 29 15

library(MASS)

## Warning: package 'MASS' was built under R version 4.1.2

## 
## Attaching package: 'MASS'

## The following object is masked from 'package:dplyr':
## 
##     select

lm.fit <- lm(mobilJTup$retail_and_recreation_percent_change_from_baseline~mobilJTup$grocery_and_pharmacy_percent_change_from_baseline, data=mobilJTup)
anova(lm.fit)

summary(lm.fit)

## 
## Call:
## lm(formula = mobilJTup$retail_and_recreation_percent_change_from_baseline ~ 
##     mobilJTup$grocery_and_pharmacy_percent_change_from_baseline, 
##     data = mobilJTup)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.9066 -0.7599  0.4757  0.6816  3.0638 
## 
## Coefficients:
##                                                             Estimate Std. Error
## (Intercept)                                                 -21.9172     0.4776
## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline   0.7941     0.0590
##                                                             t value Pr(>|t|)
## (Intercept)                                                  -45.89  < 2e-16
## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline   13.46 1.72e-13
##                                                                
## (Intercept)                                                 ***
## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.31 on 27 degrees of freedom
## Multiple R-squared:  0.8703, Adjusted R-squared:  0.8655 
## F-statistic: 181.1 on 1 and 27 DF,  p-value: 1.723e-13

plot(lm.fit)

# error berdistribusi normal 
# (data tidak berdistribusi normal)
shapiro.test(residuals(lm.fit))

## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(lm.fit)
## W = 0.9521, p-value = 0.2075

# varians bersifat konstan 
# (varians tidak konstan)
library(lmtest)

## Warning: package 'lmtest' was built under R version 4.1.2

## Loading required package: zoo

## Warning: package 'zoo' was built under R version 4.1.2

## 
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

bptest(lm.fit)

## 
##  studentized Breusch-Pagan test
## 
## data:  lm.fit
## BP = 0.047053, df = 1, p-value = 0.8283

# error bersifat independen
# (error tidak bersifat independen)
dwtest(lm.fit, alternative = "two.sided")

## 
##  Durbin-Watson test
## 
## data:  lm.fit
## DW = 1.5026, p-value = 0.1467
## alternative hypothesis: true autocorrelation is not 0

# deteksi outlier (stdres > 2)
sres <- rstandard(lm.fit)
sres[which(abs(sres)>2)] # nomor observasi outlier

##         8        29 
##  2.390562 -2.279139

# influential observation
# observasi > percentil 50
# tidak ada observasi dengan jarak cook yang extrim
cooksD <- cooks.distance(lm.fit)
p50 <- qf(0.5, df1=2, df2=560-2)
any(cooksD>p50)

## [1] FALSE

Membangun Regresi Linier Sederhana (Simple Linear Regression) Pada Data Google Mobility Index

Dwi Putri Adilah, 210605110031

3/19/2022

0.1 Apa Itu Google Mobility Index?

0.2 Regresi Linear Sederhana pada Data Google Mobility Index

0.3 Referensi