Membangun Regresi Linier Berganda (Multiple Linier Regression) pada Data Google Mobility Index
3/19/2022
Dosen Pengampu : Prof. Dr. Suhartono, M.Kom
Mata Kuliah : Linier Algebra
Prodi : Teknik Informatika
Lembaga : Universitas Islam Negeri Maulana Malik Ibrahim Malang
Apa Itu Google Mobility Index?
Pemerintah menggunakan sejumlah data dari raksasa teknologi untuk melakukan pemantauan mobilitas masyarakat saat gelombang kasus Covid-19 meningkat di Indonesia. Salah satu yang digunakan adalah laporan dari Google. Isi laporan tersebut akan menampilkan grafik tren mobilitas seiring waktu. Yakni berisi per wilayah untuk kategori tempat relevan sebagai cara merespon pandemi yakni untuk sejumlah tempat dari retail dan rekreasi, toko bahan makanan dan apotek, taman, pusat transportasi umum, tempat kerja dan area pemukiman.
Regresi Linear Berganda (Multiple Linier Regression)
Pada pembahasan ini kita akan membuat tiga buah model regresi linier. Model pertama akan menambahkan variabel
grocery_and_pharmacy_percent_change_from_baselinepada model sebelumnya, model kedua akan menggunakan seluruh ariabel yang ada, dan model ketiga akan melakukan pembaharuan dengan mengeluarkan variabel dengan VIF paling tinggi dari model kedua. Berikut adalah sintaks untuk membentuk ketiag model tersebut:
library(readr)## Warning: package 'readr' was built under R version 4.1.2
X2020_ID_Region_Mobility_Report <- read_csv("D:/DOWNLOAD/2020_ID_Region_Mobility_Report.csv")## Rows: 11235 Columns: 15
## -- Column specification --------------------------------------------------------
## Delimiter: ","
## chr (5): country_region_code, country_region, sub_region_1, iso_3166_2_code...
## dbl (6): retail_and_recreation_percent_change_from_baseline, grocery_and_ph...
## lgl (3): sub_region_2, metro_area, census_fips_code
## date (1): date
##
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.
X2020_ID_Region_Mobility_Report## # A tibble: 11,235 x 15
## country_region_code country_region sub_region_1 sub_region_2 metro_area
## <chr> <chr> <chr> <lgl> <lgl>
## 1 ID Indonesia <NA> NA NA
## 2 ID Indonesia <NA> NA NA
## 3 ID Indonesia <NA> NA NA
## 4 ID Indonesia <NA> NA NA
## 5 ID Indonesia <NA> NA NA
## 6 ID Indonesia <NA> NA NA
## 7 ID Indonesia <NA> NA NA
## 8 ID Indonesia <NA> NA NA
## 9 ID Indonesia <NA> NA NA
## 10 ID Indonesia <NA> NA NA
## # ... with 11,225 more rows, and 10 more variables: iso_3166_2_code <chr>,
## # census_fips_code <lgl>, place_id <chr>, date <date>,
## # retail_and_recreation_percent_change_from_baseline <dbl>,
## # grocery_and_pharmacy_percent_change_from_baseline <dbl>,
## # parks_percent_change_from_baseline <dbl>,
## # transit_stations_percent_change_from_baseline <dbl>,
## # workplaces_percent_change_from_baseline <dbl>, ...
library(tidyverse)## Warning: package 'tidyverse' was built under R version 4.1.2
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v dplyr 1.0.8
## v tibble 3.1.6 v stringr 1.4.0
## v tidyr 1.2.0 v forcats 0.5.1
## v purrr 0.3.4
## Warning: package 'ggplot2' was built under R version 4.1.2
## Warning: package 'tibble' was built under R version 4.1.2
## Warning: package 'tidyr' was built under R version 4.1.2
## Warning: package 'purrr' was built under R version 4.1.2
## Warning: package 'dplyr' was built under R version 4.1.2
## Warning: package 'stringr' was built under R version 4.1.2
## Warning: package 'forcats' was built under R version 4.1.2
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
mobilJK <- X2020_ID_Region_Mobility_Report %>%
filter(iso_3166_2_code== "ID-JK")
mobilJK## # A tibble: 321 x 15
## country_region_code country_region sub_region_1 sub_region_2 metro_area
## <chr> <chr> <chr> <lgl> <lgl>
## 1 ID Indonesia Jakarta NA NA
## 2 ID Indonesia Jakarta NA NA
## 3 ID Indonesia Jakarta NA NA
## 4 ID Indonesia Jakarta NA NA
## 5 ID Indonesia Jakarta NA NA
## 6 ID Indonesia Jakarta NA NA
## 7 ID Indonesia Jakarta NA NA
## 8 ID Indonesia Jakarta NA NA
## 9 ID Indonesia Jakarta NA NA
## 10 ID Indonesia Jakarta NA NA
## # ... with 311 more rows, and 10 more variables: iso_3166_2_code <chr>,
## # census_fips_code <lgl>, place_id <chr>, date <date>,
## # retail_and_recreation_percent_change_from_baseline <dbl>,
## # grocery_and_pharmacy_percent_change_from_baseline <dbl>,
## # parks_percent_change_from_baseline <dbl>,
## # transit_stations_percent_change_from_baseline <dbl>,
## # workplaces_percent_change_from_baseline <dbl>, ...
mobilJK$date <- as.Date(mobilJK$date)
mobilJK## # A tibble: 321 x 15
## country_region_code country_region sub_region_1 sub_region_2 metro_area
## <chr> <chr> <chr> <lgl> <lgl>
## 1 ID Indonesia Jakarta NA NA
## 2 ID Indonesia Jakarta NA NA
## 3 ID Indonesia Jakarta NA NA
## 4 ID Indonesia Jakarta NA NA
## 5 ID Indonesia Jakarta NA NA
## 6 ID Indonesia Jakarta NA NA
## 7 ID Indonesia Jakarta NA NA
## 8 ID Indonesia Jakarta NA NA
## 9 ID Indonesia Jakarta NA NA
## 10 ID Indonesia Jakarta NA NA
## # ... with 311 more rows, and 10 more variables: iso_3166_2_code <chr>,
## # census_fips_code <lgl>, place_id <chr>, date <date>,
## # retail_and_recreation_percent_change_from_baseline <dbl>,
## # grocery_and_pharmacy_percent_change_from_baseline <dbl>,
## # parks_percent_change_from_baseline <dbl>,
## # transit_stations_percent_change_from_baseline <dbl>,
## # workplaces_percent_change_from_baseline <dbl>, ...
mobilJTup <- filter(mobilJK, "2020-12-01" < date & date < "2020-12-31")
mobilJTup## # A tibble: 29 x 15
## country_region_code country_region sub_region_1 sub_region_2 metro_area
## <chr> <chr> <chr> <lgl> <lgl>
## 1 ID Indonesia Jakarta NA NA
## 2 ID Indonesia Jakarta NA NA
## 3 ID Indonesia Jakarta NA NA
## 4 ID Indonesia Jakarta NA NA
## 5 ID Indonesia Jakarta NA NA
## 6 ID Indonesia Jakarta NA NA
## 7 ID Indonesia Jakarta NA NA
## 8 ID Indonesia Jakarta NA NA
## 9 ID Indonesia Jakarta NA NA
## 10 ID Indonesia Jakarta NA NA
## # ... with 19 more rows, and 10 more variables: iso_3166_2_code <chr>,
## # census_fips_code <lgl>, place_id <chr>, date <date>,
## # retail_and_recreation_percent_change_from_baseline <dbl>,
## # grocery_and_pharmacy_percent_change_from_baseline <dbl>,
## # parks_percent_change_from_baseline <dbl>,
## # transit_stations_percent_change_from_baseline <dbl>,
## # workplaces_percent_change_from_baseline <dbl>, ...
library(ggplot2)
library(reshape2)## Warning: package 'reshape2' was built under R version 4.1.3
##
## Attaching package: 'reshape2'
## The following object is masked from 'package:tidyr':
##
## smiths
x <- mobilJTup$date
retail <- mobilJTup$retail_and_recreation_percent_change_from_baseline
grocery <- mobilJTup$grocery_and_pharmacy_percent_change_from_baseline
park <- mobilJTup$parks_percent_change_from_baseline
station <- mobilJTup$transit_stations_percent_change_from_baseline
workplace <- mobilJTup$workplaces_percent_change_from_baseline
residental <- mobilJTup$residential_percent_change_from_baseline
df <- data.frame(x, retail, grocery, park, station, workplace,residental )
# melt the data to a long format
df2 <- melt(data = df, id.vars = "x")
# plot, using the aesthetics argument 'colour'
ggplot(data = df2, aes(x = x, y = value, colour = variable))+
geom_point() +
geom_line() +
theme(legend.justification = "top") +
labs(title = "Google Mobility Index",
subtitle = "Propinsi DKI Jakarta Indonesia 2020",
y = "Mobility", x = "Date") +
theme(axis.text.x = element_text(angle = -90))dim(mobilJTup)## [1] 29 15
library(car)## Warning: package 'car' was built under R version 4.1.3
## Loading required package: carData
## Warning: package 'carData' was built under R version 4.1.3
##
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
##
## recode
## The following object is masked from 'package:purrr':
##
## some
# Model pertama
lm.fit1 <- lm(retail_and_recreation_percent_change_from_baseline ~ mobilJTup$grocery_and_pharmacy_percent_change_from_baseline + mobilJTup$parks_percent_change_from_baseline, data=mobilJTup)
anova(lm.fit1)## Analysis of Variance Table
##
## Response: retail_and_recreation_percent_change_from_baseline
## Df Sum Sq Mean Sq
## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline 1 310.837 310.837
## mobilJTup$parks_percent_change_from_baseline 1 6.148 6.148
## Residuals 26 40.188 1.546
## F value Pr(>F)
## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline 201.0989 9.518e-14
## mobilJTup$parks_percent_change_from_baseline 3.9774 0.0567
## Residuals
##
## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline ***
## mobilJTup$parks_percent_change_from_baseline .
## Residuals
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(lm.fit1)##
## Call:
## lm(formula = retail_and_recreation_percent_change_from_baseline ~
## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline +
## mobilJTup$parks_percent_change_from_baseline, data = mobilJTup)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.4524 -0.6142 0.2511 0.7691 2.5200
##
## Coefficients:
## Estimate
## (Intercept) -16.57523
## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline 0.73740
## mobilJTup$parks_percent_change_from_baseline 0.12064
## Std. Error t value
## (Intercept) 2.71664 -6.101
## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline 0.06279 11.744
## mobilJTup$parks_percent_change_from_baseline 0.06049 1.994
## Pr(>|t|)
## (Intercept) 1.90e-06 ***
## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline 6.76e-12 ***
## mobilJTup$parks_percent_change_from_baseline 0.0567 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.243 on 26 degrees of freedom
## Multiple R-squared: 0.8875, Adjusted R-squared: 0.8788
## F-statistic: 102.5 on 2 and 26 DF, p-value: 4.633e-13
vif(lm.fit1)## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline
## 1.257481
## mobilJTup$parks_percent_change_from_baseline
## 1.257481
Berdasarkan hasil perhitungan diketahui nilai VIF dari model
< 10, sehingga asumsi multikolinearitas terpenuhi.
plot(lm.fit1)# Model 2
lm.fit2 <- lm(mobilJTup$retail_and_recreation_percent_change_from_baseline ~ mobilJTup$grocery_and_pharmacy_percent_change_from_baseline + mobilJTup$parks_percent_change_from_baseline + mobilJTup$transit_stations_percent_change_from_baseline + mobilJTup$workplaces_percent_change_from_baseline + mobilJTup$residential_percent_change_from_baseline , data=mobilJTup)
anova(lm.fit2)## Analysis of Variance Table
##
## Response: mobilJTup$retail_and_recreation_percent_change_from_baseline
## Df Sum Sq Mean Sq
## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline 1 310.837 310.837
## mobilJTup$parks_percent_change_from_baseline 1 6.148 6.148
## mobilJTup$transit_stations_percent_change_from_baseline 1 8.768 8.768
## mobilJTup$workplaces_percent_change_from_baseline 1 0.041 0.041
## mobilJTup$residential_percent_change_from_baseline 1 3.334 3.334
## Residuals 23 28.044 1.219
## F value Pr(>F)
## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline 254.9289 6.14e-14
## mobilJTup$parks_percent_change_from_baseline 5.0420 0.03465
## mobilJTup$transit_stations_percent_change_from_baseline 7.1912 0.01332
## mobilJTup$workplaces_percent_change_from_baseline 0.0338 0.85566
## mobilJTup$residential_percent_change_from_baseline 2.7347 0.11178
## Residuals
##
## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline ***
## mobilJTup$parks_percent_change_from_baseline *
## mobilJTup$transit_stations_percent_change_from_baseline *
## mobilJTup$workplaces_percent_change_from_baseline
## mobilJTup$residential_percent_change_from_baseline
## Residuals
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(lm.fit2)##
## Call:
## lm(formula = mobilJTup$retail_and_recreation_percent_change_from_baseline ~
## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline +
## mobilJTup$parks_percent_change_from_baseline + mobilJTup$transit_stations_percent_change_from_baseline +
## mobilJTup$workplaces_percent_change_from_baseline + mobilJTup$residential_percent_change_from_baseline,
## data = mobilJTup)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.0547 -0.7000 0.2875 0.5940 1.5111
##
## Coefficients:
## Estimate
## (Intercept) -25.45532
## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline 0.84427
## mobilJTup$parks_percent_change_from_baseline 0.01550
## mobilJTup$transit_stations_percent_change_from_baseline -0.05457
## mobilJTup$workplaces_percent_change_from_baseline 0.06710
## mobilJTup$residential_percent_change_from_baseline 0.39523
## Std. Error t value
## (Intercept) 3.86015 -6.594
## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline 0.06991 12.076
## mobilJTup$parks_percent_change_from_baseline 0.06915 0.224
## mobilJTup$transit_stations_percent_change_from_baseline 0.13453 -0.406
## mobilJTup$workplaces_percent_change_from_baseline 0.06734 0.996
## mobilJTup$residential_percent_change_from_baseline 0.23900 1.654
## Pr(>|t|)
## (Intercept) 9.96e-07 ***
## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline 1.95e-11 ***
## mobilJTup$parks_percent_change_from_baseline 0.825
## mobilJTup$transit_stations_percent_change_from_baseline 0.689
## mobilJTup$workplaces_percent_change_from_baseline 0.329
## mobilJTup$residential_percent_change_from_baseline 0.112
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.104 on 23 degrees of freedom
## Multiple R-squared: 0.9215, Adjusted R-squared: 0.9044
## F-statistic: 53.99 on 5 and 23 DF, p-value: 5.993e-12
vif(lm.fit2)## mobilJTup$grocery_and_pharmacy_percent_change_from_baseline
## 1.976090
## mobilJTup$parks_percent_change_from_baseline
## 2.083029
## mobilJTup$transit_stations_percent_change_from_baseline
## 14.775100
## mobilJTup$workplaces_percent_change_from_baseline
## 17.585806
## mobilJTup$residential_percent_change_from_baseline
## 14.276831
Berdasarkan hasil perhitungan diperoleh nilai VIF untuk varaibel prediktor dalam model
< 10hanya ada 2 variavel yaitumobilJTupgroceryandpharmacypercentchangefrombaselinedanmobilJTupparks_percent_change_from_baseline, sehingga asumsi multikolinearitas terpenuhi.Sedangkan tiga yang lain nya di asumsikan tidak multikolinearitas karena> 10yaitumobilJTuptransitstationspercentchangefrombaseline,mobilJTup workplaces_percent_change_from_baselinedanmobilJTup$residential_percent_change_from_baseline.
plot(lm.fit2)