Membangun Model Linier dengan Interaksi Antar Variabel Prediktor pada data Google Mobility Index
3/19/2022
Dosen Pengampu : Prof. Dr. Suhartono, M.Kom
Mata Kuliah : Linier Algebra
Prodi : Teknik Informatika
Lembaga : Universitas Islam Negeri Maulana Malik Ibrahim Malang
Model Linier dengan Interaksi Antar Variabel Prediktor Interaksi antar variabel
pada model linier dapat dengan mudah dimasukkan kedalam fungsi lm(). Terdapat dua buah cara untuk melakukannya. Cara pertama dengan menggunakan tanda : pada formula ( contoh: y1 x1 + x2 + x1:x2 ). Tanda : menyatakan formula persamaan linier memasukkan interaksi antar variabel prediktor di dalamnya. Cara kedua adalah dengan menggunakan tanda *. Cara ini lebih sederhana, dimana fungsi lm() akan secara otomatis menerjemahkannya sebagai serangkaian variabel tunggal dan interaksinya. Berikut adalah contoh penerapannya menggunakan kedua cara tersebut:
library(readr)## Warning: package 'readr' was built under R version 4.1.2
X2020_ID_Region_Mobility_Report <- read_csv("D:/DOWNLOAD/2020_ID_Region_Mobility_Report.csv")## Rows: 11235 Columns: 15
## -- Column specification --------------------------------------------------------
## Delimiter: ","
## chr (5): country_region_code, country_region, sub_region_1, iso_3166_2_code...
## dbl (6): retail_and_recreation_percent_change_from_baseline, grocery_and_ph...
## lgl (3): sub_region_2, metro_area, census_fips_code
## date (1): date
##
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.
X2020_ID_Region_Mobility_Report## # A tibble: 11,235 x 15
## country_region_code country_region sub_region_1 sub_region_2 metro_area
## <chr> <chr> <chr> <lgl> <lgl>
## 1 ID Indonesia <NA> NA NA
## 2 ID Indonesia <NA> NA NA
## 3 ID Indonesia <NA> NA NA
## 4 ID Indonesia <NA> NA NA
## 5 ID Indonesia <NA> NA NA
## 6 ID Indonesia <NA> NA NA
## 7 ID Indonesia <NA> NA NA
## 8 ID Indonesia <NA> NA NA
## 9 ID Indonesia <NA> NA NA
## 10 ID Indonesia <NA> NA NA
## # ... with 11,225 more rows, and 10 more variables: iso_3166_2_code <chr>,
## # census_fips_code <lgl>, place_id <chr>, date <date>,
## # retail_and_recreation_percent_change_from_baseline <dbl>,
## # grocery_and_pharmacy_percent_change_from_baseline <dbl>,
## # parks_percent_change_from_baseline <dbl>,
## # transit_stations_percent_change_from_baseline <dbl>,
## # workplaces_percent_change_from_baseline <dbl>, ...
library(tidyverse)## Warning: package 'tidyverse' was built under R version 4.1.2
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v dplyr 1.0.8
## v tibble 3.1.6 v stringr 1.4.0
## v tidyr 1.2.0 v forcats 0.5.1
## v purrr 0.3.4
## Warning: package 'ggplot2' was built under R version 4.1.2
## Warning: package 'tibble' was built under R version 4.1.2
## Warning: package 'tidyr' was built under R version 4.1.2
## Warning: package 'purrr' was built under R version 4.1.2
## Warning: package 'dplyr' was built under R version 4.1.2
## Warning: package 'stringr' was built under R version 4.1.2
## Warning: package 'forcats' was built under R version 4.1.2
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
mobilJK <- X2020_ID_Region_Mobility_Report %>%
filter(iso_3166_2_code== "ID-JK")
mobilJK## # A tibble: 321 x 15
## country_region_code country_region sub_region_1 sub_region_2 metro_area
## <chr> <chr> <chr> <lgl> <lgl>
## 1 ID Indonesia Jakarta NA NA
## 2 ID Indonesia Jakarta NA NA
## 3 ID Indonesia Jakarta NA NA
## 4 ID Indonesia Jakarta NA NA
## 5 ID Indonesia Jakarta NA NA
## 6 ID Indonesia Jakarta NA NA
## 7 ID Indonesia Jakarta NA NA
## 8 ID Indonesia Jakarta NA NA
## 9 ID Indonesia Jakarta NA NA
## 10 ID Indonesia Jakarta NA NA
## # ... with 311 more rows, and 10 more variables: iso_3166_2_code <chr>,
## # census_fips_code <lgl>, place_id <chr>, date <date>,
## # retail_and_recreation_percent_change_from_baseline <dbl>,
## # grocery_and_pharmacy_percent_change_from_baseline <dbl>,
## # parks_percent_change_from_baseline <dbl>,
## # transit_stations_percent_change_from_baseline <dbl>,
## # workplaces_percent_change_from_baseline <dbl>, ...
mobilJK$date <- as.Date(mobilJK$date)
mobilJK## # A tibble: 321 x 15
## country_region_code country_region sub_region_1 sub_region_2 metro_area
## <chr> <chr> <chr> <lgl> <lgl>
## 1 ID Indonesia Jakarta NA NA
## 2 ID Indonesia Jakarta NA NA
## 3 ID Indonesia Jakarta NA NA
## 4 ID Indonesia Jakarta NA NA
## 5 ID Indonesia Jakarta NA NA
## 6 ID Indonesia Jakarta NA NA
## 7 ID Indonesia Jakarta NA NA
## 8 ID Indonesia Jakarta NA NA
## 9 ID Indonesia Jakarta NA NA
## 10 ID Indonesia Jakarta NA NA
## # ... with 311 more rows, and 10 more variables: iso_3166_2_code <chr>,
## # census_fips_code <lgl>, place_id <chr>, date <date>,
## # retail_and_recreation_percent_change_from_baseline <dbl>,
## # grocery_and_pharmacy_percent_change_from_baseline <dbl>,
## # parks_percent_change_from_baseline <dbl>,
## # transit_stations_percent_change_from_baseline <dbl>,
## # workplaces_percent_change_from_baseline <dbl>, ...
mobilJTup <- filter(mobilJK, "2020-12-01" < date & date < "2020-12-31")
mobilJTup## # A tibble: 29 x 15
## country_region_code country_region sub_region_1 sub_region_2 metro_area
## <chr> <chr> <chr> <lgl> <lgl>
## 1 ID Indonesia Jakarta NA NA
## 2 ID Indonesia Jakarta NA NA
## 3 ID Indonesia Jakarta NA NA
## 4 ID Indonesia Jakarta NA NA
## 5 ID Indonesia Jakarta NA NA
## 6 ID Indonesia Jakarta NA NA
## 7 ID Indonesia Jakarta NA NA
## 8 ID Indonesia Jakarta NA NA
## 9 ID Indonesia Jakarta NA NA
## 10 ID Indonesia Jakarta NA NA
## # ... with 19 more rows, and 10 more variables: iso_3166_2_code <chr>,
## # census_fips_code <lgl>, place_id <chr>, date <date>,
## # retail_and_recreation_percent_change_from_baseline <dbl>,
## # grocery_and_pharmacy_percent_change_from_baseline <dbl>,
## # parks_percent_change_from_baseline <dbl>,
## # transit_stations_percent_change_from_baseline <dbl>,
## # workplaces_percent_change_from_baseline <dbl>, ...
library(ggplot2)
library(reshape2)## Warning: package 'reshape2' was built under R version 4.1.3
##
## Attaching package: 'reshape2'
## The following object is masked from 'package:tidyr':
##
## smiths
x <- mobilJTup$date
retail <- mobilJTup$retail_and_recreation_percent_change_from_baseline
grocery <- mobilJTup$grocery_and_pharmacy_percent_change_from_baseline
park <- mobilJTup$parks_percent_change_from_baseline
station <- mobilJTup$transit_stations_percent_change_from_baseline
workplace <- mobilJTup$workplaces_percent_change_from_baseline
residental <- mobilJTup$residential_percent_change_from_baseline
df <- data.frame(x, retail, grocery, park, station, workplace,residental )
# melt the data to a long format
df2 <- melt(data = df, id.vars = "x")
# plot, using the aesthetics argument 'colour'
ggplot(data = df2, aes(x = x, y = value, colour = variable))+
geom_point() +
geom_line() +
theme(legend.justification = "top") +
labs(title = "Google Mobility Index",
subtitle = "Propinsi DKI Jakarta Indonesia 2020",
y = "Mobility", x = "Date") +
theme(axis.text.x = element_text(angle = -90))dim(mobilJTup)## [1] 29 15
# cara 1
lm.inter <- lm(retail_and_recreation_percent_change_from_baseline~grocery_and_pharmacy_percent_change_from_baseline+parks_percent_change_from_baseline+grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline, data=mobilJTup)
anova(lm.inter)## Analysis of Variance Table
##
## Response: retail_and_recreation_percent_change_from_baseline
## Df
## grocery_and_pharmacy_percent_change_from_baseline 1
## parks_percent_change_from_baseline 1
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline 1
## Residuals 25
## Sum Sq
## grocery_and_pharmacy_percent_change_from_baseline 310.837
## parks_percent_change_from_baseline 6.148
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline 0.182
## Residuals 40.006
## Mean Sq
## grocery_and_pharmacy_percent_change_from_baseline 310.837
## parks_percent_change_from_baseline 6.148
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline 0.182
## Residuals 1.600
## F value
## grocery_and_pharmacy_percent_change_from_baseline 194.2417
## parks_percent_change_from_baseline 3.8418
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline 0.1134
## Residuals
## Pr(>F)
## grocery_and_pharmacy_percent_change_from_baseline 2.726e-13
## parks_percent_change_from_baseline 0.06123
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline 0.73908
## Residuals
##
## grocery_and_pharmacy_percent_change_from_baseline ***
## parks_percent_change_from_baseline .
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline
## Residuals
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(lm.inter)##
## Call:
## lm(formula = retail_and_recreation_percent_change_from_baseline ~
## grocery_and_pharmacy_percent_change_from_baseline + parks_percent_change_from_baseline +
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline,
## data = mobilJTup)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.5336 -0.6497 0.4277 0.7500 2.4945
##
## Coefficients:
## Estimate
## (Intercept) -18.26069
## grocery_and_pharmacy_percent_change_from_baseline 0.54858
## parks_percent_change_from_baseline 0.08411
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline -0.00403
## Std. Error
## (Intercept) 5.71701
## grocery_and_pharmacy_percent_change_from_baseline 0.56427
## parks_percent_change_from_baseline 0.12472
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline 0.01196
## t value
## (Intercept) -3.194
## grocery_and_pharmacy_percent_change_from_baseline 0.972
## parks_percent_change_from_baseline 0.674
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline -0.337
## Pr(>|t|)
## (Intercept) 0.00377
## grocery_and_pharmacy_percent_change_from_baseline 0.34027
## parks_percent_change_from_baseline 0.50625
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline 0.73908
##
## (Intercept) **
## grocery_and_pharmacy_percent_change_from_baseline
## parks_percent_change_from_baseline
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.265 on 25 degrees of freedom
## Multiple R-squared: 0.888, Adjusted R-squared: 0.8746
## F-statistic: 66.07 on 3 and 25 DF, p-value: 5.075e-12
plot(lm.inter)