Membangun Model Linier dengan Interaksi Antar Variabel Prediktor pada data Google Mobility Index

3/19/2022

Mutiara Aprillia Dzakiroh, 210605110032

Dosen Pengampu : Prof. Dr. Suhartono, M.Kom

Mata Kuliah : Linier Algebra

Prodi : Teknik Informatika

Lembaga : Universitas Islam Negeri Maulana Malik Ibrahim Malang

Model Linier dengan Interaksi Antar Variabel Prediktor Interaksi antar variabel

pada model linier dapat dengan mudah dimasukkan kedalam fungsi lm(). Terdapat dua buah cara untuk melakukannya. Cara pertama dengan menggunakan tanda : pada formula ( contoh: y1 x1 + x2 + x1:x2 ). Tanda : menyatakan formula persamaan linier memasukkan interaksi antar variabel prediktor di dalamnya. Cara kedua adalah dengan menggunakan tanda *. Cara ini lebih sederhana, dimana fungsi lm() akan secara otomatis menerjemahkannya sebagai serangkaian variabel tunggal dan interaksinya. Berikut adalah contoh penerapannya menggunakan kedua cara tersebut:

library(readr)
## Warning: package 'readr' was built under R version 4.1.2
X2020_ID_Region_Mobility_Report <- read_csv("D:/DOWNLOAD/2020_ID_Region_Mobility_Report.csv")
## Rows: 11235 Columns: 15
## -- Column specification --------------------------------------------------------
## Delimiter: ","
## chr  (5): country_region_code, country_region, sub_region_1, iso_3166_2_code...
## dbl  (6): retail_and_recreation_percent_change_from_baseline, grocery_and_ph...
## lgl  (3): sub_region_2, metro_area, census_fips_code
## date (1): date
## 
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.
X2020_ID_Region_Mobility_Report
## # A tibble: 11,235 x 15
##    country_region_code country_region sub_region_1 sub_region_2 metro_area
##    <chr>               <chr>          <chr>        <lgl>        <lgl>     
##  1 ID                  Indonesia      <NA>         NA           NA        
##  2 ID                  Indonesia      <NA>         NA           NA        
##  3 ID                  Indonesia      <NA>         NA           NA        
##  4 ID                  Indonesia      <NA>         NA           NA        
##  5 ID                  Indonesia      <NA>         NA           NA        
##  6 ID                  Indonesia      <NA>         NA           NA        
##  7 ID                  Indonesia      <NA>         NA           NA        
##  8 ID                  Indonesia      <NA>         NA           NA        
##  9 ID                  Indonesia      <NA>         NA           NA        
## 10 ID                  Indonesia      <NA>         NA           NA        
## # ... with 11,225 more rows, and 10 more variables: iso_3166_2_code <chr>,
## #   census_fips_code <lgl>, place_id <chr>, date <date>,
## #   retail_and_recreation_percent_change_from_baseline <dbl>,
## #   grocery_and_pharmacy_percent_change_from_baseline <dbl>,
## #   parks_percent_change_from_baseline <dbl>,
## #   transit_stations_percent_change_from_baseline <dbl>,
## #   workplaces_percent_change_from_baseline <dbl>, ...
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.1.2
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5     v dplyr   1.0.8
## v tibble  3.1.6     v stringr 1.4.0
## v tidyr   1.2.0     v forcats 0.5.1
## v purrr   0.3.4
## Warning: package 'ggplot2' was built under R version 4.1.2
## Warning: package 'tibble' was built under R version 4.1.2
## Warning: package 'tidyr' was built under R version 4.1.2
## Warning: package 'purrr' was built under R version 4.1.2
## Warning: package 'dplyr' was built under R version 4.1.2
## Warning: package 'stringr' was built under R version 4.1.2
## Warning: package 'forcats' was built under R version 4.1.2
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
mobilJK <- X2020_ID_Region_Mobility_Report %>%
       filter(iso_3166_2_code== "ID-JK")
mobilJK
## # A tibble: 321 x 15
##    country_region_code country_region sub_region_1 sub_region_2 metro_area
##    <chr>               <chr>          <chr>        <lgl>        <lgl>     
##  1 ID                  Indonesia      Jakarta      NA           NA        
##  2 ID                  Indonesia      Jakarta      NA           NA        
##  3 ID                  Indonesia      Jakarta      NA           NA        
##  4 ID                  Indonesia      Jakarta      NA           NA        
##  5 ID                  Indonesia      Jakarta      NA           NA        
##  6 ID                  Indonesia      Jakarta      NA           NA        
##  7 ID                  Indonesia      Jakarta      NA           NA        
##  8 ID                  Indonesia      Jakarta      NA           NA        
##  9 ID                  Indonesia      Jakarta      NA           NA        
## 10 ID                  Indonesia      Jakarta      NA           NA        
## # ... with 311 more rows, and 10 more variables: iso_3166_2_code <chr>,
## #   census_fips_code <lgl>, place_id <chr>, date <date>,
## #   retail_and_recreation_percent_change_from_baseline <dbl>,
## #   grocery_and_pharmacy_percent_change_from_baseline <dbl>,
## #   parks_percent_change_from_baseline <dbl>,
## #   transit_stations_percent_change_from_baseline <dbl>,
## #   workplaces_percent_change_from_baseline <dbl>, ...
mobilJK$date <- as.Date(mobilJK$date)
mobilJK
## # A tibble: 321 x 15
##    country_region_code country_region sub_region_1 sub_region_2 metro_area
##    <chr>               <chr>          <chr>        <lgl>        <lgl>     
##  1 ID                  Indonesia      Jakarta      NA           NA        
##  2 ID                  Indonesia      Jakarta      NA           NA        
##  3 ID                  Indonesia      Jakarta      NA           NA        
##  4 ID                  Indonesia      Jakarta      NA           NA        
##  5 ID                  Indonesia      Jakarta      NA           NA        
##  6 ID                  Indonesia      Jakarta      NA           NA        
##  7 ID                  Indonesia      Jakarta      NA           NA        
##  8 ID                  Indonesia      Jakarta      NA           NA        
##  9 ID                  Indonesia      Jakarta      NA           NA        
## 10 ID                  Indonesia      Jakarta      NA           NA        
## # ... with 311 more rows, and 10 more variables: iso_3166_2_code <chr>,
## #   census_fips_code <lgl>, place_id <chr>, date <date>,
## #   retail_and_recreation_percent_change_from_baseline <dbl>,
## #   grocery_and_pharmacy_percent_change_from_baseline <dbl>,
## #   parks_percent_change_from_baseline <dbl>,
## #   transit_stations_percent_change_from_baseline <dbl>,
## #   workplaces_percent_change_from_baseline <dbl>, ...
mobilJTup <- filter(mobilJK, "2020-12-01" < date & date < "2020-12-31")
mobilJTup
## # A tibble: 29 x 15
##    country_region_code country_region sub_region_1 sub_region_2 metro_area
##    <chr>               <chr>          <chr>        <lgl>        <lgl>     
##  1 ID                  Indonesia      Jakarta      NA           NA        
##  2 ID                  Indonesia      Jakarta      NA           NA        
##  3 ID                  Indonesia      Jakarta      NA           NA        
##  4 ID                  Indonesia      Jakarta      NA           NA        
##  5 ID                  Indonesia      Jakarta      NA           NA        
##  6 ID                  Indonesia      Jakarta      NA           NA        
##  7 ID                  Indonesia      Jakarta      NA           NA        
##  8 ID                  Indonesia      Jakarta      NA           NA        
##  9 ID                  Indonesia      Jakarta      NA           NA        
## 10 ID                  Indonesia      Jakarta      NA           NA        
## # ... with 19 more rows, and 10 more variables: iso_3166_2_code <chr>,
## #   census_fips_code <lgl>, place_id <chr>, date <date>,
## #   retail_and_recreation_percent_change_from_baseline <dbl>,
## #   grocery_and_pharmacy_percent_change_from_baseline <dbl>,
## #   parks_percent_change_from_baseline <dbl>,
## #   transit_stations_percent_change_from_baseline <dbl>,
## #   workplaces_percent_change_from_baseline <dbl>, ...
library(ggplot2)
library(reshape2)
## Warning: package 'reshape2' was built under R version 4.1.3
## 
## Attaching package: 'reshape2'
## The following object is masked from 'package:tidyr':
## 
##     smiths
x <- mobilJTup$date
retail <- mobilJTup$retail_and_recreation_percent_change_from_baseline
grocery <- mobilJTup$grocery_and_pharmacy_percent_change_from_baseline
park <- mobilJTup$parks_percent_change_from_baseline
station <- mobilJTup$transit_stations_percent_change_from_baseline
workplace <- mobilJTup$workplaces_percent_change_from_baseline
residental <- mobilJTup$residential_percent_change_from_baseline
df <- data.frame(x, retail, grocery, park, station, workplace,residental  )

# melt the data to a long format
df2 <- melt(data = df, id.vars = "x")

# plot, using the aesthetics argument 'colour'
ggplot(data = df2, aes(x = x, y = value, colour = variable))+
  geom_point() +
  geom_line() + 
  theme(legend.justification = "top") +
  labs(title = "Google Mobility Index", 
         subtitle = "Propinsi DKI Jakarta Indonesia 2020", 
         y = "Mobility", x = "Date") +
theme(axis.text.x = element_text(angle = -90))

dim(mobilJTup)
## [1] 29 15
# cara 1
lm.inter <- lm(retail_and_recreation_percent_change_from_baseline~grocery_and_pharmacy_percent_change_from_baseline+parks_percent_change_from_baseline+grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline, data=mobilJTup)
anova(lm.inter)
## Analysis of Variance Table
## 
## Response: retail_and_recreation_percent_change_from_baseline
##                                                                                      Df
## grocery_and_pharmacy_percent_change_from_baseline                                     1
## parks_percent_change_from_baseline                                                    1
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline  1
## Residuals                                                                            25
##                                                                                       Sum Sq
## grocery_and_pharmacy_percent_change_from_baseline                                    310.837
## parks_percent_change_from_baseline                                                     6.148
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline   0.182
## Residuals                                                                             40.006
##                                                                                      Mean Sq
## grocery_and_pharmacy_percent_change_from_baseline                                    310.837
## parks_percent_change_from_baseline                                                     6.148
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline   0.182
## Residuals                                                                              1.600
##                                                                                       F value
## grocery_and_pharmacy_percent_change_from_baseline                                    194.2417
## parks_percent_change_from_baseline                                                     3.8418
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline   0.1134
## Residuals                                                                                    
##                                                                                         Pr(>F)
## grocery_and_pharmacy_percent_change_from_baseline                                    2.726e-13
## parks_percent_change_from_baseline                                                     0.06123
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline   0.73908
## Residuals                                                                                     
##                                                                                         
## grocery_and_pharmacy_percent_change_from_baseline                                    ***
## parks_percent_change_from_baseline                                                   .  
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline    
## Residuals                                                                               
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(lm.inter)
## 
## Call:
## lm(formula = retail_and_recreation_percent_change_from_baseline ~ 
##     grocery_and_pharmacy_percent_change_from_baseline + parks_percent_change_from_baseline + 
##         grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline, 
##     data = mobilJTup)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.5336 -0.6497  0.4277  0.7500  2.4945 
## 
## Coefficients:
##                                                                                       Estimate
## (Intercept)                                                                          -18.26069
## grocery_and_pharmacy_percent_change_from_baseline                                      0.54858
## parks_percent_change_from_baseline                                                     0.08411
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline  -0.00403
##                                                                                      Std. Error
## (Intercept)                                                                             5.71701
## grocery_and_pharmacy_percent_change_from_baseline                                       0.56427
## parks_percent_change_from_baseline                                                      0.12472
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline    0.01196
##                                                                                      t value
## (Intercept)                                                                           -3.194
## grocery_and_pharmacy_percent_change_from_baseline                                      0.972
## parks_percent_change_from_baseline                                                     0.674
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline  -0.337
##                                                                                      Pr(>|t|)
## (Intercept)                                                                           0.00377
## grocery_and_pharmacy_percent_change_from_baseline                                     0.34027
## parks_percent_change_from_baseline                                                    0.50625
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline  0.73908
##                                                                                        
## (Intercept)                                                                          **
## grocery_and_pharmacy_percent_change_from_baseline                                      
## parks_percent_change_from_baseline                                                     
## grocery_and_pharmacy_percent_change_from_baseline:parks_percent_change_from_baseline   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.265 on 25 degrees of freedom
## Multiple R-squared:  0.888,  Adjusted R-squared:  0.8746 
## F-statistic: 66.07 on 3 and 25 DF,  p-value: 5.075e-12
plot(lm.inter)

Referensi