Analisa Data dengan Regresi Linear dengan tiga model dengan Google Mobility Index dan Data Covid-19 Jakarta Januari 2022

Nama Moch Hasbi Ashidqy
NIM 210605110104
Dosen Pengampu Prof. Dr M. Suhartono, M.Kom
Program S1 Teknik Informatika

1. Overview

Analisa Regresi Linear Tiga Model Linear Model, SS Model, Smooth Spline Model untuk data Covid-19 DKI Jakarta bulan Januari 2022 pada data status penderita yang Dirawat.

Tujuannya adalah dapat memprediksi jumlah pasien yang Dirawat berdasarkan informasi data tanggal dari data covid 19 DKI Jakarta pada bulan Januari 2022

Tiga model ini akan menyajikan prediksi terdekat jumlah pasien yang seharusnya dirawat sebagai Learning Machine

Smoothing splines

Smoothing splines adalah pendekatan yang ampuh untuk memperkirakan hubungan fungsional antara prediktor X dan respons Y. Smoothing splines dapat ditampung baik menggunakan fungsi smooth.spline (dalam paket stats) atau fungsi ss (dalam paket npreg). Dokumen ini memberikan latar belakang teoritis tentang smoothing splines, serta contoh-contoh yang menggambarkan bagaimana menggunakan fungsi smooth.spline dan ss.

Seperti yang saya tunjukkan dalam tutorial ini, kedua fungsi memiliki sintaks yang sangat mirip, tetapi fungsi ss menawarkan beberapa opsi tambahan. Dibandingkan dengan fungsi smooth.spline, fungsi ss memiliki

lebih banyak metode pemilihan parameter penghalusan lebih banyak jenis spline (linier, kubik, quintic) opsi untuk batasan periodisitas opsi untuk nilai simpul yang ditentukan pengguna ringkasan yang sesuai dan metode plot

Untuk melihat bagaimana kinerja fungsi-fungsi ini dalam praktiknya, mari kita lihat contoh simulasinya. Secara khusus, mari simulasikan beberapa data dengan hubungan fungsional (berkala) yang memiliki beberapa gangguan.

2. Aktivasi Library

Pada Analisa regrresi tiga model memerlukan sejumlah library. Library yang digunakan dalam analisa studi kasus ini antara lain:

  1. readr : library ini digunakan untuk membaca file CSV/TASV

  2. tidyverse : library yang berisi kumpulan fungsi untuk analisa data.

  3. dplyr : library untuk membuat ringkasan data.

  4. npreg : library ini digunakan untuk membuat model ss

  5. stats : library ini digunakan untuk membuat model smooth spline

3. Import Dataset

library(readr)
mobility_DKI_Jakarta <- read_csv("DataCovid19 DKI Jakarta.1.csv")
## Rows: 31 Columns: 12
## -- Column specification --------------------------------------------------------
## Delimiter: ","
## dbl  (11): Dirawat, retail_and_recreation_percent_change_from_baseline, groc...
## date  (1): date
## 
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.
mobility_DKI_Jakarta
## # A tibble: 31 x 12
##    date       Dirawat retail_and_recreation_p~ grocery_and_pha~ parks_percent_c~
##    <date>       <dbl>                    <dbl>            <dbl>            <dbl>
##  1 2022-01-01     205                        3               24               46
##  2 2022-01-02     116                       11               35               28
##  3 2022-01-03     140                       14               36               20
##  4 2022-01-04     141                       11               33               15
##  5 2022-01-05     168                       12               34               17
##  6 2022-01-06     214                       10               30               12
##  7 2022-01-07     221                        6               25                6
##  8 2022-01-08     257                        8               31               10
##  9 2022-01-09     299                        7               28               13
## 10 2022-01-10     322                        7               27                6
## # ... with 21 more rows, and 7 more variables:
## #   transit_stations_percent_change_from_baseline <dbl>,
## #   workplaces_percent_change_from_baseline <dbl>,
## #   residential_percent_change_from_baseline <dbl>, POSITIF <dbl>,
## #   Sembuh <dbl>, Meninggal <dbl>, `Self Isolation` <dbl>

View Dataset

library(DT)
## Warning: package 'DT' was built under R version 4.1.3
datatable(mobility_DKI_Jakarta)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(rmdformats)
## Warning: package 'rmdformats' was built under R version 4.1.3
tab <- table(mobility_DKI_Jakarta$Dirawat, mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline>25)
pilltabs(
  tab,
  count = TRUE,
  rows = TRUE,
  cols = TRUE,
  chisq = TRUE,
  resid = TRUE,
  row.names = TRUE
)
## Warning in stats::chisq.test(tab): Chi-squared approximation may be incorrect
## Warning in stats::chisq.test(tab): Chi-squared approximation may be incorrect
FALSE
116 1
140 1
141 1
168 1
205 1
214 1
221 1
257 1
299 1
322 1
402 1
506 1
583 1
669 1
758 1
781 1
809 1
981 1
1125 1
1222 1
1445 1
1710 1
2011 1
2242 1
2721 1
3232 1
3989 1
4709 1
5545 1
6348 1
6809 1
FALSE Total n
116 100 100 1
140 100 100 1
141 100 100 1
168 100 100 1
205 100 100 1
214 100 100 1
221 100 100 1
257 100 100 1
299 100 100 1
322 100 100 1
402 100 100 1
506 100 100 1
583 100 100 1
669 100 100 1
758 100 100 1
781 100 100 1
809 100 100 1
981 100 100 1
1125 100 100 1
1222 100 100 1
1445 100 100 1
1710 100 100 1
2011 100 100 1
2242 100 100 1
2721 100 100 1
3232 100 100 1
3989 100 100 1
4709 100 100 1
5545 100 100 1
6348 100 100 1
6809 100 100 1
All 100 100 31
FALSE All
116 3.2 3.2
140 3.2 3.2
141 3.2 3.2
168 3.2 3.2
205 3.2 3.2
214 3.2 3.2
221 3.2 3.2
257 3.2 3.2
299 3.2 3.2
322 3.2 3.2
402 3.2 3.2
506 3.2 3.2
583 3.2 3.2
669 3.2 3.2
758 3.2 3.2
781 3.2 3.2
809 3.2 3.2
981 3.2 3.2
1125 3.2 3.2
1222 3.2 3.2
1445 3.2 3.2
1710 3.2 3.2
2011 3.2 3.2
2242 3.2 3.2
2721 3.2 3.2
3232 3.2 3.2
3989 3.2 3.2
4709 3.2 3.2
5545 3.2 3.2
6348 3.2 3.2
6809 3.2 3.2
Total 100.0 100.0
n 31.0 31.0
x
1 0
2 0
3 0
4 0
5 0
6 0
7 0
8 0
9 0
10 0
11 0
12 0
13 0
14 0
15 0
16 0
17 0
18 0
19 0
20 0
21 0
22 0
23 0
24 0
25 0
26 0
27 0
28 0
29 0
30 0
31 0

X-squared = 0, df = 30, p = 1

Cek Struktur Data

cek<-str(mobility_DKI_Jakarta)
## spec_tbl_df [31 x 12] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ date                                              : Date[1:31], format: "2022-01-01" "2022-01-02" ...
##  $ Dirawat                                           : num [1:31] 205 116 140 141 168 214 221 257 299 322 ...
##  $ retail_and_recreation_percent_change_from_baseline: num [1:31] 3 11 14 11 12 10 6 8 7 7 ...
##  $ grocery_and_pharmacy_percent_change_from_baseline : num [1:31] 24 35 36 33 34 30 25 31 28 27 ...
##  $ parks_percent_change_from_baseline                : num [1:31] 46 28 20 15 17 12 6 10 13 6 ...
##  $ transit_stations_percent_change_from_baseline     : num [1:31] -15 -4 -9 -9 -10 -11 -13 -6 -6 -13 ...
##  $ workplaces_percent_change_from_baseline           : num [1:31] -49 -4 -10 -8 -7 -5 -3 1 2 -4 ...
##  $ residential_percent_change_from_baseline          : num [1:31] 6 3 4 4 4 5 5 3 4 4 ...
##  $ POSITIF                                           : num [1:31] 865415 865518 865690 865805 866064 ...
##  $ Sembuh                                            : num [1:31] 851280 851386 851408 851449 851520 ...
##  $ Meninggal                                         : num [1:31] 13588 13588 13588 13588 13589 ...
##  $ Self Isolation                                    : num [1:31] 342 428 554 627 787 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   date = col_date(format = ""),
##   ..   Dirawat = col_double(),
##   ..   retail_and_recreation_percent_change_from_baseline = col_double(),
##   ..   grocery_and_pharmacy_percent_change_from_baseline = col_double(),
##   ..   parks_percent_change_from_baseline = col_double(),
##   ..   transit_stations_percent_change_from_baseline = col_double(),
##   ..   workplaces_percent_change_from_baseline = col_double(),
##   ..   residential_percent_change_from_baseline = col_double(),
##   ..   POSITIF = col_double(),
##   ..   Sembuh = col_double(),
##   ..   Meninggal = col_double(),
##   ..   `Self Isolation` = col_double()
##   .. )
##  - attr(*, "problems")=<externalptr>
cek
## NULL

Ringkasan Data

ringkasan<-summary(mobility_DKI_Jakarta)
ringkasan
##       date               Dirawat    
##  Min.   :2022-01-01   Min.   : 116  
##  1st Qu.:2022-01-08   1st Qu.: 278  
##  Median :2022-01-16   Median : 781  
##  Mean   :2022-01-16   Mean   :1635  
##  3rd Qu.:2022-01-23   3rd Qu.:2126  
##  Max.   :2022-01-31   Max.   :6809  
##  retail_and_recreation_percent_change_from_baseline
##  Min.   : 1.000                                    
##  1st Qu.: 6.000                                    
##  Median : 7.000                                    
##  Mean   : 7.452                                    
##  3rd Qu.: 9.000                                    
##  Max.   :14.000                                    
##  grocery_and_pharmacy_percent_change_from_baseline
##  Min.   :20.00                                    
##  1st Qu.:25.00                                    
##  Median :26.00                                    
##  Mean   :27.55                                    
##  3rd Qu.:29.50                                    
##  Max.   :36.00                                    
##  parks_percent_change_from_baseline
##  Min.   :-1.00                     
##  1st Qu.: 5.50                     
##  Median : 9.00                     
##  Mean   :10.94                     
##  3rd Qu.:14.00                     
##  Max.   :46.00                     
##  transit_stations_percent_change_from_baseline
##  Min.   :-17.00                               
##  1st Qu.:-15.00                               
##  Median :-14.00                               
##  Mean   :-12.65                               
##  3rd Qu.:-10.00                               
##  Max.   : -4.00                               
##  workplaces_percent_change_from_baseline
##  Min.   :-49.000                        
##  1st Qu.: -4.000                        
##  Median :  0.000                        
##  Mean   : -1.677                        
##  3rd Qu.:  3.000                        
##  Max.   :  8.000                        
##  residential_percent_change_from_baseline    POSITIF           Sembuh      
##  Min.   :3.000                            Min.   :865415   Min.   :851280  
##  1st Qu.:4.500                            1st Qu.:867106   1st Qu.:851783  
##  Median :5.000                            Median :870929   Median :853522  
##  Mean   :5.258                            Mean   :876297   Mean   :855268  
##  3rd Qu.:6.000                            3rd Qu.:880304   3rd Qu.:856934  
##  Max.   :7.000                            Max.   :913355   Max.   :867519  
##    Meninggal     Self Isolation 
##  Min.   :13588   Min.   :  342  
##  1st Qu.:13589   1st Qu.: 1456  
##  Median :13591   Median : 3035  
##  Mean   :13598   Mean   : 5749  
##  3rd Qu.:13597   3rd Qu.: 7646  
##  Max.   :13666   Max.   :25361

Deskripsi Data

library(jmv)
## Warning: package 'jmv' was built under R version 4.1.3
# Use the descritptives function to get the descritptive data
descriptives(mobility_DKI_Jakarta, vars = vars(Dirawat, retail_and_recreation_percent_change_from_baseline), freq = TRUE)
## 
##  DESCRIPTIVES
## 
##  Descriptives                                                                             
##  ---------------------------------------------------------------------------------------- 
##                          Dirawat     retail_and_recreation_percent_change_from_baseline   
##  ---------------------------------------------------------------------------------------- 
##    N                           31                                                    31   
##    Missing                      0                                                     0   
##    Mean                  1634.839                                              7.451613   
##    Median                781.0000                                              7.000000   
##    Standard deviation    1930.584                                              3.085868   
##    Minimum               116.0000                                              1.000000   
##    Maximum               6809.000                                              14.00000   
##  ----------------------------------------------------------------------------------------

4. Distribusi Data

Distribusi Data ditampilkan melalui Histogram dan Grafik menunjukkan trend pasien yang dirawat di RS. Pada Histogram menampilkan mean Variable dalam bentuk Vertikal.

Histogram pasien Covid-19 yang dirawat di RS (Frequency)

hist(mobility_DKI_Jakarta$Dirawat, main="Distribusi Dirawat di RS untuk Covid 19", xlab="Dirawat di RS Untuk Covid 19", xlim=c(100,7000),ylim=c(0,15)) # histogram variabel wt
mean=mean(mobility_DKI_Jakarta$Dirawat) # Menghitung nilai mean variabel wt
abline(v=mean, col="red",lwd=2) # Menambahkan garis vertical pada plot

Grafik pasien Covid-19 yang dirawat di RS (Density Variabel)

d<-density(mobility_DKI_Jakarta$Dirawat) # menghitung density variabel wt
plot(d,  xlab="Dirawat di RS untuk Covid 19", main="Distribusi Dirawat di RS untuk Covid 19")

5. Model

library(npreg)
## Warning: package 'npreg' was built under R version 4.1.3

Model SS antara Date dengan Dirawat

# fit using ss
mod.ss <- ss( mobility_DKI_Jakarta$date,mobility_DKI_Jakarta$Dirawat, nknots = 10)
mod.ss
## 
## Call:
## ss(x = mobility_DKI_Jakarta$date, y = mobility_DKI_Jakarta$Dirawat, 
##     nknots = 10)
## 
## Smoothing Parameter  spar = 0.01776397   lambda = 8.009766e-08
## Equivalent Degrees of Freedom (Df) 10.76019
## Penalized Criterion (RSS) 55189.81
## Generalized Cross-Validation (GCV) 4176.454

Model smooth.spline antara Date dengan Dirawat

# fit using smooth.spline
mod.smsp <- smooth.spline( mobility_DKI_Jakarta$date,mobility_DKI_Jakarta$Dirawat, nknots = 10)
mod.smsp
## Call:
## smooth.spline(x = mobility_DKI_Jakarta$date, y = mobility_DKI_Jakarta$Dirawat, 
##     nknots = 10)
## 
## Smoothing Parameter  spar= -0.05618678  lambda= 1.0263e-06 (16 iterations)
## Equivalent Degrees of Freedom (Df): 11.7885
## Penalized Criterion (RSS): 20919.79
## GCV: 1757.101

RMSE

Root Mean Squared Error (RMSE) merupakan salah satu cara untuk mengevaluasi model regresi linear dengan mengukur tingkat akurasi hasil perkiraan suatu model. RMSE dihitung dengan mengkuadratkan error (prediksi – observasi) dibagi dengan jumlah data (= rata-rata), lalu diakarkan. RMSE tidak memiliki satuan.

RMSE antara model SS dengan smooth.spline

# rmse between solutions
sqrt(mean(( mod.ss$y - mod.smsp$y )^2))
## [1] 32.00289

RMSE antara Dirawat dengan model ss

# rmse between solutions and f(x)
sqrt(mean(( mobility_DKI_Jakarta$Dirawat - mod.ss$y )^2))
## [1] 42.1938

Menambahkan Coloum model SS dan Smooth.spline Dirawat

mobility_DKI_Jakarta$prediksi_ss <- mod.ss$y
mobility_DKI_Jakarta$prediksi_smsp <- mod.smsp$y
datatable(mobility_DKI_Jakarta)

Model SS Date dengan Retail

library(npreg)
# fit using ss
mod.ss1 <- ss( mobility_DKI_Jakarta$date,mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline, nknots = 10)
mod.ss1
## 
## Call:
## ss(x = mobility_DKI_Jakarta$date, y = mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline, 
##     nknots = 10)
## 
## Smoothing Parameter  spar = 0.2344568   lambda = 2.945537e-06
## Equivalent Degrees of Freedom (Df) 8.355361
## Penalized Criterion (RSS) 96.48915
## Generalized Cross-Validation (GCV) 5.833233

Model Smooth.Spline Date dengan Retail

# fit using smooth.spline
mod.smsp1 <- smooth.spline(  mobility_DKI_Jakarta$date,mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline, nknots = 10)
mod.smsp1
## Call:
## smooth.spline(x = mobility_DKI_Jakarta$date, y = mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline, 
##     nknots = 10)
## 
## Smoothing Parameter  spar= -0.004828107  lambda= 2.409043e-06 (15 iterations)
## Equivalent Degrees of Freedom (Df): 11.55074
## Penalized Criterion (RSS): 55.17046
## GCV: 4.52129

Menambahkan Coloum Prediksi Model SS dan Smooth.Spline Retail

mobility_DKI_Jakarta$prediksi_ss1 <- mod.ss1$y
mobility_DKI_Jakarta$prediksi_smsp1 <- mod.smsp1$y
datatable(mobility_DKI_Jakarta)

6. Plot Regresi Linear Tiga Model

Hasil plot untuk Dirawat dengan Date beserta model SS dan Smooth.Spline

# plot results
plot(mobility_DKI_Jakarta$date, mobility_DKI_Jakarta$Dirawat,
     xlab = "Tangggal",
     ylab = "Dirawat di RS")
lines(mobility_DKI_Jakarta$date, mobility_DKI_Jakarta$prediksi_ss , lty = 2, col = 'black', lwd = 3)
lines(mobility_DKI_Jakarta$date, mobility_DKI_Jakarta$prediksi_smsp, lty = 5, col = 'red', lwd = 3)

# plot method
plot(mobility_DKI_Jakarta$date, mobility_DKI_Jakarta$Dirawat, lty = 10, col = 'black', lwd =5)
#plot(mod.ss)
# add lm fit
abline(coef(lm( mobility_DKI_Jakarta$Dirawat ~ mobility_DKI_Jakarta$date , data = mobility_DKI_Jakarta)), lty = 10, col = 'blue', lwd =5)
rug(mobility_DKI_Jakarta$date)  # add rug to plot
#points(mobilityjakarta$Tanggal, mobilityjakarta$DIRAWAT, lty = 2, col = 2, lwd = 2)
points(mobility_DKI_Jakarta$date,mod.ss$y , lty = 2, col = 'red', lwd = 4)
#points(mobilityjakarta$Tanggal,mod.smsp$y , lty = 2, col = 4, lwd = 2)
legend("topright", 
       legend = c("Real", "Model SS", "Trends"), 
       lty = 1:3, col = c("Black","Red","Blue"), lwd = 3, bty = "p")

Hasil plot untuk Date dengan Retail beserta model SS dan Smooth.Spline

# plot results
plot(mobility_DKI_Jakarta$date, mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline, lty = 10, col = 'black', lwd =5,
     xlab = "Tanggal",
     ylab = "Retail and Recreation")
# add lm fit
abline(coef(lm( mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline ~ mobility_DKI_Jakarta$date  , data = mobility_DKI_Jakarta)), lty = 10, col = 'blue', lwd =5)
lines(mobility_DKI_Jakarta$date, mobility_DKI_Jakarta$prediksi_ss1 , lty = 2, col = 'black', lwd = 4)
lines(mobility_DKI_Jakarta$date, mobility_DKI_Jakarta$prediksi_smsp1, lty = 4, col = 'red', lwd = 3)

Korelasi Pearson

Korelasi Pearson Untuk melakukan analisis korelasi Pearson cukup sederhana. Semisal kita ingin mengetahui hubungan antara mobilityjakarta\(DIRAWAT dan mobilityjakarta\)retail_and_recreation,

cor.test(mobility_DKI_Jakarta$Dirawat,mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline)
## 
##  Pearson's product-moment correlation
## 
## data:  mobility_DKI_Jakarta$Dirawat and mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline
## t = 1.6015, df = 29, p-value = 0.1201
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.07707823  0.58073124
## sample estimates:
##       cor 
## 0.2850471

Dari output tersebut dapat kita simpulkan bahwa tidak ada hubungan yang signifikan antara mobilityjakarta DIRAWAT dan mobilityjakarta retail_and_recreation (p-value<0,01). Nilai p-value dalam output R dituliskan 0,4065 Nilai koefisien korelasi r adalah sebesar -0,1572981 yang menunjukkan hubungan yang sedang dan terbalik.

model <- lm(mobility_DKI_Jakarta$Dirawat ~ mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline, data = mobility_DKI_Jakarta)
summary(model)
## 
## Call:
## lm(formula = mobility_DKI_Jakarta$Dirawat ~ mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline, 
##     data = mobility_DKI_Jakarta)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2662.6 -1193.6  -388.6   503.0  4615.4 
## 
## Coefficients:
##                                                                         Estimate
## (Intercept)                                                                306.0
## mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline    178.3
##                                                                         Std. Error
## (Intercept)                                                                  896.0
## mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline      111.4
##                                                                         t value
## (Intercept)                                                               0.342
## mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline   1.601
##                                                                         Pr(>|t|)
## (Intercept)                                                                0.735
## mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline    0.120
## 
## Residual standard error: 1882 on 29 degrees of freedom
## Multiple R-squared:  0.08125,    Adjusted R-squared:  0.04957 
## F-statistic: 2.565 on 1 and 29 DF,  p-value: 0.1201

Rumus linear regresi menjadi y = 306 + (178) mobilityjakarta$retail_and_recreation

model$coefficients
##                                                             (Intercept) 
##                                                                305.9820 
## mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline 
##                                                                178.3314
model$fitted.values
##         1         2         3         4         5         6         7         8 
##  840.9763 2267.6276 2802.6218 2267.6276 2445.9590 2089.2962 1375.9705 1732.6334 
##         9        10        11        12        13        14        15        16 
## 1554.3019 1554.3019  484.3135 1197.6391 1732.6334 1375.9705 1197.6391 1375.9705 
##        17        18        19        20        21        22        23        24 
## 1197.6391  484.3135 1375.9705 1375.9705 1554.3019 1375.9705 1554.3019 1732.6334 
##        25        26        27        28        29        30        31 
## 1732.6334 1910.9648 2089.2962 1554.3019 1910.9648 1732.6334 2802.6218

Menambahkan Coloum Prediksi Liniear

mobility_DKI_Jakarta$prediksi_linear <- model$fitted.values
mobility_DKI_Jakarta
## # A tibble: 31 x 17
##    date       Dirawat retail_and_recreation_p~ grocery_and_pha~ parks_percent_c~
##    <date>       <dbl>                    <dbl>            <dbl>            <dbl>
##  1 2022-01-01     205                        3               24               46
##  2 2022-01-02     116                       11               35               28
##  3 2022-01-03     140                       14               36               20
##  4 2022-01-04     141                       11               33               15
##  5 2022-01-05     168                       12               34               17
##  6 2022-01-06     214                       10               30               12
##  7 2022-01-07     221                        6               25                6
##  8 2022-01-08     257                        8               31               10
##  9 2022-01-09     299                        7               28               13
## 10 2022-01-10     322                        7               27                6
## # ... with 21 more rows, and 12 more variables:
## #   transit_stations_percent_change_from_baseline <dbl>,
## #   workplaces_percent_change_from_baseline <dbl>,
## #   residential_percent_change_from_baseline <dbl>, POSITIF <dbl>,
## #   Sembuh <dbl>, Meninggal <dbl>, `Self Isolation` <dbl>, prediksi_ss <dbl>,
## #   prediksi_smsp <dbl>, prediksi_ss1 <dbl>, prediksi_smsp1 <dbl>,
## #   prediksi_linear <dbl>
data1<- data.frame(mobility_DKI_Jakarta)
data1
##          date Dirawat retail_and_recreation_percent_change_from_baseline
## 1  2022-01-01     205                                                  3
## 2  2022-01-02     116                                                 11
## 3  2022-01-03     140                                                 14
## 4  2022-01-04     141                                                 11
## 5  2022-01-05     168                                                 12
## 6  2022-01-06     214                                                 10
## 7  2022-01-07     221                                                  6
## 8  2022-01-08     257                                                  8
## 9  2022-01-09     299                                                  7
## 10 2022-01-10     322                                                  7
## 11 2022-01-11     402                                                  1
## 12 2022-01-12     506                                                  5
## 13 2022-01-13     583                                                  8
## 14 2022-01-14     669                                                  6
## 15 2022-01-15     758                                                  5
## 16 2022-01-16     781                                                  6
## 17 2022-01-17     809                                                  5
## 18 2022-01-18     981                                                  1
## 19 2022-01-19    1125                                                  6
## 20 2022-01-20    1222                                                  6
## 21 2022-01-21    1445                                                  7
## 22 2022-01-22    1710                                                  6
## 23 2022-01-23    2011                                                  7
## 24 2022-01-24    2242                                                  8
## 25 2022-01-25    2721                                                  8
## 26 2022-01-26    3232                                                  9
## 27 2022-01-27    3989                                                 10
## 28 2022-01-28    4709                                                  7
## 29 2022-01-29    5545                                                  9
## 30 2022-01-30    6348                                                  8
## 31 2022-01-31    6809                                                 14
##    grocery_and_pharmacy_percent_change_from_baseline
## 1                                                 24
## 2                                                 35
## 3                                                 36
## 4                                                 33
## 5                                                 34
## 6                                                 30
## 7                                                 25
## 8                                                 31
## 9                                                 28
## 10                                                27
## 11                                                20
## 12                                                25
## 13                                                29
## 14                                                25
## 15                                                26
## 16                                                26
## 17                                                23
## 18                                                21
## 19                                                26
## 20                                                26
## 21                                                25
## 22                                                25
## 23                                                25
## 24                                                26
## 25                                                28
## 26                                                27
## 27                                                28
## 28                                                26
## 29                                                32
## 30                                                28
## 31                                                34
##    parks_percent_change_from_baseline
## 1                                  46
## 2                                  28
## 3                                  20
## 4                                  15
## 5                                  17
## 6                                  12
## 7                                   6
## 8                                  10
## 9                                  13
## 10                                  6
## 11                                 -1
## 12                                  6
## 13                                  7
## 14                                  5
## 15                                  6
## 16                                 14
## 17                                  6
## 18                                  3
## 19                                  5
## 20                                  3
## 21                                  5
## 22                                  5
## 23                                 14
## 24                                 11
## 25                                  8
## 26                                 10
## 27                                  9
## 28                                  4
## 29                                 10
## 30                                 16
## 31                                 20
##    transit_stations_percent_change_from_baseline
## 1                                            -15
## 2                                             -4
## 3                                             -9
## 4                                             -9
## 5                                            -10
## 6                                            -11
## 7                                            -13
## 8                                             -6
## 9                                             -6
## 10                                           -13
## 11                                           -15
## 12                                           -14
## 13                                           -12
## 14                                           -13
## 15                                           -10
## 16                                           -10
## 17                                           -15
## 18                                           -15
## 19                                           -16
## 20                                           -15
## 21                                           -15
## 22                                           -14
## 23                                           -14
## 24                                           -17
## 25                                           -16
## 26                                           -17
## 27                                           -15
## 28                                           -15
## 29                                            -9
## 30                                           -13
## 31                                           -16
##    workplaces_percent_change_from_baseline
## 1                                      -49
## 2                                       -4
## 3                                      -10
## 4                                       -8
## 5                                       -7
## 6                                       -5
## 7                                       -3
## 8                                        1
## 9                                        2
## 10                                      -4
## 11                                      -4
## 12                                      -3
## 13                                      -2
## 14                                       1
## 15                                       4
## 16                                       5
## 17                                      -1
## 18                                      -2
## 19                                       0
## 20                                       1
## 21                                       4
## 22                                       7
## 23                                       7
## 24                                       1
## 25                                       0
## 26                                       0
## 27                                       2
## 28                                       4
## 29                                       8
## 30                                       8
## 31                                      -5
##    residential_percent_change_from_baseline POSITIF Sembuh Meninggal
## 1                                         6  865415 851280     13588
## 2                                         3  865518 851386     13588
## 3                                         4  865690 851408     13588
## 4                                         4  865805 851449     13588
## 5                                         4  866064 851520     13589
## 6                                         5  866331 851572     13589
## 7                                         5  866631 851648     13589
## 8                                         3  866909 851727     13589
## 9                                         4  867302 851839     13589
## 10                                        4  867662 851944     13589
## 11                                        5  868199 852126     13590
## 12                                        5  868611 852269     13590
## 13                                        4  869089 852563     13590
## 14                                        5  869643 852727     13591
## 15                                        5  870363 853003     13591
## 16                                        6  870929 853522     13591
## 17                                        5  871422 853987     13591
## 18                                        6  872092 854204     13591
## 19                                        6  873104 854589     13591
## 20                                        7  874259 855026     13591
## 21                                        6  875743 855676     13591
## 22                                        6  877568 856137     13591
## 23                                        7  879307 856653     13597
## 24                                        6  881300 857215     13597
## 25                                        6  883490 857688     13606
## 26                                        6  886999 859305     13612
## 27                                        6  891148 861203     13615
## 28                                        7  895706 861203     13617
## 29                                        5  901471 864447     13627
## 30                                        6  908093 866477     13639
## 31                                        6  913355 867519     13666
##    Self.Isolation prediksi_ss prediksi_smsp prediksi_ss1 prediksi_smsp1
## 1             342    156.5343      199.2593     6.328426       3.446112
## 2             428    179.6720      131.8356     8.844735      10.802947
## 3             554    161.4757      123.8707    10.880841      12.848971
## 4             627    130.7884      146.2245    11.900411      12.216316
## 5             787    149.0095      174.4261    11.527913      11.091114
## 6             956    178.0956      202.6801    10.214749       9.875506
## 7            1173    212.5600      229.8601     8.573121       8.525637
## 8            1336    282.6753      256.3473     7.096048       7.055274
## 9            1575    332.0818      288.5539     5.983496       5.708680
## 10           1807    340.1791      334.3997     5.316243       4.787744
## 11           2081    362.9232      401.8045     5.113832       4.594352
## 12           2246    476.0741      493.7075     5.305100       5.250904
## 13           2353    604.9379      593.1250     5.639764       6.161846
## 14           2656    694.6248      678.0930     5.776836       6.552138
## 15           3011    747.5863      734.8096     5.456892       5.908637
## 16           3035    765.9703      782.1234     4.930473       4.765789
## 17           3035    809.2660      847.0453     4.529686       3.919938
## 18           3316    956.4404      951.0056     4.485771       3.965421
## 19           3799   1134.7027     1093.1121     4.810214       4.688551
## 20           4420   1290.7398     1266.8917     5.413629       5.673631
## 21           5031   1447.7952     1465.8717     6.145400       6.504966
## 22           6130   1667.4409     1688.8265     6.836032       6.899117
## 23           7046   1934.8948     1955.5211     7.424228       7.101652
## 24           8246   2273.7034     2290.9675     7.829814       7.490397
## 25           9475   2741.6358     2719.2264     7.987474       8.279395
## 26          10850   3303.6823     3260.5504     8.075024       9.027557
## 27          12341   3959.0559     3934.2404     8.285141       9.130014
## 28          14710   4732.2194     4735.9866     8.742077       8.334571
## 29          17852   5518.9656     5567.0349     9.480094       7.799720
## 30          21629   6240.3370     6305.0205    10.465034       9.036631
## 31          25361   6893.9327     6827.5786    11.601501      13.556472
##    prediksi_linear
## 1         840.9763
## 2        2267.6276
## 3        2802.6218
## 4        2267.6276
## 5        2445.9590
## 6        2089.2962
## 7        1375.9705
## 8        1732.6334
## 9        1554.3019
## 10       1554.3019
## 11        484.3135
## 12       1197.6391
## 13       1732.6334
## 14       1375.9705
## 15       1197.6391
## 16       1375.9705
## 17       1197.6391
## 18        484.3135
## 19       1375.9705
## 20       1375.9705
## 21       1554.3019
## 22       1375.9705
## 23       1554.3019
## 24       1732.6334
## 25       1732.6334
## 26       1910.9648
## 27       2089.2962
## 28       1554.3019
## 29       1910.9648
## 30       1732.6334
## 31       2802.6218

Cek Struktur Data beserta Model

str(mobility_DKI_Jakarta)
## spec_tbl_df [31 x 17] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ date                                              : Date[1:31], format: "2022-01-01" "2022-01-02" ...
##  $ Dirawat                                           : num [1:31] 205 116 140 141 168 214 221 257 299 322 ...
##  $ retail_and_recreation_percent_change_from_baseline: num [1:31] 3 11 14 11 12 10 6 8 7 7 ...
##  $ grocery_and_pharmacy_percent_change_from_baseline : num [1:31] 24 35 36 33 34 30 25 31 28 27 ...
##  $ parks_percent_change_from_baseline                : num [1:31] 46 28 20 15 17 12 6 10 13 6 ...
##  $ transit_stations_percent_change_from_baseline     : num [1:31] -15 -4 -9 -9 -10 -11 -13 -6 -6 -13 ...
##  $ workplaces_percent_change_from_baseline           : num [1:31] -49 -4 -10 -8 -7 -5 -3 1 2 -4 ...
##  $ residential_percent_change_from_baseline          : num [1:31] 6 3 4 4 4 5 5 3 4 4 ...
##  $ POSITIF                                           : num [1:31] 865415 865518 865690 865805 866064 ...
##  $ Sembuh                                            : num [1:31] 851280 851386 851408 851449 851520 ...
##  $ Meninggal                                         : num [1:31] 13588 13588 13588 13588 13589 ...
##  $ Self Isolation                                    : num [1:31] 342 428 554 627 787 ...
##  $ prediksi_ss                                       : num [1:31] 157 180 161 131 149 ...
##  $ prediksi_smsp                                     : num [1:31] 199 132 124 146 174 ...
##  $ prediksi_ss1                                      : num [1:31] 6.33 8.84 10.88 11.9 11.53 ...
##  $ prediksi_smsp1                                    : num [1:31] 3.45 10.8 12.85 12.22 11.09 ...
##  $ prediksi_linear                                   : Named num [1:31] 841 2268 2803 2268 2446 ...
##   ..- attr(*, "names")= chr [1:31] "1" "2" "3" "4" ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   date = col_date(format = ""),
##   ..   Dirawat = col_double(),
##   ..   retail_and_recreation_percent_change_from_baseline = col_double(),
##   ..   grocery_and_pharmacy_percent_change_from_baseline = col_double(),
##   ..   parks_percent_change_from_baseline = col_double(),
##   ..   transit_stations_percent_change_from_baseline = col_double(),
##   ..   workplaces_percent_change_from_baseline = col_double(),
##   ..   residential_percent_change_from_baseline = col_double(),
##   ..   POSITIF = col_double(),
##   ..   Sembuh = col_double(),
##   ..   Meninggal = col_double(),
##   ..   `Self Isolation` = col_double()
##   .. )
##  - attr(*, "problems")=<externalptr>

Ringkasan Data beserta Model

summary(mobility_DKI_Jakarta)
##       date               Dirawat    
##  Min.   :2022-01-01   Min.   : 116  
##  1st Qu.:2022-01-08   1st Qu.: 278  
##  Median :2022-01-16   Median : 781  
##  Mean   :2022-01-16   Mean   :1635  
##  3rd Qu.:2022-01-23   3rd Qu.:2126  
##  Max.   :2022-01-31   Max.   :6809  
##  retail_and_recreation_percent_change_from_baseline
##  Min.   : 1.000                                    
##  1st Qu.: 6.000                                    
##  Median : 7.000                                    
##  Mean   : 7.452                                    
##  3rd Qu.: 9.000                                    
##  Max.   :14.000                                    
##  grocery_and_pharmacy_percent_change_from_baseline
##  Min.   :20.00                                    
##  1st Qu.:25.00                                    
##  Median :26.00                                    
##  Mean   :27.55                                    
##  3rd Qu.:29.50                                    
##  Max.   :36.00                                    
##  parks_percent_change_from_baseline
##  Min.   :-1.00                     
##  1st Qu.: 5.50                     
##  Median : 9.00                     
##  Mean   :10.94                     
##  3rd Qu.:14.00                     
##  Max.   :46.00                     
##  transit_stations_percent_change_from_baseline
##  Min.   :-17.00                               
##  1st Qu.:-15.00                               
##  Median :-14.00                               
##  Mean   :-12.65                               
##  3rd Qu.:-10.00                               
##  Max.   : -4.00                               
##  workplaces_percent_change_from_baseline
##  Min.   :-49.000                        
##  1st Qu.: -4.000                        
##  Median :  0.000                        
##  Mean   : -1.677                        
##  3rd Qu.:  3.000                        
##  Max.   :  8.000                        
##  residential_percent_change_from_baseline    POSITIF           Sembuh      
##  Min.   :3.000                            Min.   :865415   Min.   :851280  
##  1st Qu.:4.500                            1st Qu.:867106   1st Qu.:851783  
##  Median :5.000                            Median :870929   Median :853522  
##  Mean   :5.258                            Mean   :876297   Mean   :855268  
##  3rd Qu.:6.000                            3rd Qu.:880304   3rd Qu.:856934  
##  Max.   :7.000                            Max.   :913355   Max.   :867519  
##    Meninggal     Self Isolation   prediksi_ss     prediksi_smsp   
##  Min.   :13588   Min.   :  342   Min.   : 130.8   Min.   : 123.9  
##  1st Qu.:13589   1st Qu.: 1456   1st Qu.: 307.4   1st Qu.: 272.5  
##  Median :13591   Median : 3035   Median : 766.0   Median : 782.1  
##  Mean   :13598   Mean   : 5749   Mean   :1634.8   Mean   :1634.8  
##  3rd Qu.:13597   3rd Qu.: 7646   3rd Qu.:2104.3   3rd Qu.:2123.2  
##  Max.   :13666   Max.   :25361   Max.   :6893.9   Max.   :6827.6  
##   prediksi_ss1    prediksi_smsp1   prediksi_linear 
##  Min.   : 4.486   Min.   : 3.446   Min.   : 484.3  
##  1st Qu.: 5.435   1st Qu.: 5.462   1st Qu.:1376.0  
##  Median : 7.096   Median : 7.055   Median :1554.3  
##  Mean   : 7.452   Mean   : 7.452   Mean   :1634.8  
##  3rd Qu.: 8.793   3rd Qu.: 9.032   3rd Qu.:1911.0  
##  Max.   :11.900   Max.   :13.556   Max.   :2802.6

Histogram Dirawat dan Retail

# Function to add histograms
panel.hist <- function(x, ...) {
    usr <- par("usr")
    on.exit(par(usr))
    par(usr = c(usr[1:2], 0, 1.5))
    his <- hist(x, plot = FALSE)
    breaks <- his$breaks
    nB <- length(breaks)
    y <- his$counts
    y <- y/max(y)
    rect(breaks[-nB], 0, breaks[-1], y, col = rgb(0, 1, 1, alpha = 0.5), ...)
    # lines(density(x), col = 2, lwd = 2) # Uncomment to add density lines
}
# Equivalent with a formula
pairs(mobility_DKI_Jakarta$Dirawat~mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline, data = mobility_DKI_Jakarta,
      upper.panel = NULL,         # Disabling the upper panel
      diag.panel = panel.hist)    # Adding the histograms

Hasil Plot 3 Linier Model

Dari ketiga model tersebut kemudian di visualkan melalui Plot, dimana masing masing model menunjukkan pendekatan prediksi linearnya dengan Data Real. Seperti pada gambar dibawah ini.

# plot method
plot(mobility_DKI_Jakarta$date, mobility_DKI_Jakarta$Dirawat, lty = 10, col = 'black', lwd =4)
#plot(mod.ss)
# add lm fit
abline(coef(lm(  mobility_DKI_Jakarta$Dirawat ~ mobility_DKI_Jakarta$date , data = mobility_DKI_Jakarta)), lty = 10, col = 'blue', lwd =5)
rug(mobility_DKI_Jakarta$date)  # add rug to plot
#points(mobilityjakarta$Tanggal, mobilityjakarta$DIRAWAT, lty = 2, col = 2, lwd = 2)
lines(mobility_DKI_Jakarta$date,mod.ss$y , lty = 2, col = 'red', lwd = 2)
#points(mobilityjakarta$Tanggal,mod.smsp$y , lty = 2, col = 4, lwd = 2)
lines(mobility_DKI_Jakarta$date, mobility_DKI_Jakarta$prediksi_linear, lty = 3, col = 6, lwd = 3)
legend("topright", 
       legend = c("Real", "Model SS", "Trends", "Linear Regresi"), 
       lty = 1:4, col = c("Black","Red","Blue","Pink"), lwd = 3, bty = "p")

7. Kesimpulan

Dari Visualisasi Plot untuk Regresi Linear Tiga Model dapat disimpulkan sebagai berikut :

  • Baik Model SS dan Model Smooth.Spline mempunyai kesamaan dengan Model Liniear, hanya perlu menggunakan salah satunya saja.

  • Dari Hasil plot 3 Model yang mendekati nilai Real adalah Model SS