Analisa Data dengan Regresi Linear dengan tiga model dengan Google Mobility Index dan Data Covid-19 Jakarta Januari 2022
| Nama | Moch Hasbi Ashidqy |
|---|---|
| NIM | 210605110104 |
| Dosen Pengampu | Prof. Dr M. Suhartono, M.Kom |
| Program | S1 Teknik Informatika |
1. Overview
Analisa Regresi Linear Tiga Model Linear Model, SS Model, Smooth Spline Model untuk data Covid-19 DKI Jakarta bulan Januari 2022 pada data status penderita yang Dirawat.
Tujuannya adalah dapat memprediksi jumlah pasien yang Dirawat berdasarkan informasi data tanggal dari data covid 19 DKI Jakarta pada bulan Januari 2022
Tiga model ini akan menyajikan prediksi terdekat jumlah pasien yang seharusnya dirawat sebagai Learning Machine
Smoothing splines
Smoothing splines adalah pendekatan yang ampuh untuk memperkirakan hubungan fungsional antara prediktor X dan respons Y. Smoothing splines dapat ditampung baik menggunakan fungsi smooth.spline (dalam paket stats) atau fungsi ss (dalam paket npreg). Dokumen ini memberikan latar belakang teoritis tentang smoothing splines, serta contoh-contoh yang menggambarkan bagaimana menggunakan fungsi smooth.spline dan ss.
Seperti yang saya tunjukkan dalam tutorial ini, kedua fungsi memiliki sintaks yang sangat mirip, tetapi fungsi ss menawarkan beberapa opsi tambahan. Dibandingkan dengan fungsi smooth.spline, fungsi ss memiliki
lebih banyak metode pemilihan parameter penghalusan lebih banyak jenis spline (linier, kubik, quintic) opsi untuk batasan periodisitas opsi untuk nilai simpul yang ditentukan pengguna ringkasan yang sesuai dan metode plot
Untuk melihat bagaimana kinerja fungsi-fungsi ini dalam praktiknya, mari kita lihat contoh simulasinya. Secara khusus, mari simulasikan beberapa data dengan hubungan fungsional (berkala) yang memiliki beberapa gangguan.
2. Aktivasi Library
Pada Analisa regrresi tiga model memerlukan sejumlah library. Library yang digunakan dalam analisa studi kasus ini antara lain:
readr: library ini digunakan untuk membaca file CSV/TASVtidyverse: library yang berisi kumpulan fungsi untuk analisa data.dplyr: library untuk membuat ringkasan data.npreg: library ini digunakan untuk membuat model ssstats: library ini digunakan untuk membuat model smooth spline
3. Import Dataset
library(readr)
mobility_DKI_Jakarta <- read_csv("DataCovid19 DKI Jakarta.1.csv")## Rows: 31 Columns: 12
## -- Column specification --------------------------------------------------------
## Delimiter: ","
## dbl (11): Dirawat, retail_and_recreation_percent_change_from_baseline, groc...
## date (1): date
##
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.
mobility_DKI_Jakarta## # A tibble: 31 x 12
## date Dirawat retail_and_recreation_p~ grocery_and_pha~ parks_percent_c~
## <date> <dbl> <dbl> <dbl> <dbl>
## 1 2022-01-01 205 3 24 46
## 2 2022-01-02 116 11 35 28
## 3 2022-01-03 140 14 36 20
## 4 2022-01-04 141 11 33 15
## 5 2022-01-05 168 12 34 17
## 6 2022-01-06 214 10 30 12
## 7 2022-01-07 221 6 25 6
## 8 2022-01-08 257 8 31 10
## 9 2022-01-09 299 7 28 13
## 10 2022-01-10 322 7 27 6
## # ... with 21 more rows, and 7 more variables:
## # transit_stations_percent_change_from_baseline <dbl>,
## # workplaces_percent_change_from_baseline <dbl>,
## # residential_percent_change_from_baseline <dbl>, POSITIF <dbl>,
## # Sembuh <dbl>, Meninggal <dbl>, `Self Isolation` <dbl>
View Dataset
library(DT)## Warning: package 'DT' was built under R version 4.1.3
datatable(mobility_DKI_Jakarta)library(dplyr)##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(rmdformats)## Warning: package 'rmdformats' was built under R version 4.1.3
tab <- table(mobility_DKI_Jakarta$Dirawat, mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline>25)
pilltabs(
tab,
count = TRUE,
rows = TRUE,
cols = TRUE,
chisq = TRUE,
resid = TRUE,
row.names = TRUE
)## Warning in stats::chisq.test(tab): Chi-squared approximation may be incorrect
## Warning in stats::chisq.test(tab): Chi-squared approximation may be incorrect
| FALSE | |
|---|---|
| 116 | 1 |
| 140 | 1 |
| 141 | 1 |
| 168 | 1 |
| 205 | 1 |
| 214 | 1 |
| 221 | 1 |
| 257 | 1 |
| 299 | 1 |
| 322 | 1 |
| 402 | 1 |
| 506 | 1 |
| 583 | 1 |
| 669 | 1 |
| 758 | 1 |
| 781 | 1 |
| 809 | 1 |
| 981 | 1 |
| 1125 | 1 |
| 1222 | 1 |
| 1445 | 1 |
| 1710 | 1 |
| 2011 | 1 |
| 2242 | 1 |
| 2721 | 1 |
| 3232 | 1 |
| 3989 | 1 |
| 4709 | 1 |
| 5545 | 1 |
| 6348 | 1 |
| 6809 | 1 |
| FALSE | Total | n | |
|---|---|---|---|
| 116 | 100 | 100 | 1 |
| 140 | 100 | 100 | 1 |
| 141 | 100 | 100 | 1 |
| 168 | 100 | 100 | 1 |
| 205 | 100 | 100 | 1 |
| 214 | 100 | 100 | 1 |
| 221 | 100 | 100 | 1 |
| 257 | 100 | 100 | 1 |
| 299 | 100 | 100 | 1 |
| 322 | 100 | 100 | 1 |
| 402 | 100 | 100 | 1 |
| 506 | 100 | 100 | 1 |
| 583 | 100 | 100 | 1 |
| 669 | 100 | 100 | 1 |
| 758 | 100 | 100 | 1 |
| 781 | 100 | 100 | 1 |
| 809 | 100 | 100 | 1 |
| 981 | 100 | 100 | 1 |
| 1125 | 100 | 100 | 1 |
| 1222 | 100 | 100 | 1 |
| 1445 | 100 | 100 | 1 |
| 1710 | 100 | 100 | 1 |
| 2011 | 100 | 100 | 1 |
| 2242 | 100 | 100 | 1 |
| 2721 | 100 | 100 | 1 |
| 3232 | 100 | 100 | 1 |
| 3989 | 100 | 100 | 1 |
| 4709 | 100 | 100 | 1 |
| 5545 | 100 | 100 | 1 |
| 6348 | 100 | 100 | 1 |
| 6809 | 100 | 100 | 1 |
| All | 100 | 100 | 31 |
| FALSE | All | |
|---|---|---|
| 116 | 3.2 | 3.2 |
| 140 | 3.2 | 3.2 |
| 141 | 3.2 | 3.2 |
| 168 | 3.2 | 3.2 |
| 205 | 3.2 | 3.2 |
| 214 | 3.2 | 3.2 |
| 221 | 3.2 | 3.2 |
| 257 | 3.2 | 3.2 |
| 299 | 3.2 | 3.2 |
| 322 | 3.2 | 3.2 |
| 402 | 3.2 | 3.2 |
| 506 | 3.2 | 3.2 |
| 583 | 3.2 | 3.2 |
| 669 | 3.2 | 3.2 |
| 758 | 3.2 | 3.2 |
| 781 | 3.2 | 3.2 |
| 809 | 3.2 | 3.2 |
| 981 | 3.2 | 3.2 |
| 1125 | 3.2 | 3.2 |
| 1222 | 3.2 | 3.2 |
| 1445 | 3.2 | 3.2 |
| 1710 | 3.2 | 3.2 |
| 2011 | 3.2 | 3.2 |
| 2242 | 3.2 | 3.2 |
| 2721 | 3.2 | 3.2 |
| 3232 | 3.2 | 3.2 |
| 3989 | 3.2 | 3.2 |
| 4709 | 3.2 | 3.2 |
| 5545 | 3.2 | 3.2 |
| 6348 | 3.2 | 3.2 |
| 6809 | 3.2 | 3.2 |
| Total | 100.0 | 100.0 |
| n | 31.0 | 31.0 |
| x | |
|---|---|
| 1 | 0 |
| 2 | 0 |
| 3 | 0 |
| 4 | 0 |
| 5 | 0 |
| 6 | 0 |
| 7 | 0 |
| 8 | 0 |
| 9 | 0 |
| 10 | 0 |
| 11 | 0 |
| 12 | 0 |
| 13 | 0 |
| 14 | 0 |
| 15 | 0 |
| 16 | 0 |
| 17 | 0 |
| 18 | 0 |
| 19 | 0 |
| 20 | 0 |
| 21 | 0 |
| 22 | 0 |
| 23 | 0 |
| 24 | 0 |
| 25 | 0 |
| 26 | 0 |
| 27 | 0 |
| 28 | 0 |
| 29 | 0 |
| 30 | 0 |
| 31 | 0 |
X-squared = 0, df = 30, p = 1
Cek Struktur Data
cek<-str(mobility_DKI_Jakarta)## spec_tbl_df [31 x 12] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
## $ date : Date[1:31], format: "2022-01-01" "2022-01-02" ...
## $ Dirawat : num [1:31] 205 116 140 141 168 214 221 257 299 322 ...
## $ retail_and_recreation_percent_change_from_baseline: num [1:31] 3 11 14 11 12 10 6 8 7 7 ...
## $ grocery_and_pharmacy_percent_change_from_baseline : num [1:31] 24 35 36 33 34 30 25 31 28 27 ...
## $ parks_percent_change_from_baseline : num [1:31] 46 28 20 15 17 12 6 10 13 6 ...
## $ transit_stations_percent_change_from_baseline : num [1:31] -15 -4 -9 -9 -10 -11 -13 -6 -6 -13 ...
## $ workplaces_percent_change_from_baseline : num [1:31] -49 -4 -10 -8 -7 -5 -3 1 2 -4 ...
## $ residential_percent_change_from_baseline : num [1:31] 6 3 4 4 4 5 5 3 4 4 ...
## $ POSITIF : num [1:31] 865415 865518 865690 865805 866064 ...
## $ Sembuh : num [1:31] 851280 851386 851408 851449 851520 ...
## $ Meninggal : num [1:31] 13588 13588 13588 13588 13589 ...
## $ Self Isolation : num [1:31] 342 428 554 627 787 ...
## - attr(*, "spec")=
## .. cols(
## .. date = col_date(format = ""),
## .. Dirawat = col_double(),
## .. retail_and_recreation_percent_change_from_baseline = col_double(),
## .. grocery_and_pharmacy_percent_change_from_baseline = col_double(),
## .. parks_percent_change_from_baseline = col_double(),
## .. transit_stations_percent_change_from_baseline = col_double(),
## .. workplaces_percent_change_from_baseline = col_double(),
## .. residential_percent_change_from_baseline = col_double(),
## .. POSITIF = col_double(),
## .. Sembuh = col_double(),
## .. Meninggal = col_double(),
## .. `Self Isolation` = col_double()
## .. )
## - attr(*, "problems")=<externalptr>
cek## NULL
Ringkasan Data
ringkasan<-summary(mobility_DKI_Jakarta)
ringkasan## date Dirawat
## Min. :2022-01-01 Min. : 116
## 1st Qu.:2022-01-08 1st Qu.: 278
## Median :2022-01-16 Median : 781
## Mean :2022-01-16 Mean :1635
## 3rd Qu.:2022-01-23 3rd Qu.:2126
## Max. :2022-01-31 Max. :6809
## retail_and_recreation_percent_change_from_baseline
## Min. : 1.000
## 1st Qu.: 6.000
## Median : 7.000
## Mean : 7.452
## 3rd Qu.: 9.000
## Max. :14.000
## grocery_and_pharmacy_percent_change_from_baseline
## Min. :20.00
## 1st Qu.:25.00
## Median :26.00
## Mean :27.55
## 3rd Qu.:29.50
## Max. :36.00
## parks_percent_change_from_baseline
## Min. :-1.00
## 1st Qu.: 5.50
## Median : 9.00
## Mean :10.94
## 3rd Qu.:14.00
## Max. :46.00
## transit_stations_percent_change_from_baseline
## Min. :-17.00
## 1st Qu.:-15.00
## Median :-14.00
## Mean :-12.65
## 3rd Qu.:-10.00
## Max. : -4.00
## workplaces_percent_change_from_baseline
## Min. :-49.000
## 1st Qu.: -4.000
## Median : 0.000
## Mean : -1.677
## 3rd Qu.: 3.000
## Max. : 8.000
## residential_percent_change_from_baseline POSITIF Sembuh
## Min. :3.000 Min. :865415 Min. :851280
## 1st Qu.:4.500 1st Qu.:867106 1st Qu.:851783
## Median :5.000 Median :870929 Median :853522
## Mean :5.258 Mean :876297 Mean :855268
## 3rd Qu.:6.000 3rd Qu.:880304 3rd Qu.:856934
## Max. :7.000 Max. :913355 Max. :867519
## Meninggal Self Isolation
## Min. :13588 Min. : 342
## 1st Qu.:13589 1st Qu.: 1456
## Median :13591 Median : 3035
## Mean :13598 Mean : 5749
## 3rd Qu.:13597 3rd Qu.: 7646
## Max. :13666 Max. :25361
Deskripsi Data
library(jmv)## Warning: package 'jmv' was built under R version 4.1.3
# Use the descritptives function to get the descritptive data
descriptives(mobility_DKI_Jakarta, vars = vars(Dirawat, retail_and_recreation_percent_change_from_baseline), freq = TRUE)##
## DESCRIPTIVES
##
## Descriptives
## ----------------------------------------------------------------------------------------
## Dirawat retail_and_recreation_percent_change_from_baseline
## ----------------------------------------------------------------------------------------
## N 31 31
## Missing 0 0
## Mean 1634.839 7.451613
## Median 781.0000 7.000000
## Standard deviation 1930.584 3.085868
## Minimum 116.0000 1.000000
## Maximum 6809.000 14.00000
## ----------------------------------------------------------------------------------------
4. Distribusi Data
Distribusi Data ditampilkan melalui Histogram dan Grafik menunjukkan trend pasien yang dirawat di RS. Pada Histogram menampilkan mean Variable dalam bentuk Vertikal.
Histogram pasien Covid-19 yang dirawat di RS (Frequency)
hist(mobility_DKI_Jakarta$Dirawat, main="Distribusi Dirawat di RS untuk Covid 19", xlab="Dirawat di RS Untuk Covid 19", xlim=c(100,7000),ylim=c(0,15)) # histogram variabel wt
mean=mean(mobility_DKI_Jakarta$Dirawat) # Menghitung nilai mean variabel wt
abline(v=mean, col="red",lwd=2) # Menambahkan garis vertical pada plotGrafik pasien Covid-19 yang dirawat di RS (Density Variabel)
d<-density(mobility_DKI_Jakarta$Dirawat) # menghitung density variabel wt
plot(d, xlab="Dirawat di RS untuk Covid 19", main="Distribusi Dirawat di RS untuk Covid 19")5. Model
library(npreg)## Warning: package 'npreg' was built under R version 4.1.3
Model SS antara Date dengan Dirawat
# fit using ss
mod.ss <- ss( mobility_DKI_Jakarta$date,mobility_DKI_Jakarta$Dirawat, nknots = 10)
mod.ss##
## Call:
## ss(x = mobility_DKI_Jakarta$date, y = mobility_DKI_Jakarta$Dirawat,
## nknots = 10)
##
## Smoothing Parameter spar = 0.01776397 lambda = 8.009766e-08
## Equivalent Degrees of Freedom (Df) 10.76019
## Penalized Criterion (RSS) 55189.81
## Generalized Cross-Validation (GCV) 4176.454
Model smooth.spline antara Date dengan Dirawat
# fit using smooth.spline
mod.smsp <- smooth.spline( mobility_DKI_Jakarta$date,mobility_DKI_Jakarta$Dirawat, nknots = 10)
mod.smsp## Call:
## smooth.spline(x = mobility_DKI_Jakarta$date, y = mobility_DKI_Jakarta$Dirawat,
## nknots = 10)
##
## Smoothing Parameter spar= -0.05618678 lambda= 1.0263e-06 (16 iterations)
## Equivalent Degrees of Freedom (Df): 11.7885
## Penalized Criterion (RSS): 20919.79
## GCV: 1757.101
RMSE
Root Mean Squared Error (RMSE) merupakan salah satu cara untuk mengevaluasi model regresi linear dengan mengukur tingkat akurasi hasil perkiraan suatu model. RMSE dihitung dengan mengkuadratkan error (prediksi – observasi) dibagi dengan jumlah data (= rata-rata), lalu diakarkan. RMSE tidak memiliki satuan.
RMSE antara model SS dengan smooth.spline
# rmse between solutions
sqrt(mean(( mod.ss$y - mod.smsp$y )^2))## [1] 32.00289
RMSE antara Dirawat dengan model ss
# rmse between solutions and f(x)
sqrt(mean(( mobility_DKI_Jakarta$Dirawat - mod.ss$y )^2))## [1] 42.1938
Menambahkan Coloum model SS dan Smooth.spline Dirawat
mobility_DKI_Jakarta$prediksi_ss <- mod.ss$y
mobility_DKI_Jakarta$prediksi_smsp <- mod.smsp$y
datatable(mobility_DKI_Jakarta)Model SS Date dengan Retail
library(npreg)
# fit using ss
mod.ss1 <- ss( mobility_DKI_Jakarta$date,mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline, nknots = 10)
mod.ss1##
## Call:
## ss(x = mobility_DKI_Jakarta$date, y = mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline,
## nknots = 10)
##
## Smoothing Parameter spar = 0.2344568 lambda = 2.945537e-06
## Equivalent Degrees of Freedom (Df) 8.355361
## Penalized Criterion (RSS) 96.48915
## Generalized Cross-Validation (GCV) 5.833233
Model Smooth.Spline Date dengan Retail
# fit using smooth.spline
mod.smsp1 <- smooth.spline( mobility_DKI_Jakarta$date,mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline, nknots = 10)
mod.smsp1## Call:
## smooth.spline(x = mobility_DKI_Jakarta$date, y = mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline,
## nknots = 10)
##
## Smoothing Parameter spar= -0.004828107 lambda= 2.409043e-06 (15 iterations)
## Equivalent Degrees of Freedom (Df): 11.55074
## Penalized Criterion (RSS): 55.17046
## GCV: 4.52129
Menambahkan Coloum Prediksi Model SS dan Smooth.Spline Retail
mobility_DKI_Jakarta$prediksi_ss1 <- mod.ss1$y
mobility_DKI_Jakarta$prediksi_smsp1 <- mod.smsp1$y
datatable(mobility_DKI_Jakarta)6. Plot Regresi Linear Tiga Model
Hasil plot untuk Dirawat dengan Date beserta model SS dan Smooth.Spline
# plot results
plot(mobility_DKI_Jakarta$date, mobility_DKI_Jakarta$Dirawat,
xlab = "Tangggal",
ylab = "Dirawat di RS")
lines(mobility_DKI_Jakarta$date, mobility_DKI_Jakarta$prediksi_ss , lty = 2, col = 'black', lwd = 3)
lines(mobility_DKI_Jakarta$date, mobility_DKI_Jakarta$prediksi_smsp, lty = 5, col = 'red', lwd = 3)# plot method
plot(mobility_DKI_Jakarta$date, mobility_DKI_Jakarta$Dirawat, lty = 10, col = 'black', lwd =5)
#plot(mod.ss)
# add lm fit
abline(coef(lm( mobility_DKI_Jakarta$Dirawat ~ mobility_DKI_Jakarta$date , data = mobility_DKI_Jakarta)), lty = 10, col = 'blue', lwd =5)
rug(mobility_DKI_Jakarta$date) # add rug to plot
#points(mobilityjakarta$Tanggal, mobilityjakarta$DIRAWAT, lty = 2, col = 2, lwd = 2)
points(mobility_DKI_Jakarta$date,mod.ss$y , lty = 2, col = 'red', lwd = 4)
#points(mobilityjakarta$Tanggal,mod.smsp$y , lty = 2, col = 4, lwd = 2)
legend("topright",
legend = c("Real", "Model SS", "Trends"),
lty = 1:3, col = c("Black","Red","Blue"), lwd = 3, bty = "p")Hasil plot untuk Date dengan Retail beserta model SS dan Smooth.Spline
# plot results
plot(mobility_DKI_Jakarta$date, mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline, lty = 10, col = 'black', lwd =5,
xlab = "Tanggal",
ylab = "Retail and Recreation")
# add lm fit
abline(coef(lm( mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline ~ mobility_DKI_Jakarta$date , data = mobility_DKI_Jakarta)), lty = 10, col = 'blue', lwd =5)
lines(mobility_DKI_Jakarta$date, mobility_DKI_Jakarta$prediksi_ss1 , lty = 2, col = 'black', lwd = 4)
lines(mobility_DKI_Jakarta$date, mobility_DKI_Jakarta$prediksi_smsp1, lty = 4, col = 'red', lwd = 3)Korelasi Pearson
Korelasi Pearson Untuk melakukan analisis korelasi Pearson cukup sederhana. Semisal kita ingin mengetahui hubungan antara mobilityjakarta\(DIRAWAT dan mobilityjakarta\)retail_and_recreation,
cor.test(mobility_DKI_Jakarta$Dirawat,mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline)##
## Pearson's product-moment correlation
##
## data: mobility_DKI_Jakarta$Dirawat and mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline
## t = 1.6015, df = 29, p-value = 0.1201
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.07707823 0.58073124
## sample estimates:
## cor
## 0.2850471
Dari output tersebut dapat kita simpulkan bahwa tidak ada hubungan yang signifikan antara mobilityjakarta DIRAWAT dan mobilityjakarta retail_and_recreation (p-value<0,01). Nilai p-value dalam output R dituliskan 0,4065 Nilai koefisien korelasi r adalah sebesar -0,1572981 yang menunjukkan hubungan yang sedang dan terbalik.
model <- lm(mobility_DKI_Jakarta$Dirawat ~ mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline, data = mobility_DKI_Jakarta)
summary(model)##
## Call:
## lm(formula = mobility_DKI_Jakarta$Dirawat ~ mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline,
## data = mobility_DKI_Jakarta)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2662.6 -1193.6 -388.6 503.0 4615.4
##
## Coefficients:
## Estimate
## (Intercept) 306.0
## mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline 178.3
## Std. Error
## (Intercept) 896.0
## mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline 111.4
## t value
## (Intercept) 0.342
## mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline 1.601
## Pr(>|t|)
## (Intercept) 0.735
## mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline 0.120
##
## Residual standard error: 1882 on 29 degrees of freedom
## Multiple R-squared: 0.08125, Adjusted R-squared: 0.04957
## F-statistic: 2.565 on 1 and 29 DF, p-value: 0.1201
Rumus linear regresi menjadi y = 306 + (178) mobilityjakarta$retail_and_recreation
model$coefficients## (Intercept)
## 305.9820
## mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline
## 178.3314
model$fitted.values## 1 2 3 4 5 6 7 8
## 840.9763 2267.6276 2802.6218 2267.6276 2445.9590 2089.2962 1375.9705 1732.6334
## 9 10 11 12 13 14 15 16
## 1554.3019 1554.3019 484.3135 1197.6391 1732.6334 1375.9705 1197.6391 1375.9705
## 17 18 19 20 21 22 23 24
## 1197.6391 484.3135 1375.9705 1375.9705 1554.3019 1375.9705 1554.3019 1732.6334
## 25 26 27 28 29 30 31
## 1732.6334 1910.9648 2089.2962 1554.3019 1910.9648 1732.6334 2802.6218
Menambahkan Coloum Prediksi Liniear
mobility_DKI_Jakarta$prediksi_linear <- model$fitted.values
mobility_DKI_Jakarta## # A tibble: 31 x 17
## date Dirawat retail_and_recreation_p~ grocery_and_pha~ parks_percent_c~
## <date> <dbl> <dbl> <dbl> <dbl>
## 1 2022-01-01 205 3 24 46
## 2 2022-01-02 116 11 35 28
## 3 2022-01-03 140 14 36 20
## 4 2022-01-04 141 11 33 15
## 5 2022-01-05 168 12 34 17
## 6 2022-01-06 214 10 30 12
## 7 2022-01-07 221 6 25 6
## 8 2022-01-08 257 8 31 10
## 9 2022-01-09 299 7 28 13
## 10 2022-01-10 322 7 27 6
## # ... with 21 more rows, and 12 more variables:
## # transit_stations_percent_change_from_baseline <dbl>,
## # workplaces_percent_change_from_baseline <dbl>,
## # residential_percent_change_from_baseline <dbl>, POSITIF <dbl>,
## # Sembuh <dbl>, Meninggal <dbl>, `Self Isolation` <dbl>, prediksi_ss <dbl>,
## # prediksi_smsp <dbl>, prediksi_ss1 <dbl>, prediksi_smsp1 <dbl>,
## # prediksi_linear <dbl>
data1<- data.frame(mobility_DKI_Jakarta)
data1## date Dirawat retail_and_recreation_percent_change_from_baseline
## 1 2022-01-01 205 3
## 2 2022-01-02 116 11
## 3 2022-01-03 140 14
## 4 2022-01-04 141 11
## 5 2022-01-05 168 12
## 6 2022-01-06 214 10
## 7 2022-01-07 221 6
## 8 2022-01-08 257 8
## 9 2022-01-09 299 7
## 10 2022-01-10 322 7
## 11 2022-01-11 402 1
## 12 2022-01-12 506 5
## 13 2022-01-13 583 8
## 14 2022-01-14 669 6
## 15 2022-01-15 758 5
## 16 2022-01-16 781 6
## 17 2022-01-17 809 5
## 18 2022-01-18 981 1
## 19 2022-01-19 1125 6
## 20 2022-01-20 1222 6
## 21 2022-01-21 1445 7
## 22 2022-01-22 1710 6
## 23 2022-01-23 2011 7
## 24 2022-01-24 2242 8
## 25 2022-01-25 2721 8
## 26 2022-01-26 3232 9
## 27 2022-01-27 3989 10
## 28 2022-01-28 4709 7
## 29 2022-01-29 5545 9
## 30 2022-01-30 6348 8
## 31 2022-01-31 6809 14
## grocery_and_pharmacy_percent_change_from_baseline
## 1 24
## 2 35
## 3 36
## 4 33
## 5 34
## 6 30
## 7 25
## 8 31
## 9 28
## 10 27
## 11 20
## 12 25
## 13 29
## 14 25
## 15 26
## 16 26
## 17 23
## 18 21
## 19 26
## 20 26
## 21 25
## 22 25
## 23 25
## 24 26
## 25 28
## 26 27
## 27 28
## 28 26
## 29 32
## 30 28
## 31 34
## parks_percent_change_from_baseline
## 1 46
## 2 28
## 3 20
## 4 15
## 5 17
## 6 12
## 7 6
## 8 10
## 9 13
## 10 6
## 11 -1
## 12 6
## 13 7
## 14 5
## 15 6
## 16 14
## 17 6
## 18 3
## 19 5
## 20 3
## 21 5
## 22 5
## 23 14
## 24 11
## 25 8
## 26 10
## 27 9
## 28 4
## 29 10
## 30 16
## 31 20
## transit_stations_percent_change_from_baseline
## 1 -15
## 2 -4
## 3 -9
## 4 -9
## 5 -10
## 6 -11
## 7 -13
## 8 -6
## 9 -6
## 10 -13
## 11 -15
## 12 -14
## 13 -12
## 14 -13
## 15 -10
## 16 -10
## 17 -15
## 18 -15
## 19 -16
## 20 -15
## 21 -15
## 22 -14
## 23 -14
## 24 -17
## 25 -16
## 26 -17
## 27 -15
## 28 -15
## 29 -9
## 30 -13
## 31 -16
## workplaces_percent_change_from_baseline
## 1 -49
## 2 -4
## 3 -10
## 4 -8
## 5 -7
## 6 -5
## 7 -3
## 8 1
## 9 2
## 10 -4
## 11 -4
## 12 -3
## 13 -2
## 14 1
## 15 4
## 16 5
## 17 -1
## 18 -2
## 19 0
## 20 1
## 21 4
## 22 7
## 23 7
## 24 1
## 25 0
## 26 0
## 27 2
## 28 4
## 29 8
## 30 8
## 31 -5
## residential_percent_change_from_baseline POSITIF Sembuh Meninggal
## 1 6 865415 851280 13588
## 2 3 865518 851386 13588
## 3 4 865690 851408 13588
## 4 4 865805 851449 13588
## 5 4 866064 851520 13589
## 6 5 866331 851572 13589
## 7 5 866631 851648 13589
## 8 3 866909 851727 13589
## 9 4 867302 851839 13589
## 10 4 867662 851944 13589
## 11 5 868199 852126 13590
## 12 5 868611 852269 13590
## 13 4 869089 852563 13590
## 14 5 869643 852727 13591
## 15 5 870363 853003 13591
## 16 6 870929 853522 13591
## 17 5 871422 853987 13591
## 18 6 872092 854204 13591
## 19 6 873104 854589 13591
## 20 7 874259 855026 13591
## 21 6 875743 855676 13591
## 22 6 877568 856137 13591
## 23 7 879307 856653 13597
## 24 6 881300 857215 13597
## 25 6 883490 857688 13606
## 26 6 886999 859305 13612
## 27 6 891148 861203 13615
## 28 7 895706 861203 13617
## 29 5 901471 864447 13627
## 30 6 908093 866477 13639
## 31 6 913355 867519 13666
## Self.Isolation prediksi_ss prediksi_smsp prediksi_ss1 prediksi_smsp1
## 1 342 156.5343 199.2593 6.328426 3.446112
## 2 428 179.6720 131.8356 8.844735 10.802947
## 3 554 161.4757 123.8707 10.880841 12.848971
## 4 627 130.7884 146.2245 11.900411 12.216316
## 5 787 149.0095 174.4261 11.527913 11.091114
## 6 956 178.0956 202.6801 10.214749 9.875506
## 7 1173 212.5600 229.8601 8.573121 8.525637
## 8 1336 282.6753 256.3473 7.096048 7.055274
## 9 1575 332.0818 288.5539 5.983496 5.708680
## 10 1807 340.1791 334.3997 5.316243 4.787744
## 11 2081 362.9232 401.8045 5.113832 4.594352
## 12 2246 476.0741 493.7075 5.305100 5.250904
## 13 2353 604.9379 593.1250 5.639764 6.161846
## 14 2656 694.6248 678.0930 5.776836 6.552138
## 15 3011 747.5863 734.8096 5.456892 5.908637
## 16 3035 765.9703 782.1234 4.930473 4.765789
## 17 3035 809.2660 847.0453 4.529686 3.919938
## 18 3316 956.4404 951.0056 4.485771 3.965421
## 19 3799 1134.7027 1093.1121 4.810214 4.688551
## 20 4420 1290.7398 1266.8917 5.413629 5.673631
## 21 5031 1447.7952 1465.8717 6.145400 6.504966
## 22 6130 1667.4409 1688.8265 6.836032 6.899117
## 23 7046 1934.8948 1955.5211 7.424228 7.101652
## 24 8246 2273.7034 2290.9675 7.829814 7.490397
## 25 9475 2741.6358 2719.2264 7.987474 8.279395
## 26 10850 3303.6823 3260.5504 8.075024 9.027557
## 27 12341 3959.0559 3934.2404 8.285141 9.130014
## 28 14710 4732.2194 4735.9866 8.742077 8.334571
## 29 17852 5518.9656 5567.0349 9.480094 7.799720
## 30 21629 6240.3370 6305.0205 10.465034 9.036631
## 31 25361 6893.9327 6827.5786 11.601501 13.556472
## prediksi_linear
## 1 840.9763
## 2 2267.6276
## 3 2802.6218
## 4 2267.6276
## 5 2445.9590
## 6 2089.2962
## 7 1375.9705
## 8 1732.6334
## 9 1554.3019
## 10 1554.3019
## 11 484.3135
## 12 1197.6391
## 13 1732.6334
## 14 1375.9705
## 15 1197.6391
## 16 1375.9705
## 17 1197.6391
## 18 484.3135
## 19 1375.9705
## 20 1375.9705
## 21 1554.3019
## 22 1375.9705
## 23 1554.3019
## 24 1732.6334
## 25 1732.6334
## 26 1910.9648
## 27 2089.2962
## 28 1554.3019
## 29 1910.9648
## 30 1732.6334
## 31 2802.6218
Cek Struktur Data beserta Model
str(mobility_DKI_Jakarta)## spec_tbl_df [31 x 17] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
## $ date : Date[1:31], format: "2022-01-01" "2022-01-02" ...
## $ Dirawat : num [1:31] 205 116 140 141 168 214 221 257 299 322 ...
## $ retail_and_recreation_percent_change_from_baseline: num [1:31] 3 11 14 11 12 10 6 8 7 7 ...
## $ grocery_and_pharmacy_percent_change_from_baseline : num [1:31] 24 35 36 33 34 30 25 31 28 27 ...
## $ parks_percent_change_from_baseline : num [1:31] 46 28 20 15 17 12 6 10 13 6 ...
## $ transit_stations_percent_change_from_baseline : num [1:31] -15 -4 -9 -9 -10 -11 -13 -6 -6 -13 ...
## $ workplaces_percent_change_from_baseline : num [1:31] -49 -4 -10 -8 -7 -5 -3 1 2 -4 ...
## $ residential_percent_change_from_baseline : num [1:31] 6 3 4 4 4 5 5 3 4 4 ...
## $ POSITIF : num [1:31] 865415 865518 865690 865805 866064 ...
## $ Sembuh : num [1:31] 851280 851386 851408 851449 851520 ...
## $ Meninggal : num [1:31] 13588 13588 13588 13588 13589 ...
## $ Self Isolation : num [1:31] 342 428 554 627 787 ...
## $ prediksi_ss : num [1:31] 157 180 161 131 149 ...
## $ prediksi_smsp : num [1:31] 199 132 124 146 174 ...
## $ prediksi_ss1 : num [1:31] 6.33 8.84 10.88 11.9 11.53 ...
## $ prediksi_smsp1 : num [1:31] 3.45 10.8 12.85 12.22 11.09 ...
## $ prediksi_linear : Named num [1:31] 841 2268 2803 2268 2446 ...
## ..- attr(*, "names")= chr [1:31] "1" "2" "3" "4" ...
## - attr(*, "spec")=
## .. cols(
## .. date = col_date(format = ""),
## .. Dirawat = col_double(),
## .. retail_and_recreation_percent_change_from_baseline = col_double(),
## .. grocery_and_pharmacy_percent_change_from_baseline = col_double(),
## .. parks_percent_change_from_baseline = col_double(),
## .. transit_stations_percent_change_from_baseline = col_double(),
## .. workplaces_percent_change_from_baseline = col_double(),
## .. residential_percent_change_from_baseline = col_double(),
## .. POSITIF = col_double(),
## .. Sembuh = col_double(),
## .. Meninggal = col_double(),
## .. `Self Isolation` = col_double()
## .. )
## - attr(*, "problems")=<externalptr>
Ringkasan Data beserta Model
summary(mobility_DKI_Jakarta)## date Dirawat
## Min. :2022-01-01 Min. : 116
## 1st Qu.:2022-01-08 1st Qu.: 278
## Median :2022-01-16 Median : 781
## Mean :2022-01-16 Mean :1635
## 3rd Qu.:2022-01-23 3rd Qu.:2126
## Max. :2022-01-31 Max. :6809
## retail_and_recreation_percent_change_from_baseline
## Min. : 1.000
## 1st Qu.: 6.000
## Median : 7.000
## Mean : 7.452
## 3rd Qu.: 9.000
## Max. :14.000
## grocery_and_pharmacy_percent_change_from_baseline
## Min. :20.00
## 1st Qu.:25.00
## Median :26.00
## Mean :27.55
## 3rd Qu.:29.50
## Max. :36.00
## parks_percent_change_from_baseline
## Min. :-1.00
## 1st Qu.: 5.50
## Median : 9.00
## Mean :10.94
## 3rd Qu.:14.00
## Max. :46.00
## transit_stations_percent_change_from_baseline
## Min. :-17.00
## 1st Qu.:-15.00
## Median :-14.00
## Mean :-12.65
## 3rd Qu.:-10.00
## Max. : -4.00
## workplaces_percent_change_from_baseline
## Min. :-49.000
## 1st Qu.: -4.000
## Median : 0.000
## Mean : -1.677
## 3rd Qu.: 3.000
## Max. : 8.000
## residential_percent_change_from_baseline POSITIF Sembuh
## Min. :3.000 Min. :865415 Min. :851280
## 1st Qu.:4.500 1st Qu.:867106 1st Qu.:851783
## Median :5.000 Median :870929 Median :853522
## Mean :5.258 Mean :876297 Mean :855268
## 3rd Qu.:6.000 3rd Qu.:880304 3rd Qu.:856934
## Max. :7.000 Max. :913355 Max. :867519
## Meninggal Self Isolation prediksi_ss prediksi_smsp
## Min. :13588 Min. : 342 Min. : 130.8 Min. : 123.9
## 1st Qu.:13589 1st Qu.: 1456 1st Qu.: 307.4 1st Qu.: 272.5
## Median :13591 Median : 3035 Median : 766.0 Median : 782.1
## Mean :13598 Mean : 5749 Mean :1634.8 Mean :1634.8
## 3rd Qu.:13597 3rd Qu.: 7646 3rd Qu.:2104.3 3rd Qu.:2123.2
## Max. :13666 Max. :25361 Max. :6893.9 Max. :6827.6
## prediksi_ss1 prediksi_smsp1 prediksi_linear
## Min. : 4.486 Min. : 3.446 Min. : 484.3
## 1st Qu.: 5.435 1st Qu.: 5.462 1st Qu.:1376.0
## Median : 7.096 Median : 7.055 Median :1554.3
## Mean : 7.452 Mean : 7.452 Mean :1634.8
## 3rd Qu.: 8.793 3rd Qu.: 9.032 3rd Qu.:1911.0
## Max. :11.900 Max. :13.556 Max. :2802.6
Histogram Dirawat dan Retail
# Function to add histograms
panel.hist <- function(x, ...) {
usr <- par("usr")
on.exit(par(usr))
par(usr = c(usr[1:2], 0, 1.5))
his <- hist(x, plot = FALSE)
breaks <- his$breaks
nB <- length(breaks)
y <- his$counts
y <- y/max(y)
rect(breaks[-nB], 0, breaks[-1], y, col = rgb(0, 1, 1, alpha = 0.5), ...)
# lines(density(x), col = 2, lwd = 2) # Uncomment to add density lines
}
# Equivalent with a formula
pairs(mobility_DKI_Jakarta$Dirawat~mobility_DKI_Jakarta$retail_and_recreation_percent_change_from_baseline, data = mobility_DKI_Jakarta,
upper.panel = NULL, # Disabling the upper panel
diag.panel = panel.hist) # Adding the histogramsHasil Plot 3 Linier Model
Dari ketiga model tersebut kemudian di visualkan melalui Plot, dimana masing masing model menunjukkan pendekatan prediksi linearnya dengan Data Real. Seperti pada gambar dibawah ini.
# plot method
plot(mobility_DKI_Jakarta$date, mobility_DKI_Jakarta$Dirawat, lty = 10, col = 'black', lwd =4)
#plot(mod.ss)
# add lm fit
abline(coef(lm( mobility_DKI_Jakarta$Dirawat ~ mobility_DKI_Jakarta$date , data = mobility_DKI_Jakarta)), lty = 10, col = 'blue', lwd =5)
rug(mobility_DKI_Jakarta$date) # add rug to plot
#points(mobilityjakarta$Tanggal, mobilityjakarta$DIRAWAT, lty = 2, col = 2, lwd = 2)
lines(mobility_DKI_Jakarta$date,mod.ss$y , lty = 2, col = 'red', lwd = 2)
#points(mobilityjakarta$Tanggal,mod.smsp$y , lty = 2, col = 4, lwd = 2)
lines(mobility_DKI_Jakarta$date, mobility_DKI_Jakarta$prediksi_linear, lty = 3, col = 6, lwd = 3)
legend("topright",
legend = c("Real", "Model SS", "Trends", "Linear Regresi"),
lty = 1:4, col = c("Black","Red","Blue","Pink"), lwd = 3, bty = "p")7. Kesimpulan
Dari Visualisasi Plot untuk Regresi Linear Tiga Model dapat disimpulkan sebagai berikut :
Baik Model SS dan Model Smooth.Spline mempunyai kesamaan dengan Model Liniear, hanya perlu menggunakan salah satunya saja.
Dari Hasil plot 3 Model yang mendekati nilai Real adalah Model SS
8. Referensi
https://github.com/juba/rmdformats : Format Visual Rpubs