Tugas Individu 2 MPDW

Deskripsi Data

Data yang digunakan untuk membuat model regresi linear merupakan data time series, yaitu data pengunjung harian website. Data ini terdiri dari Page Loads sebagai peubah penjelas (X) dan Unique Visits sebagai peubah respon (Y). Analisis ini bertujuan untuk mengetahui hubungan peubah Page Loads terhadap Unique Visits pada suatu website. Terdapat 2167 baris data yang mencakup rentang waktu dari 14 September 2014 hingga 19 Agustus 2020. Namun, pada analisis data ini hanya menggunakan sebanyak 1327 baris data dari rentang waktu 1 Januari 2017 hingga 19 Agustus 2020 (kaggle.com). - Page Loads atau jumlah pemuatan halaman yang diterima penulis website per harinya merupakan banyaknya halaman website yang telah dikunjungi visitors (gaebler.com). - Unique Visits merupakan pengunjung website yang berbeda yang mengunjungi dan melihat website penulis, termasuk juga pengunjung yang baru pertama kali maupun pengunjung yang kembali dari hari sebelumnya (Nesdale S, 2010).

Memanggil Library

library(dLagM) #bisa otomatis timeseries datanya
## Warning: package 'dLagM' was built under R version 4.1.3
## Loading required package: nardl
## Warning: package 'nardl' was built under R version 4.1.3
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
## Loading required package: dynlm
## Warning: package 'dynlm' was built under R version 4.1.3
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
library(dynlm) #data harus timeseries
library(MLmetrics) #MAPE
## Warning: package 'MLmetrics' was built under R version 4.1.3
## 
## Attaching package: 'MLmetrics'
## The following object is masked from 'package:dLagM':
## 
##     MAPE
## The following object is masked from 'package:base':
## 
##     Recall
library(lmtest)
## Warning: package 'lmtest' was built under R version 4.1.3
library(car)
## Loading required package: carData
library(readxl)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following object is masked from 'package:car':
## 
##     recode
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Data

Import Data

dt <- read_excel("C:/Users/user/Downloads/daily-website-visitors.xlsx")
str(dt)
## tibble [1,327 x 8] (S3: tbl_df/tbl/data.frame)
##  $ Row              : num [1:1327] 1 2 3 4 5 6 7 8 9 10 ...
##  $ Day              : chr [1:1327] "Sunday" "Monday" "Tuesday" "Wednesday" ...
##  $ Day.Of.Week      : num [1:1327] 1 2 3 4 5 6 7 1 2 3 ...
##  $ Date             : POSIXct[1:1327], format: "2017-01-01" "2017-01-02" ...
##  $ Page.Loads       : num [1:1327] 1447 2568 3566 3941 3841 ...
##  $ Unique.Visits    : num [1:1327] 1039 1844 2527 2816 2625 ...
##  $ First.Time.Visits: num [1:1327] 832 1448 1970 2226 2058 ...
##  $ Returning.Visits : num [1:1327] 207 396 557 590 567 526 319 373 650 648 ...
t <- dt$Row
Xt <- dt$Page.Loads
Yt <- dt$Unique.Visits
cor(Xt,Yt)
## [1] 0.9845738
dt.regresi <- cbind(t,Xt,Yt)
dtreg <- as.data.frame(dt.regresi)

Split Data

train <- dtreg[1:1062,]
test <- dtreg[1063:1327,]

Data Time Series

train.ts <- ts(train)
test.ts <- ts(test)
data.ts <- ts(dtreg)

Eksplorasi Data Time Series

Korelasi Peubah X (Page Loads) dan Peubah Y (Unique Visits)

cor(Xt,Yt)
## [1] 0.9845738

Scatter Plot Korelasi X dan Y

plot(Xt, Yt, pch = 20, col = "orange", main = "Scatter Unique Visits vs Plot Page Loads")

Scatter plot di atas menunjukkan adanya hubungan linear positif antara peubah X (Page Loads) dan peubah Y (Unique Visits), dengan nilai korelasi sebesar 0.98457. ## Model Regresi Linear Awal

model <- lm(Yt~Xt, data=dt)
summary(model)
## 
## Call:
## lm(formula = Yt ~ Xt, data = dt)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -631.28 -114.29  -21.44   98.45  606.43 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 55.647520  14.681721    3.79 0.000157 ***
## Xt           0.699793   0.003416  204.83  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 167.6 on 1325 degrees of freedom
## Multiple R-squared:  0.9694, Adjusted R-squared:  0.9694 
## F-statistic: 4.196e+04 on 1 and 1325 DF,  p-value: < 2.2e-16

Berdasarkan output di atas diperoleh model regresi linear data deret waktu yaitu:

Yt duga = 55.647520 + 0.699793 (Xt)

Dari output di atas pula terlihat bahwa intersep dan peubah Page Loads berpengaruh signifikan terhadap peubah Unique Visits pada taraf nyata 5%, serta keragaman Unique Visits dapat dijelaskan oleh Page Loads sebesar 96.94% dan sisanya dijelaskan oleh peubah penjelas lainnya.

Page Loads berpengaruh positif sebesar 0.699793 terhadap Unique Visits sehingga jika jumlah Page Loads bertambah satu dugaan, maka jumlah Unique Visits juga akan bertambah sebesar 0.699793.

Uji Asumsi Autokorelasi

Model Koyck

#Model Koyck
model.koyck <- dLagM::koyckDlm(x = train$Xt, y = train$Yt)
summary(model.koyck)
## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1740.9  -713.6    23.2   523.5  2473.2 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept) 947.0578   329.9635   2.870  0.00418 **
## Y.1           1.0948     0.3732   2.934  0.00342 **
## X.t          -0.2975     0.3382  -0.880  0.37915   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 879.8 on 1058 degrees of freedom
## Multiple R-Squared: 0.1656,  Adjusted R-squared: 0.164 
## Wald test: 374.9 on 2 and 1058 DF,  p-value: < 2.2e-16 
## 
## Diagnostic tests:
## NULL
## 
##                             alpha       beta      phi
## Geometric coefficients:  -9990.71 -0.2975253 1.094794
#AIC
AIC(model.koyck)
## [1] 17402.49
#BIC
BIC(model.koyck)
## [1] 17422.36
#Ramalan
(fore.koyck <- forecast(model = model.koyck, x=test$Xt, h=265))
## $forecasts
##   [1]  3.345280e+03  3.627319e+03  3.603162e+03  2.992376e+03  2.151127e+03
##   [6]  1.232215e+03  4.067941e+02 -1.916111e+02 -3.918252e+02 -9.734042e+02
##  [11] -2.202784e+03 -3.429988e+03 -4.797326e+03 -6.138673e+03 -7.136487e+03
##  [16] -7.737673e+03 -8.590727e+03 -9.923330e+03 -1.132810e+04 -1.285801e+04
##  [21] -1.425833e+04 -1.546262e+04 -1.648176e+04 -1.765642e+04 -1.911231e+04
##  [26] -2.046612e+04 -2.190035e+04 -2.358390e+04 -2.544578e+04 -2.741037e+04
##  [31] -2.964539e+04 -3.224313e+04 -3.487319e+04 -3.769425e+04 -4.115939e+04
##  [36] -4.499764e+04 -4.894416e+04 -5.334868e+04 -5.856079e+04 -6.428363e+04
##  [41] -7.053051e+04 -7.753171e+04 -8.487942e+04 -9.268086e+04 -1.013411e+05
##  [46] -1.112794e+05 -1.220228e+05 -1.337597e+05 -1.466279e+05 -1.605249e+05
##  [51] -1.754896e+05 -1.919505e+05 -2.103045e+05 -2.306197e+05 -2.528678e+05
##  [56] -2.771481e+05 -3.033985e+05 -3.319119e+05 -3.633795e+05 -3.981972e+05
##  [61] -4.363845e+05 -4.781342e+05 -5.238460e+05 -5.736057e+05 -6.277809e+05
##  [66] -6.872341e+05 -7.528393e+05 -8.247096e+05 -9.034487e+05 -9.895230e+05
##  [71] -1.083494e+06 -1.186077e+06 -1.298582e+06 -1.422239e+06 -1.557578e+06
##  [76] -1.705766e+06 -1.868004e+06 -2.045226e+06 -2.238914e+06 -2.451313e+06
##  [81] -2.684210e+06 -2.939158e+06 -3.218379e+06 -3.523970e+06 -3.858168e+06
##  [86] -4.223682e+06 -4.624139e+06 -5.063019e+06 -5.543650e+06 -6.069712e+06
##  [91] -6.645568e+06 -7.275714e+06 -7.965258e+06 -8.720359e+06 -9.547498e+06
##  [96] -1.045311e+07 -1.144446e+07 -1.252988e+07 -1.371779e+07 -1.501799e+07
## [101] -1.644168e+07 -1.800071e+07 -1.970755e+07 -2.157619e+07 -2.362184e+07
## [106] -2.586108e+07 -2.831233e+07 -3.099609e+07 -3.393419e+07 -3.715113e+07
## [111] -4.067299e+07 -4.452869e+07 -4.874978e+07 -5.337081e+07 -5.843006e+07
## [116] -6.396915e+07 -7.003332e+07 -7.667233e+07 -8.394072e+07 -9.189797e+07
## [121] -1.006093e+08 -1.101467e+08 -1.205884e+08 -1.320200e+08 -1.445352e+08
## [126] -1.582368e+08 -1.732371e+08 -1.896589e+08 -2.076377e+08 -2.273210e+08
## [131] -2.488703e+08 -2.724623e+08 -2.982906e+08 -3.265670e+08 -3.575237e+08
## [136] -3.914150e+08 -4.285193e+08 -4.691410e+08 -5.136134e+08 -5.623014e+08
## [141] -6.156046e+08 -6.739604e+08 -7.378482e+08 -8.077925e+08 -8.843669e+08
## [146] -9.682002e+08 -1.059980e+09 -1.160461e+09 -1.270466e+09 -1.390898e+09
## [151] -1.522748e+09 -1.667096e+09 -1.825127e+09 -1.998139e+09 -2.187550e+09
## [156] -2.394917e+09 -2.621941e+09 -2.870485e+09 -3.142591e+09 -3.440490e+09
## [161] -3.766628e+09 -4.123682e+09 -4.514582e+09 -4.942537e+09 -5.411060e+09
## [166] -5.923996e+09 -6.485556e+09 -7.100347e+09 -7.773417e+09 -8.510289e+09
## [171] -9.317013e+09 -1.020021e+10 -1.116713e+10 -1.222570e+10 -1.338462e+10
## [176] -1.465340e+10 -1.604246e+10 -1.756318e+10 -1.922806e+10 -2.105077e+10
## [181] -2.304625e+10 -2.523089e+10 -2.762263e+10 -3.024108e+10 -3.310775e+10
## [186] -3.624616e+10 -3.968208e+10 -4.344369e+10 -4.756189e+10 -5.207046e+10
## [191] -5.700642e+10 -6.241028e+10 -6.832639e+10 -7.480331e+10 -8.189421e+10
## [196] -8.965728e+10 -9.815623e+10 -1.074608e+11 -1.176475e+11 -1.287997e+11
## [201] -1.410091e+11 -1.543759e+11 -1.690098e+11 -1.850309e+11 -2.025707e+11
## [206] -2.217732e+11 -2.427959e+11 -2.658115e+11 -2.910088e+11 -3.185946e+11
## [211] -3.487954e+11 -3.818590e+11 -4.180569e+11 -4.576862e+11 -5.010720e+11
## [216] -5.485705e+11 -6.005716e+11 -6.575021e+11 -7.198293e+11 -7.880647e+11
## [221] -8.627683e+11 -9.445535e+11 -1.034091e+12 -1.132117e+12 -1.239435e+12
## [226] -1.356925e+12 -1.485553e+12 -1.626375e+12 -1.780545e+12 -1.949330e+12
## [231] -2.134114e+12 -2.336415e+12 -2.557893e+12 -2.800365e+12 -3.065823e+12
## [236] -3.356444e+12 -3.674614e+12 -4.022945e+12 -4.404295e+12 -4.821795e+12
## [241] -5.278872e+12 -5.779277e+12 -6.327116e+12 -6.926888e+12 -7.583514e+12
## [246] -8.302385e+12 -9.089400e+12 -9.951019e+12 -1.089431e+13 -1.192703e+13
## [251] -1.305764e+13 -1.429542e+13 -1.565054e+13 -1.713411e+13 -1.875832e+13
## [256] -2.053649e+13 -2.248323e+13 -2.461450e+13 -2.694780e+13 -2.950229e+13
## [261] -3.229892e+13 -3.536066e+13 -3.871264e+13 -4.238236e+13 -4.639994e+13
## 
## $call
## forecast.koyckDlm(model = model.koyck, x = test$Xt, h = 265)
## 
## attr(,"class")
## [1] "forecast.koyckDlm" "dLagM"
#mape data testing
mape.koyck <- MAPE(fore.koyck$forecasts, test$Yt)

#akurasi data training
mape_train <- dLagM::GoF(model.koyck)["MAPE"]

c("MAPE_testing" = mape.koyck, "MAPE_training" = mape_train)
## $MAPE_testing
## [1] 887970262
## 
## $MAPE_training.MAPE
## [1] 0.2927139

Regression With Distributed Lag

Regression with Distributed Lag Optimum

#Penentuan lag optimum
finiteDLMauto(formula = Yt ~ Xt,
              data = data.frame(train), q.min = 1, q.max = 10 ,
              model.type = "dlm", error.type = "AIC", trace = TRUE) 
##    q - k    MASE      AIC      BIC   GMRAE    MBRAE R.Adj.Sq Ljung-Box
## 10    10 0.17852 13042.48 13106.94 0.19299  0.07700  0.98495         0
## 9      9 0.17856 13053.99 13113.50 0.18991  0.13012  0.98493         0
## 8      8 0.17944 13069.51 13124.07 0.19273  0.31033  0.98486         0
## 7      7 0.17957 13086.74 13136.36 0.19339  7.85679  0.98477         0
## 6      6 0.17954 13096.32 13140.98 0.19378 -0.67081  0.98481         0
## 5      5 0.18009 13115.45 13155.16 0.19238  0.16021  0.98469         0
## 4      4 0.18031 13128.39 13163.14 0.19498  0.30461  0.98465         0
## 3      3 0.18039 13138.30 13168.09 0.19610  0.26799  0.98466         0
## 2      2 0.18035 13147.75 13172.58 0.19597  0.17667  0.98468         0
## 1      1 0.18032 13158.53 13178.40 0.19261  0.75486  0.98469         0

Diperoleh nilai lag optimum yaitu 10.

#Model dlm dengan lag optimum
model.dlm2 = dLagM::dlm(x = train$Xt,y = train$Yt , q = 10)
summary(model.dlm2)
## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -560.14  -78.13   -3.44   75.63  324.41 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 60.912710  15.402561   3.955 8.18e-05 ***
## x.t          0.642086   0.010284  62.434  < 2e-16 ***
## x.1          0.030928   0.013266   2.331   0.0199 *  
## x.2         -0.002622   0.013301  -0.197   0.8438    
## x.3          0.020088   0.012592   1.595   0.1110    
## x.4         -0.001059   0.010860  -0.098   0.9223    
## x.5         -0.011872   0.010649  -1.115   0.2652    
## x.6          0.022417   0.010864   2.063   0.0393 *  
## x.7          0.022839   0.012595   1.813   0.0701 .  
## x.8         -0.010698   0.013340  -0.802   0.4228    
## x.9         -0.012987   0.013256  -0.980   0.3275    
## x.10        -0.014652   0.010322  -1.420   0.1560    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 118.3 on 1040 degrees of freedom
## Multiple R-squared:  0.9851, Adjusted R-squared:  0.9849 
## F-statistic:  6252 on 11 and 1040 DF,  p-value: < 2.2e-16
## 
## AIC and BIC values for the model:
##        AIC      BIC
## 1 13042.48 13106.94

Output di atas menunjukkan hanya beberapa peubah x yang berpengaruh signifikan, antara lain intersep, X.t, X.1, X.6, dan X.7. Hal tersebut berarti jumlah Unique Visitors dipengaruhi oleh jumlah Oage Loads saat ini, kemarin, 6 hari dan 7 hari sebelumnya.

#AIC
AIC(model.dlm2)
## [1] 13042.48
#BIC
BIC(model.dlm2)
## [1] 13106.94
#Ramalan
(fore.dlm2 <- forecast(model = model.dlm2, x=test$Xt, h=265))
## $forecasts
##   [1] 2598.048 2268.119 3051.577 4398.655 4807.161 4779.671 4382.359 3701.598
##   [9] 2702.618 3518.474 4877.548 4642.191 4652.389 4303.878 3187.506 2069.593
##  [17] 2548.440 3459.656 3335.197 3293.061 2673.613 1860.041 1165.610 1339.044
##  [25] 1757.675 1245.159 1087.128 1289.270 1287.863 1126.083 1373.321 1733.122
##  [33] 1260.849 1119.814 1955.298 2078.657 1516.648 1719.012 2570.491 2587.530
##  [41] 2571.674 2992.064 2304.713 1701.034 1970.939 2987.607 2711.756 2650.187
##  [49] 2698.642 2224.479 1641.900 1850.215 2613.067 3085.678 3113.511 2950.201
##  [57] 2211.225 1671.138 2226.551 3084.339 3272.741 3152.223 3131.004 2458.648
##  [65] 1776.291 2126.105 3291.115 3411.434 3519.510 3253.083 2623.445 1941.325
##  [73] 2400.060 3523.804 3468.156 3491.433 3484.078 2575.915 1802.690 2608.778
##  [81] 3460.959 3404.274 3639.005 3419.810 2558.909 1744.266 2421.457 3473.525
##  [89] 3812.167 3546.786 3363.331 2654.635 1878.552 2344.437 3411.325 3573.001
##  [97] 3300.677 3493.669 2586.205 1866.435 2399.686 3309.478 3370.316 3364.037
## [105] 3087.717 2318.984 1702.014 2070.270 1995.021 2657.603 2640.328 2542.799
## [113] 2286.327 1885.018 2250.584 2845.197 2923.093 2917.394 2987.049 2666.185
## [121] 2149.894 2836.522 3415.809 3444.313 3429.724 3449.310 3236.728 2392.714
## [129] 2799.398 3487.092 3704.508 3703.000 3547.352 2981.360 2549.059 2895.146
## [137] 3542.745 3828.964 3939.144 3769.888 3257.064 2830.136 3353.453 4099.404
## [145] 3950.423 4037.108 4144.313 3435.756 3013.694 3553.339 4196.134 4378.739
## [153] 4230.121 3866.003 3262.788 2910.162 3288.815 4121.907 4416.128 4594.204
## [161] 4332.311 3713.349 3206.369 3556.374 4547.521 4303.560 4080.153 4208.128
## [169] 3450.734 2557.881 2984.131 3912.470 3878.998 3632.290 3594.755 3020.798
## [177] 2388.267 2751.645 3260.926 3627.523 3639.595 3433.142 3041.707 2341.822
## [185] 2579.724 3299.039 3321.808 3476.340 3334.739 2866.647 2259.881 2717.851
## [193] 3401.869 3406.160 3482.955 3250.218 3092.231 2111.318 2463.678 3048.916
## [201] 3128.505 3097.081 3395.454 2746.768 1982.806 2433.302 3210.675 3184.149
## [209] 3236.066 2934.712 2482.228 1880.203 2426.872 2818.654 2705.909 2758.925
## [217] 2547.888 2088.470 1559.932 2071.075 2808.883 2659.409 2803.192 2760.516
## [225] 2442.890 1866.634 2423.711 2630.743 2595.023 2467.703 2627.254 2298.464
## [233] 1716.000 1978.471 2613.065 2639.731 2777.894 2693.371 2399.932 1712.705
## [241] 1998.311 2789.678 2677.535 2734.532 2419.010 2107.345 1625.225 1902.644
## [249] 2695.685 2692.207 2574.724 2557.240 2196.486 1579.129 1864.628 2558.890
## [257] 2625.413 2634.184 2544.782 2089.342 1572.574 1918.531 2433.561 2522.757
## [265] 1546.485
## 
## $call
## forecast.dlm(model = model.dlm2, x = test$Xt, h = 265)
## 
## attr(,"class")
## [1] "forecast.dlm" "dLagM"
#akurasi testing
mape.dlm2 <- MAPE(fore.dlm2$forecasts, test$Yt)

#akurasi data training
mape_train <- GoF(model.dlm2)["MAPE"]

c("MAPE_testing" = mape.dlm2, "MAPE_training" = mape_train)
## $MAPE_testing
## [1] 0.08883024
## 
## $MAPE_training.MAPE
## [1] 0.03428635

Model Autoregressive Distributed Lag Model

model.ardl = ardlDlm(x = train$Xt, y = train$Yt, p = 1 , q = 1) #p:lag x, q:lag y
#model untuk p=1, q=1: yt=b0+b1yt-1+b2xt+b3xt-1
#model untuk p=2, q=3: yt=b0+b1yt-1+b2yt-2+b3xt+b4xt-1+b5xt-2

summary(model.ardl)
## 
## Time series regression with "ts" data:
## Start = 2, End = 1062
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -544.61  -57.04    1.81   58.08  305.85 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 20.033403   8.981355   2.231   0.0259 *  
## X.t          0.681402   0.003044 223.821   <2e-16 ***
## X.1         -0.456747   0.015943 -28.649   <2e-16 ***
## Y.1          0.671868   0.022962  29.260   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 88.55 on 1057 degrees of freedom
## Multiple R-squared:  0.9916, Adjusted R-squared:  0.9915 
## F-statistic: 4.138e+04 on 3 and 1057 DF,  p-value: < 2.2e-16
#AIC
AIC(model.ardl)
## [1] 12531.02
BIC(model.ardl)
## [1] 12555.85
#Ramalan
(fore.ardl <- forecast(model = model.ardl, x=test$Xt, h=265))
## $forecasts
##   [1] 2855.317 2452.557 3171.905 4482.385 4861.018 4845.726 4425.038 3720.518
##   [9] 2673.931 3499.077 4854.502 4583.449 4638.092 4281.891 3203.472 2076.495
##  [17] 2519.722 3431.423 3307.909 3289.654 2660.787 1906.900 1219.601 1352.232
##  [25] 1739.988 1189.814 1079.059 1337.797 1380.542 1211.499 1404.034 1749.604
##  [33] 1260.282 1126.369 1988.574 2091.831 1507.370 1699.190 2592.583 2632.189
##  [41] 2590.974 2962.966 2237.593 1681.107 1953.169 2999.437 2687.383 2630.590
##  [49] 2673.727 2236.474 1664.233 1840.819 2602.059 3109.831 3127.522 2952.643
##  [57] 2193.907 1676.449 2250.785 3091.562 3250.913 3120.500 3131.253 2478.161
##  [65] 1787.129 2111.924 3293.594 3400.682 3529.938 3235.866 2635.745 1956.259
##  [73] 2405.743 3524.915 3432.381 3479.578 3485.680 2580.542 1816.936 2616.323
##  [81] 3451.670 3392.297 3634.954 3408.254 2582.635 1748.832 2419.997 3479.927
##  [89] 3819.337 3523.057 3355.969 2672.622 1902.997 2344.034 3392.826 3547.401
##  [97] 3287.510 3515.420 2596.099 1894.563 2401.086 3296.713 3358.479 3354.557
## [105] 3068.930 2334.990 1720.534 2071.739 1933.464 2663.969 2625.843 2578.563
## [113] 2336.191 1912.800 2294.924 2868.711 2904.390 2903.024 2973.660 2675.620
## [121] 2148.707 2852.326 3411.539 3442.706 3419.018 3437.263 3255.029 2369.859
## [129] 2784.111 3468.628 3707.993 3708.953 3526.982 2974.513 2567.632 2904.459
## [137] 3542.689 3817.448 3937.001 3772.952 3272.434 2840.534 3367.562 4102.227
## [145] 3916.067 4020.852 4144.705 3443.402 3027.013 3543.055 4191.791 4387.714
## [153] 4210.191 3841.267 3261.763 2920.629 3290.712 4123.787 4412.454 4616.711
## [161] 4348.680 3732.344 3209.650 3548.357 4550.349 4252.545 4043.781 4207.277
## [169] 3458.645 2567.035 2971.756 3908.607 3889.802 3630.118 3591.282 3042.014
## [177] 2425.128 2763.582 3235.954 3634.964 3648.114 3442.263 3066.219 2343.404
## [185] 2590.632 3300.045 3295.977 3478.375 3324.293 2886.310 2276.558 2735.845
## [193] 3409.571 3394.149 3475.155 3233.106 3125.420 2097.681 2463.908 3028.885
## [201] 3120.001 3099.348 3410.351 2749.386 2001.307 2445.385 3225.922 3181.935
## [209] 3226.818 2894.079 2497.296 1894.177 2448.067 2798.201 2681.475 2754.522
## [217] 2548.944 2115.388 1567.421 2079.805 2829.896 2655.859 2817.806 2761.125
## [225] 2473.859 1876.011 2425.476 2595.262 2579.934 2443.178 2631.851 2313.881
## [233] 1718.683 1974.418 2625.464 2651.444 2799.182 2677.814 2410.237 1708.118
## [241] 2003.246 2797.364 2657.970 2731.997 2387.615 2120.240 1642.663 1907.945
## [249] 2704.832 2678.585 2574.966 2566.372 2209.480 1586.911 1858.198 2549.392
## [257] 2619.687 2638.726 2539.701 2096.943 1585.302 1926.852 2425.371 2511.147
## [265] 1477.996
## 
## $call
## forecast.ardlDlm(model = model.ardl, x = test$Xt, h = 265)
## 
## attr(,"class")
## [1] "forecast.ardlDlm" "dLagM"
#akurasi testing
mape.ardl <- MAPE(fore.ardl$forecasts, test$Yt) #data testing

#akurasi data training
mape_train <- GoF(model.ardl)["MAPE"]

c("MAPE_testing" = mape.ardl, "MAPE_training" = mape_train)
## $MAPE_testing
## [1] 0.08740468
## 
## $MAPE_training.MAPE
## [1] 0.02583826
#Penentuan lag optimum
ardlBoundOrders(data = data.frame(train.ts) , formula = Yt ~ Xt )
## $p
##   Xt
## 1 12
## 
## $q
## [1] 15
## 
## $Stat.table
##           q = 1    q = 2    q = 3    q = 4    q = 5    q = 6    q = 7    q = 8
## p = 1  12449.50 12437.22 12428.24 12390.97 12336.23 12326.97 12282.33 12269.65
## p = 2  12440.42 12380.52 12371.57 12336.83 12284.90 12275.29 12230.96 12217.48
## p = 3  12370.55 12370.55 12342.71 12307.70 12257.46 12247.87 12205.82 12192.26
## p = 4  12348.32 12317.15 12317.15 12285.86 12235.55 12225.92 12184.02 12170.53
## p = 5  12300.94 12259.35 12253.68 12253.68 12211.23 12201.64 12161.08 12147.29
## p = 6  12243.70 12210.41 12200.25 12198.79 12198.79 12153.04 12116.07 12102.30
## p = 7  12162.46 12134.59 12128.24 12119.43 12109.84 12109.84 12111.22 12097.73
## p = 8  12250.34 12202.85 12189.53 12174.91 12140.15 12095.54 12095.54 12096.34
## p = 9  12255.18 12206.33 12190.20 12173.45 12137.69 12084.07 12085.61 12085.61
## p = 10 12229.98 12196.95 12181.72 12163.51 12127.82 12074.42 12075.35 12076.60
## p = 11 12234.10 12192.53 12181.10 12161.29 12122.42 12066.08 12066.80 12067.06
## p = 12 12191.14 12152.37 12143.45 12133.16 12101.62 12049.87 12051.06 12051.96
## p = 13 12172.76 12135.50 12123.89 12111.51 12087.27 12040.08 12041.28 12042.21
## p = 14 12164.59 12121.94 12110.74 12095.88 12070.21 12031.15 12032.19 12032.58
## p = 15 12185.43 12140.52 12123.39 12105.36 12072.58 12025.71 12026.49 12026.77
##           q = 9   q = 10   q = 11   q = 12   q = 13   q = 14   q = 15
## p = 1  12259.33 12249.76 12240.76 12230.06 12219.24 12204.24 12195.05
## p = 2  12207.41 12197.37 12187.85 12177.30 12166.51 12152.43 12143.24
## p = 3  12182.30 12172.71 12163.37 12152.47 12141.48 12128.08 12118.90
## p = 4  12160.19 12150.19 12140.66 12130.32 12119.44 12106.44 12097.49
## p = 5  12137.34 12127.28 12118.06 12107.29 12096.98 12084.66 12075.61
## p = 6  12092.26 12082.11 12072.47 12062.62 12052.15 12039.54 12030.92
## p = 7  12087.78 12077.53 12068.07 12058.27 12048.07 12035.77 12027.12
## p = 8  12086.50 12076.19 12066.73 12057.00 12046.97 12034.44 12025.84
## p = 9  12086.63 12076.39 12066.99 12057.33 12047.31 12034.69 12026.05
## p = 10 12076.60 12078.38 12068.97 12059.30 12049.27 12036.64 12027.99
## p = 11 12067.75 12067.75 12063.18 12053.42 12043.47 12030.50 12021.80
## p = 12 12053.21 12054.18 12054.18 12053.24 12043.25 12030.34 12021.61
## p = 13 12043.52 12045.23 12043.69 12043.69 12045.10 12032.23 12023.50
## p = 14 12033.81 12035.55 12031.66 12032.47 12032.47 12034.20 12025.47
## p = 15 12027.75 12029.60 12024.98 12024.80 12025.28 12025.28 12023.72
## 
## $min.Stat
## [1] 12021.61

Pemodelan DLM dan ARDL dengan library dynlm

#library(dynlm)
#sama dengan model dlm p=1
cons_lm1 <- dynlm(Yt ~ Xt+L(Xt),data = train.ts)

#sama dengan model ardl p=0 q=1
cons_lm2 <- dynlm(Yt ~ Xt+L(Yt),data = train.ts)

#sama dengan ardl p=1 q=1
cons_lm3 <- dynlm(Yt ~ Xt+L(Xt)+L(Yt),data = train.ts)

#sama dengan dlm p=2
cons_lm4 <- dynlm(Yt ~ Xt+L(Xt)+L(Xt,2),data = train.ts)
#Ringkasan Model
summary(cons_lm1)
## 
## Time series regression with "ts" data:
## Start = 2, End = 1062
## 
## Call:
## dynlm(formula = Yt ~ Xt + L(Xt), data = train.ts)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -571.04  -81.95   -0.27   79.61  317.24 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 47.898567  12.009311   3.988 7.11e-05 ***
## Xt           0.686505   0.004087 167.966  < 2e-16 ***
## L(Xt)        0.001217   0.004080   0.298    0.765    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 119.1 on 1058 degrees of freedom
## Multiple R-squared:  0.9847, Adjusted R-squared:  0.9847 
## F-statistic: 3.408e+04 on 2 and 1058 DF,  p-value: < 2.2e-16
summary(cons_lm2)
## 
## Time series regression with "ts" data:
## Start = 2, End = 1062
## 
## Call:
## dynlm(formula = Yt ~ Xt + L(Yt), data = train.ts)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -546.94  -81.73   -4.14   73.19  329.89 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 31.052835  11.954186   2.598  0.00952 ** 
## Xt           0.673631   0.004040 166.753  < 2e-16 ***
## L(Yt)        0.026058   0.005822   4.476 8.44e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 118 on 1058 degrees of freedom
## Multiple R-squared:  0.985,  Adjusted R-squared:  0.985 
## F-statistic: 3.474e+04 on 2 and 1058 DF,  p-value: < 2.2e-16
summary(cons_lm3)
## 
## Time series regression with "ts" data:
## Start = 2, End = 1062
## 
## Call:
## dynlm(formula = Yt ~ Xt + L(Xt) + L(Yt), data = train.ts)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -544.61  -57.04    1.81   58.08  305.85 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 20.033403   8.981355   2.231   0.0259 *  
## Xt           0.681402   0.003044 223.821   <2e-16 ***
## L(Xt)       -0.456747   0.015943 -28.649   <2e-16 ***
## L(Yt)        0.671868   0.022962  29.260   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 88.55 on 1057 degrees of freedom
## Multiple R-squared:  0.9916, Adjusted R-squared:  0.9915 
## F-statistic: 4.138e+04 on 3 and 1057 DF,  p-value: < 2.2e-16
summary(cons_lm4)
## 
## Time series regression with "ts" data:
## Start = 3, End = 1062
## 
## Call:
## dynlm(formula = Yt ~ Xt + L(Xt) + L(Xt, 2), data = train.ts)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -562.86  -81.95   -2.44   78.59  313.55 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 55.016958  13.617628   4.040 5.73e-05 ***
## Xt           0.684101   0.004590 149.039  < 2e-16 ***
## L(Xt)        0.007113   0.006541   1.088    0.277    
## L(Xt, 2)    -0.005245   0.004586  -1.144    0.253    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 119.1 on 1056 degrees of freedom
## Multiple R-squared:  0.9847, Adjusted R-squared:  0.9847 
## F-statistic: 2.269e+04 on 3 and 1056 DF,  p-value: < 2.2e-16
#SSE
deviance(cons_lm1)
## [1] 14999915
deviance(cons_lm2)
## [1] 14722425
deviance(cons_lm3)
## [1] 8287338
deviance(cons_lm4)
## [1] 14980363

Uji Diagnostik Model

#Uji model
if(require("lmtest")) encomptest(cons_lm1, cons_lm2)
## Encompassing test
## 
## Model 1: Yt ~ Xt + L(Xt)
## Model 2: Yt ~ Xt + L(Yt)
## Model E: Yt ~ Xt + L(Xt) + L(Yt)
##           Res.Df Df      F    Pr(>F)    
## M1 vs. ME   1057 -1 856.15 < 2.2e-16 ***
## M2 vs. ME   1057 -1 820.76 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Uji Non-Autokolerasi

#Durbin Watson
dwtest(cons_lm1)
## 
##  Durbin-Watson test
## 
## data:  cons_lm1
## DW = 0.66181, p-value < 2.2e-16
## alternative hypothesis: true autocorrelation is greater than 0
dwtest(cons_lm2)
## 
##  Durbin-Watson test
## 
## data:  cons_lm2
## DW = 0.68876, p-value < 2.2e-16
## alternative hypothesis: true autocorrelation is greater than 0
dwtest(cons_lm3)
## 
##  Durbin-Watson test
## 
## data:  cons_lm3
## DW = 2.3274, p-value = 1
## alternative hypothesis: true autocorrelation is greater than 0
dwtest(cons_lm4)
## 
##  Durbin-Watson test
## 
## data:  cons_lm4
## DW = 0.65846, p-value < 2.2e-16
## alternative hypothesis: true autocorrelation is greater than 0

Uji Heterogenitas

#Breusch-Pagan
bptest(cons_lm1)
## 
##  studentized Breusch-Pagan test
## 
## data:  cons_lm1
## BP = 54.44, df = 2, p-value = 1.508e-12
bptest(cons_lm2)
## 
##  studentized Breusch-Pagan test
## 
## data:  cons_lm2
## BP = 51.497, df = 2, p-value = 6.57e-12
bptest(cons_lm3)
## 
##  studentized Breusch-Pagan test
## 
## data:  cons_lm3
## BP = 46.583, df = 3, p-value = 4.264e-10
bptest(cons_lm4)
## 
##  studentized Breusch-Pagan test
## 
## data:  cons_lm4
## BP = 65.836, df = 3, p-value = 3.322e-14

Uji Normalitas

#Shapiro Wilk
shapiro.test(residuals(cons_lm1))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(cons_lm1)
## W = 0.99554, p-value = 0.003453
shapiro.test(residuals(cons_lm2))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(cons_lm2)
## W = 0.99512, p-value = 0.001736
shapiro.test(residuals(cons_lm3))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(cons_lm3)
## W = 0.99208, p-value = 1.834e-05
shapiro.test(residuals(cons_lm4))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(cons_lm4)
## W = 0.9959, p-value = 0.006441

Perbandingan Keakuratan Ramalan

Membandingkan semua model berdasarkan nilai MAPE untuk melihat model yang mempunyai hasil ramalan terbaik.

#PERBANDINGAN
akurasi <- matrix(c(mape.koyck, mape.dlm2, mape.ardl))
row.names(akurasi)<- c("Koyck","DLM Optimum","Autoregressive")
colnames(akurasi) <- c("MAPE")
akurasi
##                        MAPE
## Koyck          8.879703e+08
## DLM Optimum    8.883024e-02
## Autoregressive 8.740468e-02

Apabila dilihat dari nilai keakuratan model yaitu MAPE, maka model DLM Optimum dengan nilai MAPE terkecil menjadi model yang paling cocok untuk peramalan pada data ini. ## Plot Perbandingan Ramalan vs Aktual

par(mfrow=c(1,1))
plot(1:nrow(test), test$Yt, type="b", col="black",xlab="Index waktu", ylim=c(100,6000), main="Perbandingan Metode Ramalan vs Aktual")
points(1:nrow(test), test$Yt,col="black",pch=19)
lines(1:nrow(test), test$Yt,col="black")
points(1:nrow(test), fore.koyck$forecasts,col="blue",pch=19)
lines(1:nrow(test), fore.koyck$forecasts,col="blue")
points(1:nrow(test), fore.dlm2$forecasts,col="red",pch=19)
lines(1:nrow(test), fore.dlm2$forecasts,col="red")
points(1:nrow(test), fore.ardl$forecasts,col="green",pch=19)
lines(1:nrow(test), fore.ardl$forecasts,col="green")
legend("topleft",c("Data aktual", "Koyck", "DLM Optimum", "Autoregressive"), lty=1, col=c("black", "blue", "red", "green"))

Secara eksploratif, terlihat bahwa metode Distributel Lag Model (DLM) merupakan metode yang cocok untuk peramalan pada data ini, karena tren data model dengan peubah lag yang paling mendekati pola data aktual dibandingkan dengan metode Koyck dan Autoregressive Model.

Daftar Pustaka

https://www.kaggle.com/datasets/bobnau/daily-website-visitors

Anna Lempereur. Small Business Website Unique Visitors vs. Page Loads. Gaebler.com. [Diakses pada 18 September 2022]. https://www.gramedia.com/best-seller/cara-menulis-daftar-pustaka-dari-internet/

Nesdale S. 2010. Difference Between Unique Visitors, Visitors, Visits, Page Views, Hits. And Why You Should Care. [Diakses pada 18 September 2022]. https://www.marketingfirst.co.nz/2010/01/difference-between-unique-visitors-visitors-visits-page-views-hits-and-why-you-should-care/