Tugas Individu 2 Metode Peramalan Deret Waktu

Data and Preparations

Package

library(dLagM)

## Loading required package: nardl

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

## Loading required package: dynlm

## Loading required package: zoo

## 
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

library(dynlm)
library(MLmetrics)

## 
## Attaching package: 'MLmetrics'

## The following object is masked from 'package:dLagM':
## 
##     MAPE

## The following object is masked from 'package:base':
## 
##     Recall

library(lmtest)
library(car)

## Loading required package: carData

library(readxl)
library(ggplot2)
library(GGally)

## Registered S3 method overwritten by 'GGally':
##   method from   
##   +.gg   ggplot2

Data

Data yang digunakan adalah data makro ekonomi di Amerika Serikat dalam bulanan dari 01 Jauari 1981 hingga 01 Oktober 2021. Data ini berasal dari kaggle dengan link https://www.kaggle.com/datasets/calven22/usa-key-macroeconomic-indicators. Variabel yang akan digunakan untuk analisis adalah Date, dspic, dan pce dengan jumlah amatan 490.

#IMPORT DATA

#membuka file data
# Mark the current directory
setwd("E:/KULIAH/SEMESTER 5/STA1341 Metode Peramalan Deret Waktu/TUGAS")

df <- read.csv("macro_monthly.csv", sep = ",")

data <- df[,-c(7, 10)]

head(data)

##         DATE unrate  psr       m2  dspic    pce       ir     ffer  indpro ccpi
## 1 1981-01-01    7.5 10.9 1612.900 4980.4 1870.0 12.56857 19.08452 51.1668 85.4
## 2 1981-02-01    7.4 10.8 1608.125 4965.0 1884.2 13.19444 15.93429 50.9509 85.9
## 3 1981-03-01    7.4 10.8 1629.400 4979.0 1902.9 13.11591 14.70387 51.2066 86.4
## 4 1981-04-01    7.2 10.9 1665.575 4965.1 1904.4 13.67952 15.71900 50.9711 87.0
## 5 1981-05-01    7.5 11.0 1655.150 4974.8 1913.8 14.09950 18.51774 51.2645 87.8
## 6 1981-06-01    7.5 10.8 1664.500 5001.9 1934.5 13.47227 19.09967 51.5247 88.6

str(data)

## 'data.frame':    490 obs. of  10 variables:
##  $ DATE  : chr  "1981-01-01" "1981-02-01" "1981-03-01" "1981-04-01" ...
##  $ unrate: num  7.5 7.4 7.4 7.2 7.5 7.5 7.2 7.4 7.6 7.9 ...
##  $ psr   : num  10.9 10.8 10.8 10.9 11 10.8 12.3 12 12.4 13 ...
##  $ m2    : num  1613 1608 1629 1666 1655 ...
##  $ dspic : num  4980 4965 4979 4965 4975 ...
##  $ pce   : num  1870 1884 1903 1904 1914 ...
##  $ ir    : num  12.6 13.2 13.1 13.7 14.1 ...
##  $ ffer  : num  19.1 15.9 14.7 15.7 18.5 ...
##  $ indpro: num  51.2 51 51.2 51 51.3 ...
##  $ ccpi  : num  85.4 85.9 86.4 87 87.8 88.6 89.8 90.7 91.8 92.1 ...

dim(data)

## [1] 490  10

data$DATE <- as.Date(data$DATE,"%y/%m/%d")

t<-data$DATE
Xt<-data$pce
Yt<-data$dspic

datareg1<-cbind(t, Xt, Yt)
datareg <- as.data.frame(datareg1)

Dilakukan splitting data dengan data train dengan jumlah amatan 80% dari jumlah seluruh amatan yaitu 392 dan data test yaitu 20% dari jumlah seluruh amatan yaitu 98 amatan.

#SPLIT DATA
train<-datareg[1:392,]
test<-datareg[393:490,]

#data time series
train.ts<-ts(train)
test.ts<-ts(test)
data.ts<-ts(datareg)

Plot Time Series

data.ts1<-ts(Yt)
plot(data.ts1, main = "Time Series Plot of Real Disposable Personal Income", xlab = "Period", ylab="Real Disposable Personal Income")
points(data.ts1)

data.ts2<-ts(Xt)
plot(data.ts2, main = "Time Series Plot of Personal Consumption Expenditures", xlab = "Period", ylab="Personal Consumption Expenditures")
points(data.ts2)

## Korelasi Korelasi Peubah Personal Consumption Expenditures (X) dan Real Disposable Personal Income (Y)

cor(Xt, Yt)

## [1] 0.9930372

Scatter Plot Peubah Personal Consumption Expenditures (X) dan Real Disposable Personal Income (Y)

plot(Xt, Yt, pch = 20, col = "black", main = "Scatter Plot Real Disposable Personal Income dan Personal Consumption Expenditures")

Berdasarkan scatter plot terlihat bahwa hubungan antara peubah Consumption Expenditures (X) dan Real Disposable Personal Income (Y) memiliki hubungan linier positif dengan nilai korelasi 0.9930372.

Model Regresi Awal

model.reg <- lm(pce ~ dspic, data = data)
anova(model.reg)

## Analysis of Variance Table
## 
## Response: pce
##            Df     Sum Sq    Mean Sq F value    Pr(>F)    
## dspic       1 7590475669 7590475669   34678 < 2.2e-16 ***
## Residuals 488  106815880     218885                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary(model.reg)

## 
## Call:
## lm(formula = pce ~ dspic, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4869.2  -130.8    14.2   249.9  1601.9 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -4.772e+03  6.931e+01  -68.85   <2e-16 ***
## dspic        1.262e+00  6.775e-03  186.22   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 467.9 on 488 degrees of freedom
## Multiple R-squared:  0.9861, Adjusted R-squared:  0.9861 
## F-statistic: 3.468e+04 on 1 and 488 DF,  p-value: < 2.2e-16

Uji Asumsi Autokorelasi

bgtest(model.reg)

## 
##  Breusch-Godfrey test for serial correlation of order up to 1
## 
## data:  model.reg
## LM test = 222.53, df = 1, p-value < 2.2e-16

Diperoleh nilai p-value < 0.05 (Tolak H0), artinya pada taraf nyata 5% cukup bukti untuk menyatakan bahwa terdapat autokorelasi pada model awal. Karena terdapat autokorelasi, akan dilakukan penanganan dengan menambahkan lag tertentu. Metode yang digunakan adalah Model Koyck, Distributed Lag, dan Autoregressive Distributed Lag.

Model KOYCK

Metode Koyck didasarkan asumsi bahwa semakin jauh jarak lag peubah independen dari periode sekarang maka semakin kecil pengaruh peubah lag terhadap peubah dependen

Koyck mengusulkan suatu metode untuk menduga model dinamis distributed lag dengan mengasumsikan bahwa semua koefisien 𝛽 mempunyai tanda sama.

Model Koyck merupakan jenis paling umum dari model infinite distributed lag dan juga dikenal sebagai geometric lag.

#MODEL KOYCK
model.koyck = dLagM::koyckDlm(x = data$dspic, y = data$pce)
summary(model.koyck)

## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2118.79   -18.67     3.42    29.37   592.47 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -503.14871   78.80556  -6.385 4.02e-10 ***
## Y.1            0.89855    0.01581  56.830  < 2e-16 ***
## X.t            0.13265    0.02010   6.599 1.09e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 122.4 on 486 degrees of freedom
## Multiple R-Squared: 0.9991,  Adjusted R-squared: 0.999 
## Wald test: 2.556e+05 on 2 and 486 DF,  p-value: < 2.2e-16 
## 
## Diagnostic tests:
## NULL
## 
##                              alpha     beta       phi
## Geometric coefficients:  -4959.595 0.132651 0.8985505

AIC(model.koyck)

## [1] 6094.152

BIC(model.koyck)

## [1] 6110.921

Forecasting Koyck

(fore.koyck <- forecast(model = model.koyck, x=test$Xt, h=98))

## $forecasts
##  [1] 15648.93 15079.08 14575.80 14127.20 13722.14 13367.68 13059.77 12788.69
##  [9] 12550.58 12344.50 12164.79 12013.88 11879.08 11767.33 11669.95 11583.67
## [17] 11502.02 11434.83 11380.97 11337.63 11307.08 11284.50 11271.80 11264.61
## [25] 11257.76 11252.39 11252.23 11256.77 11263.41 11279.22 11290.84 11311.78
## [33] 11335.70 11367.44 11398.47 11430.23 11466.32 11501.58 11536.64 11582.65
## [41] 11630.91 11676.88 11725.46 11774.18 11818.08 11863.91 11907.94 11953.16
## [49] 12010.27 12065.52 12128.54 12199.25 12265.21 12329.41 12396.08 12467.17
## [57] 12539.38 12609.68 12679.44 12749.00 12813.66 12883.55 12955.40 13001.75
## [65] 13050.55 13096.18 13153.96 13217.15 13279.61 13342.83 13407.83 13472.73
## [73] 13534.80 13596.31 13659.49 13718.74 13783.08 13842.90 13760.96 13456.48
## [81] 13320.34 13308.28 13329.27 13366.87 13428.67 13492.73 13539.85 13571.87
## [89] 13662.78 13723.46 13878.71 14039.41 14184.56 14338.55 14477.28 14624.63
## [97] 14770.37 14929.75
## 
## $call
## forecast.koyckDlm(model = model.koyck, x = test$Xt, h = 98)
## 
## attr(,"class")
## [1] "forecast.koyckDlm" "dLagM"

Akurasi Model Koyck

#mape data testing
mape.koyck <- MAPE(fore.koyck$forecasts, test$Yt)

#akurasi data training
mape_train <- dLagM::GoF(model.koyck)["MAPE"]

c("MAPE_testing" = mape.koyck, "MAPE_training" = mape_train)

## $MAPE_testing
## [1] 0.1268223
## 
## $MAPE_training.MAPE
## [1] 0.006205463

Regression with Distributed Lag

Regression with Distributed Lag (lag=2)

#REGRESSION WITH DISTRIBUTED LAG -> estimasi parameter menggunakan least square

model.dlm = dLagM::dlm(x = train$Xt,y = train$Yt , q = 2)
summary(model.dlm)

## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -625.39 -111.44   -6.91   96.61  597.84 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 3741.15716   25.20444 148.432  < 2e-16 ***
## x.t            1.52453    0.35385   4.308 2.09e-05 ***
## x.1            0.06344    0.49050   0.129   0.8972    
## x.2           -0.78742    0.35443  -2.222   0.0269 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 206.1 on 386 degrees of freedom
## Multiple R-squared:  0.9924, Adjusted R-squared:  0.9923 
## F-statistic: 1.675e+04 on 3 and 386 DF,  p-value: < 2.2e-16
## 
## AIC and BIC values for the model:
##        AIC      BIC
## 1 5268.733 5288.563

AIC(model.dlm)

## [1] 5268.733

BIC(model.dlm)

## [1] 5288.563

#ramalan

(fore.dlm <- forecast(model = model.dlm, x=test$Xt, h=98))

## $forecasts
##  [1] 12929.15 12986.00 13060.89 13066.31 12993.28 13080.15 13218.09 13231.03
##  [9] 13233.68 13293.62 13327.76 13404.95 13386.86 13432.20 13466.79 13426.68
## [17] 13361.82 13423.64 13525.94 13550.45 13610.71 13640.67 13680.36 13703.57
## [25] 13656.13 13639.90 13696.22 13747.62 13751.57 13838.12 13797.90 13858.98
## [33] 13937.90 13995.82 13999.30 13984.15 14058.02 14071.11 14066.46 14217.94
## [41] 14284.37 14231.50 14275.34 14321.47 14282.18 14325.58 14360.62 14389.20
## [49] 14564.18 14583.71 14641.70 14786.47 14741.65 14715.82 14807.13 14910.74
## [57] 14958.44 14958.25 14987.15 15037.30 15026.16 15121.85 15218.71 14943.87
## [65] 14963.48 15095.51 15246.86 15373.68 15344.86 15362.30 15425.97 15462.46
## [73] 15460.13 15489.21 15560.68 15558.96 15640.59 15654.34 14030.03 11299.87
## [81] 13574.77 16278.13 15880.60 15456.74 15598.78 15598.95 15316.74 15142.51
## [89] 15913.63 15763.28 16541.99 16958.43 16379.37 16524.66 16535.71 16656.73
## [97] 16818.75 17017.19
## 
## $call
## forecast.dlm(model = model.dlm, x = test$Xt, h = 98)
## 
## attr(,"class")
## [1] "forecast.dlm" "dLagM"

#meramalkan 98 periode ke depan

#mape data testing
mape.dlm <- MAPE(fore.dlm$forecasts, test$Yt)

#akurasi data training
mape_train <- GoF(model.dlm)["MAPE"]

c("MAPE_testing" = mape.dlm, "MAPE_training" = mape_train)

## $MAPE_testing
## [1] 0.04001644
## 
## $MAPE_training.MAPE
## [1] 0.01858939

Regression with Distributed Lag `Optimum`

#penentuan lag optimum 
lagop<-finiteDLMauto(formula=Yt ~ Xt,
              data = data.frame(train),model.type = "dlm",error.type = "AIC");lagop

##    q - k    MASE      AIC      BIC   GMRAE   MBRAE R.Adj.Sq Ljung-Box
## 10    10 3.53187 5153.743 5205.033 3.96934 1.33652  0.99239         0

#model dlm dengan lag optimum
model.dlm2 = dLagM::dlm(x = train$Xt,y = train$Yt , q = 10)
summary(model.dlm2)

## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -529.51 -105.57  -11.67  115.42  542.90 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 3732.5235    27.2199 137.125  < 2e-16 ***
## x.t            1.2145     0.3582   3.390 0.000773 ***
## x.1            0.1514     0.4886   0.310 0.756892    
## x.2           -0.1569     0.4933  -0.318 0.750663    
## x.3            0.1117     0.4930   0.227 0.820913    
## x.4           -0.2282     0.4924  -0.463 0.643328    
## x.5           -0.2030     0.4937  -0.411 0.681159    
## x.6           -0.1866     0.4935  -0.378 0.705554    
## x.7            0.3930     0.4955   0.793 0.428177    
## x.8            0.4692     0.4957   0.946 0.344567    
## x.9           -0.1184     0.4921  -0.241 0.810062    
## x.10          -0.6535     0.3618  -1.806 0.071677 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 202.1 on 370 degrees of freedom
## Multiple R-squared:  0.9926, Adjusted R-squared:  0.9924 
## F-statistic:  4516 on 11 and 370 DF,  p-value: < 2.2e-16
## 
## AIC and BIC values for the model:
##        AIC      BIC
## 1 5153.743 5205.033

AIC(model.dlm2)

## [1] 5153.743

BIC(model.dlm2)

## [1] 5205.033

#ramalan
(fore.dlm2 <- forecast(model = model.dlm2, x=test$Xt, h=98))

## $forecasts
##  [1] 12916.25 12974.14 12985.38 12995.05 12992.36 13079.59 13159.29 13190.51
##  [9] 13243.41 13331.64 13370.43 13395.01 13356.54 13454.45 13527.01 13473.33
## [17] 13401.53 13433.94 13502.49 13521.54 13596.90 13633.91 13690.41 13654.89
## [25] 13607.63 13626.25 13694.41 13722.13 13737.10 13829.15 13802.55 13853.03
## [33] 13864.67 13918.49 13973.33 13991.99 14061.13 14054.74 14056.66 14177.44
## [41] 14285.06 14267.98 14306.96 14313.55 14294.55 14311.99 14338.38 14417.43
## [49] 14561.00 14546.43 14648.21 14777.42 14741.18 14749.69 14814.94 14908.40
## [57] 14997.18 15029.61 15073.71 15130.57 15055.43 15098.02 15214.81 15048.77
## [65] 15060.28 15056.85 15149.98 15275.17 15328.56 15445.12 15446.22 15354.50
## [73] 15344.16 15502.70 15605.79 15633.13 15653.47 15652.36 14387.45 12093.18
## [81] 13200.62 14510.21 14796.99 15598.61 16243.39 15865.77 14125.33 13498.24
## [89] 15640.58 16937.75 17111.77 17097.34 16866.11 16979.84 16585.39 16720.07
## [97] 16930.28 17327.32
## 
## $call
## forecast.dlm(model = model.dlm2, x = test$Xt, h = 98)
## 
## attr(,"class")
## [1] "forecast.dlm" "dLagM"

#akurasi testing
mape.dlm2 <- MAPE(fore.dlm2$forecasts, test$Yt)

#akurasi data training
mape_train <- GoF(model.dlm2)["MAPE"]

c("MAPE_testing" = mape.dlm2, "MAPE_training" = mape_train)

## $MAPE_testing
## [1] 0.04497278
## 
## $MAPE_training.MAPE
## [1] 0.01766089

Model Autoregressive / Dynamic Regression

Apabila peubah dependen dipengaruhi oleh peubah independen pada waktu sekarang, serta dipengaruhii juga oleh peubah dependen itu sendiri pada satu waktu yang lalu maka model tersebut disebut autoregressive (Gujarati, 2004)

#MODEL AUTOREGRESSIVE 
#library(dLagM)
pqop<-ardlBoundOrders(data = data.frame(datareg) , formula = Yt ~ Xt )
c(p=pqop$p$Xt, q=pqop$q)

##  p  q 
## 15 11

Diperoleh lag optimum untuk peubah Xt atau unique visits adalah 15 hari sebelumnya, dan lag optimum untuk peubah Yt atau page loads adalah 11 hari sebelumnya.

model.ardl = ardlDlm(x = train$Xt, y = train$Yt, p = 15 , q = 11)
summary(model.ardl)

## 
## Time series regression with "ts" data:
## Start = 16, End = 392
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -546.00  -27.66   -0.58   25.39  499.49 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 26.049075  83.531590   0.312   0.7553    
## X.t          0.159263   0.136542   1.166   0.2442    
## X.1         -0.016993   0.180954  -0.094   0.9252    
## X.2         -0.044166   0.184302  -0.240   0.8107    
## X.3          0.153231   0.184598   0.830   0.4071    
## X.4         -0.440602   0.185487  -2.375   0.0181 *  
## X.5          0.002609   0.185199   0.014   0.9888    
## X.6          0.036423   0.184541   0.197   0.8437    
## X.7          0.374419   0.182676   2.050   0.0411 *  
## X.8          0.134680   0.183464   0.734   0.4634    
## X.9         -0.119179   0.183311  -0.650   0.5160    
## X.10         0.048768   0.183241   0.266   0.7903    
## X.11        -0.178186   0.183400  -0.972   0.3319    
## X.12        -0.147451   0.183532  -0.803   0.4223    
## X.13        -0.042996   0.183572  -0.234   0.8150    
## X.14        -0.054736   0.180256  -0.304   0.7616    
## X.15         0.132524   0.134563   0.985   0.3254    
## Y.1          0.623868   0.053438  11.675   <2e-16 ***
## Y.2          0.050106   0.063344   0.791   0.4295    
## Y.3          0.054678   0.063229   0.865   0.3878    
## Y.4          0.090863   0.063422   1.433   0.1528    
## Y.5          0.132886   0.063648   2.088   0.0375 *  
## Y.6          0.094785   0.064028   1.480   0.1397    
## Y.7          0.005316   0.064546   0.082   0.9344    
## Y.8          0.073771   0.067792   1.088   0.2773    
## Y.9         -0.145101   0.076674  -1.892   0.0593 .  
## Y.10        -0.032075   0.076797  -0.418   0.6765    
## Y.11         0.050931   0.065470   0.778   0.4371    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 72.82 on 349 degrees of freedom
## Multiple R-squared:  0.9991, Adjusted R-squared:  0.999 
## F-statistic: 1.38e+04 on 27 and 349 DF,  p-value: < 2.2e-16

AIC(model.ardl)

## [1] 4331.936

BIC(model.ardl)

## [1] 4445.971

#ramalan
(fore.ardl <- forecast(model = model.ardl, x=test$Xt, h=98))

## $forecasts
##  [1] 12304.35 12354.99 12426.74 12432.61 12421.63 12416.28 12419.34 12449.32
##  [9] 12506.32 12561.65 12576.78 12563.07 12568.57 12605.49 12644.30 12657.42
## [17] 12691.26 12699.57 12721.31 12730.03 12785.84 12834.08 12838.26 12820.55
## [25] 12799.99 12805.41 12822.57 12852.38 12900.52 12948.18 12963.95 12953.46
## [33] 12960.34 12958.56 12994.59 13004.13 13041.52 13037.48 13045.91 13091.28
## [41] 13121.05 13152.41 13191.02 13191.81 13185.04 13183.50 13221.31 13263.33
## [49] 13315.27 13345.35 13377.05 13400.47 13381.19 13392.60 13410.19 13445.19
## [57] 13489.68 13539.23 13610.37 13623.34 13612.99 13640.07 13674.11 13684.85
## [65] 13718.60 13732.19 13725.20 13788.55 13826.98 13904.90 13866.61 13815.14
## [73] 13807.19 13820.40 13889.21 13940.14 13992.22 14035.06 13839.75 13478.17
## [81] 13475.34 13510.29 13671.59 14634.15 14829.96 14181.66 12973.79 12558.13
## [89] 13024.97 13405.47 14088.22 14710.39 14504.45 14312.73 13702.18 13808.08
## [97] 14187.61 14502.03
## 
## $call
## forecast.ardlDlm(model = model.ardl, x = test$Xt, h = 98)
## 
## attr(,"class")
## [1] "forecast.ardlDlm" "dLagM"

#akurasi testing
mape.ardl <- MAPE(fore.ardl$forecasts, test$Yt) #data testing

#akurasi data training
mape_train <- GoF(model.ardl)["MAPE"]

c("MAPE_testing" = mape.ardl, "MAPE_training" = mape_train)

## $MAPE_testing
## [1] 0.06246014
## 
## $MAPE_training.MAPE
## [1] 0.004722454

#PEMODELAN DLM dan ARDL dengan library dynlm
#library(dynlm)

#sama dengan model dlm p=1
cons_lm1 <- dynlm(Yt ~ Xt+L(Xt),data = train.ts)

#sama dengan model ardl p=0 q=1
cons_lm2 <- dynlm(Yt ~ Xt+L(Yt),data = train.ts)

#sama dengan ardl p=1 q=1
cons_lm3 <- dynlm(Yt ~ Xt+L(Xt)+L(Yt),data = train.ts)

#sama dengan dlm p=2
cons_lm4 <- dynlm(Yt ~ Xt+L(Xt)+L(Xt,2),data = train.ts)

#Ringkasan Model
summary(cons_lm1)

## 
## Time series regression with "ts" data:
## Start = 2, End = 392
## 
## Call:
## dynlm(formula = Yt ~ Xt + L(Xt), data = train.ts)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -601.67 -116.63   -6.14   93.16  604.62 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 3749.5563    24.7297 151.622  < 2e-16 ***
## Xt             1.4957     0.3559   4.203 3.28e-05 ***
## L(Xt)         -0.6934     0.3565  -1.945   0.0525 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 207.4 on 388 degrees of freedom
## Multiple R-squared:  0.9923, Adjusted R-squared:  0.9922 
## F-statistic: 2.495e+04 on 2 and 388 DF,  p-value: < 2.2e-16

summary(cons_lm2)

## 
## Time series regression with "ts" data:
## Start = 2, End = 392
## 
## Call:
## dynlm(formula = Yt ~ Xt + L(Yt), data = train.ts)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -760.12  -21.26    2.99   21.54  522.56 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 202.36793   72.66910   2.785  0.00562 ** 
## Xt            0.03942    0.01555   2.536  0.01161 *  
## L(Yt)         0.95094    0.01928  49.329  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 77.3 on 388 degrees of freedom
## Multiple R-squared:  0.9989, Adjusted R-squared:  0.9989 
## F-statistic: 1.809e+05 on 2 and 388 DF,  p-value: < 2.2e-16

summary(cons_lm3)

## 
## Time series regression with "ts" data:
## Start = 2, End = 392
## 
## Call:
## dynlm(formula = Yt ~ Xt + L(Xt) + L(Yt), data = train.ts)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -761.82  -20.54    3.02   21.17  520.38 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 204.67532   72.92432   2.807  0.00526 ** 
## Xt            0.10013    0.13578   0.737  0.46129    
## L(Xt)        -0.06013    0.13360  -0.450  0.65291    
## L(Yt)         0.95010    0.01939  49.004  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 77.38 on 387 degrees of freedom
## Multiple R-squared:  0.9989, Adjusted R-squared:  0.9989 
## F-statistic: 1.203e+05 on 3 and 387 DF,  p-value: < 2.2e-16

summary(cons_lm4)

## 
## Time series regression with "ts" data:
## Start = 3, End = 392
## 
## Call:
## dynlm(formula = Yt ~ Xt + L(Xt) + L(Xt, 2), data = train.ts)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -625.39 -111.44   -6.91   96.61  597.84 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 3741.15716   25.20444 148.432  < 2e-16 ***
## Xt             1.52453    0.35385   4.308 2.09e-05 ***
## L(Xt)          0.06344    0.49050   0.129   0.8972    
## L(Xt, 2)      -0.78742    0.35443  -2.222   0.0269 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 206.1 on 386 degrees of freedom
## Multiple R-squared:  0.9924, Adjusted R-squared:  0.9923 
## F-statistic: 1.675e+04 on 3 and 386 DF,  p-value: < 2.2e-16

#SSE
deviance(cons_lm1)

## [1] 16694326

deviance(cons_lm2)

## [1] 2318207

deviance(cons_lm3)

## [1] 2316995

deviance(cons_lm4)

## [1] 16390287

Uji Diagnostik Model

Uji Non Autokorelasi

#uji model
if(require("lmtest")) encomptest(cons_lm1, cons_lm2)

## Encompassing test
## 
## Model 1: Yt ~ Xt + L(Xt)
## Model 2: Yt ~ Xt + L(Yt)
## Model E: Yt ~ Xt + L(Xt) + L(Yt)
##           Res.Df Df         F Pr(>F)    
## M1 vs. ME    387 -1 2401.3983 <2e-16 ***
## M2 vs. ME    387 -1    0.2026 0.6529    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

#Diagnostik
#durbin watson
dwtest(cons_lm1)

## 
##  Durbin-Watson test
## 
## data:  cons_lm1
## DW = 0.19782, p-value < 2.2e-16
## alternative hypothesis: true autocorrelation is greater than 0

dwtest(cons_lm2)

## 
##  Durbin-Watson test
## 
## data:  cons_lm2
## DW = 2.3575, p-value = 0.9997
## alternative hypothesis: true autocorrelation is greater than 0

dwtest(cons_lm3)

## 
##  Durbin-Watson test
## 
## data:  cons_lm3
## DW = 2.3582, p-value = 0.9997
## alternative hypothesis: true autocorrelation is greater than 0

dwtest(cons_lm4)

## 
##  Durbin-Watson test
## 
## data:  cons_lm4
## DW = 0.19485, p-value < 2.2e-16
## alternative hypothesis: true autocorrelation is greater than 0

Uji Heterogenitas

#heterogenitas
bptest(cons_lm1)

## 
##  studentized Breusch-Pagan test
## 
## data:  cons_lm1
## BP = 37.663, df = 2, p-value = 6.631e-09

bptest(cons_lm2)

## 
##  studentized Breusch-Pagan test
## 
## data:  cons_lm2
## BP = 10.56, df = 2, p-value = 0.005093

bptest(cons_lm3)

## 
##  studentized Breusch-Pagan test
## 
## data:  cons_lm3
## BP = 12.263, df = 3, p-value = 0.006533

bptest(cons_lm4)

## 
##  studentized Breusch-Pagan test
## 
## data:  cons_lm4
## BP = 38.633, df = 3, p-value = 2.076e-08

Uji Normalitas

#shapiro wilk
shapiro.test(residuals(cons_lm1))

## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(cons_lm1)
## W = 0.98211, p-value = 9.169e-05

shapiro.test(residuals(cons_lm2))

## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(cons_lm2)
## W = 0.66233, p-value < 2.2e-16

shapiro.test(residuals(cons_lm3))

## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(cons_lm3)
## W = 0.66171, p-value < 2.2e-16

shapiro.test(residuals(cons_lm4))

## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(cons_lm4)
## W = 0.98527, p-value = 0.0005309

Perbandingan Keakuratan Ramalan

#PERBANDINGAN
akurasi <- matrix(c(mape.koyck, mape.dlm, mape.dlm2, mape.ardl))
row.names(akurasi)<- c("Koyck","DLM 1","DLM 2","Autoregressive")
colnames(akurasi) <- c("MAPE")
akurasi

##                      MAPE
## Koyck          0.12682234
## DLM 1          0.04001644
## DLM 2          0.04497278
## Autoregressive 0.06246014

#PLOT
par(mfrow=c(1,1))
plot(test$Xt, test$Yt, type="b", col="black")
points(test$Xt, fore.koyck$forecasts,col="red")
lines(test$Xt, fore.koyck$forecasts,col="red")
points(test$Xt, fore.dlm$forecasts,col="blue")
lines(test$Xt, fore.dlm$forecasts,col="blue")
points(test$Xt, fore.dlm2$forecasts,col="orange")
lines(test$Xt, fore.dlm2$forecasts,col="orange")
points(test$Xt, fore.ardl$forecasts,col="green")
lines(test$Xt, fore.ardl$forecasts,col="green")
legend("topleft",c("aktual", "koyck","DLM 1","DLM 2", "autoregressive"), lty=1, col=c("black","red","blue","orange","green"), cex=0.8)

Secara eksploratif, terlihat bahwa metode Distributed Lag Model merupakan metode yang sesuai untuk peramalan karena memiliki tren data yang paling mendekati pola data aktual dibandingkan dengan metode Koyck dan Autoregressive Model.

Kesimpulan

Metode yang paling cocok untuk metode peramalan terbaik yaitu metode Distributed lag Model.

Hasil uji diagnostik menunjukkan bahwa dengan metode Distributed lag Model autokorelasi pada model regresi deret waktu belum berhasil ditangani, sehingga perlu dilakukan uji lanjut atau penanganan dengan metode lain.

Tugas Individu 2 Metode Peramalan Deret Waktu