Model ARIMA (2,2,2)
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
## Registered S3 methods overwritten by 'TSA':
## method from
## fitted.Arima forecast
## plot.Arima forecast
##
## Attaching package: 'TSA'
## The following objects are masked from 'package:stats':
##
## acf, arima
## The following object is masked from 'package:utils':
##
## tar
Generate Data
Data digenerate sebanyank \(200\), kemudian \(50\) data pertama dihapus sehingga data yang digunakan data ke \(51-200\) untuk dianalisis
set.seed(123)
n <- 200
phi1 <- 0.5
phi2 <- -0.7
teta1 <- -0.3
teta2 <- -0.5
ar <- numeric(4)
ar[1] <- phi1 + 2
ar[2] <- phi2 - 2*phi1 - 1
ar[3] <- phi1 - 2*phi2
ar[4] <- phi2
ma <- numeric(2)
ma[1] <- -teta1
ma[2] <- -teta2
et <- rnorm(n, mean = 0, sd = sqrt(1.15))
Yt <- numeric(n)
for (t in 5:n) {
Yt[t] <- ar[1] * Yt[t-1] +
ar[2] * Yt[t-2] +
ar[3] * Yt[t-3] +
ar[4] * Yt[t-4] +
et[t] +
ma[1] * et[t-1] +
ma[2] * et[t-2]
}
data <- ts(Yt[-(1:50)])
# Plot manual
plot(data, type="l", col="blue",
main="Time Series Data",
ylab="Y_t", xlab="Waktu")Split Data
Data yang akan digunakan untuk menentukan model terbaik berdasarkan kadidat model dilakukan splitting data yaitu dengan membagi data menjadi data train sebesar \(80%\) dan data testing sebesar \(20%\). Sehingga ada \(120\) data train dan \(30\) data testing.
train_size <- 120
train <- data[1:train_size]
test <- data[(train_size+1):150]
plot(train, type="l", col="blue", main="Train Data")
lines(test, col="red")
legend("topleft", legend=c("Train","Test"),
col=c("blue","red"), lty=1)Model ARIMA
1. Uji Stasioneritas (ADF)
untuk menguji kestasioneran data dilakukan dengan uji ADF
##
## Augmented Dickey-Fuller Test
##
## data: train
## Dickey-Fuller = -2.3354, Lag order = 4, p-value = 0.4373
## alternative hypothesis: stationary
Hasil uji ADF menunjukan nilai \(p-value=0.4373 < 0.05\) artinya data belum stasioner
2. Melakukan Differencing agar data train stasioner
##
## Augmented Dickey-Fuller Test
##
## data: data.dif1
## Dickey-Fuller = -1.7984, Lag order = 4, p-value = 0.6603
## alternative hypothesis: stationary
## Warning in adf.test(data.dif2, alternative = "stationary", k =
## trunc((length(train) - : p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: data.dif2
## Dickey-Fuller = -4.2162, Lag order = 4, p-value = 0.01
## alternative hypothesis: stationary
Berdasarkan hasil uji ADF pada hasil differencing ordo 2, diperoleh nilai \(p-value = 0.01 < alpha = 0.05\) sehingga data yang digunakan merupakan data yang stasioner.
3. Identifikasi Kandidat Model
Berdasarkan plot ACF diperoleh kandidat model adalah \(ARIMA(0,0,1)\)
Berdasarkan plot PACF diperoleh kandidat model adalah \(ARIMA(2,0,0)\)
## AR/MA
## 0 1 2 3 4 5 6 7 8 9 10 11 12 13
## 0 x o x x o x o o o o o o o o
## 1 x x x x x x o o o o o o o o
## 2 o x o o o o o o o o o o o o
## 3 x x o o o o o o o o o o o o
## 4 x x o o o o o o o o o o o o
## 5 x x o o o o o o o o o o o o
## 6 x x o o o o o o o o o o o o
## 7 x o o o o o o o o o o o o o
Berdasarkan hasil EACF di atas kandidat model yang diperoleh adalah \(ARIMA(2,0,2), ARIMA(3,0,2), ARIMA(4,0,2)\)
Karena data yang digunakan adalah stasioner maka \(d = 0\), sehingga kandidat model adalah: \(ARIMA(0,0,1), ARIMA(2,0,0), ARIMA(2,0,2), ARIMA(3,0,2), ARIMA(4,0,2)\)
#ARIMA(0,0,1)
arima001 <- arima(data.dif2, order = c(0,0,1), include.mean = TRUE, method = "ML")
arima001##
## Call:
## arima(x = data.dif2, order = c(0, 0, 1), include.mean = TRUE, method = "ML")
##
## Coefficients:
## ma1 intercept
## 0.6555 -0.0324
## s.e. 0.0672 0.1760
##
## sigma^2 estimated as 1.342: log likelihood = -185.09, aic = 374.17
#ARIMA(2,0,0)
arima200 <- arima(data.dif2, order = c(2,0,0), include.mean = TRUE, method = "ML")
arima200##
## Call:
## arima(x = data.dif2, order = c(2, 0, 0), include.mean = TRUE, method = "ML")
##
## Coefficients:
## ar1 ar2 intercept
## 0.7566 -0.5476 -0.0347
## s.e. 0.0777 0.0772 0.1228
##
## sigma^2 estimated as 1.106: log likelihood = -173.87, aic = 353.73
#ARIMA(2,0,2)
arima202 <- arima(data.dif2, order = c(2,0,2), include.mean = TRUE, method = "ML")
arima202##
## Call:
## arima(x = data.dif2, order = c(2, 0, 2), include.mean = TRUE, method = "ML")
##
## Coefficients:
## ar1 ar2 ma1 ma2 intercept
## 0.6972 -0.7064 -0.0006 0.3732 -0.0306
## s.e. 0.1220 0.0893 0.1449 0.1329 0.1285
##
## sigma^2 estimated as 1.05: log likelihood = -170.95, aic = 351.9
#ARIMA(3,0,2)
arima302 <- arima(data.dif2, order = c(3,0,2), include.mean = TRUE, method = "ML")
arima302##
## Call:
## arima(x = data.dif2, order = c(3, 0, 2), include.mean = TRUE, method = "ML")
##
## Coefficients:
## ar1 ar2 ar3 ma1 ma2 intercept
## 0.8267 -0.8322 0.0977 -0.1198 0.4073 -0.0303
## s.e. 0.3237 0.2991 0.2266 0.3095 0.1522 0.1338
##
## sigma^2 estimated as 1.049: log likelihood = -170.86, aic = 353.72
#ARIMA(4,0,2)
arima402 <- arima(data.dif2, order = c(4,0,2), include.mean = TRUE, method = "ML")
arima402##
## Call:
## arima(x = data.dif2, order = c(4, 0, 2), include.mean = TRUE, method = "ML")
##
## Coefficients:
## ar1 ar2 ar3 ar4 ma1 ma2 intercept
## 0.8186 -0.9603 0.2047 -0.0936 -0.1086 0.5233 -0.0316
## s.e. 0.3964 0.4600 0.3643 0.1902 0.3766 0.2669 0.1295
##
## sigma^2 estimated as 1.047: log likelihood = -170.74, aic = 355.47
AICKandidatModel <- c(374.17, 353.73, 351.9, 353.72, 355.47)
KndidatModelARIMA <- c("ARIMA(0,0,1)", "ARIMA(2,0,0)", "ARIMA(2,0,2)", "ARIMA(3,0,2)", "ARIMA(4,0,2)")
compmodelARIMA <- cbind(KndidatModelARIMA, AICKandidatModel)
colnames(compmodelARIMA) <- c("Kandidat Model", "Nilai AIC")
compmodelARIMA <- as.data.frame(compmodelARIMA)
compmodelARIMA## Kandidat Model Nilai AIC
## 1 ARIMA(0,0,1) 374.17
## 2 ARIMA(2,0,0) 353.73
## 3 ARIMA(2,0,2) 351.9
## 4 ARIMA(3,0,2) 353.72
## 5 ARIMA(4,0,2) 355.47
Model terbaik diperoleh berdasarkan nilai AIC terkecil. Oleh karena itu, model terbaik yang diperoleh adalah \(ARIMA(2,0,2)\)
Model \(ARIMA(2,0,2)\), maka \(Y_t\) diperoleh dari penjabaran operator backshift sehingga untuk model \(ARIMA(2,0,2)\):
\[ \phi(B)(1-B)^dY_t = \mu + \theta_q(B)e_t \] \[ (1-\phi_1B-\phi_2B^2)Y_t = \mu + (1+\theta_1B+\theta_2B^2)e_t \] \[ Y_t-\phi_1Y_{t-1}-\phi_2Y_{t-2} = \mu + e_t+\theta_1e_{t-1}+\theta_2e_{t-2} \] \[ Y_t = \mu + \phi_1Y_{t-1} + \phi_2Y_{t-2} + e_t + \theta_1e_{t-1} + \theta_2e_{t-2} \]
Pendugaan parameter \(\hat{\mu}\) = \(-0.0306\), \(\phi_1\) = \(0.6972\), \(\phi_2\) = \(-0.7064\), \(\theta_1\) = $-0.0006, \(\theta_2\) = \(0.3732\). Model terbaik yang diperoleh yaitu model ARIMA(2,0,2). \[\begin{equation} Y_t = - 0.0306 + 0.6972 Y_{t-1} - 0.7064 Y_{t-2} + e_t - 0.0006 e_{t-1} + 0.3732 e_{t-2} \end{equation}\]
Diagnostic model
ARIMA202diag <- stats::arima(data.dif2, order = c(2,0,2), method = "ML")
checkresiduals(ARIMA202diag)##
## Ljung-Box test
##
## data: Residuals from ARIMA(2,0,2) with non-zero mean
## Q* = 1.5853, df = 6, p-value = 0.9536
##
## Model df: 4. Total lags used: 10
Berdasarkan plot di ACF dan PACF terlihat bahwa tidak ada lag yang signifikan. Artinya tidak ada gejala autokorelasi pada sisaan.
##
## Asymptotic one-sample Kolmogorov-Smirnov test
##
## data: sisaan
## D = 0.036122, p-value = 0.9979
## alternative hypothesis: two-sided
Hasil pengujian meunjukan nilai \(p-value = 0.9979 > 0.05\). Maka sisaan menyebar normal.
##
## One Sample t-test
##
## data: sisaan
## t = 0.032496, df = 117, p-value = 0.9741
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## -0.1845760 0.1907342
## sample estimates:
## mean of x
## 0.003079083
hasil Nilai Tengah Sisaan menunjukan nilai \(p-value = 0.9741 > 0.05\). Maka nilai tengah sisaan sama dengan \(0\)
##
## Box-Ljung test
##
## data: sisaan
## X-squared = 8.192, df = 20, p-value = 0.9905
Hasil uji Autokorelasi menunjukan nilai \(p-value = 0.9905 > 0.05\). Maka tidak terdapat gejala autokorelasi.
Kesimpulan
Jadi dapat disimpulkan bahwa semua asumsi terpenuhi.