Projek EAS _ TDS I
Data
Data yang digunakan dalam projek ini adalah data Daily Air Quality Index (AQI) in Jakarta from January 2010 - February 2025 yang diperoleh dari Kaggle. Data ini memiliki beberapa variabel seperti pm25, pm10, so2, co, o3, no2, max, critical, dan categori. Namun, pada projek ini variabel yang digunakan adalah variabel pm25 pada periode Februari 2024 hingga Februari 2025.
Membaca Data
# Data
data <- read.csv("C:/Users/User/OneDrive/Dokumen/SEMESTER 6/Topik Statistika I/Polutan.csv")
# Data Digunakan
datapolutan <- data$pm25[5173:5539]
tabel_polutan <- data.frame(PM25 = datapolutan)
head(tabel_polutan)## PM25
## 1 59
## 2 82
## 3 72
## 4 58
## 5 56
## 6 56
Model ARIMA
Forming Data TS
## Time Series:
## Start = 1
## End = 6
## Frequency = 1
## [1] 59 82 72 58 56 56
ts.plot(datapolutan.ts, col="blue", ylab = "Konsentrasi PM2.5", xlab = "Hari")
title(main = "Plot Time Seris Konsentrasi PM2.5", sub = "Sumber : Kaggle (https://www.kaggle.com/)",
cex.sub = 0.8)
points(datapolutan.ts, pch = 20, col = "blue")Splitting Data
Sebelum dilakukan analisis lebih lanjut, dilakukan splitting data terlebih dahulu yaitu dengan membagi data menjadi dua bagian bagian pertama yaitu sebagai data training sebesar 80% dan bagian kedua sebagai data testing sebesar 20%.
Uji Stasioner
Uji stasioner dilakukan menggunakan uji Augmented Dickey-Fuller (ADF) dengan hipotesis sebagai berikut: \(H_0\) : Data tidak stasioner
\(H_1\) : Data stasioner
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
##
## Augmented Dickey-Fuller Test
##
## data: train
## Dickey-Fuller = -3.3638, Lag order = 6, p-value = 0.06081
## alternative hypothesis: stationary
Berdasarkan hasil uji ADF diperoleh nilai \(p-value = 0.06081 > 0.05\), maka terima \(H_0\) artinya data tidak stasioner. Sehingga perlu dilakukan differencing data.
Differencing Data
## Warning in adf.test(d1): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: d1
## Dickey-Fuller = -9.9426, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary
Setelah dilakukan differencing orde 1, diperoleh nilai \(p-value = 0.01 < 0.05\), maka data sudah stasioner.
Identifikasi Kandidat Model
Berdasarkan plot ACF terlihat bahwa plot cuts off setelah lag
ke-1 sehingga kandidat model yang diperoleh adalah \(ARIMA(0,1,1)\) dan \(ARIMA(0,1,2)\).
Berdasarkan plot PACF di atas kandidat model yang diperoleh adalah \(ARIMA(2,1,0)\).
##
## Attaching package: 'TSA'
## The following objects are masked from 'package:stats':
##
## acf, arima
## The following object is masked from 'package:utils':
##
## tar
## AR/MA
## 0 1 2 3 4 5 6 7 8 9 10 11 12 13
## 0 x x o o o o o o o o x o o o
## 1 x x o o o o o o o o x o o o
## 2 x x o o o o o o o o o o o o
## 3 x o x x o o o o o o o o o o
## 4 x x x o o o o o o o o o o o
## 5 x x x o o o x o o o o o o o
## 6 x x x o x x x o o o o o o o
## 7 x x o x o x x o o o o o o o
Berdasarkan hasil EACF model yang diperoleh adalah \(ARIMA(2,1,1)\) dan \(ARIMA(2,1,2)\).
Karena data yang digunakan sudah stasioner setelah dilakukan differencing orde 1, maka \(d=1\). Oleh karena itu dapat ditentukan beberapa kandidat model yaitu \(ARIMA(0,1,1)\), \(ARIMA(0,1,2)\), \(ARIMA(2,1,0)\), \(ARIMA(2,1,1)\), dan \(ARIMA(2,1,2)\).
##
## Call:
## arima(x = d1, order = c(0, 0, 1), include.mean = TRUE, method = "ML")
##
## Coefficients:
## ma1 intercept
## -0.5288 -0.0888
## s.e. 0.0680 0.4886
##
## sigma^2 estimated as 311.5: log likelihood = -1252.75, aic = 2509.5
##
## Call:
## arima(x = d1, order = c(0, 0, 2), include.mean = TRUE, method = "ML")
##
## Coefficients:
## ma1 ma2 intercept
## -0.4650 -0.2129 -0.0712
## s.e. 0.0567 0.0620 0.3296
##
## sigma^2 estimated as 299.7: log likelihood = -1247.22, aic = 2500.44
##
## Call:
## arima(x = d1, order = c(2, 0, 0), include.mean = TRUE, method = "ML")
##
## Coefficients:
## ar1 ar2 intercept
## -0.3760 -0.2352 -0.0903
## s.e. 0.0569 0.0568 0.6466
##
## sigma^2 estimated as 315.8: log likelihood = -1254.69, aic = 2515.39
##
## Call:
## arima(x = d1, order = c(2, 0, 1), include.mean = TRUE, method = "ML")
##
## Coefficients:
## ar1 ar2 ma1 intercept
## 0.4448 0.0682 -0.9099 -0.0498
## s.e. 0.0763 0.0697 0.0484 0.1922
##
## sigma^2 estimated as 295.1: log likelihood = -1245.07, aic = 2498.15
##
## Call:
## arima(x = d1, order = c(2, 0, 2), include.mean = TRUE, method = "ML")
##
## Coefficients:
## Warning in sqrt(diag(x$var.coef)): NaNs produced
## ar1 ar2 ma1 ma2 intercept
## -0.0912 0.2472 -0.359 -0.4636 -0.0541
## s.e. NaN NaN NaN NaN 0.2170
##
## sigma^2 estimated as 296: log likelihood = -1245.48, aic = 2500.97
AICKandidatModel <- c(2509.5, 2500.44, 2515.39, 2498.15, 2500.97)
KndidatModelARIMA <- c("ARIMA(0,1,1)", "ARIMA(0,1,2)","ARIMA(2,1,0)", "ARIMA(2,1,1)", "ARIMA(2,1,2)")
compmodelARIMA <- cbind(KndidatModelARIMA, AICKandidatModel)
colnames(compmodelARIMA) <- c("Kandidat Model", "Nilai AIC")
compmodelARIMA <- as.data.frame(compmodelARIMA)
compmodelARIMA## Kandidat Model Nilai AIC
## 1 ARIMA(0,1,1) 2509.5
## 2 ARIMA(0,1,2) 2500.44
## 3 ARIMA(2,1,0) 2515.39
## 4 ARIMA(2,1,1) 2498.15
## 5 ARIMA(2,1,2) 2500.97
Model terbaik adalah model dengan nilai AIC terkecil yaitu model \(ARIMA(2,1,1)\).
Selanjutnya dilakukan uji hipotesisi pada model terbaik untuk mengetahui signifikansi parameternya, dimana: \(H_0\) : Parameter = 0
\(H_1\) : Parameter \(\ne\) 0
Pengambilan keputusan yaitu tolak \(H_0\) jika \(|t_{hitung}|>t_{\frac{\alpha}{2},(n-1)}\)
t-hitung parameter dugaan model \(ARIMA(2,1,1)\) adalah
\(t_{hitung}=\frac{parameter estimasi}{SE parameter estimasi}\)
## [1] -0.2591051
## [1] 5.82962
## [1] 0.9784792
## [1] -18.79959
prmtrdugaanarima211 <- c("mu", "phi(1)", "phi(2)", "theta(1)")
thitungprmtrarima211 <- c(abs(intercept_mu), abs(ar_phi1), abs(ar_phi2), abs(ma_theta1))
ttabel <- c("2.228139", "2.228139", "2.228139")
keputusan <- c("Tidak Signifikan", "Signifikan", "Tidak Signifikan", "Signifikan")
tksignimodelarima211 <- cbind(prmtrdugaanarima211, thitungprmtrarima211, ttabel, keputusan)## Warning in cbind(prmtrdugaanarima211, thitungprmtrarima211, ttabel, keputusan):
## number of rows of result is not a multiple of vector length (arg 3)
colnames(tksignimodelarima211) <- c("Parameter dugaan", "T-Hitung", "T-Tabel", "Keputusan")
tksignimodelarima211 <- as.data.frame(tksignimodelarima211)
tksignimodelarima211## Parameter dugaan T-Hitung T-Tabel Keputusan
## 1 mu 0.259105098855359 2.228139 Tidak Signifikan
## 2 phi(1) 5.82961992136304 2.228139 Signifikan
## 3 phi(2) 0.978479196556671 2.228139 Tidak Signifikan
## 4 theta(1) 18.7995867768595 2.228139 Signifikan
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.444751 0.076339 5.8260 5.677e-09 ***
## ar2 0.068215 0.069703 0.9787 0.3277
## ma1 -0.909932 0.048409 -18.7967 < 2.2e-16 ***
## intercept -0.049769 0.192201 -0.2589 0.7957
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Berdasarkan tabel di atas maka pengambilan keputusan dari paramater dugaan dari model \(ARIMA(2,1,1)\) diperoleh parameter yang tidak siknifikan adalah \(\mu\) dan \(\phi_2\).
Selanjutnya model ini digunak untuk forecasting
Dibandingkan dengan auto.arima
## Series: train
## ARIMA(2,1,1)
##
## Coefficients:
## ar1 ar2 ma1
## 0.4446 0.0680 -0.9095
## s.e. 0.0766 0.0698 0.0488
##
## sigma^2 = 298.2: log likelihood = -1245.11
## AIC=2498.21 AICc=2498.35 BIC=2512.92
Dibandingkan dengan Arima
## Series: d1
## ARIMA(0,0,1) with non-zero mean
##
## Coefficients:
## ma1 mean
## -0.5287 -0.0866
## s.e. 0.0680 0.4886
##
## sigma^2 = 313.7: log likelihood = -1252.75
## AIC=2511.5 AICc=2511.58 BIC=2522.53
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 0.02553106 17.65012 13.09682 NaN Inf 0.570706 0.09164382
## Series: d1
## ARIMA(0,0,2) with non-zero mean
##
## Coefficients:
## ma1 ma2 mean
## -0.4649 -0.2128 -0.0712
## s.e. 0.0567 0.0620 0.3298
##
## sigma^2 = 302.9: log likelihood = -1247.22
## AIC=2502.44 AICc=2502.58 BIC=2517.14
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 0.003162754 17.31311 12.8583 NaN Inf 0.5603122 0.0131159
## Series: d1
## ARIMA(2,0,0) with non-zero mean
##
## Coefficients:
## ar1 ar2 mean
## -0.3760 -0.2352 -0.0903
## s.e. 0.0569 0.0568 0.6466
##
## sigma^2 = 319.1: log likelihood = -1254.69
## AIC=2517.39 AICc=2517.53 BIC=2532.09
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 0.02920987 17.77149 13.11283 NaN Inf 0.5714036 -0.02543065
## Series: d1
## ARIMA(2,0,1) with non-zero mean
##
## Coefficients:
## ar1 ar2 ma1 mean
## 0.4447 0.0681 -0.9099 -0.0497
## s.e. 0.0764 0.0697 0.0485 0.1923
##
## sigma^2 = 299.2: log likelihood = -1245.07
## AIC=2500.15 AICc=2500.36 BIC=2518.53
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 0.02489927 17.1785 12.87801 NaN Inf 0.5611713 -0.005565121
## Series: d1
## ARIMA(2,0,2) with non-zero mean
##
## Coefficients:
## ar1 ar2 ma1 ma2 mean
## 1.0700 -0.2082 -1.5447 0.5658 -0.0427
## s.e. 0.3395 0.1952 0.3270 0.3017 0.1616
##
## sigma^2 = 298.8: log likelihood = -1244.39
## AIC=2500.78 AICc=2501.07 BIC=2522.84
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 0.1013553 17.13665 12.79929 NaN Inf 0.5577407 0.002194455
Model Terbaik
AICKandidatModel <- c(model1$aic, model2$aic, model3$aic, model4$aic, model5$aic)
AICcKandidatModel <- c(model1$aicc, model2$aicc, model3$aicc, model4$aicc, model5$aicc)
BICKandidatModel <- c(model1$bic, model2$bic, model3$bic, model4$bic, model5$bic)
KandidatModelARIMA <- c("ARIMA(0,1,1)", "ARIMA(0,1,2)", "ARIMA(2,1,0)", "ARIMA(2,1,1)", "ARIMA(2,1,2)")
compmodelARIMA <- cbind(KandidatModelARIMA, AICKandidatModel, AICcKandidatModel, BICKandidatModel)
colnames(compmodelARIMA) <- c("Kandidat Model", "Nilai AIC", "Nilai AICc", "Nilai BIC")
compmodelARIMA <- as.data.frame(compmodelARIMA)
compmodelARIMA## Kandidat Model Nilai AIC Nilai AICc Nilai BIC
## 1 ARIMA(0,1,1) 2511.5015654662 2511.58489879954 2522.53182687301
## 2 ARIMA(0,1,2) 2502.43667921045 2502.57605203275 2517.14369441952
## 3 ARIMA(2,1,0) 2517.38685295474 2517.52622577704 2532.09386816382
## 4 ARIMA(2,1,1) 2500.14775860858 2500.35754881837 2518.53152761992
## 5 ARIMA(2,1,2) 2500.77791857857 2501.07265542067 2522.83844139218
Berdasarkan hasil analisis menggunakan fungsi arima, auto.arima, dan Arima, ketiga fungsi ini menghasilkan model terbaik yang sama yaitu \(ARIMA(2,1,1)\). Oleh karena itu, model yang disarankan untuk dianalisis lebih lanjut yaitu dengan menggunakan model \(ARIMA(2,1,1)\) yang mana ini adalah model terbaik (dengan nilai AIC terkecil) dari kandidat model berdasarkan ACF, PACF, dan EACF.
Maka diperoleh \(Y_t\) dari penjabaran operator backshift dari model \(ARIMA(2,1,1)\) adalah \[ \scriptstyle \begin{aligned} \phi_p(B)(1-B)^dY_t &= \mu+\theta_q(B)e_t \\ \phi_2(B)(1-B)^1Y_t &= \mu+\theta_1(B)e_t \\ (1-\phi_1B-\phi_2B^2)(1-B)Y_t &= \mu+(1-\theta_1B)e_t \\ Y_t-(1+\phi_1)Y_{t-1}-(\phi_1+\phi_2)Y_{t-2}-\phi_2Y_{t-3} &= \mu+e_t-\theta_1e_{t-1}\\ Y_t &= \mu +e_t+(1+\phi_1)Y_{t-1}+(\phi_1+\phi_2)Y_{t-2}+\phi_2Y_{t-3}-\theta_1e_{t-1}\\ \end{aligned} \]
Penduga parameter \(\hat{\mu}=-0.0498\), \(\hat{\phi_1}=0.4448\), \(\hat{\phi_2}=0.0682\), \(\hat{\theta_1}=-0.9099\), dengan nilai dugaan \(\sigma_e^2\) sebesar 295.1. Sehingga model yang diperoleh adalah \[ \begin{aligned} Y_t &= \mu +e_t+(1+\phi_1)Y_{t-1}+(\phi_1+\phi_2)Y_{t-2}+\phi_2Y_{t-3}-\theta_1e_{t-1}\\ Y_t &= -0.0498+1.4448Y_{t-1}+0.513Y_{t-2}+0.0682Y_{t-3}+0.9099e_{t-1}+e_t\\ \end{aligned} \]
Diagnostic Model
library(forecast)
ARIMA211diag <- stats::arima(d1, order = c(2,0,1), method = "ML")
checkresiduals(ARIMA211diag)##
## Ljung-Box test
##
## data: Residuals from ARIMA(2,0,1) with non-zero mean
## Q* = 9.3503, df = 7, p-value = 0.2285
##
## Model df: 3. Total lags used: 10
Berdasarkan plot di atas terlihat bahwa sisaan tidak mengikuti sebaran normal. Selanjutnya, ditinjau dari plot ACF terlihat bahwa tidak ada lag yang signifikan. Hal tersebut menunjukkan bahwa tidak ada gejala autokorelasi pada sisaan. Selanjutnya, untuk memastikan kembali akan dilakukan uji asumsi secara formal:
Uji Formal Normalitas Data
\(H_0\) : sisaan menyebar normal
\(H_1\) : sisaan tidak menyebar normal
##
## Asymptotic one-sample Kolmogorov-Smirnov test
##
## data: sisaan
## D = 0.46692, p-value < 2.2e-16
## alternative hypothesis: two-sided
Berdasarkan hasil uji formal normalitas diperoleh nilai \(p-value=2.2\times10^{-16}<0.05\), maka tolak \(H_0\) artinya sisaan tidak menyebar normal.
Uji Nilai Tengah Sisaan
\(H_0\) : \(\mu=0\)
\(H_1\) : \(\mu\neq0\)
##
## One Sample t-test
##
## data: sisaan
## t = 0.025276, df = 291, p-value = 0.9799
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## -1.956513 2.007419
## sample estimates:
## mean of x
## 0.02545338
Berdasarkan hasil uji nilai tengah sisaan diperoleh nilai \(p-value=0.9799>0.05\), maka terima \(H_0\) artinya nilai tengah sisaan sama dengan 0.
Uji Autokorelasi
\(H_0\) : tidak ada autokorelasi
\(H_1\) : terdapat autokorelasi
##
## Box-Ljung test
##
## data: sisaan
## X-squared = 24.663, df = 23, p-value = 0.3679
Berdasarkan hasil uji autokorelasi diperoleh nilai \(p-value=0.3679>0.05\), maka terima \(H_0\) artinya tidak ada gejala autokorelasi.
Forecasting
Selanjutnya dilakukan forecasting hasil dari pemodelan dengan menggunakan data training, yaitu sebagai berikut:
Fitted model \(ARIMA(2,1,1)\) dengan data training
Visualisasi hasil fitted model \(ARIMA(2,1,1)\) dengan data training
ts.plot(d1, col="blue", ylab = "Konsentrasi Polutan", xlab = "Hari")
title(main = "Fitted Time Series Plot", cex.sub = 0.8)
points(d1, pch = 20, col = "blue")
par(col="red")
lines(predictARIMA211)forecasting <- predict(arima211, n.ahead = 5)
hasilforcasting <- as.data.frame(forecasting)
hasilforcasting## pred se
## 1 6.5838091 17.17849
## 2 2.1535487 18.94619
## 3 1.3826724 19.09537
## 4 0.7376117 19.16267
## 5 0.3981344 19.18269
Visualisasi hasil forecasting model \(ARIMA(2,1,1)\) 5 waktu ke depan
## Warning in plot.window(...): "n.ahead" is not a graphical parameter
## Warning in plot.xy(xy, type, ...): "n.ahead" is not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "n.ahead" is not a
## graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "n.ahead" is not a
## graphical parameter
## Warning in box(...): "n.ahead" is not a graphical parameter
## Warning in title(...): "n.ahead" is not a graphical parameter
## Warning in plot.window(...): "n.ahead" is not a graphical parameter
## Warning in plot.xy(xy, type, ...): "n.ahead" is not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "n.ahead" is not a
## graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "n.ahead" is not a
## graphical parameter
## Warning in box(...): "n.ahead" is not a graphical parameter
## Warning in title(...): "n.ahead" is not a graphical parameter
title(main = "Fitted Time Series Plot", cex.sub = 0.8)
points(d1, pch = 20, col = "blue")
par(col="red")
lines(predictARIMA211)
Fitted model \(ARIMA(2,1,1)\)
dengan data testing
##
## Call:
## arima(x = test, order = c(2, 1, 1), include.mean = TRUE, method = "ML")
##
## Coefficients:
## ar1 ar2 ma1
## 0.2034 0.1151 -0.9188
## s.e. 0.1386 0.1415 0.0745
##
## sigma^2 estimated as 207.4: log likelihood = -294.84, aic = 595.68
Visualisasi hasil fitted model \(ARIMA(2,1,1)\) dengan data testing
ts.plot(test, col="blue", ylab = "Konsentrasi Polutan", xlab = "Hari")
title(main = "Fitted Time Series Plot", cex.sub = 0.8)
points(test, pch = 20, col = "blue")
par(col="red")
lines(predictARIMA211test)Akurasi Model dalam Forecasting
Hasil forecasting model \(ARIMA(2,1,1)\) dibandingkan dengan data aktual.
## ME RMSE MAE MPE MAPE MASE
## Training set 0.02489927 17.17850 12.87801 NaN Inf 0.5611713
## Test set 64.33627763 66.09364 64.33628 99.77854 99.77854 2.8035125
## ACF1
## Training set -0.005565121
## Test set NA
Model SARIMA
Forming Data TS
## [1] 367
ts.plot(data2.ts, type="l", ylab="Konsentrasi PM2.5", xlab="Hari", col="blue")
title(main = "Plot Time Series Konsentrasi PM2.5", cex.sub = 0.8)
points(data2.ts, pch = 20, col = "blue")library(forecast)
seasonplot(data2.ts,7,main="Plot Musiman Konsentrasi PM2.5", ylab="Konsentrasi PM2.5",year.labels = TRUE,
col=rainbow(18))Splitting Data
Sebelum dilakukan analisis lebih lanjut, dilakukan splitting data terlebih dahulu yaitu dengan membagi data menjadi dua bagian bagian pertama yaitu sebagai data training sebesar 80% dan bagian kedua sebagai data testing sebesar 20%.
ARIMA Non-Seasonal
Identifikasi Kestasioneran Data
acf0$lag <- acf0$lag * 7
acf0.1 <- as.data.frame(cbind(acf0$acf,acf0$lag))
acf0.2 <- acf0.1[which(acf0.1$V2%%7==0),]
barplot(height = acf0.2$V1,
names.arg=acf0.2$V2, ylab="ACF", xlab="Lag")
Plot ACF menunjukkan data Konsentrasi PM2.5 tidak stasioner sehingga
perlu dilakukan proses differencing.
Differencing non-seasonal \(d=1\) jika dilihat berdasarkan plot di atas
berhasil mengatasi ketidakstasioneran dalam rataan untuk komponen
non-seasonal.
Identifikasi Kestasioneran Data setelah Differencing
acf2 <- acf1$lag <- acf1$lag * 7
acf1.1 <- as.data.frame(cbind(acf1$acf,acf1$lag))
acf1.2 <- acf1.1[which(acf1.1$V2%%7==0),]
barplot(height = acf1.2$V1, names.arg=acf1.2$V2, ylab="ACF", xlab="Lag")
Berdasarkan plot ACF data non-seasonal differencing \(d=1\) dapat dilihat pada series
non-seasonal sudah stasioner.
ARIMA Seasonal
Identifikasi Kandidat Model
Berdasarkan plot di atas diperoleh komponen Seasonal adalah
\(ARIMA(0,1,1)_7\)
## AR/MA
## 0 1 2 3 4 5 6 7 8 9 10 11 12 13
## 0 x o o o o x x x x o x o o o
## 1 x x o o o o x x x o x o o o
## 2 x x x o o o x x x x o o o o
## 3 x x o o o o x o x x o o o o
## 4 x o o o o o x o x o o o o o
## 5 x x o o o o x x x o o o o o
## 6 x o o x x x x x o o o o o o
## 7 x x o x o x x x o o x o o o
Berdasarkan plot EACF identifikasi komponen non-seasonal yaitu \(ARIMA(1,1,0)\), \(ARIMA(2,1,1)\), dan \(ARIMA(3,1,2)\). Identifikasi komonen seasonal adalah \(ARIMA(0,1,1)_7\), sehingga model yang diperoleh adalah
- \(ARIMA(1,1,0)\times(0,1,1)_7\)
- \(ARIMA(2,1,1)\times(0,1,1)_7\)
- \(ARIMA(3,1,2)\times(0,1,1)_7\)
## Series: train2
## ARIMA(1,1,0)(0,1,1)[7]
##
## Coefficients:
## ar1 sma1
## -0.3101 -1.000
## s.e. 0.0564 0.049
##
## sigma^2 = 335.9: log likelihood = -1245.41
## AIC=2496.82 AICc=2496.91 BIC=2507.78
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.2926647 18.01288 13.1161 -4.962256 18.3486 0.6243684
## ACF1
## Training set -0.07197137
## Series: train2
## ARIMA(2,1,1)(0,1,1)[7]
##
## Coefficients:
## ar1 ar2 ma1 sma1
## 0.4319 0.0703 -0.8971 -1.0000
## s.e. 0.0797 0.0715 0.0548 0.0936
##
## sigma^2 = 300.4: log likelihood = -1229.34
## AIC=2468.69 AICc=2468.9 BIC=2486.95
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -1.326505 16.97228 12.59701 -7.860175 18.83492 0.599658
## ACF1
## Training set -0.01314608
## Series: train2
## ARIMA(3,1,2)(0,1,1)[7]
##
## Coefficients:
## ar1 ar2 ar3 ma1 ma2 sma1
## -0.4666 0.4580 0.0605 0.0016 -0.8050 -1.0000
## s.e. 0.6100 0.2832 0.0927 0.6053 0.5499 0.0926
##
## sigma^2 = 302.5: log likelihood = -1229.34
## AIC=2472.68 AICc=2473.09 BIC=2498.25
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -1.323734 16.97168 12.59785 -7.856358 18.83684 0.5996981
## ACF1
## Training set -0.01307604
Model Terbaik
AICKandidatModel <- c(tmodel1$aic, tmodel2$aic, tmodel3$aic)
AICcKandidatModel <- c(tmodel1$aicc, tmodel2$aicc, tmodel3$aicc)
BICKandidatModel <- c(tmodel1$bic, tmodel2$bic, tmodel3$bic)
KandidatModelARIMA <- c("ARIMA(1,1,0)(0,1,1)7", "ARIMA(2,1,1)(0,1,1)7", "ARIMA(3,1,2)(0,1,1)7")
compmodelARIMA <- cbind(KandidatModelARIMA, AICKandidatModel, AICcKandidatModel, BICKandidatModel)
colnames(compmodelARIMA) <- c("Kandidat Model", "Nilai AIC", "Nilai AICc", "Nilai BIC")
compmodelARIMA <- as.data.frame(compmodelARIMA)
compmodelARIMA## Kandidat Model Nilai AIC Nilai AICc Nilai BIC
## 1 ARIMA(1,1,0)(0,1,1)7 2496.8213353872 2496.90674463987 2507.77880292801
## 2 ARIMA(2,1,1)(0,1,1)7 2468.68915059246 2468.9042043559 2486.95159649381
## 3 ARIMA(3,1,2)(0,1,1)7 2472.68192235584 2473.08625448581 2498.24934661772
Model terbaik diperoleh berdasarkan nilai AIC terkecil dari kandidat model. Oleh karena itu, model terbaik yang diperoleh adalah \(ARIMA(2,1,1)\times(0,1,1)_7\).
Bandingkan dengan Fungsi auto.arima
## Series: train2
## ARIMA(2,1,1)
##
## Coefficients:
## ar1 ar2 ma1
## 0.4446 0.0680 -0.9095
## s.e. 0.0766 0.0698 0.0488
##
## sigma^2 = 298.2: log likelihood = -1245.11
## AIC=2498.21 AICc=2498.35 BIC=2512.92
Model terbaik dari fungsi auto.arima adalah \(ARIMA(2,1,1)\times(0,0,1)_7\) dengan nilai AIC sebesar 2498.21. Jika dibandingkan dengan model terbaik dari fungsi Arima yaitu \(ARIMA(2,1,1)\times(0,1,1)_7\) dengan nilai AIC sebesar 2477.71. Oleh karena itu, model yang disarankan untuk dianalisis lebih lanjut yaitu dengan menggunakan model \(ARIMA(2,1,1)\times(0,1,1)_7\) yang mana ini adalah model terbaik (dengan nilai AIC terkecil) dari kandidat model berdasarkan ACF, PACF, dan EACF.
Maka diperoleh \(X_t\) dari penjabaran operator backshift dari model \(ARIMA(2,1,1)\times(0,1,1)_7\) adalah \[ \scriptstyle \begin{aligned} \phi_p(B)\Phi_P(1-B)^d(1-B^s)^DX_t = \theta_q(B)\Theta_Q(B^s)e_t \\ \phi_2(B)\Phi_0(B^7)(1-B)^1(1-B^7)^1X_t = \theta_1(B)\Theta_1(B^7)e_t \\ (1-\phi_1B-\phi_2B^2)(1-B)(1-B^7)X_t = (1-\theta_1B)(1-\Theta_1B^7)e_t \\ X_t-(1+\phi_1)X_{t-1}-(\phi_1+\phi_2)X_{t-2}+\phi_2X_{t-3}-X_{t-7}+(1+\phi_1)X_{t-8}-(\phi_1-\phi_2)X_{t-9}-\phi_2X_{t-10} = e_t+\Theta_1e_{t-7}-\theta_1e_{t-1}+\theta_1\Theta_1e_{t-8}\\ Y_t = e_t+(1+\phi_1)X_{t-1}+(\phi_1+\phi_2)X_{t-2}-\phi_2X_{t-3}+X_{t-7}-(1+\phi_1)X_{t-8}+(\phi_1-\phi_2)X_{t-9}+\phi_2X_{t-10}+\Theta_1e_{t-7}-\theta_1e_{t-1}+\theta_1\Theta_1e_{t-8}\\ \end{aligned} \]
Pengujian Parameter
printstatarima <- function (x, digits = 4,se=TRUE){
if (length(x$coef) > 0) {
cat("\nCoefficients:\n")
coef <- round(x$coef, digits = digits)
if (se && nrow(x$var.coef)) {
ses <- rep(0, length(coef))
ses[x$mask] <- round(sqrt(diag(x$var.coef)), digits = digits)
coef <- matrix(coef, 1, dimnames = list(NULL, names(coef)))
coef <- rbind(coef, s.e. = ses)
statt <- coef[1,]/ses
pval <- 2*pt(abs(statt), df=length(x$residuals)-1, lower.tail = FALSE)
coef <- rbind(coef, t=round(statt,digits=digits),sign.=round(pval,digits=digits))
coef <- t(coef)
}
print.default(coef, print.gap = 2)
}
}##
## Coefficients:
## s.e. t sign.
## ar1 0.4319 0.0797 5.4191 0.0000
## ar2 0.0703 0.0715 0.9832 0.3263
## ma1 -0.8971 0.0548 -16.3704 0.0000
## sma1 -1.0000 0.0936 -10.6838 0.0000
Berdasarkan hasil pengujian parameter pada model terbaik, diperoleh semua dugaan parameter model berpengaruh nyata kecuali \(\phi_2\).
Penduga parameter \(\hat{\phi_1}=0.4358\), \(\hat{\phi_2}=0.0784\), \(\hat{\theta_1}=-0.9052\), \(\hat{\Theta}=-1\) dengan nilai dugaan \(\sigma_e^2\) sebesar 300.7. Sehingga model yang diperoleh adalah \[ \scriptstyle \begin{aligned} Y_t = e_t+(1+\phi_1)X_{t-1}+(\phi_1+\phi_2)X_{t-2}-\phi_2X_{t-3}+X_{t-7}-(1+\phi_1)X_{t-8}+(\phi_1-\phi_2)X_{t-9}+\phi_2X_{t-10}+\Theta_1e_{t-7}-\theta_1e_{t-1}+\theta_1\Theta_1e_{t-8}\\ Y_t = e_t+1.4358X_{t-1}+0.5142X_{t-2}-0.0784X_{t-3}+X_{t-7}-1.4358X_{t-8}+0.3574X_{t-9}+0.0784X_{t-10}-e_{t-7}+0.9052e_{t-1}+0.9052e_{t-8}\\ \end{aligned} \]
Diagnostic Model
## lags statistic df p-value
## 5 1.497849 1 0.2210027
## 10 9.445425 6 0.1500336
## 15 17.049665 11 0.1064151
## 20 21.232141 16 0.1697589
## 25 26.310705 21 0.1948488
## 30 29.223645 26 0.3010104
Berdasarkan hasil uji LjungBox di atas terdapat autokorelasi pada sisaan, karena nilai \(p−value\) semua lag tidak signifikan atau \(p−value>\alpha=0.05\).
Selanjutnya dilakukan uji asumsi formal terhadap kenormalan sisan dengan menggunakan uji Jarque Bera.
##
## Jarque Bera Test
##
## data: residuals(tmodel2)
## X-squared = 37.477, df = 2, p-value = 7.277e-09
Berdasarkan hasil uji kenormalan dengan uji Jarque Bera sisaan tidak menyebar normal, karena nilai \(p−value=2.107\times10^{-8}<0.05\).
Forecasting
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 42.85714 45.66976 23.18881 68.15071 11.288114 80.05140
## 43.00000 44.69652 19.20249 70.19056 5.706766 83.68628
## 43.14286 45.43013 18.36490 72.49536 4.037430 86.82283
## 43.28571 48.28338 20.30633 76.26043 5.496177 91.07059
## 43.42857 45.14856 16.53263 73.76450 1.384266 88.91286
## ME RMSE MAE MPE MAPE MASE
## Training set -1.326505 16.97228 12.59701 -7.860175 18.83492 0.5996580
## Test set 18.835348 24.41990 20.36925 24.538964 29.51477 0.9696413
## ACF1 Theil's U
## Training set -0.01314608 NA
## Test set 0.30974745 1.220926
forecast_arima2 <- cbind(ramalan_arima2$mean,ramalan_arima2$lower,ramalan_arima2$upper)
forecast_arima2## Time Series:
## Start = c(42, 7)
## End = c(53, 3)
## Frequency = 7
## ramalan_arima2$mean ramalan_arima2$lower.80% ramalan_arima2$lower.95%
## 42.85714 45.66976 23.1888092 11.2881138
## 43.00000 44.69652 19.2024917 5.7067664
## 43.14286 45.43013 18.3648984 4.0374297
## 43.28571 48.28338 20.3063328 5.4961775
## 43.42857 45.14856 16.5326268 1.3842663
## 43.57143 43.21001 14.0871235 -1.3296039
## 43.71429 48.54710 18.9850859 3.3359008
## 43.85714 50.13570 20.0796194 4.1688934
## 44.00000 47.25349 16.7718725 0.6358765
## 44.14286 46.66455 15.7867497 -0.5589681
## 44.28571 48.81233 17.5556925 1.0094275
## 44.42857 45.27980 13.6552630 -3.0857567
## 44.57143 43.11987 11.1349667 -5.7968183
## 44.71429 48.33337 15.9935513 -1.1261133
## 44.85714 49.85302 17.0799418 -0.2690755
## 45.00000 46.93234 13.7700810 -3.7849603
## 45.14286 46.32193 12.7901530 -4.9604973
## 45.28571 48.45774 14.5673618 -3.3731202
## 45.42857 44.91853 10.6767661 -7.4497275
## 45.57143 42.75487 8.1669483 -10.1427900
## 45.71429 47.96628 13.0360892 -5.4548393
## 45.85714 49.48477 14.1370945 -4.5748357
## 46.00000 46.56345 10.8385626 -8.0730517
## 46.14286 45.95268 9.8704020 -9.2304007
## 46.28571 48.08828 11.6585882 -7.6261283
## 46.42857 44.54896 7.7783437 -11.6868455
## 46.57143 42.38524 5.2783201 -14.3648956
## 46.71429 47.59662 10.1567089 -9.6627830
## 46.85714 49.11509 11.2699329 -8.7640821
## 47.00000 46.19376 7.9807415 -12.2480073
## 47.14286 45.58298 7.0222767 -13.3905247
## 47.28571 47.71858 8.8194698 -11.7724757
## 47.42857 44.17926 4.9477005 -15.8202299
## 47.57143 42.01553 2.4556895 -18.4860254
## 47.71429 47.22691 7.3416400 -13.7723479
## 47.85714 48.74538 8.4649442 -12.8582306
## 48.00000 45.82405 5.1834035 -16.3304547
## 48.14286 45.21327 4.2329632 -17.4606997
## 48.28571 47.34887 6.0376170 -15.8312405
## 48.42857 43.80955 2.1728748 -19.8682479
## 48.57143 41.64582 -0.3124877 -22.5238753
## 48.71429 46.85720 4.5797392 -17.8005988
## 48.85714 48.37567 5.7114906 -16.8735625
## 49.00000 45.45434 2.4363136 -20.3360543
## 49.14286 44.84356 1.4926162 -21.4559869
## 49.28571 46.97916 3.3035426 -19.8169346
## 49.42857 43.43984 -0.5552878 -23.8449006
## 49.57143 41.27611 -3.0350541 -26.4919695
## 49.71429 46.48750 1.8624556 -21.7606136
## 49.85714 48.00596 3.0013828 -20.8226029
## 50.00000 45.08463 -0.2684307 -24.2768919
## 50.14286 44.47385 -1.2063885 -25.3880465
## 50.28571 46.60946 0.6098787 -23.7408262
## 50.42857 43.07013 -3.2439164 -27.7610913
## 50.57143 40.90641 -5.7189153 -30.4008690
## 50.71429 46.11779 -0.8169063 -25.6626321
## 50.85714 47.63625 0.3281871 -24.7151911
## 51.00000 44.71492 -2.9370550 -28.1624887
## 51.14286 44.10414 -3.8700732 -29.2660888
## 51.28571 46.23975 -2.0492095 -27.6118388
## 51.42857 42.70042 -5.8986705 -31.6254754
## 51.57143 40.53670 -8.3695659 -34.2589775
## 51.71429 45.74808 -3.4636862 -29.5148207
## 51.85714 47.26655 -2.3132416 -28.5591958
## 52.00000 44.34521 -5.5745496 -32.0004768
## 52.14286 43.73443 -6.5032764 -33.0975138
## 52.28571 45.87004 -4.6784201 -31.4371575
## 52.42857 42.33071 -8.5241165 -35.4450367
## 52.57143 40.16699 -10.9914483 -38.0730887
## 52.71429 45.37837 -6.0822093 -33.3237943
## 52.85714 46.89684 -4.9270810 -32.3610062
## 53.00000 43.97551 -8.1849751 -35.7970661
## 53.14286 43.36473 -9.1099429 -36.8883542
## 53.28571 45.50033 -7.2815903 -35.2226509
## ramalan_arima2$upper.80% ramalan_arima2$upper.95%
## 42.85714 68.15071 80.05140
## 43.00000 70.19056 83.68628
## 43.14286 72.49536 86.82283
## 43.28571 76.26043 91.07059
## 43.42857 73.76450 88.91286
## 43.57143 72.33291 87.74963
## 43.71429 78.10911 93.75830
## 43.85714 80.19177 96.10250
## 44.00000 77.73512 93.87111
## 44.14286 77.54234 93.88806
## 44.28571 80.06897 96.61523
## 44.42857 76.90434 93.64536
## 44.57143 75.10477 92.03656
## 44.71429 80.67318 97.79285
## 44.85714 82.62609 99.97511
## 45.00000 80.09461 97.64965
## 45.14286 79.85371 97.60436
## 45.28571 82.34812 100.28860
## 45.42857 79.16029 97.28678
## 45.57143 77.34279 95.65252
## 45.71429 82.89648 101.38741
## 45.85714 84.83245 103.54438
## 46.00000 82.28834 101.19996
## 46.14286 82.03495 101.13576
## 46.28571 84.51798 103.80270
## 46.42857 81.31958 100.78477
## 46.57143 79.49216 99.13537
## 46.71429 85.03653 104.85602
## 46.85714 86.96024 106.99426
## 47.00000 84.40677 104.63552
## 47.14286 84.14368 104.55648
## 47.28571 86.61769 107.20964
## 47.42857 83.41081 104.17874
## 47.57143 81.57537 102.51709
## 47.71429 87.11219 108.22617
## 47.85714 89.02582 110.34899
## 48.00000 86.46469 107.97855
## 48.14286 86.19357 107.88724
## 48.28571 88.66013 110.52898
## 48.42857 85.44622 107.48734
## 48.57143 83.60413 105.81552
## 48.71429 89.13467 111.51501
## 48.85714 91.03985 113.62491
## 49.00000 88.47237 111.24474
## 49.14286 88.19450 111.14311
## 49.28571 90.65478 113.77526
## 49.42857 87.43496 110.72458
## 49.57143 85.58728 109.04420
## 49.71429 91.11254 114.73560
## 49.85714 93.01054 116.83453
## 50.00000 90.43769 114.44616
## 50.14286 90.15409 114.33575
## 50.28571 92.60903 116.95974
## 50.42857 89.38418 113.90135
## 50.57143 87.53173 112.21368
## 50.71429 93.05248 117.89821
## 50.85714 94.94432 119.98770
## 51.00000 92.36690 117.59234
## 51.14286 92.07836 117.47437
## 51.28571 94.52870 120.09133
## 51.42857 91.29951 117.02632
## 51.57143 89.44296 115.33237
## 51.71429 94.95984 121.01098
## 51.85714 96.84633 123.09229
## 52.00000 94.26498 120.69091
## 52.14286 93.97214 120.56638
## 52.28571 96.41850 123.17723
## 52.42857 93.18554 120.10646
## 52.57143 91.32542 118.40706
## 52.71429 96.83895 124.08053
## 52.85714 98.72076 126.15468
## 53.00000 96.13599 123.74808
## 53.14286 95.83939 123.61780
## 53.28571 98.28225 126.22331