library(fpp2)
library(ggplot2)
library(tseries)
library(openxlsx)
library(tidyverse)
library(kableExtra)
a.
A series is a white noise if 95% of spikes in the ACF lies within ±2√T, where
T = length of time series.
Therefore, as per the equation, as T gets larger, the range between blue dashed line around zero(as mean) gets narrower In first, second and third series; few spikes do seem to touch the blue dashed line(95% interval border). Since correlations for all the seires lies within the 95% range, it confirms that the data has white noise.
b.
According to the equation: ±2√T , It follows the law of large numbers. As the no. of observations increase or as the length of the time series increase, no. of outliers from the mean decreases. Now, since the length of each series is different, there will be different equations for white noise.
Let’s plot the dataset and also ACF and PACF plots:
ggtsdisplay(ibmclose)
As we see from above graph, there is an increasing trend followed by a downward trend. Therefore, timeseries data that has a trend or a seasonality is not stationary.
ACF and PACF plots, autocorrelation lines are well beyond the range of 95% interval, which clearly shows that the data is not white noise.
Before beginning let’s first know how a time series is said to be stationary if it holds the following conditions true.
The mean value of time-series is constant over time, which implies, the trend component is nullified.
The variance does not increase over time.
Seasonality effect is minimal.
a.
Let’s plot the series usnetelec
autoplot(usnetelec, main = "US Net Electricity Generation") +
theme(axis.title = element_blank())
As we see from the above graph, there is a linear trend. Let’s see if differencing it can make it stationary.
Since usnetelec is a non-seasonal data, there is no seasonal differencing required.
Now, let’s see no. of differences need to make it stationary
# Make it stationary
ndiffs(usnetelec) # number of differences need to make it stationary
## [1] 1
According to the number above, data should be differenced at least 1 i.e. >1
Differenciating twice would stationarize our data:
stationaryTS <- diff(usnetelec, differences= 2)
plot(stationaryTS, type="l", main="Differenced and Stationary") # appears to be stationary
As we see there is a change in the linear trend of the data, and data looks somewhat stationery
In order to confirm the above result, let’s do a Augmented Dickey-Fuller Test (adf test). A p-Value of less than 0.05 in adf. test() indicates that it is stationary.
adf.test(stationaryTS) # p-value < 0.05 indicates the TS is stationary
## Warning in adf.test(stationaryTS): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: stationaryTS
## Dickey-Fuller = -5.5409, Lag order = 3, p-value = 0.01
## alternative hypothesis: stationary
kpss.test(stationaryTS)
## Warning in kpss.test(stationaryTS): p-value greater than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: stationaryTS
## KPSS Level = 0.098532, Truncation lag parameter = 3, p-value = 0.1
As we see, adf test has a p value less than 0.05, which confirms that our data is stationery.
usnetelec_lambda <- BoxCox.lambda(usnetelec)
bc_usnetelec <- BoxCox(usgdp, lambda = usnetelec_lambda)
print(usnetelec_lambda)
## [1] 0.5167714
plot(diff(bc_usnetelec), type="l", main="Differenced and Stationary") # appears to be stationary
With a lambda value of 0.5167714, this data can be stationerized.
b.
Let’s plot the series usgdp
autoplot(usgdp, main = "Quarterly US Gross Domestic Product") +
theme(axis.title = element_blank())
As we see from the above graph, there is a linear incresing trend. Let’s see if differencing it can make it stationary.
Since usnetelec is a non-seasonal data, there is no seasonal differencing required.
Now, let’s see no. of differences need to make it stationary
# Make it stationary
ndiffs(usgdp) # number of differences need to make it stationary
## [1] 2
According to the number above, data should be differenced at least 2 i.e. >2
Differenciating twice would stationarize our data:
stationaryTS <- diff(usgdp, differences= 2)
plot(stationaryTS, type="l", main="Differenced and Stationary") # appears to be stationary
As we see there is a change in the linear trend of the data, and data looks somewhat stationery
In order to confirm the above result, let’s do a Augmented Dickey-Fuller Test (adf test). A p-Value of less than 0.05 in adf. test() indicates that it is stationary.
adf.test(stationaryTS) # p-value < 0.05 indicates the TS is stationary
## Warning in adf.test(stationaryTS): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: stationaryTS
## Dickey-Fuller = -7.8593, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary
kpss.test(stationaryTS)
## Warning in kpss.test(stationaryTS): p-value greater than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: stationaryTS
## KPSS Level = 0.020988, Truncation lag parameter = 4, p-value = 0.1
As we see, adf test has a p value less than 0.05, which confirms that our data is stationery.
usgdp_lambda <- BoxCox.lambda(usgdp)
bc_usgdp <- BoxCox(usgdp, lambda = usgdp_lambda)
print(usgdp_lambda)
## [1] 0.366352
plot(diff(bc_usgdp), type="l", main="Differenced and Stationary") # appears to be stationary
With a lambda value of 0.366352, this data can be stationerized.
c.
Let’s plot the series mcopper
autoplot(mcopper, main = "Monthly Grade A Copper Prices") +
theme(axis.title = element_blank())
As we see from the above graph, there is a gradual incresing trend. Let’s see if differencing it can make it stationary.
Since usnetelec is a non-seasonal data, there is no seasonal differencing required.
Now, let’s see no. of differences need to make it stationary
# Make it stationary
ndiffs(mcopper) # number of differences need to make it stationary
## [1] 1
According to the number above, data should be differenced at least 1 i.e. >1
Differenciating twice would stationarize our data:
stationaryTS <- diff(usgdp, differences= 2)
plot(stationaryTS, type="l", main="Differenced and Stationary") # appears to be stationary
As we see there is a change in the linear trend of the data, and data looks somewhat stationery
In order to confirm the above result, let’s do a Augmented Dickey-Fuller Test (adf test). A p-Value of less than 0.05 in adf. test() indicates that it is stationary.
adf.test(stationaryTS) # p-value < 0.05 indicates the TS is stationary
## Warning in adf.test(stationaryTS): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: stationaryTS
## Dickey-Fuller = -7.8593, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary
kpss.test(stationaryTS)
## Warning in kpss.test(stationaryTS): p-value greater than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: stationaryTS
## KPSS Level = 0.020988, Truncation lag parameter = 4, p-value = 0.1
As we see, adf test has a p value less than 0.05, which confirms that our data is stationery.
mcopper_lambda <- BoxCox.lambda(mcopper)
bc_mcopper <- BoxCox(mcopper, lambda = mcopper_lambda)
print(mcopper_lambda)
## [1] 0.1919047
plot(diff(bc_mcopper), type="l", main="Differenced and Stationary") # appears to be stationary
With a lambda value of 0.1919047, this data can be stationerized.
d.
Let’s plot the series enplanements
autoplot(enplanements, main = "Monthly US Domestic Enplanements") +
theme(axis.title = element_blank())
As we see from the above graph, there is a incresing trend also this graph has seasonality. Let’s see if differencing it can make it stationary.
Now, let’s see no. of differences need to make it stationary
# Make it stationary
nsdiffs(enplanements) # number of differences need to make it stationary
## [1] 1
According to the number above, data should be seasonally differenced at least 1 i.e. >1
enplanements_seasdiff <- diff(enplanements, lag=frequency(enplanements), differences=1) # seasonal differencing
plot(enplanements_seasdiff, type="l", main="Seasonally Differenced") # still not stationary!
Since the data is now se-seasonalized, let’s make it stationery
According to the number above, data is stationerized
Differenciating twice would stationarize our data:
stationaryTS <- diff(enplanements_seasdiff, differences= 1)
plot(stationaryTS, type="l", main="Differenced and Stationary") # appears to be stationary
As we see there is a change in the linear trend of the data, and data looks somewhat stationery
In order to confirm the above result, let’s do a Augmented Dickey-Fuller Test (adf test). A p-Value of less than 0.05 in adf. test() indicates that it is stationary.
adf.test(stationaryTS) # p-value < 0.05 indicates the TS is stationary
## Warning in adf.test(stationaryTS): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: stationaryTS
## Dickey-Fuller = -8.215, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary
kpss.test(stationaryTS)
## Warning in kpss.test(stationaryTS): p-value greater than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: stationaryTS
## KPSS Level = 0.062128, Truncation lag parameter = 5, p-value = 0.1
As we see, adf test has a p value less than 0.05, which confirms that our data is stationery.
enplanements_lambda <- BoxCox.lambda(enplanements)
bc_enplanements <- BoxCox(enplanements, lambda = enplanements_lambda)
print(enplanements_lambda)
## [1] -0.2269461
plot(diff(bc_enplanements), type="l", main="Differenced and Stationary") # appears to be stationary
With a lambda value of 0.2269461, this data can be stationerized.
e.
Let’s plot the series visitors
autoplot(visitors, main = "Monthly Australian Overseas Visitors") +
theme(axis.title = element_blank())
As we see from the above graph, there is a incresing trend also this graph has seasonality. Let’s see if differencing it can make it stationary.
Now, let’s see no. of differences need to make it stationary
# Make it stationary
nsdiffs(visitors) # number of differences need to make it stationary
## [1] 1
According to the number above, data should be seasonally differenced at least 1 i.e. >1
visitors_seasdiff <- diff(visitors, lag=frequency(visitors), differences=1) # seasonal differencing
plot(visitors_seasdiff, type="l", main="Seasonally Differenced") # still not stationary!
Since the data is now se-seasonalized, let’s make it stationery
According to the number above, data is stationerized
Differenciating twice would stationarize our data:
stationaryTS <- diff(visitors_seasdiff, differences= 1)
plot(stationaryTS, type="l", main="Differenced and Stationary") # appears to be stationary
As we see there is a change in the linear trend of the data, and data looks somewhat stationery
In order to confirm the above result, let’s do a Augmented Dickey-Fuller Test (adf test). A p-Value of less than 0.05 in adf. test() indicates that it is stationary.
adf.test(stationaryTS) # p-value < 0.05 indicates the TS is stationary
## Warning in adf.test(stationaryTS): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: stationaryTS
## Dickey-Fuller = -8.6129, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary
kpss.test(stationaryTS)
## Warning in kpss.test(stationaryTS): p-value greater than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: stationaryTS
## KPSS Level = 0.01541, Truncation lag parameter = 4, p-value = 0.1
As we see, adf test has a p value less than 0.05, which confirms that our data is stationery.
visitors_lambda <- BoxCox.lambda(visitors)
bc_visitors <- BoxCox(visitors, lambda = visitors_lambda)
print(visitors_lambda)
## [1] 0.2775249
plot(diff(bc_visitors), type="l", main="Differenced and Stationary") # appears to be stationary
With a lambda value of 0.2775249, this data can be stationerized.
Loading the retail data:
file2 = "retail.xlsx"
retaildata <- read.xlsx(file2, sheet=1, startRow=2)
retail <- ts(retaildata[,"A3349873A"], frequency=12, start=c(1982,4))
Let’s plot the series retail
autoplot(retail, main = "Monthly Retail Sales") +
theme(axis.title = element_blank())
As we see from the above graph, there is a incresing trend also this graph has seasonality. Let’s see if differencing it can make it stationary.
Now, let’s see no. of differences need to make it stationary
# Make it stationary
nsdiffs(retail) # number of differences need to make it stationary
## [1] 1
According to the number above, data should be seasonally differenced at least 1 i.e. >1
retail_seasdiff <- diff(retail, lag=frequency(retail), differences=1) # seasonal differencing
plot(retail_seasdiff, type="l", main="Seasonally Differenced") # still not stationary!
Since the data is now se-seasonalized, let’s make it stationery
According to the number above, data is stationerized
Differenciating twice would stationarize our data:
stationaryTS <- diff(retail_seasdiff, differences= 1)
plot(stationaryTS, type="l", main="Differenced and Stationary") # appears to be stationary
As we see there is a change in the linear trend of the data, and data looks somewhat stationery
In order to confirm the above result, let’s do a Augmented Dickey-Fuller Test (adf test). A p-Value of less than 0.05 in adf. test() indicates that it is stationary.
adf.test(stationaryTS) # p-value < 0.05 indicates the TS is stationary
## Warning in adf.test(stationaryTS): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: stationaryTS
## Dickey-Fuller = -7.8988, Lag order = 7, p-value = 0.01
## alternative hypothesis: stationary
kpss.test(stationaryTS)
## Warning in kpss.test(stationaryTS): p-value greater than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: stationaryTS
## KPSS Level = 0.014456, Truncation lag parameter = 5, p-value = 0.1
As we see, adf test has a p value less than 0.05, which confirms that our data is stationery.
retail_lambda <- BoxCox.lambda(retail)
bc_retail <- BoxCox(retail, lambda = retail_lambda)
print(retail_lambda)
## [1] 0.1276369
plot(diff(bc_retail), type="l", main="Differenced and Stationary") # appears to be stationary
With a lambda value of 0.1276369, this data can be stationerized.
a.
Using the following code with given values:
y <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
y[i] <- 0.6*y[i-1] + e[i]
b.
ar <- function(theta){
set.seed(42)
y <- ts(numeric(100))
e <- rnorm(100, sd=1)
for(i in 2:100){
y[i] <- theta*y[i-1] + e[i]}
return(y)
}
p <- autoplot(ar(0.6))
for(phi in seq(0.1, 0.9, 0.1)){
p <- p + autolayer(ar(phi), series = paste(phi))
}
p
As you see above, there’sa fixed random number seed with which we can see the effect of phi.
We analyse that as value of phi increases, distance from 0 increases. With smaller value of phi, data get more random compared to higer value of phi. Whereas, as the value of phi increses, autocorrelation will be higher.
c.
Generate data from MA(1) model with given values:
ma <- function(theta){
set.seed(42)
y <- ts(numeric(100))
e <- rnorm(100, sd=1)
for(i in 2:100){
y[i] <- theta*e[i-1] + e[i]}
return(y)
}
ma(0.6)
## Time Series:
## Start = 1
## End = 100
## Frequency = 1
## [1] 0.00000000 0.25787690 0.02430951 0.85073965 0.78398589 0.13643648
## [7] 1.44784729 0.81225416 1.96162829 1.14834013 1.26724119 3.06956719
## [13] -0.01687347 -1.11210519 -0.30059460 0.55595760 0.09731732 -2.82700717
## [19] -4.03434018 -0.14416681 0.48542941 -1.96529159 -1.24070242 1.11152429
## [25] 2.62399828 0.70664695 -0.51555086 -1.91752471 -0.59780050 -0.36393646
## [31] 0.07145320 0.97810741 1.45800592 0.01213574 0.13959930 -1.41403561
## [37] -1.81466422 -1.32158300 -2.92475221 -1.41240198 0.22767216 -0.23745814
## [43] 0.54152886 -0.27180689 -1.80430394 -0.38815060 -0.55170236 0.95726536
## [49] 0.43501455 0.39678016 0.71531400 -0.59068378 1.10542416 1.58833582
## [55] 0.47550023 0.33040714 0.84521926 0.49740618 -2.93919035 -1.51097110
## [61] -0.19630487 -0.03511022 0.69296207 1.74883106 0.11255004 0.86616740
## [67] 1.11737370 1.24001497 1.54383223 1.27331530 -0.61059204 -0.71605775
## [73] 0.56940633 -0.57941246 -1.11494283 0.25529921 1.11677664 0.92467483
## [79] -0.60751574 -1.63124668 0.85283847 1.16554564 0.24319309 -0.06783240
## [85] -1.26686682 -0.10460044 0.15005829 -0.31304061 0.82369230 1.38178091
## [91] 1.88518024 0.35909590 0.36464421 1.78131959 -0.27612261 -1.52726591
## [97] -1.64821423 -2.13825721 -0.79554585 0.70119387
d.
Creating time plot for the series:
autoplot(cbind(e, ma(.1), ma(.6), ma(1), ma(3)), facet = TRUE)
By analysing above graph, we can say that as phi value changes, time series pattern remains constant. However, as the value of phi increases, scale(y-axis) increases.
e.
Generte data from ARIMA(1,1) model with given values:
y <- ts(numeric(100))
e <- rnorm(100, sd=1)
for(i in 2:100){
y[i] <- 0.6*y[i-1] + 0.6*e[i-1] + e[i]
}
autoplot(y) +
ggtitle('ARMA(1,1)')
f.
Generte data from AR(2) model with given values:
y2 <- ts(numeric(100))
e <- rnorm(100, sd=1)
for(i in 3:100){
y2[i] <- (-0.8)*y2[i-1] + 0.3*y2[i-2] + e[i]
}
autoplot(y2) +
ggtitle('AR(2)')
g.
par(mfrow=c(1,2))
ggAcf(y) + ggtitle('ARMA(1,1)')
ggAcf(y2) + ggtitle('AR(2)')
By compairing both series, AR(2) data is non-stationary as mentioned before. This data also seems to have seasonality which increases over time. ARIMA(1,1) data does not seem to have this seasonality. It seems to be random and stationery compared to AR(2) data.
By seeing the ACF plots, AR(2) does show seasonality and ARIMA(1,1) shows random trend and no seasonality.
a.
Let’s plot the time series data:
autoplot(wmurders)
From the above plot we see that, from mid 1950 to mid 1970, there is an increasing trend followed by some fluctions but has on seasonlaity.
Differenciating it would stationarize our data:
stationaryTS <- diff(wmurders, differences= 2)
plot(stationaryTS, type="l", main="Differenced and Stationary") # appears to be stationary
As we see there is a change in the linear trend of the data, and data looks somewhat stationery
In order to confirm the above result, let’s do a Augmented Dickey-Fuller Test (adf test). A p-Value of less than 0.05 in adf. test() indicates that it is stationary.
adf.test(stationaryTS) # p-value < 0.05 indicates the TS is stationary
## Warning in adf.test(stationaryTS): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: stationaryTS
## Dickey-Fuller = -5.1646, Lag order = 3, p-value = 0.01
## alternative hypothesis: stationary
kpss.test(stationaryTS)
## Warning in kpss.test(stationaryTS): p-value greater than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: stationaryTS
## KPSS Level = 0.045793, Truncation lag parameter = 3, p-value = 0.1
As we see, adf test has a p value less than 0.05, which confirms that our data is stationery.
Now, let’s plot ACF and PACF plots
ggtsdisplay(stationaryTS)
As we see from above plots, PACF is decaying . In ACF spikes in lag 1 and lag 2 are significant. None of them are significant beyond lag 2.
Therefore, ARIMA(0,2,2) is preferred to model the data.
b.
According to the book, the constant has an important effect on the long-term forecasts obtained from these models.
Also, ARIMA model of the data means it includes twice differencing. Also, if a model has constant, twice integrated constant will give a quadratic trend, which is not preferred in forecasting.
Therefore, constant will not be included in the model.
c.
Model interms of backshift operator:
(1−B)2∗yt=(1+theta1∗B+theta2∗B2)∗et
d.
Fit the model
(fit <- Arima(wmurders, order=c(0,2,2)))
## Series: wmurders
## ARIMA(0,2,2)
##
## Coefficients:
## ma1 ma2
## -1.0181 0.1470
## s.e. 0.1220 0.1156
##
## sigma^2 estimated as 0.04702: log likelihood=6.03
## AIC=-6.06 AICc=-5.57 BIC=-0.15
Check residuals
checkresiduals(fit)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,2,2)
## Q* = 11.764, df = 8, p-value = 0.1621
##
## Model df: 2. Total lags used: 10
From ACF plot it shows that spikes are within the blue line range, therefore, it can be considered as white noise.
Residual plot shows that residuals are not normal but satisfactory in nature. p-value from Ljung-Box is not significant which fails to reject the null hypothesis that time series isn’t autocorrelated.
e.
Forecasts using forecast() method:
fc <- forecast(fit, h=3)
fc %>%
kable() %>%
kable_styling()
| Point Forecast | Lo 80 | Hi 80 | Lo 95 | Hi 95 | |
|---|---|---|---|---|---|
| 2005 | 2.480525 | 2.202620 | 2.758430 | 2.055506 | 2.905544 |
| 2006 | 2.374890 | 1.985422 | 2.764359 | 1.779249 | 2.970531 |
| 2007 | 2.269256 | 1.772305 | 2.766207 | 1.509235 | 3.029276 |
fc$mean
## Time Series:
## Start = 2005
## End = 2007
## Frequency = 1
## [1] 2.480525 2.374890 2.269256
Forecasts using manual calculation:
fc$model
## Series: wmurders
## ARIMA(0,2,2)
##
## Coefficients:
## ma1 ma2
## -1.0181 0.1470
## s.e. 0.1220 0.1156
##
## sigma^2 estimated as 0.04702: log likelihood=6.03
## AIC=-6.06 AICc=-5.57 BIC=-0.15
(1−B)2∗yt=(1−1.0181∗B+0.1470∗B2)∗et
yt=2yt−1−yt−2+et−1.0181∗et−1+0.1470∗et−2
years <- length(wmurders)
e <- fc$residuals
fc1 <- 2*wmurders[years] - wmurders[years - 1] - 1.0181*e[years] + 0.1470*e[years - 1]
fc2 <- 2*fc1 - wmurders[years] + 0.1470*e[years]
fc3 <- 2*fc2 - fc1
c(fc1, fc2, fc3)
## [1] 2.480523 2.374887 2.269252
Therefore manully calculated values matches with values forecasted using forecasts method.
f.
A plot of the series with forecasts and prediction intervals for the next three periods:
autoplot(fc)
g.
Accuracy for ARIMA(0,2,2)
accuracy(fc)
## ME RMSE MAE MPE MAPE MASE
## Training set -0.0113461 0.2088162 0.1525773 -0.2403396 4.331729 0.9382785
## ACF1
## Training set -0.05094066
Accuracy for ARIMA(1,2,1)
fc_autoarima <- forecast(auto.arima(wmurders), h = 3)
accuracy(fc_autoarima)
## ME RMSE MAE MPE MAPE MASE
## Training set -0.01065956 0.2072523 0.1528734 -0.2149476 4.335214 0.9400996
## ACF1
## Training set 0.02176343
Without RMSE, all errors show that ARIMA(0, 2, 2) is better than ARIMA(1, 2, 1).
Using auto.arima function with stepwise and approximation options false gave ARIMA(0, 2, 3) model
(fc_autoarima2 <- forecast(auto.arima(wmurders, stepwise = FALSE, approximation = FALSE), h = 3))
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2005 2.408817 2.137718 2.679916 1.994206 2.823428
## 2006 2.365555 1.985092 2.746018 1.783687 2.947423
## 2007 2.290976 1.753245 2.828706 1.468588 3.113363
Accuracy for ARIMA(0,2,3)
accuracy(fc_autoarima2)
## ME RMSE MAE MPE MAPE MASE
## Training set -0.01336585 0.2016929 0.1531053 -0.3332051 4.387024 0.9415259
## ACF1
## Training set -0.03193856
In this case, some errors were better but others were not.
Residuals of ARIMA(0,2,2)
checkresiduals(fc)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,2,2)
## Q* = 11.764, df = 8, p-value = 0.1621
##
## Model df: 2. Total lags used: 10
Residuals of ARIMA(0,2,3)
checkresiduals(fc_autoarima2)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,2,3)
## Q* = 10.706, df = 7, p-value = 0.152
##
## Model df: 3. Total lags used: 10
Both models i.e. ARIMA(0,2,2) and ARIMA(0,2,3) are very close and similar residuals. ARIMA (0,2,2) is preferred because error values for ARIMA(0,2,3) were higher compared to ARIMA(0,2,2).