Question 1
Figure 9.32 shows the ACFs for 36 random numbers, 360 random numbers
and 1,000 random numbers.
Question 2
A classic example of a non-stationary series are stock prices. Plot
the daily closing prices for Amazon stock (contained in gafa_stock),
along with the ACF and PACF. Explain how each plot shows that the series
is non-stationary and should be differenced.
gafa_stock %>%
filter(Symbol == "AMZN") %>%
gg_tsdisplay(Close, plot_type='partial')
## Warning: Provided data has an irregular interval, results should be treated with caution. Computing ACF by observation.
## Provided data has an irregular interval, results should be treated with caution. Computing ACF by observation.

In the Time Series Plot, the stock prices steadily increase over
time, showing a clear trend rather than fluctuating around a constant
level. This is a sign that the series is non-stationary.In the ACF Plot,
the autocorrelations decrease slowly across many lags, meaning the
values are highly related to previous ones even at long time intervals,
which suggests non-stationarity. Lastly, in the PACF Plot, there’s a
strong spike at the first lag, meaning that the stock price today is
strongly related to yesterday’s price. This also supports the idea that
the series is non-stationary. To make this data stationary, we’d need to
apply differencing, which removes the trend and makes the values
fluctuate around a constant level.
Question 6
Simulate and plot some data from simple ARIMA models.
a)
y <- numeric(100)
e <- rnorm(100)
for(i in 2:100)
y[i] <- 0.6*y[i-1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)
b) Produce a time plot for the series. How does the plot change as
you change TeX(“\(\\phi_1\)”) ?
simulate_and_plot <- function(phi, n = 100, sigma2 = 1) {
y <- numeric(n) # Initialize time series
e <- rnorm(n, mean = 0, sd = sqrt(sigma2)) # Generate white noise
# Simulate the AR(1) process
for (i in 2:n) {
y[i] <- phi * y[i - 1] + e[i]
}
# Create a tsibble and generate the plot
sim <- tsibble(idx = seq_len(n), y = y, index = idx)
sim %>%
autoplot(y) +
labs(
title = TeX(paste0("AR(1) model with $\\phi_1$ = ", phi)),
x = "Time", y = "y"
)
}
# Generate plots for different phi_1 values
phi_values <- c(0.6, 1, 0, -0.6)
# Plot all models in a 2x2 grid layout
par(mfrow = c(2, 2))
for (phi in phi_values) {
print(simulate_and_plot(phi)) # Print each plot in the loop
}




TeX(“\(\\phi_1\)”) = 0.6 -> A
moderately persistent process with some autocorrelation.
TeX(“\(\\phi_1\)”) = 1 -> This
is a random walk with high persistence with no reversion to the
mean.
TeX(“\(\\phi_1\)”) = 0 -> The
series is just white noise with no autocorrelation.
TeX(“\(\\phi_1\)”) = -0.6 -> It
shows negative autocorrelation.
c) Write your own code to generate data from an MA(1) model with
TeX(“\(\\theta_1\)”) = 0.6 and
TeX(“\(\\sigma^2\)”) =1
simulate_ma1 <- function(theta, n = 100, sigma2 = 1) {
# Generate white noise (mean 0, variance sigma2)
e <- rnorm(n + 1, mean = 0, sd = sqrt(sigma2))
y <- numeric(n)
# Simulate the MA(1) process: y_t = e_t + theta * e_(t-1)
for (t in 2:n) {
y[t] <- e[t] + theta * e[t - 1]
}
sim2 <- tsibble(idx = seq_len(n), y = y, index = idx)
return(sim)
}
# Simulate the MA(1) process with theta = 0.6
sim2 <- simulate_ma1(theta = 0.6)
d) Produce a time plot for the series. How does the plot change as
you change TeX(“\(\\theta_1\)”) ?
# Plot the simulated time series
sim2 %>%
autoplot(y) +
labs(
title = expression(paste("MA(1) Model with ", theta[1], " = 0.6")),
x = "Time", y = "y"
)

theta_values <- c(0.6, 1, 0, -0.6)
# Generate and store plots for each theta_1 value
plots <- lapply(theta_values, function(theta) {
sim <- simulate_ma1(theta)
sim %>%
autoplot(y) +
labs(
title = TeX(paste0("MA(1) Model with $\\theta_1 = ", theta, "$")),
x = "Time", y = "y"
)
})
# Display all plots in a 2x2 grid layout
grid.arrange(grobs = plots, ncol = 2)

With TeX(“\(\\theta_1\)”) = 0.6 the
series displays moderate persistence, resulting in smooth fluctuations.
When TeX(“\(\\theta_1\)”) increases to
1, the plot resembles a random walk, showing strong positive
autocorrelation. A TeX(“\(\\theta_1\)”)
of 0 eads to a purely random, uncorrelated series similar to white
noise. In contrast, a TeX(“\(\\theta_1\)”) of -0.6 introduces negative
autocorrelation. In conclusion, changing TeX(“\(\\theta_1\)”) significantly influences the
degree and nature of autocorrelation in the resulting time series.
e)
set.seed(2001)
# Initialize parameters
n <- 100
y <- numeric(n)
e <- rnorm(n, mean = 0, sd = 1) # Generate white noise
# Simulate ARMA(1,1) process
for (i in 2:n) {
y[i] <- e[i] + 0.6 * e[i - 1] + 0.6 * y[i - 1]
}
# Create a tsibble for plotting
arma1 <- tsibble(idx = seq_len(n), y = y, index = idx)
f)
set.seed(4510)
# Initialize parameters
n <- 100
y <- numeric(n)
e <- rnorm(n, mean = 0, sd = 1) # Generate white noise
# Simulate AR(2) process
for (i in 3:n) {
y[i] <- -0.8 * y[i - 1] + 0.3 * y[i - 2] + e[i]
}
# Create a tsibble for plotting
ar2 <- tsibble(idx = seq_len(n), y = y, index = idx)
g) Graph the latter two series and compare them.
arma1 %>%
gg_tsdisplay(y, plot_type='partial') +
labs(title = TeX("ARMA(1,1) model with $\\phi_1 = 0.6$, $\\theta_1 = 0.6$, and $\\sigma^2 = 1$"))

ar2 %>%
gg_tsdisplay(y, plot_type='partial') +
labs(title = TeX("AR(2) model with $\\phi_1 = -0.8$, $\\phi_2 = 0.3$, and $\\sigma^2 = 1$"))

The ARMA(1,1) model shows stable, moderate oscillations with
diminishing autocorrelations, suggesting an overall stationarity. In
contrast, the AR(2) model starts stable but quickly becomes unstable
with large, growing oscillations, reflecting a potential
non-stationarity. The autocorrelations in the AR(2) model persist longer
than in the ARMA(1,1), and the AR(2) model’s instability arises from its
parameter choices, making it less reliable for forecasting compared to
the more controlled ARMA(1,1) model.
Question 7
Consider aus_airpassengers, the total number of passengers (in
millions) from Australian air carriers for the period 1970-2011.
a) Use ARIMA() to find an appropriate ARIMA model. What model was
selected. Check that the residuals look like white noise. Plot forecasts
for the next 10 periods.
ARIMA (0,2,1) was picked and the residuals do resemble white
noise.
data("aus_airpassengers")
# Fit an ARIMA model using auto.arima()
fit <- auto.arima(aus_airpassengers)
checkresiduals(fit)

##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,2,1)
## Q* = 6.5774, df = 8, p-value = 0.5828
##
## Model df: 1. Total lags used: 9
forecasted_values <- forecast(fit, h = 10)
autoplot(forecasted_values)

b) Write the model in terms of the backshift operator.
plot(TeX("$y_t = -0.87 \\epsilon_{t-1} + \\epsilon_t$"))

plot(TeX("(1 - B)^2 y_t = (1 - 0.87B) \\epsilon_t"))

c) Plot forecasts from an ARIMA(0,1,0) model with drift and compare
these to part a.
The ARIMA model from part a forecasted values that were consistently
higher than the actual values, whereas the current ARIMA model forecasts
lower than the actual values. Additionally, the slope of the forecasts
in this model appears to be more gradual compared to the previous
one
fit2 <-aus_airpassengers %>%
filter(Year < 2012) %>%
model(ARIMA(Passengers ~ pdq(0,1,0)))
fit2 %>%
forecast(h=10) %>%
autoplot(aus_airpassengers) +
labs(title = "ARIMA(0,1,0)", y = "Passengers (in millions)")

fit2 %>%
gg_tsresiduals()

d) Plot forecasts from an ARIMA(2,1,2) model with drift and compare
these to parts a and c. Remove the constant and see what happens.
fit3 <-aus_airpassengers %>%
filter(Year < 2012) %>%
model(ARIMA(Passengers ~ pdq(2,1,2)))
fit3 %>%
forecast(h=10) %>%
autoplot(aus_airpassengers) +
labs(title = " ARIMA(2,1,2)", y = "Passengers (in millions)")

fit3 %>%
gg_tsresiduals() +
labs(title = "Residuals for ARIMA(2,1,2)")

# removing the constant
fit4 <-aus_airpassengers %>%
filter(Year < 2012) %>%
model(ARIMA(Passengers ~ 0 + pdq(2,1,2)))
report(fit4)
## Series: Passengers
## Model: NULL model
## NULL model
I plotted the forecasts from an ARIMA(2,1,2) model with a drift and
compared them to the ones from parts a and c. The forecasts were similar
to those from part a, and the residuals looked like white noise,
indicating a good fit. However, when I removed the constant, the model
became a null model. This change introduced a polynomial trend, making
the forecasts non-stationary. This shows that including a constant is
important for keeping the forecasts stable
e) Plot forecasts from an ARIMA(0,2,1) model with a constant. What
happens?
fit5 <-aus_airpassengers %>%
filter(Year < 2012) %>%
model(ARIMA(Passengers ~ 1 + pdq(0,2,1)))
## Warning: Model specification induces a quadratic or higher order polynomial trend.
## This is generally discouraged, consider removing the constant or reducing the number of differences.
fit5 %>%
forecast(h=10) %>%
autoplot(aus_airpassengers) +
labs(title = "ARIMA(0,2,1) with constant", y = "Passengers (in millions)")

fit5 %>%
gg_tsresiduals() +
labs(title = "ARIMA(0,2,1) with constant")

The slope becomes steeper, indicating a stronger trend in the
forecasts. Additionally, a warning is issued, stating that the model
specifies a higher-order polynomial trend due to the constant. This
suggests that the model may be overly complex and could benefit from the
removal of the constant to better capture the underlying data
dynamics.
Question 8
For the United States GDP series (from global_economy):
c) try some other plausible models by experimenting with the orders
chosen;
To experiment with other ARIMA models, we can manually adjust the
orders of p, d, and q in the ARIMA model.
# ARIMA(2,1,0)
fit_arima_210 <- Arima(gdp_transformed, order = c(2, 1, 0))
summary(fit_arima_210)
## Series: gdp_transformed
## ARIMA(2,1,0)
##
## Coefficients:
## ar1 ar2
## 0.6950 0.2485
## s.e. 0.1281 0.1288
##
## sigma^2 = 6719: log likelihood = -332.05
## AIC=670.1 AICc=670.55 BIC=676.23
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 15.49463 79.82007 63.01628 0.1445514 0.484831 0.281242 -0.07720077
# ARIMA(0,1,1)
fit_arima_011 <- Arima(gdp_transformed, order = c(0, 1, 1))
summary(fit_arima_011)
## Series: gdp_transformed
## ARIMA(0,1,1)
##
## Coefficients:
## ma1
## 0.7673
## s.e. 0.0674
##
## sigma^2 = 23338: log likelihood = -367.47
## AIC=738.93 AICc=739.16 BIC=743.02
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 123.4948 150.1108 130.9554 0.9868126 1.034531 0.5844546 -0.4256826
d) choose what you think is the best model and check the residual
diagnostics;
The residuals of the ARIMA(2,1,0) model appear to oscillate around
zero, suggesting that the model fits the data fairly well without any
clear trend.The ACF plot shows several spikes, but most of them fall
within the blue dashed lines.The histogram of the residuals appears to
be roughly bell-shaped, suggesting that the residuals are normally
distributed.
For the ARIMA(0,1,1) model the residuals oscillate around zero, but
there seems to be a bit more variability compared to the ARIMA(2,1,0)
model.The ACF plot for this model shows a more significant spike at lag
1 and then decreases more slowly, suggesting that there is more
autocorrelation present in the residuals compared to the other model.The
histogram for the residuals also resembles a normal distribution but
shows more variability, and there is a bit of skewness.
In conclusion,based on the residual diagnostics and the AIC and BIC
, ARIMA(2,1,0) seems to be the better model due to its more favorable
residual behavior (random scatter) , less autocorrelation, and more
normality in the residuals.
e) produce forecasts of your fitted model. Do the forecasts look
reasonable?
forecasts <- forecast(fit_arima_210, h = 10)
# Plot the forecasts
autoplot(forecasts) +
ggtitle("Forecasts from ARIMA(2,1,0) Model") +
xlab("Year") +
ylab("Transformed GDP")

The forecast follows the upward trend observed in the data, which
suggests that the model has captured the long-term growth trend well and
the prediction line and confidence intervals seem smooth without wild
fluctuations, indicating that the model is not overfitted. Also, the
blue confidence intervals look appropriate—neither too narrow nor
excessively wide—indicating the model has captured uncertainty
reasonably. Overall, the forecasts do look reasonable.