Question 1

Recall from lecture that an AR(2) model

$X_t = \phi_1X_{t-1} + \phi_2X_{t-1} + W_t$

where $W_t \sim WN(0, \Sigma_W)$ can be written in state-space form as:

\[\begin{eqnarray*} X_t &=& \begin{bmatrix} 1 & 0 \end{bmatrix}{\boldsymbol \theta_t} +\nu_t^2 \\ {\boldsymbol \theta_t} &=& \begin{bmatrix} \phi_1 & 1 \\ \phi_2 & 0 \end{bmatrix} {\boldsymbol \theta_{t-1}} + \begin{bmatrix} W_t \\ 0 \end{bmatrix} \\ && \\ && \nu_t^2 \stackrel{iid}{\sim} N(0,\sigma^2) \end{eqnarray*}\]

where $\sigma_\nu^2 = 0$ in the $dlm$ package:

\[\begin{eqnarray*} FF &=& \begin{bmatrix} 1 & 0 \end{bmatrix}\\ GG &=& \begin{bmatrix} \phi_1 & 1 \\ \phi_2 & 0 \end{bmatrix}\\ V &\equiv& \sigma_\nu^2 = 0 \\ W &\equiv& \Sigma_W \end{eqnarray*}\]

In this problem, the goal is to understand how to fit an autoregressive model in in the $dlm$ library, and then extract the estimates of the para from the output. Using your own seed and values for $\phi_1$ and $\phi_2$ (just be sure your values satisfy stationarity conditions: $\phi_2 < 1 + \phi_1$; $\phi_2 < 1 - \phi_1$;$\phi_2 > -1$ ), obtain 1000 simulated draws from your AR(2). In the code below, I have used a seed of 2 and $(\phi_1, \phi_2)' = (0.3, =0.5)'$. Note that the default for $\Sigma_w$ is the identity matrix - we will use this default. Provide the time plot, acf, and pacf plot of your simulated values. Note that in an AR(2) model, we would expect to see the acf trail off and the pacf cut off after 2 lags.

set.seed(102)
sim1 <- arima.sim(n= 1000, list(ar = c(0.4, -0.2)))


autoplot(sim1, ylab = "", 
     main = "1000 Simulated AR(2) Draws. phi1=0.4 and phi2=-0.2")

autoplot(acf(sim1, plot = FALSE), ylab = "", 
     main = "ACF: 1000 Simulated AR(2) Draws. phi1=0.4 and phi2=-0.2")

autoplot(pacf(sim1, plot = FALSE), ylab = "", 
     main = "PACF: 1000 Simulated AR(2) Draws. phi1=0.4 and phi2=-0.2")

Now, we will turn to the $dlm$ package. We must first ``build" our model by creating a function with our model elements. Here, we are only working with an AR element, so the build function only requires the $dlmModARMA()$ function with placeholders for the AR coefficients.

build <- function(p) {
   mod <- dlmModARMA(ar=c(p[1], p[2]), sigma2 = 1)
  return(mod)
}

Once the model has been built, we apply the $dlmMLE()$ with specified initial parameter values to obtain the maximum likelihood estimates of the AR coefficients. Below, I just set the initial values to 0.

init <- rep(0,2)

fit <- dlmMLE(y = sim1, parm = init, build = build, method = "BFGS", control=list(trace = 10, maxit = 1000))

## initial  value 559.632335 
## iter  10 value 474.355581
## final  value 474.275583 
## converged

fit$par

## [1]  0.4253352 -0.2118917

The object $fit$ now contains the optimal values of the parameters of the AR(2), and if you print out $fit\$par$ you will see that $\hat{\phi_1} = 0.4253352$ and $\hat{\phi_2} = -0.2118917$, which is reasonably close to our true values of $\phi_1 = 0.4$ and $\phi_2 = -0.2$. Try a different initial value set to test the sensitivity of the optimization to the initial values.

Finally, we can obtain the state-space representation of our fit AR(2) model by imputing the parameter estimates back into the built model. Note that $m0$ and $C0$ are simply the default mean and covariance of the initial state vector.

mod <- build(fit$par)

mod

## $FF
##      [,1] [,2]
## [1,]    1    0
## 
## $V
##      [,1]
## [1,]    0
## 
## $GG
##            [,1] [,2]
## [1,]  0.4253352    1
## [2,] -0.2118917    0
## 
## $W
##      [,1] [,2]
## [1,]    1    0
## [2,]    0    0
## 
## $m0
## [1] 0 0
## 
## $C0
##       [,1]  [,2]
## [1,] 1e+07 0e+00
## [2,] 0e+00 1e+07

Question 2 (Extra Credit)

Recall from lecture that an ARMA(1,1) model

$X_t = \phi X_{t-1} + \theta W_{t-1} + W_t$

where $W_t \sim WN(0, \Sigma_W)$.

The ARMA(2,2) can be written in state-space form as:

\[ \begin{eqnarray*} X_t &=& \begin{bmatrix} 1 & 0 \end{bmatrix}{\boldsymbol \theta_t} +\nu_t^2 \\ {\boldsymbol \theta_t} &=& \begin{bmatrix} \phi & 1 \\ 0 & 0 \end{bmatrix} {\boldsymbol \theta_{t-1}} + \begin{bmatrix} W_t \\ \theta W_t \end{bmatrix} \\ && \\ && \nu_t^2 \stackrel{iid}{\sim} N(0,\sigma^2) \end{eqnarray*} \] Alternatively the ARMA(2,2) can be written (using $dlm$ package notation) as follows:

\[ \begin{eqnarray*} FF &=& \begin{bmatrix} 1 & 0 \end{bmatrix}\\ GG &=& \begin{bmatrix} \phi & 1 \\ 0 & 0 \end{bmatrix}\\ V &\equiv& \sigma_\nu^2 = 0 \\ W &=&\begin{bmatrix} 1 & \theta \\ \theta & \theta^2 \end{bmatrix} \end{eqnarray*} \]

sim2 <- arima.sim(n=1000, list(ar=0.4, ma=0.5))
autoplot(sim2, ylab = "", 
     main = "1000 Simulated ARMA(0.4, 0.5) Draws")

autoplot(acf(sim2, plot = FALSE), ylab = "", 
     main = "ACF: 1000 Simulated ARMA(0.4, 0.5) Draws")

## Warning: Ignoring unknown parameters: ylab, main

autoplot(pacf(sim2, plot = FALSE), ylab = "", 
     main = "PACF: 1000 Simulated ARMA(0.4, 0.5) Draws")

## Warning: Ignoring unknown parameters: ylab, main

build2 <- function(p) {
  mod <- dlmModARMA(ar=p[1], ma=p[2], sigma2=1)
  return(mod)
}

init <- rep(0,2)
fit2 <-dlmMLE(y=sim2, parm=init, build=build2, method='BFGS', control=list(trace=10,maxit=1000))

## initial  value 950.129632 
## iter  10 value 519.725552
## iter  10 value 519.725552
## final  value 519.725510 
## converged

fit2$par

## [1] 0.4014554 0.4368376

mod2 <- build2(fit2$par)

mod2

## $FF
##      [,1] [,2]
## [1,]    1    0
## 
## $V
##      [,1]
## [1,]    0
## 
## $GG
##           [,1] [,2]
## [1,] 0.4014554    1
## [2,] 0.0000000    0
## 
## $W
##           [,1]      [,2]
## [1,] 1.0000000 0.4368376
## [2,] 0.4368376 0.1908271
## 
## $m0
## [1] 0 0
## 
## $C0
##       [,1]  [,2]
## [1,] 1e+07 0e+00
## [2,] 0e+00 1e+07

Question 3

We also learned in class that two state-space models may be easily combined. In our Airpassengers analysis, we assumed a a quadratic and seasonal trend, which required us to use the $dlmModPoly(order = 2)$ for the quadratic trend and $dlmModSeas(frequency = 12)$. In this question we repeat our analysis but with an ARMA(1,1) element added to the model.

The code has already been modified to include the ARMA element - you simply need to remove the hashtags in lines 145 and 146 (where the ARMA elements are commented out in the build function). With the hashtags in place, you have the analysis we did lecture.

Please run the code with the hashtags removed and write a sentence on any changes you may see in the forecast plot compared to the one we did in class. Also use the model output to state the estimated values of the AR coefficient and the MA coefficient (it may help to review the state representation of the ARMA(1,1) from lecture).

Read in Data

Recall that the data can be formatted into R as follows:

data("AirPassengers")
#?"AirPassengers"
#Fit a local level with seasonal dummies
plot(log(AirPassengers), ylab = "log monthly total passengers (in thousands)", 
     main = "Number of international air passengers")

#Keep some observations for forecast validation
y <- as.numeric(log(window(AirPassengers,1949,1959+11/12)))
n <- length(y)
time <- as.vector(time(window(AirPassengers,1949,1959+11/12)))
month <- as.factor(as.integer(time*12) %%12)

Specify Model

In class, we allowed for observational and process noise in both the seasonality and quadratic trend, which is not exactly analogous to our initial analysis on the Airpassenger data in Class 2. Hence why the fit in $dlm$ was better than our initial analysis. We will follow the same direction already taken in the $dlm$ model is regards to our ARMA element - that is we will allow for observational and state level variances.

Please rerun our analysis using the following code:

#Build the dynamic linear model  - specification with local level + seasonal component (dummies) + ARMA(1,1)
#takes a few steps to adjust
build <- function(params) {
  level <- dlmModPoly(order = 2, dV = exp(params[1]), dW = c(exp(params[2:3])))
  seas <- dlmModSeas(frequency = 12, dV = exp(params[1]), dW = c(exp(params[4]), rep(0, 10)))  #stochastic seas.
  arma <- dlmModARMA(ar=c(params[5]), ma = c(params[6]),sigma2 = params[7],dV =  exp(params[1]))
  mod <- level + seas + arma
  return(mod)
}
#Initial parameter values - five log-variance, mean and trend, seasonal effect, and ARMA
init <- rep(-0.2,7)
#Fit the DLM model - numerical optimization
fit <- dlmMLE(y = y, parm = init, build = build, method = "BFGS", control=list(trace = 10, maxit = 1000))

## initial  value 285.844297 
## iter  10 value -191.801409
## iter  20 value -213.123427
## final  value -213.241058 
## converged

#Define model with estimated parameters
mod <- build(fit$par)
mod

## $FF
##      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
## [1,]    1    0    1    0    0    0    0    0    0     0     0     0     0     1
##      [,15]
## [1,]     0
## 
## $V
##              [,1]
## [1,] 0.0001114883
## 
## $GG
##       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
##  [1,]    1    1    0    0    0    0    0    0    0     0     0     0     0
##  [2,]    0    1    0    0    0    0    0    0    0     0     0     0     0
##  [3,]    0    0   -1   -1   -1   -1   -1   -1   -1    -1    -1    -1    -1
##  [4,]    0    0    1    0    0    0    0    0    0     0     0     0     0
##  [5,]    0    0    0    1    0    0    0    0    0     0     0     0     0
##  [6,]    0    0    0    0    1    0    0    0    0     0     0     0     0
##  [7,]    0    0    0    0    0    1    0    0    0     0     0     0     0
##  [8,]    0    0    0    0    0    0    1    0    0     0     0     0     0
##  [9,]    0    0    0    0    0    0    0    1    0     0     0     0     0
## [10,]    0    0    0    0    0    0    0    0    1     0     0     0     0
## [11,]    0    0    0    0    0    0    0    0    0     1     0     0     0
## [12,]    0    0    0    0    0    0    0    0    0     0     1     0     0
## [13,]    0    0    0    0    0    0    0    0    0     0     0     1     0
## [14,]    0    0    0    0    0    0    0    0    0     0     0     0     0
## [15,]    0    0    0    0    0    0    0    0    0     0     0     0     0
##           [,14] [,15]
##  [1,] 0.0000000     0
##  [2,] 0.0000000     0
##  [3,] 0.0000000     0
##  [4,] 0.0000000     0
##  [5,] 0.0000000     0
##  [6,] 0.0000000     0
##  [7,] 0.0000000     0
##  [8,] 0.0000000     0
##  [9,] 0.0000000     0
## [10,] 0.0000000     0
## [11,] 0.0000000     0
## [12,] 0.0000000     0
## [13,] 0.0000000     0
## [14,] 0.9997795     1
## [15,] 0.0000000     0
## 
## $W
##               [,1]         [,2]         [,3] [,4] [,5] [,6] [,7] [,8] [,9]
##  [1,] 0.0002456896 0.000000e+00 0.000000e+00    0    0    0    0    0    0
##  [2,] 0.0000000000 8.505325e-07 0.000000e+00    0    0    0    0    0    0
##  [3,] 0.0000000000 0.000000e+00 4.118324e-05    0    0    0    0    0    0
##  [4,] 0.0000000000 0.000000e+00 0.000000e+00    0    0    0    0    0    0
##  [5,] 0.0000000000 0.000000e+00 0.000000e+00    0    0    0    0    0    0
##  [6,] 0.0000000000 0.000000e+00 0.000000e+00    0    0    0    0    0    0
##  [7,] 0.0000000000 0.000000e+00 0.000000e+00    0    0    0    0    0    0
##  [8,] 0.0000000000 0.000000e+00 0.000000e+00    0    0    0    0    0    0
##  [9,] 0.0000000000 0.000000e+00 0.000000e+00    0    0    0    0    0    0
## [10,] 0.0000000000 0.000000e+00 0.000000e+00    0    0    0    0    0    0
## [11,] 0.0000000000 0.000000e+00 0.000000e+00    0    0    0    0    0    0
## [12,] 0.0000000000 0.000000e+00 0.000000e+00    0    0    0    0    0    0
## [13,] 0.0000000000 0.000000e+00 0.000000e+00    0    0    0    0    0    0
## [14,] 0.0000000000 0.000000e+00 0.000000e+00    0    0    0    0    0    0
## [15,] 0.0000000000 0.000000e+00 0.000000e+00    0    0    0    0    0    0
##       [,10] [,11] [,12] [,13]         [,14]         [,15]
##  [1,]     0     0     0     0  0.000000e+00  0.000000e+00
##  [2,]     0     0     0     0  0.000000e+00  0.000000e+00
##  [3,]     0     0     0     0  0.000000e+00  0.000000e+00
##  [4,]     0     0     0     0  0.000000e+00  0.000000e+00
##  [5,]     0     0     0     0  0.000000e+00  0.000000e+00
##  [6,]     0     0     0     0  0.000000e+00  0.000000e+00
##  [7,]     0     0     0     0  0.000000e+00  0.000000e+00
##  [8,]     0     0     0     0  0.000000e+00  0.000000e+00
##  [9,]     0     0     0     0  0.000000e+00  0.000000e+00
## [10,]     0     0     0     0  0.000000e+00  0.000000e+00
## [11,]     0     0     0     0  0.000000e+00  0.000000e+00
## [12,]     0     0     0     0  0.000000e+00  0.000000e+00
## [13,]     0     0     0     0  0.000000e+00  0.000000e+00
## [14,]     0     0     0     0 -3.847128e-04 -4.521975e-05
## [15,]     0     0     0     0 -4.521975e-05 -5.315201e-06
## 
## $m0
##  [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 
## $C0
##        [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] [,10] [,11] [,12]
##  [1,] 1e+07 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00
##  [2,] 0e+00 1e+07 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00
##  [3,] 0e+00 0e+00 1e+07 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00
##  [4,] 0e+00 0e+00 0e+00 1e+07 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00
##  [5,] 0e+00 0e+00 0e+00 0e+00 1e+07 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00
##  [6,] 0e+00 0e+00 0e+00 0e+00 0e+00 1e+07 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00
##  [7,] 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 1e+07 0e+00 0e+00 0e+00 0e+00 0e+00
##  [8,] 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 1e+07 0e+00 0e+00 0e+00 0e+00
##  [9,] 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 1e+07 0e+00 0e+00 0e+00
## [10,] 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 1e+07 0e+00 0e+00
## [11,] 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 1e+07 0e+00
## [12,] 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 1e+07
## [13,] 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00
## [14,] 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00
## [15,] 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00
##       [,13] [,14] [,15]
##  [1,] 0e+00 0e+00 0e+00
##  [2,] 0e+00 0e+00 0e+00
##  [3,] 0e+00 0e+00 0e+00
##  [4,] 0e+00 0e+00 0e+00
##  [5,] 0e+00 0e+00 0e+00
##  [6,] 0e+00 0e+00 0e+00
##  [7,] 0e+00 0e+00 0e+00
##  [8,] 0e+00 0e+00 0e+00
##  [9,] 0e+00 0e+00 0e+00
## [10,] 0e+00 0e+00 0e+00
## [11,] 0e+00 0e+00 0e+00
## [12,] 0e+00 0e+00 0e+00
## [13,] 1e+07 0e+00 0e+00
## [14,] 0e+00 1e+07 0e+00
## [15,] 0e+00 0e+00 1e+07

fit$par

## [1] -1.020020e+01 -8.311441e+00 -1.397740e+01 -1.009748e+01  9.997795e-01
## [6]  1.175416e-01 -3.847128e-04

We can extract our estimates $\hat{\phi}$ and $\hat{\theta}$ by looking in the 14th column of GG and W respectively.

By looking in the 14th column of GG we see that $\hat{\phi} = 0.9997795$.

By looking in the 14th column of W we see that $\hat{\theta} =$ -3.847128e-04.

##Filtering and Forecasting

#Filtering 
filtered <- dlmFilter(y, mod) 
plot((AirPassengers), ylab = "log monthly total passengers (in thousands)", 
     main = "Number of international air passengers",ylim=c(100,720))
lines(time, exp(c(mod$FF %*% t(filtered$m[-1,]))), col = 2, lwd = 2) #filtered states



#One-step ahead forecasts (a linear fn of filtering mean)
forecasted <- dlmForecast(filtered, nAhead = 12)
timelo <- seq(tail(time,1) + 1/12, by = 1/12, length = 12)
lines(timelo,  exp(forecasted$f), col = 4, lwd=2)
legend("topleft", c("Original series","Filtered states","One-step forecasts"), 
       lty=c(1, 1, 1), lwd=c(1,2,2), col=c(1,2,4),bty = "n")


fsd <- sqrt(unlist(forecasted$Q))
pl <- forecasted$f + qnorm(0.025, sd = fsd)
pu <- forecasted$f + qnorm(0.975, sd = fsd)
lines(timelo,  exp(pl), col = 1, lwd=2)
lines(timelo,  exp(pu), col = 1, lwd=2)

Comparing our new model (that additionally uses an ARMA(p,q) component) to the original model (as seen in the video for Exam 4) we see that our new model estimates have smaller variance.

In the video it appear our old model had an upper bound of around 720 for its peak. It appears our new model has an upper bound of around 650. This is a significant difference in precision. Therefore, we can conclude that for this time series data a model that includes polynomial order 2 + seasonal component + ARMA(1,1) does better at forecasting than a model without the ARMA(1,1) component.

MATH 531T: Time Series Analysis and Forecasting (Summer 2021)

Hunter Alzate

Due 08/07/2021

Question 1

Question 2 (Extra Credit)

Question 3

Read in Data

Specify Model