Q2

We consider time series for the Civilian Unemployment Rate using the FRED Database.

First we load the “Quandl” packages as well as others.

library("Quandl")
library("tseries")
library("urca")
library("stargazer")

The symbol is FRED/UNRATE. Lets first plot the series and its first differences to see what it looks like.

civ<-Quandl("FRED/UNRATE", type="zoo")
diffciv<-diff(civ)
diffciv_1<-diff(diff(civ))
par(mfrow=c(2,2))
plot(civ,xlab="", ylab="Civilian Unemployment Rate")
plot(diffciv,xlab="", ylab="First Differences CUR")
plot(diffciv_1,xlab="", ylab="Twice Differencing CUR")

Looking at the plotted series, it looks as though the first differences is stationary. The raw time series may be trend stationary. Lets conduct the ADF and KPSS tests on the raw data.

adf.civ<-ur.df(civ,type="trend",selectlags="BIC")
summary(adf.civ)

Augmented Dickey-Fuller Test Unit Root Test

Test regression trend

Call: lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)

Residuals: Min 1Q Median 3Q Max -1.64394 -0.11178 -0.00623 0.10994 1.33334

Coefficients: Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.859e-02 2.729e-02 2.147 0.03211 *
z.lag.1 -1.009e-02 4.829e-03 -2.090 0.03691 *
tt 3.258e-06 3.369e-05 0.097 0.92299
z.diff.lag 1.269e-01 3.478e-02 3.649 0.00028 *** — Signif. codes: 0 ‘’ 0.001 ’’ 0.01 ’’ 0.05 ‘.’ 0.1 ‘’ 1

Residual standard error: 0.2096 on 811 degrees of freedom Multiple R-squared: 0.021, Adjusted R-squared: 0.01738 F-statistic: 5.799 on 3 and 811 DF, p-value: 0.0006355

Value of test-statistic is: -2.0902 1.6469 2.4589

Critical values for test statistics: 1pct 5pct 10pct tau3 -3.96 -3.41 -3.12 phi2 6.09 4.68 4.03 phi3 8.27 6.25 5.34

kpss.test(civ, null="Trend")
KPSS Test for Trend Stationarity

data: civ KPSS Trend = 0.49797, Truncation lag parameter = 6, p-value = 0.01

kpss.test(diff(civ), null="Level")
KPSS Test for Level Stationarity

data: diff(civ) KPSS Level = 0.064885, Truncation lag parameter = 6, p-value = 0.1

Looking at the ADF test, we can reject that \(\gamma\) is zero and we can reject that \(\rho\) is zero. We cannot reject that \(\beta\) is zero. Since we reject that \(\gamma\) is zero, we can say that \(y_t\) is trend stationary.

When we look at the KPSS test, we cannot reject the null that it is a trend stationary model. So let us carefully conclude that the time series is trend stationary for now. Next we apply the Box Jenkins methodology to build a time series until the end of 2014, and then check it for adequacy and plot a forecast until the end of 2016.

Box Jenkins Methodology

We truncate the data to the end of 2014. Since the data is trend stationary

civ2<-window(civ, end="Dec 2014")
diffciv2<-diff(civ2)

ACF & PACF Suggestions for Estimated Model

Lets look at the ACF and PACF of our data.

par(mfrow=c(2,2))
acf(civ2,type='correlation',na.action=na.pass,lag=96)
acf(civ2,type='partial',na.action=na.pass,lag=96)
acf(diffciv2,type='correlation',na.action=na.pass,lag=96)
acf(diffciv2,type='partial',na.action=na.pass,lag=96)

Looking at both the raw data and the differenced data, the PACF seems to point to AR(5).

Ljung-Box Q Statistic

ARMA53 <- arima(civ2, order=c(5,0,3))
tsdiag(ARMA53,gof.lag=12)

After trying out various ARIMA combinations using the Ljung-Box Q Statistic, I settled on an ARMA(5,3) process for the estimator.

Now I test our ARMA(5,3) model for significance.

The ARMA (5,3) does indeed show significance. Next we do the BIC check.

BIC Check

BIC(ARMA53)
## [1] -277.6448
AR50<-arima(civ2, order=c(5,0,0))
BIC(AR50)
## [1] -274.7875

We see that at least that an ARMA(5,3) is better than an AR(5) model because the former’s BIC value is lower.

Forecasts

Now we will construct forecasts using the ARMA(5,3) model and estimate how close we get to the actual housing starts values from Jan 2015-Nov 2015. Then I will plot that data as well as a forecast till the end of 2016.

library("forecast")
ARMA53fore<-forecast.Arima(ARMA53,h=24)
plot(ARMA53fore, xlim=c(2005,2016),ylim=c(2,10))
lines(ARMA53fore$mean, type="p", pch=16, col="blue")
lines(civ, type="o", pch=16)

The blue dotted line is our forecast, and the black dotted line are the actual values. It shows that the actual unemployment rate is lower than what we might expect using the ARMA(5,3) model estimator.