Applying various time series methods, I develop several models to forecast US monthly seasonally adjusted non-farm payrolls. I first develop an arima model using the auto.arima function from the forecast package. Then I develop a SARIMA model using the astsa package where I fine tune the number of parameters to minimize the model’s AIC. My third forecast is a regression model with ARMA errors using several covariates as leading indicators. My final forecast is a composite of the three models.
My composite forecast for the change in May 2016 non-farm payrolls was 189,000 which was wildly incorrect - the actual change for May was 36,000. I plan to update the models on a rolling basis as each month’s non-farm payrolls are released. The most recent jobs report from the BLS can be found here.
Before developing a forecast I want to look at a few plots:
The monthly change in non-farm payrolls also shows the effect of the great recession. There were 23 straight months with negative changes in non-farm payrolls. The worst month saw 823,000 jobs lost. Since the great recession ended, however, the average change in non-farm payrolls has been 157256.
The ACF and PACF can give insight to the structure of the correllation of the data with itself over time. The ACF plot shows high correlation at large lags, suggesting the data is not stationary and differencing may be necessary.
The differenced data looks much easer to model and suggests an AR(1) model may be a good fit.
This forecast is based on three models discussed in more detail below - auto.arima, SARIMA, regression with ARMA errors
The auto arima model is developed using the forecas package and the auto.arima fuction. The output of the fucntion indicates an ARMA(1,2) with 1 round of differenceing is the best model. This model resulted in a forecast of about 189,000 jobs for May 2016.
(model1 <- auto.arima(emp.change))
## Series: emp.change
## ARIMA(0,1,1)(2,0,0)[12]
##
## Coefficients:
## ma1 sar1 sar2
## -0.4662 -0.1222 -0.1955
## s.e. 0.0607 0.0734 0.0729
##
## sigma^2 estimated as 14904: log likelihood=-1206.52
## AIC=2421.05 AICc=2421.26 BIC=2434.12
The SARIMA model is developed using the sarima() function from the astsa package, with the number of model terms selected so as to minimize the model’s AIC while still keeping the model relatively simple. I went with an ARMA(0,1,1)(0,0,1) model which resulted in a forecast of about 176,000 jobs. The model diagnostics all seem pretty good. The residuals appear like white noise, there are no significant spikes on the ACF plot of residuals, the p-vals are all pretty high and the residuals appear somewhat normally distributed.
## $fit
##
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D,
## Q), period = S), include.mean = !no.constant, optim.control = list(trace = trc,
## REPORT = 1, reltol = tol))
##
## Coefficients:
## ma1 sma1
## -0.4511 -1.0000
## s.e. 0.0635 0.0642
##
## sigma^2 estimated as 15883: log likelihood = -1155.3, aic = 2316.6
##
## $AIC
## [1] 10.69351
##
## $AICc
## [1] 10.70441
##
## $BIC
## [1] 9.727075
The regression model with ARMA errors is developed using several leading indicators:
The plot below shows the fit of the three forecasts described above.
This plot shows the forecasts of the three models.