Libraries:

library(AER)
## Warning: package 'AER' was built under R version 4.3.2
## Loading required package: car
## Loading required package: carData
## Loading required package: lmtest
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## Loading required package: sandwich
## Loading required package: survival
library(vars)
## Loading required package: MASS
## Loading required package: strucchange
## Loading required package: urca
library(forecast)
## Warning: package 'forecast' was built under R version 4.3.2
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
library(seasonal)
library(tseries)
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.3.2
library(dynlm)
library(qcc)
## Package 'qcc' version 2.7
## Type 'citation("qcc")' for citing this R package in publications.
library(Metrics)
## 
## Attaching package: 'Metrics'
## The following object is masked from 'package:forecast':
## 
##     accuracy
library(prophet)
## Loading required package: Rcpp
## Loading required package: rlang
## 
## Attaching package: 'rlang'
## The following object is masked from 'package:Metrics':
## 
##     ll

Introduction:

We will use the employment variable part of the Orange County time series data set. This variable measures the employment numbers in thousands and observes these changes over time. The data is already set as a time series class, so we do not need to convert it into one.

Reference: Baltagi, B.H. (2002). Econometrics, 3rd ed. Berlin, Springer.

data(OrangeCounty)
OC <- OrangeCounty
head(OC)
## Time Series:
## Start = 1965.1 
## End = 1966.35 
## Frequency = 4 
##         employment      gnp
## 1965.10     288.00 906.6016
## 1965.35     298.75 919.6007
## 1965.60     302.78 934.0129
## 1965.85     308.65 956.7770
## 1966.10     312.73 975.4326
## 1966.35     331.45 979.3680
emp <- OC[, "employment"]
ggtsdisplay(emp)

When we visualize the data alongside the ACF and PACF, we see that the model looks closest to an AR(1) model. For the employment data itself, there is clearly an increase over time, spanning from 1965 to 1984.

train_emp <- ts(emp, start = 1965.1, end = 1980.35, frequency = 4)
test_emp <- emp[62:length(emp)]

Before we start forecasting, we want to split the employment data into the train and test part of the data. Since it is common to divide the train and test into 80% and 20% we will do that by calling specific dates and utilizing the length function. The train data will be used to forecast, and the test data will be used to compare our forecasts to see our errors.

Forecasting using a Variety of Methods:

ARIMA:

arima_forecast <- forecast(auto.arima(train_emp), h = 15)
plot(arima_forecast)

The first forecasting model we observe is the arima forecast. We can see from above the graph that R automatically generated an AR(1) with one differentiation to make the data stationary, as well as an ARMA (1,1) model with one differentiation for the seasonality as well. Since employment numbers may vary depending on the season, and the arima model captures seasonality, the arima forecast may be useful for our ideal forecast method. However, we will still see how other forecast methods perform.

ETS:

ets_forecast <- forecast(ets(train_emp), h = 15)
plot(ets_forecast)

Compared to the arima forecast, the ets forecast has wider band/confidence widths. The ets forecast also is able to capture seasonality within the data, but specifically captures the trend, seasonality, and cycle part of the data. The wider confidence bands can be explained by a multitude of factors, such as the arima forecast being more accurate, or it just being able to capture smaller movements within the data, especially the moving average and seasonality.

Holt-Winters:

hw_forecast <- forecast(hw(train_emp, seasonal = c("additive")), h = 8)
plot(hw_forecast)

summary(hw_forecast)
## 
## Forecast method: Holt-Winters' additive method
## 
## Model Information:
## Holt-Winters' additive method 
## 
## Call:
##  hw(y = train_emp, seasonal = c("additive")) 
## 
##   Smoothing parameters:
##     alpha = 0.9999 
##     beta  = 0.2281 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 283.1107 
##     b = 7.0893 
##     s = 1.0782 -0.1907 5.3919 -6.2794
## 
##   sigma:  8.2003
## 
##      AIC     AICc      BIC 
## 526.2346 529.6961 545.3788 
## 
## Error measures:
##                    ME     RMSE      MAE        MPE     MAPE      MASE
## Training set 0.214266 7.653009 6.012858 0.04639724 1.146515 0.1617117
##                    ACF1
## Training set 0.05458785
## 
## Forecasts:
##         Point Forecast    Lo 80    Hi 80    Lo 95     Hi 95
## 1980.60       848.9093 838.4002 859.4185 832.8370  864.9817
## 1980.85       860.2974 843.6543 876.9405 834.8440  885.7508
## 1981.10       863.0589 840.4499 885.6679 828.4814  897.6364
## 1981.35       884.8488 856.1354 913.5623 840.9354  928.7623
## 1981.60       889.3882 854.3394 924.4369 835.7858  942.9906
## 1981.85       900.7762 859.1297 942.4227 837.0834  964.4690
## 1982.10       903.5378 855.0192 952.0563 829.3350  977.7405
## 1982.35       925.3277 869.6611 980.9943 840.1930 1010.4624

Next we will consider the holt winter forecast method. The holt winter forecast uses similar algorithms of the ets forecast by utilizing the trend, seasonality, and cycles. However, holt winters also adds levels to these factors with smoothing parameters (alpha, beta, gamma) so that it could weigh some of the factors. Since alpha is close to 1, the recent observation has a lot of influence in he forecast. The beta value implies there is some trend going on, and the small value of gamma implies there is little to no seasonality.

Neural Networks:

nnetar_forecast <- forecast(nnetar(train_emp), h = 15)
plot(nnetar_forecast, ylim = c(400, 1000))

summary(nnetar_forecast)
## 
## Forecast method: NNAR(1,1,2)[4]
## 
## Model Information:
## 
## Average of 20 networks, each of which is
## a 2-2-1 network with 9 weights
## options were - linear output units 
## 
## Error measures:
##                       ME     RMSE     MAE         MPE     MAPE      MASE
## Training set 0.004113398 9.138583 7.06932 -0.03171407 1.377916 0.1901245
##                    ACF1
## Training set 0.01951275
## 
## Forecasts:
##         Point Forecast
## 1980.60       843.7229
## 1980.85       849.2472
## 1981.10       849.2493
## 1981.35       851.6422
## 1981.60       852.1336
## 1981.85       853.1894
## 1982.10       853.4551
## 1982.35       853.9083
## 1982.60       854.1002
## 1982.85       854.3146
## 1983.10       854.4100
## 1983.35       854.5045
## 1983.60       854.5580
## 1983.85       854.6046
## 1984.10       854.6311

The nnetar forecast, or the forecast using neural networks, are particularly useful to analyze non-linear relationships, which is time and employment in this case. Visually, the nnetar forecast seems to be more of a straight line compared to the arima and ets forecast. Based on this observation, the nnetar forecast may be more useful for a safer forecast, but that may just be conditional to only this dataset. For the summary, we can see that the output is linear, and that there is one input layer (first/initial layer) of one neuron, one hidden layer (layers in between) that contains two neurons, and one output neuron (the final neuron) with nine weights, the total amount of parameters. From the forecast method, we can see that there is one order of both AR and MA, with two seasonal lags and a 4 frequency, which we already knew.

Prophet:

emp_date <- seq(from = 1965.1, to = 1983.85, by = 0.25)
emp_df <- data.frame(ds = emp_date, y = emp)
prophet_model <- prophet(emp_df, yearly.seasonality=TRUE)
## Disabling weekly seasonality. Run prophet with weekly.seasonality=TRUE to override this.
## Disabling daily seasonality. Run prophet with daily.seasonality=TRUE to override this.
future <- make_future_dataframe(prophet_model, periods = 15, freq = 4)  
prophet_forecast <- predict(prophet_model, future)
prophet_forecast <- prophet_forecast$yhat[66:80]
plot(prophet_forecast)

The prophet model is a type of fourier forecast that uses wavelengths with trigonometry functions to capture larger parts of the seasonal components within the dataset. Visually we see that the frequency in our employment data is not particularly large, so intuitively we should not use the prophet forecast.

Measuring Errors with the Train and Test Data:

Now, we will use a couple of error measurements to see which type of forecast is preferred compared to our test data.

mape(arima_forecast$mean, test_emp)
## [1] 0.06834808
mape(ets_forecast$mean, test_emp)
## [1] 0.03255725
mape(hw_forecast$mean, test_emp[1:8])
## [1] 0.02617629
mape(nnetar_forecast$mean, test_emp)
## [1] 0.01502832
mape(prophet_forecast, test_emp)
## [1] 0.04539546
rmse(arima_forecast$mean, test_emp)
## [1] 77.84847
rmse(ets_forecast$mean, test_emp)
## [1] 37.25079
rmse(hw_forecast$mean, test_emp[1:8])
## [1] 29.33354
rmse(nnetar_forecast$mean, test_emp)
## [1] 15.34284
rmse(prophet_forecast, test_emp)
## [1] 74.55366

Based on the mape (mean absolute percentage error, measures proportional error) and rmse (root mean squared error, measures accuracy), we see that the neural network is preferred, followed by the holt winters, and then the ets forecast. However, for combing forecasts we will not use the holt winters, since we were only able to forecast 8 steps ahead for holt winters, compared to all other forecasts, where we forecasted 15 steps ahead. Additionally, the ets forecast uses some parts of the ets forecast.

Combining Forecasts using average:

ets_nnetar_forecast <- (ets_forecast$mean + nnetar_forecast$mean)/2
plot(train_emp, xlim = c(1970, 1985), ylim = c(300, 1000), col = "red", main = "Forecast of hw and nnetar combination")
lines(ets_nnetar_forecast, col = "blue")

By combining the NNETAR and ets forecast using averaging, we see that we get a similar forecast from the original two models.

Combining ets and nnetar using regression:

forecast_regression <- lm(test_emp ~ ets_forecast$mean + nnetar_forecast$mean)
fitted_regression <- fitted(forecast_regression, interval = "prediction", n.ahead = 15)
fitted_regression <- ts(fitted_regression, start = c(1980.35), end = c(1983.85), frequency = 4)

plot(train_emp, type = "l", col = "red", xlim = c(1970, 1985), main = "Original Data with Combined Forecast", ylim = c(400, 1000))
lines(fitted_regression, col = "blue")

In this type of combination, we regressed the nnetar and ets forecast values onto the test data and forecasted using that fitted regression model. Although the forecast looks similar to the combination forecast of the average, the upwards trend in this model seems more stale/horizontal.

Measuring errors of the Combination Forecasts:

mape(fitted_regression, test_emp)
## [1] 0.01082425
rmse(fitted_regression, test_emp)
## [1] 11.43378
mape(ets_nnetar_forecast, test_emp)
## [1] 0.01698182
rmse(ets_nnetar_forecast, test_emp)
## [1] 18.06976

We see that although both errors are low, the regression forecast should be preferred based on the mape and rmse. Compared to the singular forecast methods, the ets method errors is slightly lower than the combined average forecast, but we must consider the “thief” aspect of the combined average forecast, which means that the combined forecasts steals away different components of individual forecasts that other forecast methods do not utilize.

Conclusion:

In this project, we utilized various forecast methods to predict the future values of our employment data, including the arima, ets, holt-winters, nnetar, and prophet forecasts. We then picked a couple of these individual forecasts based on the lowest mape and rmse by splitting the data into a train and test part with the 80/20 proportion. We then combined forecasts, including the combined average forecasting and combined regressional forecasting. We found out that the combined regressional forecast with nnetar and ets had the lowest errors, concluding that the combined regressional forecast is the most preferred out of all forecast methods. Finally, we must acknowledge that although the average combined forecast method did not have the lowest errors, it still can have an advantage over some of the individual forecast methods because of the “thief” aspect, as it utilizes various unique components of each individual forecast methods to make a more unbiased forecast. To improve this project in the future, we can utilize even more forecasting methods, like the state space model we learned about. We can also be more concerned about our forecast horizon, since we haven’t learned in this class the ideal forecast horizon when predicting our values. Lastly, we may also choose a more appropriate dataset to forecast, since the employment data did not seem to exhibit much frequency, and a lot of the forecast models we learned about were specialized to decompose the frequency patterns so that they could make a more advanced forecast.