We are going to try a bunch of different forecasts for total employment. I grabbed the non-seasonally adjusted data, just to make it more interesting Download Employment total level Data
##########################################################
library(knitr)
library(fpp2)
## Warning: package 'fpp2' was built under R version 4.0.3
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
## -- Attaching packages --------------------------------------------------------------------------------- fpp2 2.4 --
## v ggplot2 3.3.2 v fma 2.4
## v forecast 8.13 v expsmooth 2.3
## Warning: package 'forecast' was built under R version 4.0.3
## Warning: package 'fma' was built under R version 4.0.3
## Warning: package 'expsmooth' was built under R version 4.0.3
##
Emp=read.csv("D:/predictive analytics/discussions/PAYNSA.csv", header = TRUE)
Emp.chg=read.csv("D:/predictive analytics//discussions/PAYNSA_chg.csv", header = TRUE)
Emp <- ts(Emp[,2], start=c(1939,1,1) ,frequency=12)
Emp.chg <- ts(Emp.chg[,2], start=c(1939,1,1) ,frequency=12)
autoplot(Emp)+ylab("Total Employment (Thousands)")
##########################################################
Decomposition just to see what we are working with. Additive seems to be the way to go here. Should also note that the ggsubseries plot is for year difference in total employment, in order to show which months are effected by seasonality. December is clearly where the seasonal jobs are lost. Decompose
##########################################################
dec1<-decompose(Emp,type="additive") #decompose additive
dec2<-decompose(Emp,type="multiplicative") #decompose multiplicative
autoplot(dec1)
autoplot(dec2)
ggsubseriesplot(Emp.chg)
##########################################################
Simple Smoothing prediction, with a flat line prediction. Obviously, guessing the level the period before is not the best forecast Simple Smoothing
fc.ses<-ses(Emp, h=12)
#forecast
autoplot(fc.ses)+
autolayer(fitted(fc.ses), series="Fitted")+
ylab("Total Employment (Thousands)")+xlab("year")+
xlim(c(2018, 2022))
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
## Warning: Removed 948 row(s) containing missing values (geom_path).
## Warning: Removed 948 row(s) containing missing values (geom_path).
Using Holt yields a slight uptick, which we would somewhat expect given the unique times we are in. However, it fails to capture the seasonality that usually accompanies employment. Holt
fc.holt<-holt(Emp, h=12)
#forecast
autoplot(fc.holt)+
autolayer(fitted(fc.holt), series="Fitted")+
ylab("Total Employment (Thousands)")+xlab("year")+
xlim(c(2018, 2022))
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
## Warning: Removed 948 row(s) containing missing values (geom_path).
## Warning: Removed 948 row(s) containing missing values (geom_path).
Using Holt Winter gives us the seasonality we are looking for. It does show a relatively flat trajectory for total employment over the next two years, adding only 1.2 million jobs, or 50,000 jobs a month. That would be an extremely slow recovery (the U.S. was averaging between 100k-200k jobs per month during the last recovery, and that was considered frustratingly slow). That being said, this is probably where extra parameters would be useful, and we should hedge to the upside of this forecast. The 95% upside interval predicts an 8.2 million job increase, an average of 342,000 jobs a month which would be a string jobs recovery, though the labor market has done better than expected since its initial April drop. Holt Winter
fc.hw<-hw(Emp, seasonal="additive")
fc.hw
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Nov 2020 143583.8 142587.6 144580.0 142060.2 145107.4
## Dec 2020 143140.4 141832.5 144448.4 141140.1 145140.8
## Jan 2021 140100.7 138538.5 141662.9 137711.5 142489.9
## Feb 2021 140923.4 139139.6 142707.2 138195.3 143651.5
## Mar 2021 141338.5 139354.8 143322.2 138304.7 144372.3
## Apr 2021 139343.2 137175.3 141511.2 136027.7 142658.8
## May 2021 143262.5 140922.2 145602.7 139683.4 146841.5
## Jun 2021 144582.8 142079.8 147085.9 140754.7 148410.9
## Jul 2021 143008.5 140350.3 145666.7 138943.1 147073.8
## Aug 2021 143052.7 140245.8 145859.6 138759.9 147345.5
## Sep 2021 143256.5 140306.3 146206.8 138744.5 147768.5
## Oct 2021 144091.1 141002.2 147180.0 139367.0 148815.1
## Nov 2021 144215.9 140944.5 147487.2 139212.8 149219.0
## Dec 2021 143772.5 140371.9 147173.1 138571.7 148973.2
## Jan 2022 140732.8 137205.9 144259.6 135338.9 146126.7
## Feb 2022 141555.5 137905.0 145206.0 135972.5 147138.5
## Mar 2022 141970.6 138198.8 145742.4 136202.1 147739.1
## Apr 2022 139975.3 136084.3 143866.3 134024.5 145926.1
## May 2022 143894.5 139886.3 147902.8 137764.5 150024.6
## Jun 2022 145214.9 141091.2 149338.6 138908.2 151521.6
## Jul 2022 143640.5 139402.9 147878.2 137159.7 150121.4
## Aug 2022 143684.8 139334.7 148034.8 137032.0 150337.6
## Sep 2022 143888.6 139427.4 148349.7 137065.9 150711.3
## Oct 2022 144723.1 140152.1 149294.1 137732.4 151713.9
#forecast
autoplot(fc.hw)+
autolayer(fitted(fc.hw), series="Fitted", PI=FALSE)+
ylab("Total Employment (Thousands)")+xlab("year")+
xlim(c(2018, 2022))+ylim(c(120000, 155000))
## Warning: Ignoring unknown parameters: PI
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
## Warning: Removed 948 row(s) containing missing values (geom_path).
## Warning: Removed 948 row(s) containing missing values (geom_path).
The ETS estimation has an even more pessimistic point forecast than the Holt Winter, though will likely yield better accuracy tests ETS
fc.ets<-ets(Emp)
fc.ets
## ETS(M,Ad,M)
##
## Call:
## ets(y = Emp)
##
## Smoothing parameters:
## alpha = 0.927
## beta = 0.0405
## gamma = 0.073
## phi = 0.9784
##
## Initial states:
## l = 29404.6748
## b = 384.5581
## s = 1.019 1.0069 1.0079 1.0092 1.0025 0.9968
## 0.9994 0.9959 0.9943 0.9954 0.9837 0.9891
##
## sigma: 0.0066
##
## AIC AICc BIC
## 19143.09 19143.80 19231.10
forecast(fc.ets, h=24)
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Nov 2020 143607.5 142386.1 144828.9 141739.5 145475.5
## Dec 2020 143179.4 141484.1 144874.7 140586.7 145772.2
## Jan 2021 140082.4 138033.8 142131.1 136949.3 143215.6
## Feb 2021 140617.9 138208.9 143026.9 136933.6 144302.2
## Mar 2021 140855.4 138112.7 143598.1 136660.8 145050.0
## Apr 2021 140082.8 137042.9 143122.8 135433.6 144732.1
## May 2021 142003.6 138616.5 145390.7 136823.4 147183.8
## Jun 2021 142460.0 138763.6 146156.3 136806.9 148113.1
## Jul 2021 140749.1 136808.5 144689.7 134722.4 146775.8
## Aug 2021 140644.4 136423.0 144865.8 134188.3 147100.5
## Sep 2021 140960.0 136448.5 145471.5 134060.2 147859.8
## Oct 2021 141835.6 137016.7 146654.4 134465.7 149205.4
## Nov 2021 142016.1 136884.5 147147.7 134168.0 149864.2
## Dec 2021 141625.4 136235.1 147015.8 133381.6 149869.3
## Jan 2022 138593.4 133052.9 144133.9 130120.0 147066.8
## Feb 2022 139154.0 133325.9 144982.2 130240.7 148067.4
## Mar 2022 139419.3 133315.5 145523.2 130084.3 148754.3
## Apr 2022 138684.2 132350.4 145018.0 128997.4 148371.0
## May 2022 140615.1 133928.2 147302.0 130388.4 150841.9
## Jun 2022 141095.9 134121.1 148070.6 130429.0 151762.8
## Jul 2022 139429.3 132276.0 146582.7 128489.2 150369.5
## Aug 2022 139353.0 131943.3 146762.6 128020.9 150685.0
## Sep 2022 139692.6 132004.8 147380.4 127935.1 151450.0
## Oct 2022 140586.8 132588.7 148584.8 128354.8 152818.8
fc.ets%>%forecast(h=24)%>%
autoplot()+
ylab("Total Employment (Thousands)")+xlab("year")+
xlim(c(2018, 2022))+ylim(c(120000, 155000))
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
## Warning: Removed 948 row(s) containing missing values (geom_path).
Accuracy SES
accuracy(fc.ses)
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 116.2659 1068.951 656.3999 0.155482 0.7779856 0.3057198 0.01896624
Accuracy holt
accuracy(fc.holt)
## ME RMSE MAE MPE MAPE MASE
## Training set -0.6707129 1062.72 609.5694 -0.003271905 0.720219 0.2839083
## ACF1
## Training set 0.02141457
Accuracy hw
accuracy(fc.hw)
## ME RMSE MAE MPE MAPE MASE
## Training set -33.28087 770.98 265.0773 -0.06544405 0.3831697 0.1234604
## ACF1
## Training set 0.2098566
Accuracy ets
accuracy(fc.ets)
## ME RMSE MAE MPE MAPE MASE
## Training set 32.24793 752.6262 251.8645 0.03956541 0.3301809 0.1173065
## ACF1
## Training set 0.1079829
The ETS estimation is easily the most accurate. The errors are pretty high, though I have to attribute some of that error to the drop in April, which the decomp charts show was historic. In a more full forecast, I would use a dummy or a knot to account for the decline. Going into the future, it might need to be necessary to control for pandemic times, when we saw extreme movements in the labor market as a result of a black swan event, and the BLS itself has admitted that its labor data is not completely accurate given many workers not fitting neatly into one of its pre-determined labor classifications