Write-up: 1 – Basic Forecasting Methods

Aayush Sethi

2021-09-25

Naive Forecasting

BJ Sales Data

naive_data <- data.frame(BJsales); head(naive_data)
##   BJsales
## 1   200.1
## 2   199.5
## 3   199.4
## 4   198.9
## 5   199.0
## 6   200.2

Forecast

naive_mod <- naive(naive_data$BJsales, h = 12)
summary(naive_mod)
## 
## Forecast method: Naive method
## 
## Model Information:
## Call: naive(y = naive_data$BJsales, h = 12) 
## 
## Residual sd: 1.4992 
## 
## Error measures:
##                     ME     RMSE      MAE       MPE      MAPE MASE      ACF1
## Training set 0.4201342 1.499217 1.161074 0.1804754 0.5104591    1 0.3117991
## 
## Forecasts:
##     Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 151          262.7 260.7787 264.6213 259.7616 265.6384
## 152          262.7 259.9828 265.4172 258.5445 266.8555
## 153          262.7 259.3722 266.0278 257.6105 267.7895
## 154          262.7 258.8574 266.5426 256.8232 268.5768
## 155          262.7 258.4038 266.9962 256.1295 269.2705
## 156          262.7 257.9937 267.4063 255.5024 269.8976
## 157          262.7 257.6167 267.7833 254.9257 270.4743
## 158          262.7 257.2657 268.1343 254.3889 271.0111
## 159          262.7 256.9360 268.4640 253.8848 271.5152
## 160          262.7 256.6242 268.7758 253.4079 271.9921
## 161          262.7 256.3277 269.0723 252.9544 272.4456
## 162          262.7 256.0443 269.3557 252.5210 272.8790
plot(naive_mod)

Seasonal Naive Model

There is still no change in the output

  • This is a clear indication that naive forecasting should not be used for this data set
snaive_fore <- snaive(naive_data$BJsales, h=4)
snaive_fore$mean
## Time Series:
## Start = 151 
## End = 154 
## Frequency = 1 
## [1] 262.7 262.7 262.7 262.7

Interpretation

The output above shows that the naive forecast method predicts the same value for the entire forecasting horizon.

MAPE

mape <- function(actual,pred){
  mape <- mean(abs((actual - pred)/actual))*100
  return (mape)
}
mape(naive_data$BJsales, 262.7)  
## [1] 15.209

MAPE error is 15.2%

Exponential Smoothing

se_model <- ses(naive_data$BJsales, h = 12)
summary(se_model)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = naive_data$BJsales, h = 12) 
## 
##   Smoothing parameters:
##     alpha = 0.9999 
## 
##   Initial states:
##     l = 200.0992 
## 
##   sigma:  1.5043
## 
##      AIC     AICc      BIC 
## 878.0858 878.2502 887.1177 
## 
## Error measures:
##                     ME     RMSE      MAE       MPE      MAPE      MASE
## Training set 0.4173804 1.494266 1.153384 0.1792926 0.5070788 0.9933768
##                   ACF1
## Training set 0.3131158
## 
## Forecasts:
##     Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 151       262.6999 260.7721 264.6278 259.7515 265.6484
## 152       262.6999 259.9737 265.4262 258.5304 266.8695
## 153       262.6999 259.3610 266.0389 257.5935 267.8064
## 154       262.6999 258.8445 266.5554 256.8035 268.5964
## 155       262.6999 258.3894 267.0105 256.1076 269.2923
## 156       262.6999 257.9780 267.4219 255.4784 269.9215
## 157       262.6999 257.5997 267.8002 254.8998 270.5001
## 158       262.6999 257.2476 268.1523 254.3613 271.0386
## 159       262.6999 256.9168 268.4831 253.8554 271.5445
## 160       262.6999 256.6040 268.7959 253.3770 272.0229
## 161       262.6999 256.3065 269.0934 252.9220 272.4779
## 162       262.6999 256.0222 269.3777 252.4872 272.9127
plot(se_model)

The output above shows that the simple exponential smoothing has the same value for all the forecasts. Because the alpha value is close to 1, the forecasts are closer to the most recent observations.

Holt’s Trend Method

holt_model <- holt(naive_data$BJsales, h = 12)
summary(holt_model)
## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = naive_data$BJsales, h = 12) 
## 
##   Smoothing parameters:
##     alpha = 0.9999 
##     beta  = 0.2441 
## 
##   Initial states:
##     l = 200.1794 
##     b = -0.0859 
## 
##   sigma:  1.3752
## 
##      AIC     AICc      BIC 
## 853.1295 853.5461 868.1826 
## 
## Error measures:
##                      ME    RMSE      MAE        MPE      MAPE      MASE
## Training set 0.01019083 1.35678 1.065723 0.00891004 0.4690084 0.9178769
##                    ACF1
## Training set 0.02369031
## 
## Forecasts:
##     Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 151       262.9872 261.2248 264.7497 260.2918 265.6826
## 152       263.2745 260.4614 266.0875 258.9723 267.5767
## 153       263.5617 259.7156 267.4078 257.6797 269.4437
## 154       263.8489 258.9384 268.7594 256.3390 271.3589
## 155       264.1362 258.1157 270.1567 254.9287 273.3437
## 156       264.4234 257.2428 271.6041 253.4416 275.4053
## 157       264.7107 256.3184 273.1029 251.8758 277.5455
## 158       264.9979 255.3428 274.6530 250.2317 279.7641
## 159       265.2852 254.3168 276.2535 248.5105 282.0598
## 160       265.5724 253.2414 277.9034 246.7138 284.4310
## 161       265.8596 252.1179 279.6013 244.8435 286.8758
## 162       266.1469 250.9476 281.3462 242.9015 289.3922
plot(holt_model)

Extending the exponential smoothing method – Holt’s trend method considers the trend component while generating forecasts. This method involves two smoothing equations, one for the level and one for the trend component.

  • The forecasts are more favorable than both exponential smoothing and naive forecasts

ARIMA

Importing GDP Data

# GDP Data
library(WDI)
gdp <- WDI(country=c("US", "CA", "GB"), indicator=c("NY.GDP.PCAP.CD", "NY.GDP.MKTP.CD"), start=1960, end=2020) 
  
# Renaming Columns
names(gdp) <- c("iso2c", "Country", "Year", "PerCapGDP", "GDP")

head(gdp)
##   iso2c Country Year PerCapGDP         GDP
## 1    CA  Canada 1960  2259.294 40461721693
## 2    CA  Canada 1961  2240.433 40934952064
## 3    CA  Canada 1962  2268.585 42227447632
## 4    CA  Canada 1963  2374.498 45029988561
## 5    CA  Canada 1964  2555.111 49377522897
## 6    CA  Canada 1965  2770.362 54515179581

Subsetting data for only United States Data

# get US data

us <- gdp$PerCapGDP[gdp$Country == "United States"] 


# Converting Data to Time-Series Object 
us <- ts(us, start=min(gdp$Year), end=max(gdp$Year)) ; us
## Time Series:
## Start = 1960 
## End = 2020 
## Frequency = 1 
##  [1]  3007.123  3066.563  3243.843  3374.515  3573.941  3827.527  4146.317
##  [8]  4336.427  4695.923  5032.145  5234.297  5609.383  6094.018  6726.359
## [15]  7225.691  7801.457  8592.254  9452.577 10564.948 11674.186 12574.792
## [22] 13976.110 14433.788 15543.894 17121.225 18236.828 19071.227 20038.941
## [29] 21417.012 22857.154 23888.600 24342.259 25418.991 26387.294 27694.853
## [36] 28690.876 29967.713 31459.139 32853.677 34513.562 36334.909 37133.243
## [43] 38023.161 39496.486 41712.801 44114.748 46298.731 47975.968 48382.558
## [50] 47099.980 48466.658 49882.558 51602.931 53106.537 55049.988 56863.371
## [57] 58021.400 60109.656 63064.418 65279.529 63543.578
plot(us, ylab="Per Capita GDP", xlab="Year")

Modeling

arima_model <- auto.arima(us)
summary(arima_model)
## Series: us 
## ARIMA(2,2,1) 
## 
## Coefficients:
##          ar1      ar2      ma1
##       0.5297  -0.4374  -0.8551
## s.e.  0.1740   0.1641   0.0753
## 
## sigma^2 estimated as 480076:  log likelihood=-468.89
## AIC=945.77   AICc=946.52   BIC=954.08
## 
## Training set error measures:
##                    ME     RMSE      MAE      MPE     MAPE      MASE        ACF1
## Training set 105.3186 663.8716 428.7066 1.011964 1.889548 0.3863758 -0.06261455

MAPE = 1.878

ARIMA Results

fore_arima = forecast::forecast(arima_model, h=12)

fore_arima
##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2021       62727.80 61839.85 63615.76 61369.79 64085.81
## 2022       64127.68 62395.71 65859.65 61478.87 66776.50
## 2023       66298.82 63978.11 68619.54 62749.60 69848.05
## 2024       67909.40 65177.25 70641.55 63730.94 72087.87
## 2025       68885.68 65773.24 71998.12 64125.61 73645.74
## 2026       69771.12 66229.24 73313.00 64354.28 75187.96
## 2027       70885.89 66864.94 74906.84 64736.38 77035.40
## 2028       72161.88 67645.03 76678.72 65253.95 79069.80
## 2029       73422.96 68409.79 78436.13 65755.98 81089.94
## 2030       74605.63 69090.79 80120.47 66171.41 83039.85
## 2031       75753.28 69722.40 81784.16 66529.85 84976.71
## 2032       76916.68 70352.67 83480.68 66877.90 86955.46
plot(fore_arima)