Hwk 2 - Chapter 7&8 - Predictive Analytics

#Chapter 7 

#7.1. Consider the pigs series — the number of pigs slaughtered in Victoria each month.
#1.a. Use the ses() function in R to find the optimal values of αand  ℓ0 , and generate forecasts for the next four months.
# ?pigs = #Monthly total # of pigs slaughtered in Victoria, Aus
autoplot(pigs) + ylab("# of Pigs slaughtered in Victoria") + xlab("Time (months)")

fcses_pigs <- ses(pigs, h = 4)
summary(fcses_pigs)

## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = pigs, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.2971 
## 
##   Initial states:
##     l = 77260.0561 
## 
##   sigma:  10308.58
## 
##      AIC     AICc      BIC 
## 4462.955 4463.086 4472.665 
## 
## Error measures:
##                    ME    RMSE      MAE       MPE     MAPE      MASE       ACF1
## Training set 385.8721 10253.6 7961.383 -0.922652 9.274016 0.7966249 0.01282239
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Sep 1995       98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995       98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995       98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995       98816.41 83958.37 113674.4 76092.99 121539.8

#α = 0.2971 for 95% confidence interval, which indicates slower/more balanced weighting
#ℓ = 77260.0561
fcses_pigs

##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Sep 1995       98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995       98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995       98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995       98816.41 83958.37 113674.4 76092.99 121539.8

autoplot(fcses_pigs) + ylab("# of Pigs slaughtered in Victoria") + xlab("Time (months)")

#1.b. Compute a 95% prediction interval for the first forecast...Compare your interval with the interval produced by R.
checkresiduals(fcses_pigs)

## 
##  Ljung-Box test
## 
## data:  Residuals from Simple exponential smoothing
## Q* = 55.356, df = 22, p-value = 0.0001057
## 
## Model df: 2.   Total lags used: 24

mean(fcses_pigs$residuals)

## [1] 385.8721

#Residuals are not perfectly normal though have consistent variance, and show some autocorrekation blips and a mean above 0 means there is bias
#Fails Ljung-box test - residuals are distinguishable from white noise autocorrelation
st_pigsfc_res <- sd(fcses_pigs$residuals)
#Calculated confidence interval = Yhat +/- 1.96 * st(residuals)
Yhat_pigs <- 98816.41
Pigs_ci_low <- Yhat_pigs - 1.96*(st_pigsfc_res)
Pigs_ci_low_high <- Yhat_pigs + 1.96*(st_pigsfc_res)
Pigs_fcses_ci <- c(Pigs_ci_low, Pigs_ci_low_high)
Pigs_fcses_ci

## [1]  78679.97 118952.85

#My calculated 95% confidence intervals (78679.97-118952.85) were very similar to the R generated intervals (78611.97-119020.8)
#The R generated intervals are slightly wider as the result of more precise Z score employed

#2. Write your own function to implement simple exponential smoothing. The function should take arguments y (the time series), alpha (the smoothing parameter α) and level (the initial level ℓ0). It should return the forecast of the next observation in the series. Does it give the same forecast as ses()?

#if statement for modeling data set
#Else statement for forecasts
ses_man_function <- function(y, alpha, level, h) 
  {for(i in 1:length(y)+h){
    if(i <= length(y)){
      level[i] <- alpha*y[i] +(1-alpha)*level[i-1]}
    else{
      level[i] <- alpha*level[i-1]+(1-alpha)*level[i-1]}}
  return(level[length(y):length(y)+h])}


#α = 0.2971 for 95% confidence interval, which indicates slower/more balanced weighting
#ℓ = 77260.0561 

ses_man_function(pigs,alpha = 0.2971, level = c(95), h = 1)

## [1] 98816.45

ses(pigs, h = 1)

##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Sep 1995       98816.41 85605.43 112027.4 78611.97 119020.8

#Both point estmates are the same -> 98816.4~

#3. Modify your function from the previous exercise to return the sum of squared errors rather than the forecast of the next observation. Then use the optim() function to find the optimal values.Do you get the same values as the ses() function?
#ses_sse <- function(parameters = c(alpha, level), y)
#  {error <- 0
 #  sse <- 0
  # alpha <- parameters[1]
   #l0 <- parameters[2]
   #for (index in 1:length(y))
    #y_hat = l0  
   #{error <- y[index] - level
    # sse <- sse + error^2
     #y_hat <- alpha*y[index] + (1 - alpha)*y_hat }
   #return (sse)}

#ses_sse(parameters = c(0.2971, c(95)), pigs)
#optim(par = c(0.5, pigs[1]), y=pigs, fn = ses_sse)

#stuck here

#4

#5.Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.
#a, Plot the series and discuss the main features of the data.
#?books - daily dales of paperback and hardcover books from same store
autoplot(books)

pb_plot <- autoplot(books[,"Paperback"]) + ggtitle("Paperback books sold") + xlab("Time (Days)") + ylab("Paperback Sales") 
hc_plot <- autoplot(books[,"Hardcover"]) + ggtitle("Hardcover books sold") + xlab("Time (Days)") + ylab("Hardcover Sales") 
grid.arrange(pb_plot, hc_plot, ncol = 2)

frequency(books)

## [1] 1

#There are two time series plots, one for paperback books and one for hardcover books
#Both time series appear to be trending upwards with time and show variability by day, sometimes hardcover and paperback move in unison and sometimes they are inversely related
#It appears that the data might be over the course of one given month

#b. Use the ses() function to forecast each series, and plot the forecasts.
pb_ses <- ses(books[,"Paperback"], h = 4)
hc_ses <- ses(books[,"Hardcover"], h = 4)

books[,"Paperback"] %>% ses(h = 4) %>% autoplot() + ylab("Paperback books Sold") +xlab("Time (Days)") + ggtitle("SES Forecast Paperback")

books[,"Hardcover"] %>% ses(h = 4) %>% autoplot() + ylab("Hardcover books Sold") +xlab("Time (Days)") + ggtitle("SES Forecast Hardcover")

#c. Compute the RMSE values for the training data in each case.
pb_ses_rmse <- accuracy(pb_ses)[2]
hc_ses_rmse <- accuracy(hc_ses)[2]
accuracy(pb_ses)

##                    ME     RMSE     MAE       MPE     MAPE      MASE       ACF1
## Training set 7.175981 33.63769 27.8431 0.4736071 15.57784 0.7021303 -0.2117522

accuracy(hc_ses)

##                    ME     RMSE      MAE      MPE     MAPE      MASE       ACF1
## Training set 9.166735 31.93101 26.77319 2.636189 13.39487 0.7987887 -0.1417763

#RMSE paperback = 33.63769, RMSE Hardcover = 31.93101

#6. We will continue with the daily sales of paperback and hardcover books in data set books.
# a. Apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.
#pi = FALSE, so no dampening
pb_holt <- holt(books[,"Paperback"], h = 4, pi = FALSE)
hc_holt <- holt(books[,"Hardcover"], h = 4, pi = FALSE)
pb_holt

##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       209.4668 166.6035 252.3301 143.9130 275.0205
## 32       210.7177 167.8544 253.5811 145.1640 276.2715
## 33       211.9687 169.1054 254.8320 146.4149 277.5225
## 34       213.2197 170.3564 256.0830 147.6659 278.7735

hc_holt

##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       250.1739 212.7390 287.6087 192.9222 307.4256
## 32       253.4765 216.0416 290.9113 196.2248 310.7282
## 33       256.7791 219.3442 294.2140 199.5274 314.0308
## 34       260.0817 222.6468 297.5166 202.8300 317.3334

grid.arrange(autoplot(pb_holt), autoplot(hc_holt), ncol = 2)

#b. Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.
accuracy(pb_holt)

##                     ME     RMSE      MAE       MPE     MAPE      MASE
## Training set -3.717178 31.13692 26.18083 -5.508526 15.58354 0.6602122
##                    ACF1
## Training set -0.1750792

accuracy(hc_holt)

##                      ME     RMSE      MAE       MPE    MAPE      MASE
## Training set -0.1357882 27.19358 23.15557 -2.114792 12.1626 0.6908555
##                     ACF1
## Training set -0.03245186

pb_holt_rmse <- accuracy(pb_holt)[2]
hc_holt_rmse <- accuracy(hc_holt)[2]

#RMSE for Holt method on Paperback and Hardcover series were 31.13692 and 27.19358
#The RMSE calculation for both the Paperback and Hardcover series improved under the Holt method
#The Holt Method uses an additional parameter to match the local trend which better matches this specific data

#c. Compare the forecasts for the two series using both methods. Which do you think is best?
autoplot(books) + autolayer(pb_ses, series = "SES Paperback") + autolayer(hc_ses, series = "SES Hardcover")+
            autolayer(pb_holt, series = "Holt Paperback") + autolayer(hc_holt, series = "Holt Hardcover") + ylab("Books Sold") +xlab("Time (Days)") + ggtitle("Daily Book Sales with Forecasts")

checkresiduals(pb_ses)

## 
##  Ljung-Box test
## 
## data:  Residuals from Simple exponential smoothing
## Q* = 16.039, df = 4, p-value = 0.002967
## 
## Model df: 2.   Total lags used: 6

checkresiduals(hc_ses)

## 
##  Ljung-Box test
## 
## data:  Residuals from Simple exponential smoothing
## Q* = 10.306, df = 4, p-value = 0.03558
## 
## Model df: 2.   Total lags used: 6

checkresiduals(pb_holt)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt's method
## Q* = 15.081, df = 3, p-value = 0.001749
## 
## Model df: 4.   Total lags used: 7

checkresiduals(hc_holt)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt's method
## Q* = 9.416, df = 3, p-value = 0.02424
## 
## Model df: 4.   Total lags used: 7

#The paperback series have some small autocorrelation, but not enough to throw out predictions
#Overall the Holt method is a better model for this data.  Visually the Holt forecast matches the upward trend of the sales data and the error metrics demonstrate that the Holt model does a better job explaining the residuals.
#As shown in part D, the confidence interval is also tighter for the Holt method, giving a more precise prediction

#D. Calculate a 95% prediction interval for the first forecast for each series, using
# the RMSE values and assuming normal erros. Compare your intervals with those produced.
z <-1.96
pb_ses_ci <- c(pb_ses$mean[1] - z*pb_ses_rmse, pb_ses$mean[1] + z*pb_ses_rmse)
hc_ses_ci <- c(hc_ses$mean[1] - z*hc_ses_rmse, hc_ses$mean[1] + z*hc_ses_rmse)
pb_holt_ci <- c(pb_holt$mean[1] - z*pb_holt_rmse, pb_holt$mean[1] + z*pb_holt_rmse)
hc_holt_ci <- c(hc_holt$mean[1] - z*hc_holt_rmse, hc_holt$mean[1] + z*hc_holt_rmse)

confidence_interval_matrix <- matrix(c(pb_ses_ci, hc_ses_ci, pb_holt_ci, hc_holt_ci), nrow = 4, byrow = TRUE)
ci_range <- c("Lower 95%", "Upper 95%")
Model_CI <- c("Paperback SES", "Hardcover SES", "Paperback Holt", "Hardcover Holt")
rownames(confidence_interval_matrix)<-Model_CI
colnames(confidence_interval_matrix)<-ci_range
#Confidence interval chart shows that SES models had larger confidence intervals
confidence_interval_matrix

##                Lower 95% Upper 95%
## Paperback SES   141.1798  273.0395
## Hardcover SES   176.9753  302.1449
## Paperback Holt  148.4384  270.4951
## Hardcover Holt  196.8745  303.4733

ci_breadth <- confidence_interval_matrix[,2]-confidence_interval_matrix[,1]
ci_breadth

##  Paperback SES  Hardcover SES Paperback Holt Hardcover Holt 
##       131.8597       125.1696       122.0567       106.5988

#7. For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts.
#?eggs = Price of dozen eggs (constant dollar) 1900-1993
frequency(eggs)

## [1] 1

autoplot(eggs) + ggtitle("Price of dozen eggs (constant dollar) 1900-1993") + ylab("Price ($)")

#Holt no dampening or transformation -> over very long time horizon results become unrealistic
eggs_holt_simple <- holt(eggs, damped = FALSE, level = c(95), h=100)
eggs_holt_dampened <- holt(eggs, damped = TRUE, level = c(95), h=100)
eggs_holt_simple_bctrans <- holt(eggs, damped = FALSE, level = c(95), h=100, lambda = 'auto')
eggs_holt_dampened_bctrans <- holt(eggs, damped = TRUE, level = c(95), h=100, lambda = 'auto')
eggs_holt_works_biasadj <- holt(eggs, damped = TRUE, level = c(95), h=100, lambda = 'auto', biasadj = TRUE)
eggs_holt_simple_damp_biasadj <- holt(eggs, damped = TRUE, level = c(95), h=100, biasadj = TRUE)

eggs_holt_simple_plot <- autoplot(eggs_holt_simple) + ggtitle("Simple Holt") + ylab("Price of Eggs ($)")
eggs_holt_dampened_plot <- autoplot(eggs_holt_dampened) + ggtitle("Dampened Holt")+ ylab("Price of Eggs ($)")
eggs_holt_simple_bctrans_plot <- autoplot(eggs_holt_simple_bctrans) + ggtitle("Boxcox no Dampening") + ylab("Price of Eggs ($)")
eggs_holt_dampened_bctrans_plot <- autoplot(eggs_holt_simple_bctrans) + ggtitle("Boxcox with Dampening") + ylab("Price of Eggs ($)")
eggs_holt_works_biasadj_plot <- autoplot(eggs_holt_works_biasadj) + ggtitle("Simple Holt w/ Bias Adj") + ylab("Price of Eggs ($)")
eggs_holt_simple_damp_biasadj_plot <- autoplot(eggs_holt_works_biasadj) + ggtitle("Dampening and BoxCox w/ Bias Adj") + ylab("Price of Eggs ($)")

grid.arrange(eggs_holt_simple_plot, eggs_holt_dampened_plot, eggs_holt_simple_bctrans_plot, eggs_holt_dampened_bctrans_plot, eggs_holt_works_biasadj_plot, eggs_holt_simple_damp_biasadj_plot, nrow = 3)

#My impression is that some form of transformation and/or dampening is required
#The simple Holt adjustment could see values fall below 0
#BoxCox transformation seems to significantly improve the visual trajectory

eggs_holt_RMSE_compare <- c(accuracy(eggs_holt_simple)[,2], accuracy(eggs_holt_dampened)[,2],accuracy(eggs_holt_simple_bctrans)[,2], accuracy(eggs_holt_dampened_bctrans)[,2], accuracy(eggs_holt_works_biasadj)[,2], accuracy(eggs_holt_simple_damp_biasadj)[,2])
names(eggs_holt_RMSE_compare) <- c("Simple Holt", "Dampened Holt", "Boxcox no Dampening", "Boxcox with Dampening", "Simple Holt w/ Bias Adj", "Dampening and BoxCox w/ Bias Adj") 
eggs_holt_RMSE_compare

##                      Simple Holt                    Dampened Holt 
##                         26.58219                         26.54019 
##              Boxcox no Dampening            Boxcox with Dampening 
##                         26.39364                         26.53317 
##          Simple Holt w/ Bias Adj Dampening and BoxCox w/ Bias Adj 
##                         26.58953                         26.54019

#All the RMSEs were relatively similar, but the Holt with a boxcox transformation and no dampening had the lowest at 26.39


#8 Recall your retail time series data (from Exercise 3 in Section 2.10).
retaildata <- read_xlsx("retail.xlsx", skip=1)
head(retaildata)

## # A tibble: 6 x 190
##   `Series ID`         A3349335T A3349627V A3349338X A3349398A A3349468W
##   <dttm>                  <dbl>     <dbl>     <dbl>     <dbl>     <dbl>
## 1 1982-04-01 00:00:00      303.      41.7      63.9      409.      65.8
## 2 1982-05-01 00:00:00      298.      43.1      64        405.      65.8
## 3 1982-06-01 00:00:00      298       40.3      62.7      401       62.3
## 4 1982-07-01 00:00:00      308.      40.9      65.6      414.      68.2
## 5 1982-08-01 00:00:00      299.      42.1      62.6      404.      66  
## 6 1982-09-01 00:00:00      305.      42        64.4      412.      62.3
## # ... with 184 more variables: A3349336V <dbl>, A3349337W <dbl>,
## #   A3349397X <dbl>, A3349399C <dbl>, A3349874C <dbl>, A3349871W <dbl>,
## #   A3349790V <dbl>, A3349556W <dbl>, A3349791W <dbl>, A3349401C <dbl>,
## #   A3349873A <dbl>, A3349872X <dbl>, A3349709X <dbl>, A3349792X <dbl>,
## #   A3349789K <dbl>, A3349555V <dbl>, A3349565X <dbl>, A3349414R <dbl>,
## #   A3349799R <dbl>, A3349642T <dbl>, A3349413L <dbl>, A3349564W <dbl>,
## #   A3349416V <dbl>, A3349643V <dbl>, A3349483V <dbl>, A3349722T <dbl>, ...

#My name starts with a T, so I am choosing the data column with a T in it
#a. Why is multiplicative seasonality necessary for this series?
retaildata_ts <- ts(retaildata[,"A3349335T"], frequency=12, start=c(1982,4))
frequency(retaildata_ts)

## [1] 12

autoplot(retaildata_ts)

ggseasonplot(retaildata_ts)

autoplot((diff(retaildata_ts)))

#The seasonal fluctuations clearly increase with the level
#The autoplot of the differences shows that later periods have greater seasonality
train_retail <- subset(retaildata_ts, end = length(retaildata_ts) - 36)

#b. Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.
#Looking at a 3 year forecast
retaildata_ts_hwm <- hw(train_retail, seasonal = "multiplicative", h=36)
plot_hwm_retail <- autoplot(retaildata_ts_hwm) + ggtitle("HW Multiplicative Seasonality") + ylab("Retails Sales Volume")
retaildata_ts_hwm_damp <- hw(train_retail, seasonal = "multiplicative", damped = TRUE, h=36)
plot_hwm_damp_retail <- autoplot(retaildata_ts_hwm_damp) + ggtitle("HW Multiplicative Seasonality + Dampened") + ylab("Retails Sales Volume")
grid.arrange(plot_hwm_retail, plot_hwm_damp_retail, nrow = 2)

#c. Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?
accuracy(retaildata_ts_hwm, retaildata_ts)

##                      ME     RMSE      MAE        MPE     MAPE      MASE
## Training set   2.658058 25.00649 18.25149  0.2228273 1.965971 0.2958851
## Test set     -63.670300 77.04807 65.91346 -2.9161237 3.023249 1.0685599
##                      ACF1 Theil's U
## Training set -0.006737509        NA
## Test set      0.447374940 0.5550438

accuracy(retaildata_ts_hwm_damp, retaildata_ts)

##                      ME     RMSE      MAE       MPE     MAPE      MASE
## Training set   4.089724 25.68057 18.92510  0.371369 2.057716 0.3068053
## Test set     -23.699364 39.60879 31.62906 -1.109011 1.448777 0.5127563
##                     ACF1 Theil's U
## Training set -0.05009924        NA
## Test set     -0.05576411 0.2844991

#The HW model without damping demonstrates a lower RMSE for the training set, but it does a much worse job tracking with the test set.  Dampening of HW method makes a better fit of the data.
#RMS HW multiplicative (no damp): Train set = 25.00649, Test set = 77.04807
#RMS HW multiplicative with Dampening: Train set = 25.68057, Test set = 39.60879


#d. Check that the residuals from the best method look like white noise.
checkresiduals(retaildata_ts_hwm_damp)

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt-Winters' multiplicative method
## Q* = 287.08, df = 7, p-value < 2.2e-16
## 
## Model df: 17.   Total lags used: 24

#There is major autocorrelation concerns in the data, which means the residuals are not white noise and we have not done a good job capturing all explanatory factors in our forecast.
#The HW Multiplicative model with dampening also fails the Ljung-Box test, so significant to show not white noise

#e. I answered this in part c - RMSE for HW Mult with Dampening is 39.60879
retaildata_ts_snaive <- snaive(train_retail, h = 36)
accuracy(retaildata_ts_snaive, retaildata_ts)

##                     ME      RMSE       MAE      MPE     MAPE     MASE      ACF1
## Training set  61.56787  72.20702  61.68438 6.388722 6.404105 1.000000 0.6018274
## Test set     129.52500 145.46662 131.24722 5.988193 6.069098 2.127722 0.4067838
##              Theil's U
## Training set        NA
## Test set      1.058244

#Both Holt Methods outperform the SNaive model for retail data

#9.For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?
#STL model
lambda_retail_train <- BoxCox.lambda(train_retail)
retaildata_ts_boxcox <- BoxCox(train_retail, lambda = lambda_retail_train)
retaildata_ts_STL <- mstl(retaildata_ts_boxcox, s.window = 12, robust = TRUE)
stl_fc_train <- forecast(retaildata_ts_STL, h=36)
autoplot(stl_fc_train, lamda = lambda_retail_train)

accuracy(stl_fc_train, retaildata_ts)

##                         ME         RMSE          MAE          MPE       MAPE
## Training set -1.450869e-03 8.088656e-02 6.264965e-02 -0.009459512  0.4644044
## Test set      2.140724e+03 2.144459e+03 2.140724e+03 99.184944770 99.1849448
##                     MASE        ACF1 Theil's U
## Training set    0.263184 -0.06890606        NA
## Test set     8992.937907  0.22659896  15.59527

#ETS model
retail_train_ets <- ets(train_retail) 
retail_train_ets_fc <- forecast(retail_train_ets, h = 36)
autoplot(retail_train_ets_fc)

summary(retail_train_ets)

## ETS(M,A,M) 
## 
## Call:
##  ets(y = train_retail) 
## 
##   Smoothing parameters:
##     alpha = 0.2248 
##     beta  = 0.007 
##     gamma = 0.1428 
## 
##   Initial states:
##     l = 303.9767 
##     b = 2.4001 
##     s = 1.0069 0.9512 0.9884 1.1711 1.0178 1.0035
##            0.9651 0.9905 0.9894 0.9605 0.9819 0.9738
## 
##   sigma:  0.0262
## 
##      AIC     AICc      BIC 
## 4193.018 4194.889 4258.358 
## 
## Training set error measures:
##                    ME     RMSE      MAE       MPE     MAPE      MASE       ACF1
## Training set 2.238717 25.19277 18.12092 0.1862117 1.934037 0.2937683 0.01490798

#Errors = Multiplicative, Trend = additive, Seasonality = Multiplicative
#alpha = 0.2248, l = 303.9767
accuracy(retail_train_ets_fc, retaildata_ts)

##                      ME     RMSE      MAE        MPE     MAPE      MASE
## Training set   2.238717 25.19277 18.12092  0.1862117 1.934037 0.2937683
## Test set     -70.732515 83.25273 72.51597 -3.2426628 3.326431 1.1755968
##                    ACF1 Theil's U
## Training set 0.01490798        NA
## Test set     0.52551110 0.6007244

#The ETS model improves upon the training set residuals but the Holt Multiplicative method with dampening continues to perform best with the test data

#10 For this exercise use data set ukcars, the quarterly UK passenger vehicle production data from 1977Q1–2005Q1.
#a. Plot the data and describe the main features of the series.
#?ukcars - quarterly uk passenger car production
autoplot(ukcars)

#Initial downward trend from the 1970s into 1980s, followed by upward tend until recession in 2000s
ggseasonplot(ukcars)

#Seasonality -> Production falls off in Q3 to bounce back in Q4

#b. Decompose the series using STL and obtain the seasonally adjusted data.
ukcars_sadj <-stl(ukcars, t.window=4,s.window='periodic',robust = TRUE)
autoplot(ukcars_sadj)

#C. Forecast the next two years of the series using an additive damped trend method applied to the seasonally adjusted data. (This can be done in one step using stlf() with arguments etsmodel="AAN", damped=TRUE.)
ukcars_stl <- stlf(ukcars, 8, s.window = 4, etsmodel = "AAN", damped = TRUE)
ukcars_stl

##         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2005 Q2       417.9810 392.3647 443.5973 378.8042 457.1578
## 2005 Q3       380.1666 345.0236 415.3097 326.4199 433.9133
## 2005 Q4       394.0142 351.3839 436.6445 328.8168 459.2116
## 2006 Q1       432.7625 383.7443 481.7808 357.7956 507.7295
## 2006 Q2       417.9529 363.2623 472.6435 334.3108 501.5950
## 2006 Q3       380.1413 320.2923 439.9903 288.6102 471.6724
## 2006 Q4       393.9914 329.3768 458.6060 295.1718 492.8109
## 2007 Q1       432.7420 363.6747 501.8093 327.1127 538.3713

autoplot(ukcars)

#D. Forecast the next two years of the series using Holt’s linear method applied to the seasonally adjusted data (as before but with damped=FALSE).
ukcars_holt <- hw(ukcars, h=8, damped = FALSE)
ukcars_holt

##         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2005 Q2       428.1044 394.1527 462.0561 376.1797 480.0290
## 2005 Q3       361.2569 321.4060 401.1078 300.3102 422.2036
## 2005 Q4       399.6350 354.6504 444.6195 330.8370 468.4329
## 2006 Q1       430.7384 381.1474 480.3295 354.8955 506.5814
## 2006 Q2       426.2693 372.4598 480.0788 343.9747 508.5639
## 2006 Q3       359.4219 301.7035 417.1402 271.1493 447.6945
## 2006 Q4       397.7999 336.4200 459.1798 303.9275 491.6724
## 2007 Q1       428.9034 364.0673 493.7395 329.7451 528.0616

autoplot(ukcars_holt)

#E. Now use ets() to choose a seasonal model for the data.
ukcars_ets_mod <- ets(ukcars)
ukcars_ets_fc <- forecast(ukcars_ets_mod, h = 8)
ukcars_ets_fc

##         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2005 Q2       427.4885 394.2576 460.7195 376.6662 478.3109
## 2005 Q3       361.3329 322.2353 400.4305 301.5383 421.1275
## 2005 Q4       404.5358 360.3437 448.7280 336.9497 472.1219
## 2006 Q1       431.8154 383.0568 480.5741 357.2455 506.3854
## 2006 Q2       427.4885 374.5571 480.4200 346.5369 508.4401
## 2006 Q3       361.3329 304.5345 418.1313 274.4672 448.1986
## 2006 Q4       404.5358 344.1174 464.9542 312.1338 496.9378
## 2007 Q1       431.8154 367.9809 495.6500 334.1890 529.4419

autoplot(ukcars_ets_fc)

#F. Compare the RMSE of the ETS model with the RMSE of the models you obtained using STL decompositions. Which gives the better in-sample fits?
RMSE_ukcars_compare <- c(accuracy(ukcars_stl)[2], accuracy(ukcars_holt)[2], accuracy(ukcars_ets_fc)[2])
names_ukcars_model <- c("STL Model", "Holt Model", "ETS Model")
names(RMSE_ukcars_compare) <- names_ukcars_model
RMSE_ukcars_compare

##  STL Model Holt Model  ETS Model 
##   19.54131   25.53765   25.23244

#G.Compare the forecasts from the three approaches? Which seems most reasonable? 
#STL Model has the lowest root mean squared error for the training data
autoplot(ukcars) + autolayer(ukcars_stl, series = "STL Model") +autolayer(ukcars_holt, series = "Holt Model") +autolayer(ukcars_ets_fc, series = "ETS Model") + ylab("UK Passengar Car Production (1000s)") + xlab("Time (Quarters)")

ukcars_stl

##         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2005 Q2       417.9810 392.3647 443.5973 378.8042 457.1578
## 2005 Q3       380.1666 345.0236 415.3097 326.4199 433.9133
## 2005 Q4       394.0142 351.3839 436.6445 328.8168 459.2116
## 2006 Q1       432.7625 383.7443 481.7808 357.7956 507.7295
## 2006 Q2       417.9529 363.2623 472.6435 334.3108 501.5950
## 2006 Q3       380.1413 320.2923 439.9903 288.6102 471.6724
## 2006 Q4       393.9914 329.3768 458.6060 295.1718 492.8109
## 2007 Q1       432.7420 363.6747 501.8093 327.1127 538.3713

ukcars_holt

##         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2005 Q2       428.1044 394.1527 462.0561 376.1797 480.0290
## 2005 Q3       361.2569 321.4060 401.1078 300.3102 422.2036
## 2005 Q4       399.6350 354.6504 444.6195 330.8370 468.4329
## 2006 Q1       430.7384 381.1474 480.3295 354.8955 506.5814
## 2006 Q2       426.2693 372.4598 480.0788 343.9747 508.5639
## 2006 Q3       359.4219 301.7035 417.1402 271.1493 447.6945
## 2006 Q4       397.7999 336.4200 459.1798 303.9275 491.6724
## 2007 Q1       428.9034 364.0673 493.7395 329.7451 528.0616

ukcars_ets_fc

##         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2005 Q2       427.4885 394.2576 460.7195 376.6662 478.3109
## 2005 Q3       361.3329 322.2353 400.4305 301.5383 421.1275
## 2005 Q4       404.5358 360.3437 448.7280 336.9497 472.1219
## 2006 Q1       431.8154 383.0568 480.5741 357.2455 506.3854
## 2006 Q2       427.4885 374.5571 480.4200 346.5369 508.4401
## 2006 Q3       361.3329 304.5345 418.1313 274.4672 448.1986
## 2006 Q4       404.5358 344.1174 464.9542 312.1338 496.9378
## 2007 Q1       431.8154 367.9809 495.6500 334.1890 529.4419

#The STL forecast seems to better align with the slightly lower trend in the more recent years

#H Check the residuals of your preferred model. 
checkresiduals(ukcars_stl)

## Warning in checkresiduals(ukcars_stl): The fitted degrees of freedom is based on
## the model used for the seasonally adjusted data.

## 
##  Ljung-Box test
## 
## data:  Residuals from STL +  ETS(A,Ad,N)
## Q* = 64.027, df = 3, p-value = 8.094e-14
## 
## Model df: 5.   Total lags used: 8

#Residuals are not white noise, which shows that there is some autocorrelation in play


#12. The fets() function below returns ETS forecasts.
#Problem matches Hyndmans data camp course
#a. Apply tsCV() for a forecast horizon of  h=  4, for both ETS and seasonal naïve methods to the qcement data, (Hint: use the newly created fets() and the existing snaive() functions as your forecast function arguments.)
#Forecast function for ETS
#fets <- function(y, h) {
 # forecast(ets(y), h = h)}
#?qcement -> quarterly production of Portland cement in Austrailia
autoplot(qcement)

#Looks to have strong seasonality that is increasing with time, and an upward trend

#e_ets <- ets_tscv <- tsCV(qcement, fets, h=4)
#e_snaive <- snaive_tscv <- tsCV(qcement, snaive, h=4)

#b. Compute the MSE of the resulting  4-step-ahead errors. (Hint: make sure you remove missing values.) Why are there missing values? Comment on which forecasts are more accurate. Is this what you expected?
#Compute MSE of resulting errors
#mean(e_ets^2, na.rm = TRUE)
#mean(e_snaive^2, na.rm = TRUE)

#Because the earliest observations are not included in the cross validations set, some items are missing from that calculation.


#13. Compare ets(), snaive() and stlf() on the following six time series. For stlf(), you might need to use a Box-Cox transformation. Use a test set of three years to decide what gives the best forecasts. ausbeer, bricksq, dole, a10, h02, usmelec.

#14. a. Use ets() on the following series: bicoal, chicken, dole, usdeaths, lynx, ibmclose, eggs. Does it always give good forecasts?
bicoal %>% ets() %>% forecast() %>% autoplot()

chicken %>% ets() %>% forecast() %>% autoplot()

dole %>% ets() %>% forecast() %>% autoplot()

usdeaths %>% ets() %>% forecast() %>% autoplot()

lynx %>% ets() %>% forecast() %>% autoplot()

ibmclose %>% ets() %>% forecast() %>% autoplot()

eggs %>% ets() %>% forecast() %>% autoplot()

#Overall, these forecasts look decent, but I would question if all are good predictions for the future

#14b. The ETS model does better where there are some consistent patterns related to trend and seasonality, but for data like the IBM close, which may be more cyclical in nature, the ets smoothing falters. 
#Likewise, the USDeaths data is based on only the last 6 years, for a smaller data set, the naive, or snaive, methods will likely produce better results.


#16. Show that the forecast variance for an ETS(A,N,N) model is given by...
bicoal_mod <- ets(bicoal, model = "ANN")
bicoal_mod_fc <- forecast(bicoal_mod, h=1, level = c(80))

summary(bicoal_mod_fc)

## 
## Forecast method: ETS(A,N,N)
## 
## Model Information:
## ETS(A,N,N) 
## 
## Call:
##  ets(y = bicoal, model = "ANN") 
## 
##   Smoothing parameters:
##     alpha = 0.8047 
## 
##   Initial states:
##     l = 540.3435 
## 
##   sigma:  60.7797
## 
##      AIC     AICc      BIC 
## 597.1683 597.7017 602.8438 
## 
## Error measures:
##                     ME    RMSE      MAE        MPE     MAPE     MASE       ACF1
## Training set 0.1294209 59.5264 46.56756 -0.9692703 10.14369 1.000556 0.04590185
## 
## Forecasts:
##      Point Forecast    Lo 80    Hi 80
## 1969       545.4468 467.5544 623.3391

sigma_bicoal <- 60.7797
alpha_bicoal <-  0.8047
h_bicoal <- 2
variance_form <- (sigma_bicoal^2)*(1+(alpha_bicoal^2)*(h_bicoal-1))
#It seems that 1 standard deviation with this model is equal to the 80th percent confidence interval
#The distance between the mean forecast and cofidence interval bound should equal the sqrt(variance)
bicoal_sd <- bicoal_mod_fc$mean - bicoal_mod_fc$lower
bicoal_sd

## Time Series:
## Start = 1969 
## End = 1969 
## Frequency = 1 
##      bicoal_mod_fc$lower
## [1,]            77.89235

sqrt(variance_form)

## [1] 78.01477

#Values from the formular calculated variance and the proven variance using the confidence interval are close to equal

#17. Write down 95% prediction intervals for an ETS(A,N,N) model as a function of...
#The prediction interval would be 2 standard deviations above and below the mean forecast
#Standard Dev = sqrt(sigma^2*(1+alpha^2*(h-1))
# ℓT = mean forecast
#Upper level(95%) = ℓT + 2* sqrt(sigma^2*(1+alpha^2*(h-1))
#Lower level(95%) = ℓT - 2* sqrt(sigma^2*(1+alpha^2*(h-1))

##Chapter 8
#1. Figure 8.31 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers.
#a. Explain the differences among these figures. Do they all indicate that the data are white noise?
#Base on the text, all white noise series are expected to be close to zero.  The dotted blue lines represent where 95% of the ACF spikes to reside within (equal to (+/-)2/(sqrtT))
#There are no large spikes in any of the series outside of the critical value lines, nor substantially more than 5% of the spikes beyond it (closest is series 2)
#All three series appear to be white noise residuals, the differences in the series related to the small critical value lines as T increases

#b. The expected mean of the residual Acf is 0. The critical values lines are at diffent width from 0 based on the time (T) of the series.  A large T will increase the denominator in the critical value calculation, thus minimizing the range of acceptable autocorrelation.
#Random variation explains why there is some small spiking despite all three series representing white noise.

#2. A classic example of a non-stationary series is the daily closing IBM stock price series (data set ibmclose). Use R to plot the daily closing prices for IBM stock and the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.
#Plot of IBM close price for a year
autoplot(ibmclose) + xlab("Time (Days)") + ylab("Price ($)") + ggtitle("IBM Daily Closing Price")

#Plot with ACF and PACF
ggtsdisplay(ibmclose)

#Stationary data should have no trend of seasonality - 
#The TS plot shows that the IBM data has an upward trajectory for the first 100 days followed by a smaller decline from 100 to day 200 followed by a sharp crash and slow recovery.  Trending is happening that may even be seasonally related.
#In the meantime, the ACF plot show residuals that are clearly not white noise with each lag crossing the critical value and declining only very slowly.
#The PACF shows very strong correlation at the 1 lag that is a spike that outweighs the remainder of lags falling in line.

#This data could be made stationary through differencing.
#From the text "Differencing can help stabilise the mean of a time series by removing changes in the level of a time series, and therefore eliminating (or reducing) trend and seasonality."
ndiffs(ibmclose)

## [1] 1

#1 diff required to make data stationary
#Still some spiking and apparent seasonality but more stationary overall
ggtsdisplay(diff(ibmclose))

#3. For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.
#a. usnetelec
autoplot(usnetelec)

#Clear upward trend
usnetelec_lambda <-BoxCox.lambda(usnetelec)
round(usnetelec_lambda,2)

## [1] 0.52

ndiffs(BoxCox(usnetelec, usnetelec_lambda))

## [1] 2

ggtsdisplay(diff(diff(BoxCox(usnetelec, BoxCox.lambda(usnetelec)))))

#usnetelec requires two differences on a box.cox transformation with lambda 0.52
ndiffs(usnetelec)

## [1] 1

#only 1 difference required if not transforming

#b USGDP
autoplot(usgdp)

#Clear upward trend, data has some cyclicality, but not seasonal
usgdp_lambda <-BoxCox.lambda(usgdp)
round(usgdp_lambda,2)

## [1] 0.37

ndiffs(BoxCox(usgdp, usgdp_lambda))

## [1] 1

ggtsdisplay(diff((BoxCox(usgdp, usgdp_lambda))))

#usgdp requires 1 difference on a box.cox transformation with lambda 0.37

#c. mcopper
autoplot(mcopper)

#Slow upward trend, unclear on seasonality.  Likely cyclical
mcopper_lambda <-BoxCox.lambda(mcopper)
round(mcopper_lambda,2)

## [1] 0.19

ndiffs(BoxCox(mcopper, mcopper_lambda))

## [1] 1

ggtsdisplay(diff(BoxCox(mcopper, mcopper_lambda)))

#mcopper requires 1 difference on a box.cox transformation with lambda 0.19

#d. enplanements
autoplot(enplanements)

ggseasonplot(enplanements)

#Upward trend + seasonality (dips in sept)
enplanements_lambda <-BoxCox.lambda(enplanements)
round(enplanements_lambda,2)

## [1] -0.23

ndiffs(BoxCox(enplanements, enplanements_lambda))

## [1] 1

ggtsdisplay(diff(BoxCox(enplanements, enplanements_lambda)))

#enplanements requires 1 difference on a box.cox transformation with lambda -0.23 (more than a log, moving towards inverse)

#e. visitors
autoplot(visitors)

ggseasonplot(visitors)

#Upward trend + seasonality (dips in may with spike Jun/Jul and Dec/Jan)
visitors_lambda <-BoxCox.lambda(visitors)
round(visitors_lambda,2)

## [1] 0.28

ndiffs(BoxCox(visitors, visitors_lambda))

## [1] 1

ggtsdisplay(diff(BoxCox(visitors, visitors_lambda)))

#visiors requires 1 difference on a box.cox transformation with lambda 0.28
#Data seems to still have some autocorrelation even with 1 difference, spaced around the peak seasons


#4. For the enplanements data, write down the differences you chose above using backshift operator notation.
frequency(enplanements)

## [1] 12

#monthly data
#backward shift notation would include seasonal and first differencing
#(1−B)(1−B^12)Yt is the backward notation

#5. For your retail data (from Exercise 3 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.
#recalling retailing datats 
head(retaildata_ts)

##        Apr   May   Jun   Jul   Aug   Sep
## 1982 303.1 297.8 298.0 307.9 299.2 305.4

ggtsdisplay(retaildata_ts)

#ACF only slowly declining with each lag, which indicates strong trending
ggseasonplot(retaildata_ts)

#Slight seasonality in nov-dec, and seasonality level appears to be increasing with time.  Will transorm data.
retail_data_lambda <- BoxCox.lambda(retaildata_ts)
retail_data_lambda

## [1] 0.193853

retail_data_bc <- BoxCox(retaildata_ts, retail_data_lambda)
ggtsdisplay(retail_data_bc)

#Seasonality smoothed but still major ACF concerns, so should difference
ndiffs(retaildata_ts)

## [1] 1

ggtsdisplay(diff(retail_data_bc))

#Data appears to have more even variance with a shared mean after 1 level of differencing.  Autocorrelation remains a concern though
#Box Cox transformation of 0.19


#6.
#a. Use the following R code to generate data from an AR(1) model 
y.8.6 <- ts(numeric(100))
e.8.6 <- rnorm(100)
for(i in 2:100)
  y.8.6[i] <- 0.6*y.8.6[i-1] + e.8.6[i]

#7. Consider wmurders, the number of women murdered each year (per 100,000 standard population) in the United States.
#a. By studying appropriate graphs of the series in R, find an appropriate ARIMA(p,d,q) model for these data.
ggtsdisplay(wmurders)

#Data does not appear to be seasonal, but there is trending that we need to make stationary as indicated by ACF
ggtsdisplay(diff(wmurders))

#The data still appears to have a trend downward rather than a constant mean.  The variance is also inconsistent
#Will take the first difference of the first difference of wmurders data
ggtsdisplay(diff(diff(wmurders)))

#The ACF and PCD both decay to 0, but surprisingly the early ACF lags cross the critical lines which makes me think we will need both a p & q
#d = 2, number of differences
#normally would prefer on an MA or AR term. Given the initial negative noise at lag 1, assume a q =1, but the positive spike at lag 2 indicate
#I will try  an ARIMA(0,2,1) 
wmurders_arima <- arima(wmurders, order = c(0,2,1))
wmurders_arima_fc <- forecast(wmurders_arima)
autoplot(wmurders_arima_fc)

accuracy(wmurders_arima_fc)

##                       ME      RMSE       MAE        MPE     MAPE      MASE
## Training set -0.01306101 0.2118445 0.1559694 -0.3151353 4.447123 0.9591385
##                    ACF1
## Training set -0.2011523

#b. Should you include a constant in the model? Explain.
#Because there have already been two differencing of the data, I would not use a constant which could have a drastic impact on the forecasts

#c.Write this model in terms of the backshift operator.
#(1 − B)(1 − B^2) yT

#d. Fit the model using R and examine the residuals. Is the model satisfactory?
checkresiduals(wmurders_arima_fc)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,2,1)
## Q* = 13.04, df = 9, p-value = 0.1608
## 
## Model df: 1.   Total lags used: 10

#The residuals actually look pretty good, and are statistically significant as representing white noise.  
#The variance is not stationary but does resolve around a mean of 0
#The residuals are almost normally distributed but have a bit of a bias towards a left skew

#e. Forecast three times ahead. 
forecast(wmurders_arima, h = 3)

##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2005       2.504087 2.227523 2.780652 2.081119 2.927056
## 2006       2.418792 2.007540 2.830044 1.789836 3.047748
## 2007       2.333496 1.804849 2.862144 1.524999 3.141993

#f. Create a plot of the series with forecasts and prediction intervals for the next three periods shown.
wmurders_arima %>% forecast(h=3) %>% autoplot()

#g. oes auto.arima() give the same model you have chosen? If not, which model do you think is better?
auto_a_wmurders <- auto.arima(wmurders)
auto_a_wmurders

## Series: wmurders 
## ARIMA(1,2,1) 
## 
## Coefficients:
##           ar1      ma1
##       -0.2434  -0.8261
## s.e.   0.1553   0.1143
## 
## sigma^2 estimated as 0.04632:  log likelihood=6.44
## AIC=-6.88   AICc=-6.39   BIC=-0.97

#Auto_ARIMA gives an ARIMA(1,2,1) which is not what I selected
auto_a_wmurders_fc <- forecast(auto_a_wmurders, h=3)
autoplot(auto_a_wmurders_fc)

accuracy(auto_a_wmurders_fc)

##                       ME      RMSE       MAE        MPE     MAPE      MASE
## Training set -0.01065956 0.2072523 0.1528734 -0.2149476 4.335214 0.9400996
##                    ACF1
## Training set 0.02176343

#My selected model ARIMA(0,2,1) has a slightly higher RMSE on the training data than the AUTOARIMA model but I actually prefer the more conservative trend employed by my model

#8. Consider austa, the total international visitors to Australia (in millions) for the period 1980-2015.
#a. Use auto.arima() to find an appropriate ARIMA model. What model was selected. Check that the residuals look like white noise. Plot forecasts for the next 10 periods.
austa.8.a_fit <- auto.arima(austa)
austa.8.a_fit

## Series: austa 
## ARIMA(0,1,1) with drift 
## 
## Coefficients:
##          ma1   drift
##       0.3006  0.1735
## s.e.  0.1647  0.0390
## 
## sigma^2 estimated as 0.03376:  log likelihood=10.62
## AIC=-15.24   AICc=-14.46   BIC=-10.57

#ARIMA (0,1,1) with drift selected, which indicates - ACF, first differencing, and no seasonality
austa.8.a_fc <- forecast(austa.8.a_fit, h=10)
autoplot(austa.8.a_fc)

checkresiduals(austa.8.a_fc)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,1,1) with drift
## Q* = 2.297, df = 5, p-value = 0.8067
## 
## Model df: 2.   Total lags used: 7

#Residuals appear to be white noise, but definitely some variance and potential bias in the residuals

#b. Plot forecasts from an ARIMA(0,1,1) model with no drift and compare these to part a. Remove the MA term and plot again.
austa.8.b_fit1 <- arima(austa, order = c(0,1,1))
austa.8.b_fc1 <- forecast(austa.8.b_fit1, h=10)
autoplot(austa.8.b_fc1)

#The ARIMA(0,1,1) model has a flat forecast with a wider confidence interval as compared to the AutoARIMA model which is a tighter forecast moving with an upward trend
#The auto selected model does seem to better match the trend of the data
austa.8.b_fit2 <- Arima(austa, order = c(0,1,0))
austa.8.b_fc2 <- forecast(austa.8.b_fit2, h=10)
autoplot(austa.8.b_fc2)

#Removing the MA term lowers the point forecast as compared to the prior model in b

#c. Plot forecasts from an ARIMA(2,1,3) model with drift. Remove the constant and see what happens.
#With drift
austa.8.c_fit1 <- Arima(austa, c(2,1,3), include.drift = TRUE)
austa.8.c_fc1 <- forecast(austa.8.c_fit1, h=10)
autoplot(austa.8.c_fc1)

austa.8.c_fc1

##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2016       7.156908 6.939218 7.374599 6.823980 7.489837
## 2017       7.372474 7.058446 7.686502 6.892209 7.852738
## 2018       7.546777 7.189693 7.903861 7.000664 8.092890
## 2019       7.685798 7.312327 8.059268 7.114624 8.256971
## 2020       7.797679 7.420897 8.174462 7.221440 8.373919
## 2021       7.891704 7.515207 8.268201 7.315901 8.467506
## 2022       7.977317 7.598822 8.355811 7.398459 8.556174
## 2023       8.063289 7.678485 8.448092 7.474783 8.651795
## 2024       8.157055 7.762739 8.551370 7.554001 8.760108
## 2025       8.264270 7.859810 8.668730 7.645702 8.882838

#Without drift
#austa.8.c_fit2 <- Arima(austa, c(2,1,3))
#Nonstationary data receiving an error

#d. Plot forecasts from an ARIMA(0,0,1) model with a constant. Remove the MA term and plot again.
austa.8.d_fit1 <- Arima(austa, c(0,0,1), include.constant = TRUE)
austa.8.d_fc1 <- forecast(austa.8.d_fit1, h=10)
autoplot(austa.8.d_fc1)

austa.8.d_fc1

##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2016       4.997220 3.741604 6.252836 3.076921 6.917519
## 2017       3.551491 1.799302 5.303680 0.871749 6.231232
## 2018       3.551491 1.799302 5.303680 0.871749 6.231232
## 2019       3.551491 1.799302 5.303680 0.871749 6.231232
## 2020       3.551491 1.799302 5.303680 0.871749 6.231232
## 2021       3.551491 1.799302 5.303680 0.871749 6.231232
## 2022       3.551491 1.799302 5.303680 0.871749 6.231232
## 2023       3.551491 1.799302 5.303680 0.871749 6.231232
## 2024       3.551491 1.799302 5.303680 0.871749 6.231232
## 2025       3.551491 1.799302 5.303680 0.871749 6.231232

#Large downward shift after last observation
austa.8.d_fit2 <- Arima(austa, c(0,0,0), include.constant = TRUE)
austa.8.d_fc2 <- forecast(austa.8.d_fit2, h=10)
autoplot(austa.8.d_fc2)

austa.8.d_fc2

##      Point Forecast    Lo 80    Hi 80        Lo 95    Hi 95
## 2016       3.541428 1.223915 5.858941 -0.002902594 7.085758
## 2017       3.541428 1.223915 5.858941 -0.002902594 7.085758
## 2018       3.541428 1.223915 5.858941 -0.002902594 7.085758
## 2019       3.541428 1.223915 5.858941 -0.002902594 7.085758
## 2020       3.541428 1.223915 5.858941 -0.002902594 7.085758
## 2021       3.541428 1.223915 5.858941 -0.002902594 7.085758
## 2022       3.541428 1.223915 5.858941 -0.002902594 7.085758
## 2023       3.541428 1.223915 5.858941 -0.002902594 7.085758
## 2024       3.541428 1.223915 5.858941 -0.002902594 7.085758
## 2025       3.541428 1.223915 5.858941 -0.002902594 7.085758

#The point forecast drops even lower when remove the MA term and the forecast flattens out at the constant

#e. Plot forecasts from an ARIMA(0,2,1) model with no constant.
austa %>% Arima(order = c(0,2,1), include.constant = FALSE) %>% forecast(h=10) %>% autoplot()

#The resulting forecast almost looks like a regression line

#9. For the usgdp series:
#a.if necessary, find a suitable Box-Cox transformation for the data;
autoplot(usgdp)

ggseasonplot(usgdp)

#Pretty flat by quarter. not showing a tone of seasonality, so I don't think it's actually needed
#For good measure, I had calculated a box.cox transformation in 3b
usgdp_lambda

## [1] 0.366352

#With boxcox transformation
usgdp_bc <- BoxCox(usgdp, usgdp_lambda)
autoplot(usgdp_bc)

ggtsdisplay(usgdp_bc)

#Need to differencing in order to stabilize
usgdp_bcdif <- diff(BoxCox(usgdp, usgdp_lambda))
ggtsdisplay(usgdp_bcdif)

#Looks better - ACF decreases after 2nd lag and PACF after lag 1
#Variance does seem a bit smaller after 1980 as compared to prior
#Would lean to ARIMA(2,1,0) model with drift

#Without boxcox transformation
ndiffs(usgdp)

## [1] 2

usgdp_diffs <- diff(diff(usgdp))
ggtsdisplay(usgdp_diffs)

#Once again there are a few spikes in variance that results in higher PACF at lag 8 & 11
#Otherwise, autocorrelation drops faster than when boxcox transformed
#Resulting data suggests that an ARIMA(0,2,1) with drift model would be worth testing

#B. fit a suitable ARIMA model to the transformed data using auto.arima()
usgdp_auto.a1 <- auto.arima(usgdp_bc)
usgdp_auto.a1

## Series: usgdp_bc 
## ARIMA(2,1,0) with drift 
## 
## Coefficients:
##          ar1     ar2   drift
##       0.2795  0.1208  0.1829
## s.e.  0.0647  0.0648  0.0202
## 
## sigma^2 estimated as 0.03518:  log likelihood=61.56
## AIC=-115.11   AICc=-114.94   BIC=-101.26

#This model actually, exactly matches what I predicted based on my
usgdp_auto.a1 %>% forecast(h=4*5) %>% autoplot()

checkresiduals(usgdp_auto.a1)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(2,1,0) with drift
## Q* = 6.5772, df = 5, p-value = 0.254
## 
## Model df: 3.   Total lags used: 8

accuracy(usgdp_auto.a1)

##                        ME      RMSE       MAE          MPE     MAPE      MASE
## Training set 0.0006791305 0.1859861 0.1417489 -0.004468837 0.264737 0.1789208
##                     ACF1
## Training set 0.009570738

usgdp_auto.a2 <- auto.arima(usgdp)
usgdp_auto.a2

## Series: usgdp 
## ARIMA(2,2,2) 
## 
## Coefficients:
##           ar1     ar2      ma1      ma2
##       -0.1228  0.3106  -0.5835  -0.3669
## s.e.   0.2869  0.0872   0.3004   0.2862
## 
## sigma^2 estimated as 1604:  log likelihood=-1199.57
## AIC=2409.13   AICc=2409.39   BIC=2426.43

#This model actually, exactly matches what I predicted based on my
usgdp_auto.a2 %>% forecast(h=4*5) %>% autoplot()

#This is quite different from my prediction for the untransformed data.
checkresiduals(usgdp_auto.a2)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(2,2,2)
## Q* = 8.6247, df = 4, p-value = 0.0712
## 
## Model df: 4.   Total lags used: 8

accuracy(usgdp_auto.a2)

##                   ME     RMSE      MAE        MPE      MAPE      MASE
## Training set 4.40617 39.54446 29.69209 0.09143455 0.6916719 0.1678117
##                     ACF1
## Training set -0.01784487

#c. Testing the models that I suggested in A for the nontransfromed data
usgdp_arima.test1 <- Arima(usgdp, order = c(0,2,1), include.drift = TRUE)

## Warning in Arima(usgdp, order = c(0, 2, 1), include.drift = TRUE): No drift term
## fitted as the order of difference is 2 or more.

usgdp_arima.test1

## Series: usgdp 
## ARIMA(0,2,1) 
## 
## Coefficients:
##           ma1
##       -0.7139
## s.e.   0.0758
## 
## sigma^2 estimated as 1747:  log likelihood=-1210.49
## AIC=2424.98   AICc=2425.03   BIC=2431.9

checkresiduals(usgdp_arima.test1)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,2,1)
## Q* = 21.288, df = 7, p-value = 0.003367
## 
## Model df: 1.   Total lags used: 8

accuracy(usgdp_arima.test1)

##                    ME     RMSE      MAE        MPE      MAPE      MASE
## Training set 1.500632 41.52639 31.76404 0.03723613 0.7484553 0.1795218
##                   ACF1
## Training set 0.0685883

usgdp_arima.test1 %>% forecast(h=20) %>% autoplot()

#d. choose what you think is the best model and check the residual diagnostics;
#usgdp_auto.a1, which uses the boxcox transformed data looks to be the best model
#This model has minimal autocorrelation, the lowest RMSE of the 3 tested, and drifts upwards as would be expected for GDP over time

#e.produce forecasts of your fitted model. Do the forecasts look reasonable?
forecast(usgdp_auto.a1, h=5*4)

##         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2006 Q2       81.15093 80.91054 81.39132 80.78329 81.51857
## 2006 Q3       81.37885 80.98847 81.76923 80.78181 81.97589
## 2006 Q4       81.58145 81.05350 82.10939 80.77403 82.38886
## 2007 Q1       81.77527 81.12659 82.42396 80.78320 82.76735
## 2007 Q2       81.96359 81.20736 82.71982 80.80704 83.12014
## 2007 Q3       82.14931 81.29632 83.00230 80.84478 83.45384
## 2007 Q4       82.33363 81.39250 83.27477 80.89430 83.77297
## 2008 Q1       82.51726 81.49494 83.53957 80.95376 84.08075
## 2008 Q2       82.70051 81.60271 83.79831 81.02157 84.37945
## 2008 Q3       82.88358 81.71502 84.05214 81.09642 84.67074
## 2008 Q4       83.06655 81.83120 84.30190 81.17725 84.95586
## 2009 Q1       83.24948 81.95074 84.54821 81.26323 85.23573
## 2009 Q2       83.43237 82.07318 84.79157 81.35367 85.51108
## 2009 Q3       83.61526 82.19818 85.03233 81.44803 85.78249
## 2009 Q4       83.79814 82.32544 85.27083 81.54585 86.05042
## 2010 Q1       83.98101 82.45472 85.50729 81.64676 86.31526
## 2010 Q2       84.16388 82.58582 85.74194 81.75045 86.57731
## 2010 Q3       84.34675 82.71856 85.97494 81.85665 86.83685
## 2010 Q4       84.52962 82.85280 86.20644 81.96515 87.09410
## 2011 Q1       84.71249 82.98841 86.43658 82.07574 87.34925

#They do appear reasonable in terms of trajectory
(81.96359-81.15093)/81.15093

## [1] 0.01001418

#However 1% seems like a very small annual GDP growth rate, but I now realize this is still logged

#f.compare the results with what you would obtain using ets() (with no transformation).
usgdp_ets <- ets(usgdp)
usgdp_ets

## ETS(A,A,N) 
## 
## Call:
##  ets(y = usgdp) 
## 
##   Smoothing parameters:
##     alpha = 0.9999 
##     beta  = 0.278 
## 
##   Initial states:
##     l = 1557.4589 
##     b = 18.6862 
## 
##   sigma:  41.8895
## 
##      AIC     AICc      BIC 
## 3072.303 3072.562 3089.643

#error = additive, trend = additive, seasonality = neither
usgdp_ets_fc <- forecast(usgdp_ets, h=20)
autoplot(usgdp_ets_fc)

usgdp_ets_fc

##         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2006 Q2       11508.06 11454.38 11561.75 11425.96 11590.16
## 2006 Q3       11612.53 11525.42 11699.64 11479.31 11745.76
## 2006 Q4       11717.00 11596.31 11837.69 11532.42 11901.58
## 2007 Q1       11821.47 11665.72 11977.22 11583.27 12059.67
## 2007 Q2       11925.94 11733.29 12118.59 11631.30 12220.57
## 2007 Q3       12030.41 11798.94 12261.87 11676.41 12384.40
## 2007 Q4       12134.87 11862.68 12407.07 11718.58 12551.17
## 2008 Q1       12239.34 11924.53 12554.15 11757.88 12720.80
## 2008 Q2       12343.81 11984.56 12703.07 11794.38 12893.24
## 2008 Q3       12448.28 12042.81 12853.76 11828.16 13068.40
## 2008 Q4       12552.75 12099.33 13006.17 11859.30 13246.20
## 2009 Q1       12657.22 12154.17 13160.26 11887.88 13426.56
## 2009 Q2       12761.69 12207.39 13315.98 11913.97 13609.41
## 2009 Q3       12866.16 12259.03 13473.28 11937.63 13794.68
## 2009 Q4       12970.62 12309.13 13632.12 11958.95 13982.30
## 2010 Q1       13075.09 12357.72 13792.47 11977.97 14172.22
## 2010 Q2       13179.56 12404.86 13954.27 11994.75 14364.37
## 2010 Q3       13284.03 12450.56 14117.50 12009.35 14558.71
## 2010 Q4       13388.50 12494.87 14282.13 12021.81 14755.19
## 2011 Q1       13492.97 12537.81 14448.12 12032.18 14953.75

(11821.47- 11508.06)/ 11508.06

## [1] 0.02723396

#This seems right at just below 3% per year
checkresiduals(usgdp_ets_fc)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(A,A,N)
## Q* = 21.461, df = 4, p-value = 0.0002565
## 
## Model df: 4.   Total lags used: 8

#Some autocorrelation in the ets model and residuals are not normally distributed
accuracy(usgdp_ets_fc)

##                   ME     RMSE      MAE        MPE      MAPE     MASE       ACF1
## Training set 1.30185 41.53448 31.85324 0.02319607 0.7543931 0.180026 0.07596492

#RMSE was lower for the transformed ARIMA model, I think overall the AUTO.ARIMA model with the boxcox transformation is my best bet

#10 Consider austourists, the quarterly visitor nights (in millions) spent by international tourists to Australia for the period 1999–2015.
#a. Describe the timeplot
#?austourists > quarterly visitor nights in millions by tourists in Austrailia
autoplot(austourists)

ggseasonplot(austourists)

#From 1999 to 2015 there has been increasing trend of tourist visitors  overnights in Austrailia 
#There is also clear seasonality in play as tourism visitor nights seems to spike at the end of Q4 into Q1 befor declining strongly in Q2.  I imagine Q2 is a low season for tourism in Austrailia

#b/c. What can you learn from the ACF/PACF graphs
ggtsdisplay(austourists)

#The autocorrelation peaks every 4 lags which makes sense given this is quarterly data.  The ACF also shows slow decay lag 4 to lag 8 to lag 12 etc..
#The PACF likewise shows correlation at lags 4 & 8 but recedes to near 0 after that

#d. Produce plots of the seasonally differenced data  
ndiffs(austourists)

## [1] 1

austourists_diff <- diff(austourists, lag = 4) 
ggtsdisplay(austourists_diff)

#Because of the strong positive ACF to start, I would anticipate an AR 1 or 2 with an MA 1 or 2 and a d of 1

#e. Does auto.arima() give the same model that you chose? If not, which model do you think is better?
austortourists_autoa <- auto.arima(austourists)
austortourists_autoa

## Series: austourists 
## ARIMA(1,0,0)(1,1,0)[4] with drift 
## 
## Coefficients:
##          ar1     sar1   drift
##       0.4705  -0.5305  0.5489
## s.e.  0.1154   0.1122  0.0864
## 
## sigma^2 estimated as 5.15:  log likelihood=-142.48
## AIC=292.97   AICc=293.65   BIC=301.6

austortourists_autoa_fc <- forecast(austourists, h=4*3)
#auto.arima gives an ARIMA(1,0,0)(1,1,0) with drift model
#The auto.arima model is different than I had anticipated because, namely, I forgot that because we are correcting for a seasonal component as well, the model would be in the form (pdq)(PDQ)
#It makes since that the differencing is related solely to the seasonal aspect and because the ACF is positive at the seasonal lags, we'd expect an AR term, not MA

#11. Consider usmelec, the total net generation of electricity (in billion kilowatt hours) by the U.S. electric industry (monthly for the period January 1973 – June 2013). In general there are two peaks per year: in mid-summer and mid-winter.
#a.Examine the 12-month moving average of this series to see what kind of trend is involved.
usmelec_ma <- ma(usmelec, order = 12)
autoplot(usmelec_ma)

autoplot(decompose(usmelec))

#There is a strong upward trend, the seasonality already seems consistent in terms of variance but not with a mean of 0

#b. Do the data need transforming? If so, find a suitable transformation.
ndiffs(usmelec)

## [1] 1

ggtsdisplay(diff(usmelec, lag = 12))

autoplot(decompose(diff(usmelec, lag = 12)))

#After differencing the trend and seasonality are not stable around 0
#In order to make the trend and seasonal components stationary, I am adding a boxcox transformation
usmelec_bc <- BoxCox(usmelec, BoxCox.lambda(usmelec))
ggtsdisplay(usmelec_bc)

#Way too much autocorrelation, so need to difference

#Are the data stationary? If not, find an appropriate differencing which yields stationary data.
ndiffs(usmelec_bc)

## [1] 1

usmelec_bc_diff <- diff(usmelec_bc, lag = 12)
ggtsdisplay(usmelec_bc_diff)

#Because the initial ACF at lag 12 is negative, I would except in MA term

#d. Identify a couple of ARIMA models that might be useful in describing the time series. Which of your models is the best according to their AIC values?
#Model 1 -> ARIMA(0,1,1)(0,1,1)
usmelec_test1 <- Arima(usmelec_bc, order = c(0,1,1), seasonal = c(0,1,1))
usmelec_test1_fc <- forecast(usmelec_test1, h=36)
autoplot(usmelec_test1_fc)

checkresiduals(usmelec_test1_fc)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,1,1)(0,1,1)[12]
## Q* = 71.613, df = 22, p-value = 3.678e-07
## 
## Model df: 2.   Total lags used: 24

accuracy(usmelec_test1_fc)

##                         ME        RMSE          MAE          MPE       MAPE
## Training set -3.756618e-05 0.001148661 0.0008761514 -0.002263146 0.05254106
##                   MASE      ACF1
## Training set 0.5840923 0.1214656

#Model 2
#Model 2 -> ARIMA(1,1,2)(1,1,2)
usmelec_test2 <- Arima(usmelec_bc, order = c(1,1,2), seasonal = c(1,1,2))
usmelec_test2_fc <- forecast(usmelec_test2, h=36)
autoplot(usmelec_test2_fc)

checkresiduals(usmelec_test2_fc)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(1,1,2)(1,1,2)[12]
## Q* = 28.233, df = 18, p-value = 0.0586
## 
## Model df: 6.   Total lags used: 24

accuracy(usmelec_test2_fc)

##                         ME        RMSE         MAE          MPE       MAPE
## Training set -5.798143e-05 0.001109599 0.000841201 -0.003477622 0.05044141
##                   MASE        ACF1
## Training set 0.5607924 0.001225037

#Model 2 test performs better than model 1 in terms of RMSE on training data and autocorrelation

#e. Estimate the parameters of your best model and do diagnostic testing on the residuals. Do the residuals resemble white noise? If not, try to find another ARIMA model which fits better.
#The residuals for ARIMA(1,1,2)(1,1,2) does resemble white noise,just barely passing the Ljung-Box test

#f. Forecast the next 15 years of electricity generation by the U.S. electric industry. Get the latest figures from the EIA to check the accuracy of your forecasts.
usmelec_test2_fc_15 <- forecast(usmelec_test2, h = 12*15)
autoplot(usmelec_test2_fc_15)

#g. Eventually, the prediction intervals are so wide that the forecasts are not particularly useful. How many years of forecasts do you think are sufficiently accurate to be usable?
length(usmelec)/12

## [1] 40.5

#Given the length of the historical data, I would say 3-8 years.  I would not go beyond 20 percent of the training data.

#12. For the mcopper data:
#a. if necessary, find a suitable Box-Cox transformation for the data
autoplot(mcopper)

#Slow upward trend, unclear on seasonality.  Likely cyclical
mcopper_lambda <-BoxCox.lambda(mcopper)
ndiffs(BoxCox(mcopper, mcopper_lambda))

## [1] 1

ggtsdisplay(diff(BoxCox(mcopper, mcopper_lambda)))

#mcopper requires 1 difference on a box.cox transformation with lambda 0.19

#b.fit a suitable ARIMA model to the transformed data using auto.arima();
mcopper_arima <- auto.arima(mcopper, lambda = mcopper_lambda)
mcopper_arima <- forecast(mcopper_arima, h = 12*3)
#AutoARIMA produces an ARIMA(0,1,1) model
autoplot(mcopper_arima)

#c.
model_mcopper_test1 <- Arima(mcopper, order = c(0,1,2), lambda = mcopper_lambda)
model_mcopper_test1_fc <- forecast(model_mcopper_test1, h = 12*3)
autoplot(model_mcopper_test1_fc)

model_mcopper_test2 <- Arima(mcopper, order = c(1,1,2), lambda = mcopper_lambda)
model_mcopper_test2_fc <- forecast(model_mcopper_test2, h = 12*3)
autoplot(model_mcopper_test2_fc)

#All three models look quite similar - ARIMA(1,1,2) has the highest projection.  I will stick with the auto_arima

#d. choose what you think is the best model and check the residual diagnostics
checkresiduals(mcopper_arima)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,1,1)
## Q* = 22.913, df = 23, p-value = 0.4659
## 
## Model df: 1.   Total lags used: 24

#Residuals look great for the auto generated ARIMA model, clearly white noise

#e. produce forecasts of your fitted model. Do the forecasts look reasonable?
mcopper_arima

##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Jan 2007       3387.143 3188.095 3596.115 3086.619 3710.882
## Feb 2007       3387.143 3054.898 3747.998 2889.994 3951.229
## Mar 2007       3387.143 2964.977 3856.608 2759.364 4125.610
## Apr 2007       3387.143 2893.279 3947.003 2656.458 4272.299
## May 2007       3387.143 2832.337 4026.647 2569.883 4402.686
## Jun 2007       3387.143 2778.707 4098.983 2494.387 4522.024
## Jul 2007       3387.143 2730.462 4165.940 2427.034 4633.250
## Aug 2007       3387.143 2686.396 4228.725 2365.987 4738.201
## Sep 2007       3387.143 2645.693 4288.152 2310.004 4838.118
## Oct 2007       3387.143 2607.770 4344.804 2258.202 4933.884
## Nov 2007       3387.143 2572.198 4399.111 2209.926 5026.155
## Dec 2007       3387.143 2538.644 4451.405 2164.672 5115.434
## Jan 2008       3387.143 2506.849 4501.947 2122.046 5202.116
## Feb 2008       3387.143 2476.603 4550.944 2081.731 5286.517
## Mar 2008       3387.143 2447.735 4598.569 2043.468 5368.896
## Apr 2008       3387.143 2420.103 4644.964 2007.042 5449.469
## May 2008       3387.143 2393.588 4690.247 1972.273 5528.415
## Jun 2008       3387.143 2368.088 4734.520 1939.008 5605.888
## Jul 2008       3387.143 2343.517 4777.871 1907.114 5682.019
## Aug 2008       3387.143 2319.799 4820.374 1876.478 5756.923
## Sep 2008       3387.143 2296.867 4862.096 1847.001 5830.699
## Oct 2008       3387.143 2274.664 4903.095 1818.596 5903.434
## Nov 2008       3387.143 2253.139 4943.421 1791.184 5975.204
## Dec 2008       3387.143 2232.247 4983.121 1764.699 6046.080
## Jan 2009       3387.143 2211.945 5022.235 1739.078 6116.121
## Feb 2009       3387.143 2192.198 5060.800 1714.266 6185.383
## Mar 2009       3387.143 2172.973 5098.849 1690.215 6253.916
## Apr 2009       3387.143 2154.240 5136.412 1666.878 6321.764
## May 2009       3387.143 2135.971 5173.516 1644.216 6388.968
## Jun 2009       3387.143 2118.141 5210.186 1622.190 6455.566
## Jul 2009       3387.143 2100.729 5246.444 1600.768 6521.591
## Aug 2009       3387.143 2083.712 5282.312 1579.917 6587.075
## Sep 2009       3387.143 2067.072 5317.809 1559.609 6652.047
## Oct 2009       3387.143 2050.791 5352.953 1539.818 6716.532
## Nov 2009       3387.143 2034.853 5387.759 1520.518 6780.556
## Dec 2009       3387.143 2019.242 5422.244 1501.688 6844.141

#I am concerned that the most recent spike is overly impacting the projection, our forecast remains well above 90% of data.
accuracy(mcopper_arima)

##                    ME     RMSE      MAE      MPE     MAPE      MASE        ACF1
## Training set 3.480533 77.27254 44.92858 0.166202 4.303677 0.2021433 -0.08442198

#f. compare the results with what you would obtain using ets() (with no transformation).
mcopper_ets <- ets(mcopper)
mcopper_ets_fc <- forecast(mcopper_ets, h = 36)
autoplot(mcopper_ets_fc)

checkresiduals(mcopper_ets_fc)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(M,Ad,N)
## Q* = 77.585, df = 19, p-value = 4.832e-09
## 
## Model df: 5.   Total lags used: 24

accuracy(mcopper_ets)

##                    ME     RMSE      MAE       MPE     MAPE      MASE      ACF1
## Training set 2.691483 78.87699 46.76047 0.1883633 4.503026 0.2103854 0.1052005

#Some initial autocorrelation, definitely not all white noise residuals
#But the accuracy in terms of RMSE is very similar to the ARIMA model.
#I would actually prefer the ETS model because I think the decay in our forecast makes more sense given the historicals

#13. Choose one of the following seasonal time series: hsales, auscafe, qauselec, qcement, qgas.
#?auscafe <- monthly expenditure on cafes
#a. Do the data need transforming? If so, find a suitable transformation.
autoplot(auscafe)

ggseasonplot(auscafe)

#The AUS coffee data has a strong upward trend with soem small seasonality
autoplot(decompose(auscafe))

auscafe_lambda <- BoxCox.lambda(auscafe)
#0.1 lambda is close to a full log transformation

#b.Are the data stationary? If not, find an appropriate differencing which yields stationary data.
auscafe_bc <- BoxCox(auscafe, lambda = auscafe_lambda)
ggtsdisplay(auscafe_bc)

#After the transformation, the data is still not stationary due to serious autocorrelation
ndiffs(BoxCox(auscafe, auscafe_lambda))

## [1] 1

#1 differencing of the transformed data is required
auscafe_bc_dif <- diff(auscafe_bc)
ggtsdisplay(auscafe_bc_dif)

#c.Identify a couple of ARIMA models that might be useful in describing the time series. Which of your models is the best according to their AIC values?
#Test model 1 -> ARIMA(1,0,1)(1,1,1)
auscafe_arima_t1 <- Arima(auscafe, order = c(1,0,1), seasonal = c(1,1,1), lambda = auscafe_lambda)
summary(auscafe_arima_t1)

## Series: auscafe 
## ARIMA(1,0,1)(1,1,1)[12] 
## Box Cox transformation: lambda= 0.109056 
## 
## Coefficients:
##          ar1      ma1    sar1     sma1
##       0.9993  -0.3410  0.1251  -0.8598
## s.e.  0.0009   0.0451  0.0611   0.0346
## 
## sigma^2 estimated as 0.000582:  log likelihood=948.56
## AIC=-1887.12   AICc=-1886.97   BIC=-1866.99
## 
## Training set error measures:
##                        ME       RMSE        MAE        MPE     MAPE      MASE
## Training set -5.57067e-06 0.03699778 0.02701454 0.02639823 1.810944 0.2585459
##                      ACF1
## Training set 0.0001143763

#Test Model 2, same parameters just with drift
auscafe_arima_t2 <- Arima(auscafe, order = c(1,0,1), seasonal = c(1,1,1), lambda = auscafe_lambda, include.drift = TRUE)
summary(auscafe_arima_t2)

## Series: auscafe 
## ARIMA(1,0,1)(1,1,1)[12] with drift 
## Box Cox transformation: lambda= 0.109056 
## 
## Coefficients:
##          ar1      ma1    sar1     sma1   drift
##       0.9711  -0.3230  0.1398  -0.8610  0.0056
## s.e.  0.0134   0.0474  0.0628   0.0357  0.0004
## 
## sigma^2 estimated as 0.000575:  log likelihood=952.56
## AIC=-1893.12   AICc=-1892.92   BIC=-1868.97
## 
## Training set error measures:
##                         ME       RMSE        MAE        MPE     MAPE      MASE
## Training set -0.0008028054 0.03670207 0.02689896 0.02251643 1.796536 0.2574396
##                      ACF1
## Training set -0.006143918

#The model with drift performed slightly better in terms of AICc

#d. Estimate the parameters of your best model and do diagnostic testing on the residuals. Do the residuals resemble white noise? If not, try to find another ARIMA model which fits better.
checkresiduals(forecast(auscafe_arima_t2))

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(1,0,1)(1,1,1)[12] with drift
## Q* = 57.921, df = 19, p-value = 8.214e-06
## 
## Model df: 5.   Total lags used: 24

#The residuals do not appear to be white noise.
#To find a better model, I will leverage the auto.arima function
auscafe_arima_t3 <- auto.arima(auscafe, lambda = auscafe_lambda)
summary(auscafe_arima_t3)

## Series: auscafe 
## ARIMA(1,0,1)(2,1,1)[12] with drift 
## Box Cox transformation: lambda= 0.109056 
## 
## Coefficients:
##          ar1      ma1    sar1     sar2     sma1   drift
##       0.9718  -0.3190  0.1270  -0.0527  -0.8423  0.0056
## s.e.  0.0131   0.0478  0.0649   0.0585   0.0431  0.0004
## 
## sigma^2 estimated as 0.0005754:  log likelihood=952.96
## AIC=-1891.92   AICc=-1891.65   BIC=-1863.74
## 
## Training set error measures:
##                         ME       RMSE        MAE        MPE    MAPE      MASE
## Training set -0.0008622186 0.03664217 0.02686113 0.01635299 1.79394 0.2570775
##                     ACF1
## Training set -0.00872097

auscafe_arima_t3_fc <- forecast(auscafe_arima_t3, h = 24)
checkresiduals((auscafe_arima_t3_fc))

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(1,0,1)(2,1,1)[12] with drift
## Q* = 56.069, df = 18, p-value = 8.692e-06
## 
## Model df: 6.   Total lags used: 24

#Still not white noise residuals, but slightly bett AICc with the ARIMA(101)(211) with drift

#e. Forecast the next 24 months of data using your preferred model.
autoplot(auscafe_arima_t3_fc)

accuracy(auscafe_arima_t3_fc)

##                         ME       RMSE        MAE        MPE    MAPE      MASE
## Training set -0.0008622186 0.03664217 0.02686113 0.01635299 1.79394 0.2570775
##                     ACF1
## Training set -0.00872097

#f. Compare the forecasts obtained using ets()
auscafe_ets <- ets(auscafe)
auscafe_ets_fc <- forecast(auscafe_ets, h = 12)
auscafe_ets_fc

##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Oct 2017       3.833084 3.710682 3.955486 3.645886 4.020282
## Nov 2017       3.813389 3.669399 3.957380 3.593176 4.033603
## Dec 2017       4.217663 4.036861 4.398464 3.941150 4.494175
## Jan 2018       3.826123 3.644415 4.007830 3.548225 4.104020
## Feb 2018       3.514610 3.332694 3.696526 3.236394 3.792827
## Mar 2018       3.870005 3.654231 4.085779 3.540007 4.200003
## Apr 2018       3.783810 3.558549 4.009072 3.439302 4.128318
## May 2018       3.811589 3.570956 4.052222 3.443573 4.179605
## Jun 2018       3.678277 3.433373 3.923180 3.303729 4.052825
## Jul 2018       3.875118 3.604236 4.146000 3.460840 4.289396
## Aug 2018       3.886458 3.602300 4.170615 3.451876 4.321039
## Sep 2018       3.854391 3.560568 4.148215 3.405027 4.303755

autoplot(auscafe_ets_fc)

checkresiduals(auscafe_ets_fc)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(M,A,M)
## Q* = 83.332, df = 8, p-value = 1.044e-14
## 
## Model df: 16.   Total lags used: 24

accuracy(auscafe_ets_fc)

##                       ME       RMSE        MAE       MPE     MAPE      MASE
## Training set 0.002601818 0.03997611 0.02952838 0.1121434 1.982609 0.2826048
##                   ACF1
## Training set 0.1249781

#It seems that the seasonal ARIMA model with drift actually outperforms the ETS model

#14. For the same time series you used in the previous exercise, try using a non-seasonal model applied to the seasonally adjusted data obtained from STL. The stlf() function will make the calculations easy (with method="arima"). Compare the forecasts with those obtained in the previous exercise. Which do you think is the best approach?
auscafe_stlf = stlf(auscafe, auscafe_lambda, s.window = 13, robust = TRUE, method = "arima",h = 24)
autoplot(auscafe_stlf)

checkresiduals(auscafe_stlf)

## Warning in checkresiduals(auscafe_stlf): The fitted degrees of freedom is based
## on the model used for the seasonally adjusted data.

## 
##  Ljung-Box test
## 
## data:  Residuals from STL +  ARIMA(3,2,2)
## Q* = 166, df = 19, p-value < 2.2e-16
## 
## Model df: 5.   Total lags used: 24

accuracy(auscafe_stlf)

##                       ME       RMSE       MAE       MPE     MAPE      MASE
## Training set 0.003009462 0.04466553 0.0315392 0.3044353 2.377539 0.3018496
##                     ACF1
## Training set 0.001130337

#The seasonal_trend decomp actually seems to perform worse in terms of minimizing error on the training data.
#The AUTO.ARIMA model from question 13 has performed best.  All three comparison models fail the residual white noise test

#15. For your retail time series (Exercise 5 above):
#a. develop an appropriate seasonal ARIMA model;
retail_ts_arima <- auto.arima(train_retail)
retail_ts_arima

## Series: train_retail 
## ARIMA(0,1,1)(0,1,2)[12] 
## 
## Coefficients:
##           ma1     sma1     sma2
##       -0.6403  -0.3578  -0.1922
## s.e.   0.0490   0.0560   0.0511
## 
## sigma^2 estimated as 687.6:  log likelihood=-1556.12
## AIC=3120.25   AICc=3120.37   BIC=3135.47

#ARIMA(0,1,1)(0,1,2) - MA terms with 1 difference

#b. compare the forecasts with those you obtained in earlier chapters;
retail_ts_arima_fc <- forecast(retail_ts_arima, h = 12*3)
retail_ts_arima_fc

##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Jan 2011       2140.367 2106.762 2173.972 2088.972 2191.762
## Feb 2011       1944.746 1909.033 1980.459 1890.128 1999.365
## Mar 2011       2124.284 2086.580 2161.987 2066.622 2181.946
## Apr 2011       2048.629 2009.036 2088.223 1988.077 2109.182
## May 2011       2088.057 2046.659 2129.454 2024.745 2151.368
## Jun 2011       1999.884 1956.759 2043.010 1933.929 2065.840
## Jul 2011       2116.059 2071.271 2160.846 2047.562 2184.556
## Aug 2011       2113.213 2066.823 2159.603 2042.265 2184.160
## Sep 2011       2077.802 2029.863 2125.741 2004.486 2151.118
## Oct 2011       2176.403 2126.964 2225.842 2100.792 2252.014
## Nov 2011       2237.904 2187.009 2288.800 2160.067 2315.742
## Dec 2011       2435.056 2382.745 2487.367 2355.053 2515.058
## Jan 2012       2236.112 2173.903 2298.321 2140.972 2331.253
## Feb 2012       2049.432 1984.133 2114.731 1949.565 2149.298
## Mar 2012       2221.620 2153.370 2289.869 2117.241 2325.998
## Apr 2012       2147.369 2076.292 2218.446 2038.666 2256.072
## May 2012       2183.488 2109.691 2257.284 2070.625 2296.350
## Jun 2012       2100.419 2024.000 2176.839 1983.546 2217.293
## Jul 2012       2200.824 2121.868 2279.779 2080.072 2321.576
## Aug 2012       2208.697 2127.285 2290.110 2084.188 2333.207
## Sep 2012       2172.870 2089.072 2256.667 2044.713 2301.026
## Oct 2012       2277.566 2191.450 2363.682 2145.863 2409.270
## Nov 2012       2335.878 2247.504 2424.252 2200.722 2471.035
## Dec 2012       2525.779 2435.203 2616.354 2387.255 2664.302
## Jan 2013       2330.999 2233.907 2428.092 2182.509 2479.490
## Feb 2013       2144.319 2043.987 2244.651 1990.874 2297.763
## Mar 2013       2316.507 2213.037 2419.977 2158.263 2474.750
## Apr 2013       2242.256 2135.741 2348.772 2079.355 2405.157
## May 2013       2278.375 2168.898 2387.851 2110.945 2445.804
## Jun 2013       2195.307 2082.947 2307.666 2023.468 2367.145
## Jul 2013       2295.711 2180.541 2410.881 2119.574 2471.848
## Aug 2013       2303.585 2185.671 2421.498 2123.251 2483.918
## Sep 2013       2267.757 2147.162 2388.351 2083.323 2452.190
## Oct 2013       2372.454 2249.236 2495.671 2184.008 2560.899
## Nov 2013       2430.765 2304.979 2556.551 2238.392 2623.138
## Dec 2013       2620.666 2492.363 2748.969 2424.443 2816.888

retail_train_ets_fc

##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Jan 2011       2134.454 2062.903 2206.004 2025.027 2243.881
## Feb 2011       1955.290 1888.018 2022.561 1852.407 2058.173
## Mar 2011       2138.148 2062.647 2213.649 2022.679 2253.617
## Apr 2011       2053.044 1978.641 2127.447 1939.254 2166.834
## May 2011       2091.754 2013.959 2169.549 1972.777 2210.731
## Jun 2011       2011.179 1934.427 2087.931 1893.797 2128.560
## Jul 2011       2100.292 2018.057 2182.526 1974.525 2226.058
## Aug 2011       2127.319 2041.877 2212.761 1996.647 2257.992
## Sep 2011       2088.343 2002.318 2174.367 1956.779 2219.906
## Oct 2011       2206.706 2113.497 2299.916 2064.155 2349.258
## Nov 2011       2226.786 2130.360 2323.211 2079.316 2374.255
## Dec 2011       2458.152 2349.054 2567.250 2291.302 2625.003
## Jan 2012       2232.986 2128.582 2337.390 2073.314 2392.658
## Feb 2012       2045.206 1947.371 2143.041 1895.581 2194.831
## Mar 2012       2236.098 2126.680 2345.515 2068.758 2403.438
## Apr 2012       2146.738 2039.307 2254.169 1982.437 2311.039
## May 2012       2186.854 2074.954 2298.754 2015.718 2357.990
## Jun 2012       2102.270 1992.303 2212.238 1934.090 2270.451
## Jul 2012       2195.062 2077.711 2312.414 2015.589 2374.536
## Aug 2012       2222.950 2101.516 2344.384 2037.233 2408.667
## Sep 2012       2181.872 2060.111 2303.632 1995.655 2368.088
## Oct 2012       2305.169 2173.784 2436.554 2104.233 2506.105
## Nov 2012       2325.777 2190.422 2461.132 2118.770 2532.785
## Dec 2012       2567.026 2414.517 2719.535 2333.783 2800.269
## Jan 2013       2331.539 2187.523 2475.554 2111.286 2551.791
## Feb 2013       2135.141 2000.667 2269.614 1929.481 2340.801
## Mar 2013       2334.068 2184.210 2483.927 2104.880 2563.257
## Apr 2013       2240.452 2093.839 2387.065 2016.227 2464.677
## May 2013       2281.973 2129.804 2434.143 2049.250 2514.697
## Jun 2013       2193.381 2044.368 2342.394 1965.486 2421.277
## Jul 2013       2289.853 2131.392 2448.314 2047.508 2532.198
## Aug 2013       2318.602 2155.199 2482.004 2068.700 2568.504
## Sep 2013       2275.420 2112.144 2438.697 2025.711 2525.130
## Oct 2013       2403.653 2228.072 2579.234 2135.125 2672.181
## Nov 2013       2424.790 2244.514 2605.066 2149.081 2700.498
## Dec 2013       2675.924 2473.477 2878.370 2366.308 2985.539

autoplot(retail_train_ets_fc)

autoplot(retail_ts_arima_fc)

#The ARIMA forecast starts slightly higher than the ETS forecast but icnreases more slowly
accuracy(retail_ts_arima_fc)

##                    ME     RMSE      MAE        MPE    MAPE      MASE
## Training set 1.278019 25.60723 18.89797 0.09624258 1.95848 0.3063655
##                     ACF1
## Training set -0.01434914

checkresiduals(retail_ts_arima_fc)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,1,1)(0,1,2)[12]
## Q* = 115.3, df = 21, p-value = 5.107e-15
## 
## Model df: 3.   Total lags used: 24

accuracy(retail_train_ets_fc)

##                    ME     RMSE      MAE       MPE     MAPE      MASE       ACF1
## Training set 2.238717 25.19277 18.12092 0.1862117 1.934037 0.2937683 0.01490798

#Both models have very similar error statistics, but the RMSE is slighlty lower on the ETS model and less residual autocorrelation

#16. Consider sheep, the sheep population of England and Wales from 1867–1939.
#a. Produce a time plot of the time series.
autoplot(sheep)

#b. Assume you decide to fit the following model:
#Would this be an ARIMA (3,1,0) model?
#cBy examining the ACF and PACF of the differenced data, explain why this model is appropriate.
ggtsdisplay(sheep)

#An AR model makes sense because the ACF residuals are positive and decay, while the PACF recedes below the critical value by lag 3

#e. Now fit the model in R and obtain the forecasts using forecast. How are they different from yours? Why?
frequency(sheep)

## [1] 1

sheep %>% auto.arima() %>% forecast(h=3)

##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 1940       1777.996 1687.454 1868.537 1639.524 1916.467
## 1941       1718.869 1561.544 1876.193 1478.261 1959.476
## 1942       1695.985 1494.151 1897.819 1387.306 2004.664

#17

#18

Hwk 2 - Chapter 7&8 - Predictive Analytics

Thomas Laffey

11/28/2021