Discussion 3: ETS Models

I Decided on using vehicle sales since they have been a hot button topic due to supply chain issues.

# Libraries/Data ----------------------------------------------------------
library('tidyverse')

## Warning: package 'tidyverse' was built under R version 4.1.3

## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --

## v ggplot2 3.3.5     v purrr   0.3.4
## v tibble  3.1.6     v dplyr   1.0.8
## v tidyr   1.2.0     v stringr 1.4.0
## v readr   2.1.2     v forcats 0.5.1

## Warning: package 'ggplot2' was built under R version 4.1.3

## Warning: package 'tibble' was built under R version 4.1.3

## Warning: package 'tidyr' was built under R version 4.1.3

## Warning: package 'readr' was built under R version 4.1.3

## Warning: package 'purrr' was built under R version 4.1.3

## Warning: package 'dplyr' was built under R version 4.1.3

## Warning: package 'stringr' was built under R version 4.1.3

## Warning: package 'forcats' was built under R version 4.1.3

## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

library('lubridate')

## Warning: package 'lubridate' was built under R version 4.1.3

## 
## Attaching package: 'lubridate'

## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union

library('forecast')

## Warning: package 'forecast' was built under R version 4.1.3

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

library('fable')

## Warning: package 'fable' was built under R version 4.1.3

## Loading required package: fabletools

## Warning: package 'fabletools' was built under R version 4.1.3

## 
## Attaching package: 'fabletools'

## The following objects are masked from 'package:forecast':
## 
##     accuracy, forecast

vehicle <- read.csv("TOTALSA.csv")
view(vehicle)

colSums(is.na(vehicle))

##    DATE TOTALSA 
##       0       0

myts <- ts(vehicle$TOTALSA, frequency = 12, start = c(1976))
myts %>%  
  autoplot() + 
  labs(title = "US Vehicles Sales", 
       y = "Sales In Millions")

The plot shows a good amount of seasonality but the trend is quite strange. The effects of the financial crises and covid are very present.

# Automated ETS Model -----------------------------------------------------

auto <- ets(myts)
auto

## ETS(A,N,N) 
## 
## Call:
##  ets(y = myts) 
## 
##   Smoothing parameters:
##     alpha = 0.5583 
## 
##   Initial states:
##     l = 13.0378 
## 
##   sigma:  0.9631
## 
##      AIC     AICc      BIC 
## 3462.017 3462.061 3474.969

auto %>% 
  forecast(h = 10) %>% 
  autoplot() +
  labs(title = "US Vehicles Sales", 
       y = "Sales In Millions")

My first model used the ets function only. This automatically gives you the model with the lowest AIC so I was interested and finding out what it would be.

# Seasonal model ----------------------------------------------------------


auto_2 <- ets(myts, 'AAA')
auto_2

## ETS(A,A,A) 
## 
## Call:
##  ets(y = myts, model = "AAA") 
## 
##   Smoothing parameters:
##     alpha = 0.5672 
##     beta  = 1e-04 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 13.4024 
##     b = 0.0012 
##     s = 0.1744 -0.1086 -0.0964 0.174 0.1568 -0.0806
##            -0.1185 -0.0303 -0.1339 -0.0465 0.1198 -0.0103
## 
##   sigma:  0.9668
## 
##      AIC     AICc      BIC 
## 3480.028 3481.169 3553.419

auto_2 %>% 
  forecast(h = 10) %>% 
  autoplot() +
  labs(title = "US Vehicles Sales", 
       y = "Sales In Millions")

My second model was a AAA model to incorporate more seasonality. I wanted to see if I could almos overfit the ets model to the data for beter accuracy but the ets function knows better than me.

Discussion 3: ETS Models

Tyler Brierley

3/30/2022