Load packages and data

library(fpp3)

tsla1 <- readxl::read_excel("//Users//colinadams//Documents//GCSU//Fall 2022//Forecasting//Forecasts//Forecast 5//TSLA.xlsx")

Questions

Creating a tsibble and plotting

tsla <- tsla1 %>%
  mutate(Day = row_number()) %>%
  as_tsibble(index = Day) %>%
  select(Day, Close)

tsla %>%
  ggplot(aes(x = Day)) +
  geom_line(aes(y = Close)) +
  labs(title = "Tesla Stock Price Since IPO", y = "Close Price", x = "Days Since IPO")

By plotting this data, I see that the level of stock price for Tesla is under $50 for a long time until 2021. In 2021 the stock price becomes much more volatile and increases drastically. I observe no seasonality, cycle, or consistent trend in the data. The high amount of volitility and no preceivable patter in the data will make this difficult to forecast with any confidence.

Creating test and training data

tsla_2022 <- tsla %>%
  filter(Day > 2800)
tsla_train <- tsla %>%
  filter(Day < 2181)
tsla_test <- tsla %>%
  filter(Day > 2180)

I create my training and test data sets that I will use for cross-validation.

Determining my method with AICc

tsla_compare <- tsla %>%
  model('Auto' = ETS(Close),
        'Simple' = ETS(Close ~ error("A") + trend("N") + season("N")),
        'Holt' = ETS(Close ~ error("A") + trend("A") + season("N")),
        'Damped Holt-Winters' = ETS(Close ~ error("A") + trend("Ad") + season("N")))

tsla_compare %>%
  glance()
## # A tibble: 4 × 9
##   .model                sigma2 log_lik    AIC   AICc    BIC   MSE  AMSE    MAE
##   <chr>                  <dbl>   <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl>  <dbl>
## 1 Auto                 0.00128 -11019. 22049. 22049. 22079.  19.1  37.0 0.0246
## 2 Simple              19.1     -17150. 34307. 34307. 34325.  19.1  36.8 1.63  
## 3 Holt                19.1     -17150. 34310. 34310. 34340.  19.1  36.8 1.64  
## 4 Damped Holt-Winters 19.1     -17149. 34311. 34311. 34347.  19.1  36.8 1.63
tsla_compare %>%
  forecast(h = 1) %>%
  autoplot(tsla_2022)

Next, I check the AICc of a one step forecast using four different methods. I choose not to use any methods with seasonality because there is none in Tesla’s stock price. I chose to compare the Auto ETS, Simple Exponential Smoothing, Holt, and Damped Holt methods. I picked each of these as they let me compare different trends without including seasonality. The auto ETS selected (M,A,N). Using multiplicative error does not seem necessary for this data. Other than the auto method, I found the simple exponential smoothing technique to have the lowest AICc.

Testing the accuracy (RMSE) of methods with cross validation

tsla_cv <- tsla_train %>%
  slice(1:(n() - 3)) %>%
  stretch_tsibble(.init = 500, .step = 1)

tsla_cv_compare <- tsla_cv %>%
  model('Auto' = ETS(Close),
        'Simple' = ETS(Close ~ error("A") + trend("N") + season("N")),
        'Holt-Winters' = ETS(Close ~ error("A") + trend("A") + season("N")),
        'Damped Holt-Winters' = ETS(Close ~ error("A") + trend("Ad") + season("N")))

tsla_cv_compare %>%
  forecast(h = 1) %>%
  accuracy(tsla)
## # A tibble: 4 × 10
##   .model            .type       ME  RMSE   MAE     MPE  MAPE  MASE RMSSE    ACF1
##   <chr>             <chr>    <dbl> <dbl> <dbl>   <dbl> <dbl> <dbl> <dbl>   <dbl>
## 1 Auto              Test   0.00628 0.458 0.301  0.0587  2.13  1.24  1.14 -0.0183
## 2 Damped Holt-Wint… Test   0.00708 0.459 0.302  0.0452  2.15  1.25  1.14 -0.0186
## 3 Holt-Winters      Test  -0.00193 0.459 0.301 -0.0213  2.14  1.24  1.14 -0.0137
## 4 Simple            Test   0.0106  0.458 0.301  0.0838  2.13  1.24  1.14 -0.0171

Next I wanted to check the accuracy of each method in order to double check my findings above. I compare the same four methods, but here I use cross validation and use the accuracy() function to obtain each’s RMSE. I found the automatic and simple exponential smoothing methods to have the lowest RMSE. I decide I will use the simple exponential smoothing method (Naive) to forecast the price of Tesla stock tomorrow. I believe this method theoretically makes the most sense as well as having the lowest AICc and RMSE values. According to the efficient market hypothesis, the best predictor for tomorrow’s stock price is the price today. Because the day to day variation is random, there is no way to forecast the stock price tomorrow better than to look at the price today. I beleieve the graph of Tesla’s stock price back’s this up as there is no strong trend, cycle, seasonality, or pattern that I can use to base my forecast off of for tomorrow.

Plotting and calculating the naive forecast, ETS(A,N,N)

tsla_naive <- tsla %>%
  model('Simple' = ETS(Close ~ error("A") + trend("N") + season("N")))

tsla_naive %>%
  forecast(h = 1) %>%
  autoplot(tsla_2022)

tsla_naive %>%
  forecast(h = 1) %>%
  hilo()
## # A tsibble: 1 x 6 [1]
## # Key:       .model [1]
##   .model   Day      Close .mean                  `80%`                  `95%`
##   <chr>  <dbl>     <dist> <dbl>                 <hilo>                 <hilo>
## 1 Simple  3121 N(187, 19)  187. [181.6193, 192.8161]80 [178.6557, 195.7797]95

Using the simple exponential smoothing technique, I forecast Tesla’s stock price to be $186.92 on November 17th. My 80% PI is 181.6193, 192.8161.

Testing residuals and checking for white noise

tsla_naive %>%
  gg_tsresiduals()

augment(tsla_naive) %>%
  features(.resid, ljung_box, lag = 10, dof = 0)
## # A tibble: 1 × 3
##   .model lb_stat lb_pvalue
##   <chr>    <dbl>     <dbl>
## 1 Simple    89.0  8.55e-15

In order to check if my forecast is good or not, I finally look at the residuals. I plotted the residuals and did a Ljung Box test to check for white noise. Plotting the residuals there is some possible correlation multiple lags back, but the residuals are normally distributed around the mean of 0. The plot does not look like white noise, but the ljung box test results in a very low p-value. This allows me to reject the null hypothesis meaning the residuals are not differentiable from white-noise. This is good for my forecast and adds to why I believe it is the best.