library(fpp3)
tsla1 <- readxl::read_excel("//Users//colinadams//Documents//GCSU//Fall 2022//Forecasting//Forecasts//Forecast 5//TSLA.xlsx")
tsla <- tsla1 %>%
mutate(Day = row_number()) %>%
as_tsibble(index = Day) %>%
select(Day, Close)
tsla %>%
ggplot(aes(x = Day)) +
geom_line(aes(y = Close)) +
labs(title = "Tesla Stock Price Since IPO", y = "Close Price", x = "Days Since IPO")
By plotting this data, I see that the level of stock price for Tesla is under $50 for a long time until 2021. In 2021 the stock price becomes much more volatile and increases drastically. I observe no seasonality, cycle, or consistent trend in the data. The high amount of volitility and no preceivable patter in the data will make this difficult to forecast with any confidence.
tsla_2022 <- tsla %>%
filter(Day > 2800)
tsla_train <- tsla %>%
filter(Day < 2181)
tsla_test <- tsla %>%
filter(Day > 2180)
I create my training and test data sets that I will use for cross-validation.
tsla_compare <- tsla %>%
model('Auto' = ETS(Close),
'Simple' = ETS(Close ~ error("A") + trend("N") + season("N")),
'Holt' = ETS(Close ~ error("A") + trend("A") + season("N")),
'Damped Holt-Winters' = ETS(Close ~ error("A") + trend("Ad") + season("N")))
tsla_compare %>%
glance()
## # A tibble: 4 × 9
## .model sigma2 log_lik AIC AICc BIC MSE AMSE MAE
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Auto 0.00128 -11019. 22049. 22049. 22079. 19.1 37.0 0.0246
## 2 Simple 19.1 -17150. 34307. 34307. 34325. 19.1 36.8 1.63
## 3 Holt 19.1 -17150. 34310. 34310. 34340. 19.1 36.8 1.64
## 4 Damped Holt-Winters 19.1 -17149. 34311. 34311. 34347. 19.1 36.8 1.63
tsla_compare %>%
forecast(h = 1) %>%
autoplot(tsla_2022)
Next, I check the AICc of a one step forecast using four different methods. I choose not to use any methods with seasonality because there is none in Tesla’s stock price. I chose to compare the Auto ETS, Simple Exponential Smoothing, Holt, and Damped Holt methods. I picked each of these as they let me compare different trends without including seasonality. The auto ETS selected (M,A,N). Using multiplicative error does not seem necessary for this data. Other than the auto method, I found the simple exponential smoothing technique to have the lowest AICc.
tsla_cv <- tsla_train %>%
slice(1:(n() - 3)) %>%
stretch_tsibble(.init = 500, .step = 1)
tsla_cv_compare <- tsla_cv %>%
model('Auto' = ETS(Close),
'Simple' = ETS(Close ~ error("A") + trend("N") + season("N")),
'Holt-Winters' = ETS(Close ~ error("A") + trend("A") + season("N")),
'Damped Holt-Winters' = ETS(Close ~ error("A") + trend("Ad") + season("N")))
tsla_cv_compare %>%
forecast(h = 1) %>%
accuracy(tsla)
## # A tibble: 4 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Auto Test 0.00628 0.458 0.301 0.0587 2.13 1.24 1.14 -0.0183
## 2 Damped Holt-Wint… Test 0.00708 0.459 0.302 0.0452 2.15 1.25 1.14 -0.0186
## 3 Holt-Winters Test -0.00193 0.459 0.301 -0.0213 2.14 1.24 1.14 -0.0137
## 4 Simple Test 0.0106 0.458 0.301 0.0838 2.13 1.24 1.14 -0.0171
Next I wanted to check the accuracy of each method in order to double check my findings above. I compare the same four methods, but here I use cross validation and use the accuracy() function to obtain each’s RMSE. I found the automatic and simple exponential smoothing methods to have the lowest RMSE. I decide I will use the simple exponential smoothing method (Naive) to forecast the price of Tesla stock tomorrow. I believe this method theoretically makes the most sense as well as having the lowest AICc and RMSE values. According to the efficient market hypothesis, the best predictor for tomorrow’s stock price is the price today. Because the day to day variation is random, there is no way to forecast the stock price tomorrow better than to look at the price today. I beleieve the graph of Tesla’s stock price back’s this up as there is no strong trend, cycle, seasonality, or pattern that I can use to base my forecast off of for tomorrow.
tsla_naive <- tsla %>%
model('Simple' = ETS(Close ~ error("A") + trend("N") + season("N")))
tsla_naive %>%
forecast(h = 1) %>%
autoplot(tsla_2022)
tsla_naive %>%
forecast(h = 1) %>%
hilo()
## # A tsibble: 1 x 6 [1]
## # Key: .model [1]
## .model Day Close .mean `80%` `95%`
## <chr> <dbl> <dist> <dbl> <hilo> <hilo>
## 1 Simple 3121 N(187, 19) 187. [181.6193, 192.8161]80 [178.6557, 195.7797]95
Using the simple exponential smoothing technique, I forecast Tesla’s stock price to be $186.92 on November 17th. My 80% PI is 181.6193, 192.8161.
tsla_naive %>%
gg_tsresiduals()
augment(tsla_naive) %>%
features(.resid, ljung_box, lag = 10, dof = 0)
## # A tibble: 1 × 3
## .model lb_stat lb_pvalue
## <chr> <dbl> <dbl>
## 1 Simple 89.0 8.55e-15
In order to check if my forecast is good or not, I finally look at the residuals. I plotted the residuals and did a Ljung Box test to check for white noise. Plotting the residuals there is some possible correlation multiple lags back, but the residuals are normally distributed around the mean of 0. The plot does not look like white noise, but the ljung box test results in a very low p-value. This allows me to reject the null hypothesis meaning the residuals are not differentiable from white-noise. This is good for my forecast and adds to why I believe it is the best.