2 Analysis

The following section will contain four sections:

a preliminary analysis where the closing stock price will be plotted and discussed
a split of the data into training and testing data and the construction of four different forecasting models
a graphical representation of the forecasted outputs
a determination of the best model using various accuracy measures

2.1 Preliminary Analysis

In this section, we will first import the data for NVDIA’s closing stock price from Yahoo Finance and then plot the data to see a visual representation of the stock’s historical closing price over the past two years.

getSymbols("NVDA", from="2021-06-21", to="2023-04-14") #retrieve data from Yahoo finance

## [1] "NVDA"

NVDA_Close_Prices = NVDA[,4] #just look at closing price

plot(NVDA_Close_Prices, main = "NVIDIA's Closing Stock Price") #generate plot of stock data

The graph shows that the date of the beginning of the stock price is June 21, 2021 which represents the day the stock was bought by the analyst. The last day is April 14, 2023, the last trading day before this report was written. From the graph we can deduce that the stock price reached a maximum of 333.76001 on 2021-11-29, a minimum of on 2022-10-14, and ended the final trading day at 264.630005.

2.2 Training, Testing, and Model Building

In this section we will first split the data into traing and testing data, in which the testing data will be the last ten observations of the closing stock price. Next, we will develop four forecasting models:

a moving average model in which all future values to be equal to the average of the historical data
a naive forecasting model in which all forecasts to be the value of the last observation
a seasonal naive forecasting model which is a modification of the naive model but with a seasonality component
a drift model which is another variation of the naive method that allows the forecasts to increase or decrease over time

The forecast horizon for each model will be ten days, the same amount as the testing data.

training = NVDA_Close_Prices[1:447] #split training
testing = NVDA_Close_Prices[448:457] #testing

NVDA.ts = ts(training, frequency = 365, start = 1, end = 447) #create timeseries object

pred.mv = meanf(NVDA.ts, h=10)$mean #predicted moving average
pred.naive = naive(NVDA.ts, h=10)$mean #naive forecast
pred.snaive = snaive(NVDA.ts, h=10)$mean #seasonal naive
pred.rwf = rwf(NVDA.ts, h=10, drift = TRUE)$mean #drift

pred.table = cbind( pred.mv = pred.mv,
                    pred.naive = pred.naive,
                    pred.snaive = pred.snaive,
                    pred.rwf = pred.rwf)
pander(pred.table, caption = "Forecasting Table")

Forecasting Table
pred.mv	pred.naive	pred.snaive	pred.rwf
205.6	218.6	242.7	218.6
205.6	218.6	265	218.6
205.6	218.6	265.1	218.6
205.6	218.6	245.1	218.6
205.6	218.6	236.4	218.6
205.6	218.6	233.9	218.6
205.6	218.6	223.9	218.6
205.6	218.6	237.5	218.6
205.6	218.6	241.6	218.6
205.6	218.6	243.9	218.6

The forecasting table above shows the various outputs for the forecasting models. The only one which varies with time is the seasonal naive model. The others stay constant over time.

2.3 Graphical Representation of the Forecasting Models

The following shows a graphical representation of the four forecasted models and the held-out testing data which is represented by the black line. Based on the graph, the seasonal naive data is the best approximation of the testing data. This conclusion will also be validated by the next section in which three statistical tests for accurracy will be implemented.

plot(448:457, NVDA_Close_Prices[448:457], type="l", xlim=c(448,457), ylim=c(200, 350),
     xlab = "observation sequence",
     ylab = "Stock Price",
     main = "NVDA Stock Price Forecast")
points(448:457, NVDA_Close_Prices[448:457],pch=20)
##
points(448:457, pred.mv, pch=15, col = "red")
points(448:457, pred.naive, pch=16, col = "blue")
points(448:457, pred.rwf, pch=18, col = "navy")
points(448:457, pred.snaive, pch=17, col = "purple")
##
lines(448:457, pred.mv, lty=2, col = "red")
lines(448:457, pred.snaive, lty=2, col = "purple")
lines(448:457, pred.naive, lty=2, col = "blue")
lines(448:457, pred.rwf, lty=2, col = "navy")
## 
legend("topright", c("moving average", "naive", "drift", "seasonal naive"),
       col=c("red", "blue", "navy", "purple"), pch=15:18, lty=rep(2,4),
       bty="n", cex = 0.8)

2.4 Statistical Tests for Accurracy

In the following section we will test the accurracy of each model using three accurracy measures:

Mean Absolute Percentage Error (MAPE): represents the average of the absolute percentage errors of each entry in a dataset to calculate how accurate the forecasted quantities were in comparison with the actual quantities
Mean Average Deviation (MAD): mean (average) distance between each data value and the mean of the data set
Mean Standard Error (MSE): the average squared error between actual and predicted values

true.value = NVDA_Close_Prices[448:457]
PE.mv =  100*(true.value - pred.mv)/true.value
PE.naive =  100*(true.value - pred.naive)/true.value
PE.snaive =  100*(true.value - pred.snaive)/true.value
PE.rwf =  100*(true.value - pred.rwf)/true.value
##
MAPE.mv = mean(abs(PE.mv))
MAPE.naive = mean(abs(PE.naive))
MAPE.snaive = mean(abs(PE.snaive))
MAPE.rwf = mean(abs(PE.rwf))
##
MAPE = c(MAPE.mv, MAPE.naive, MAPE.snaive, MAPE.rwf)
## residual-based Error
e.mv = true.value - pred.mv
e.naive = true.value - pred.naive
e.snaive = true.value - pred.snaive
e.rwf = true.value - pred.rwf
## MAD
MAD.mv = sum(abs(e.mv))
MAD.naive = sum(abs(e.naive))
MAD.snaive = sum(abs(e.snaive))
MAD.rwf = sum(abs(e.rwf))
MAD = c(MAD.mv, MAD.naive, MAD.snaive, MAD.rwf)
## MSE
MSE.mv = mean((e.mv)^2)
MSE.naive = mean((e.naive)^2)
MSE.snaive = mean((e.snaive)^2)
MSE.rwf = mean((e.rwf)^2)
MSE = c(MSE.mv, MSE.naive, MSE.snaive, MSE.rwf)
##
accuracy.table = cbind(MAPE = MAPE, MAD = MAD, MSE = MSE)
row.names(accuracy.table) = c("Moving Average", "Naive", "Seasonal Naive", "Drift")
pander(accuracy.table, caption ="Overall performance of the four forecasting methods")

Overall performance of the four forecasting methods
	MAPE	MAD	MSE
Moving Average	24.43	665.7	4454
Naive	19.66	535.8	2894
Seasonal Naive	10.56	287.1	943.8
Drift	19.66	535.8	2894

Based on the following table, the accurracy measures confirm our conclusion that the seasonal naive model is the most accurrate at forecasting. Looking in particular at the MAPE measure, the seasonal naive is only off by around 10 to 11 percent, whereas the naive and drift models are off by 20 percent and the moving average is off by 24 to 25 percent.

Predicting NVIDIA Stock Price using Time Series Methods

Angelo Saporito

2023-03-30

1 Introduction

2 Analysis

2.1 Preliminary Analysis

2.2 Training, Testing, and Model Building

2.3 Graphical Representation of the Forecasting Models

2.4 Statistical Tests for Accurracy

3 Conclusion