This project is intended to provide an insight into stock price data for the company Nvidia (NVDA), using data compiled by members on the internet. Using time series analysis with base R functionality, as well as Meta’s Prophet model, I aim to model historical data in a way that allows us to forecast future stock prices.
Nvidia is a company that I have great interest in, having been a long time user of their GPUs and software since getting into PC technology in 2018. As such, I have chosen to analyse their stock prices over the years. This data should allow us to see the successes and failures that Nvidia has had throughout the years, alongside various other market developments, through a reflection in their stock price. Our dataset is taken from the website Kaggle, where data is compiled and uploaded by users.
The Kaggle dataset contains daily data from the year 1999 to May 2024, but having a dataset this large would firstly have potential crashing issues when running Prophet, but secondly, and more importantly, it would cause the effect of recent changes in stock price to appear incorrectly. As such, we filter the dataset so that we can analyse from 2020 to 2024.
Further to this, the data is daily but we will take monthly averages for our time series.
Upon reviewing the dataset, I also noticed that the dataset, being not completely up-to-date, ends at 23/05/2024. To avoid skewing in May 2024, I clean the dataset so that we end on 30/04.
Furthermore, having compared with Yahoo Finance figures for NVDA, I noticed that the Close price in the dataset was multiplied by a factor of 10, so I fix this in the code.
MacOS automatically converted Kaggle data to the UK YYYY/MM/DD format when I was working on my laptop, so to avoid errors in R across devices I include a clause to check both formats of the date.
nvda_raw <- read.csv("Nvidia Dataset.csv")
nvda_monthly <- nvda_raw %>%
mutate(Date = as.Date(Date, tryFormats = c("%Y-%m-%d", "%d/%m/%Y"))) %>%
mutate(Close = Close / 10) %>%
filter(Date >= as.Date("2020-01-01") & Date <= as.Date("2024-04-30")) %>%
mutate(Month = as.Date(format(Date, "%Y-%m-01"))) %>%
group_by(Month) %>%
summarise(Price = mean(Close, na.rm = TRUE))
nvda_ts <- ts(nvda_monthly$Price, start = c(2020, 1), frequency = 12)
summary(nvda_ts)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 6.089 13.272 18.005 25.304 30.045 89.443
Before we use Prophet to forecast, we will test the time series and perform a classical decomposition to split the data into its underlying components (\(X_t = m_t + S_t + Y_t\)).
plot(nvda_ts,
main = "Average Monthly Closing Price of NVDA (2020-2024)",
xlab = "Year",
ylab = "Price (USD)",
col = "darkgreen",
lwd = 3)As we can see, the stock had steady growth until 2022, where it dropped until the later months of the year. In 2024, the price surged, increasing above the $80 mark as the year progressed to May.
We now decompose to gain a deeper insight.
Decomposition Insights:
Trend: We can see that trend is relatively stable until 2023, where we can see the rise that we mentioned previously.
Seasonal: From the seasonal component, we can see that there is a regular pattern, where the stock price rises around halfway through the year, and dips near the end of the year. This may be attributable to Nvidia’s announcement and release cycles.
Random: The random component shows heteroskedasticity, due to bigger jumps in share price in 2023 and 2024.
We now use Meta’s Prophet model to perform further analysis and
forecasting. We must first format our data into the ds and
y columns that Prophet requires.
Our data is based off monthly average data, so we disable daily and weekly seasonality.
With the model fitted, we can forecast future values to predict how Nvidia’s stock might behave over the next two years. Since the data only goes to 2024, we will use 48 months (4 years) as our forecasting period.
nvda_forecast_df <- make_future_dataframe(nvda_prophet, periods = 48, freq = "month", include_history = TRUE)
nvda_predictions <- predict(nvda_prophet, nvda_forecast_df)
tail(nvda_predictions)## ds trend additive_terms additive_terms_lower
## 95 2027-11-01 96.71203 -7.95347613 -7.95347613
## 96 2027-12-01 97.68239 -7.41805476 -7.41805476
## 97 2028-01-01 98.68509 0.07118799 0.07118799
## 98 2028-02-01 99.68779 8.70709705 8.70709705
## 99 2028-03-01 100.62580 9.69826723 9.69826723
## 100 2028-04-01 101.62851 7.50989717 7.50989717
## additive_terms_upper yearly yearly_lower yearly_upper
## 95 -7.95347613 -7.95347613 -7.95347613 -7.95347613
## 96 -7.41805476 -7.41805476 -7.41805476 -7.41805476
## 97 0.07118799 0.07118799 0.07118799 0.07118799
## 98 8.70709705 8.70709705 8.70709705 8.70709705
## 99 9.69826723 9.69826723 9.69826723 9.69826723
## 100 7.50989717 7.50989717 7.50989717 7.50989717
## multiplicative_terms multiplicative_terms_lower multiplicative_terms_upper
## 95 0 0 0
## 96 0 0 0
## 97 0 0 0
## 98 0 0 0
## 99 0 0 0
## 100 0 0 0
## yhat_lower yhat_upper trend_lower trend_upper yhat
## 95 76.63815 100.9202 96.66689 96.75767 88.75856
## 96 77.25431 103.6982 97.63608 97.72959 90.26434
## 97 85.70444 111.6679 98.63733 98.73385 98.75628
## 98 96.32345 120.9269 99.63761 99.73820 108.39489
## 99 98.58875 122.2789 100.57396 100.67768 110.32407
## 100 95.78414 121.7491 101.57443 101.68202 109.13840
Now we plot, including our forecast data.
plot(nvda_prophet, nvda_predictions,
xlab = 'Year',
ylab = 'Price (USD)',
main = 'Prophet Forecast for NVDA')Observations
The black dots represent our actual historical monthly averages.
The blue line represents the model’s predicted trend, which is showing that the growth continues.
The light blue shaded region represents the confidence intervals, which we can see widen as the years go on, suggesting that the stock price is quite uncertain.
Prophet allows us to extract its own calculated components to compare against our classical decomposition.
Insights:
Trend Component: The Prophet analysis confirms what we saw earlier, with the growth rate rising past 2023.
Yearly Seasonality: The seasonality chart seems to be conflicting with what we found previously, where trend rises in March. Upon researching how Prophet calculates seasonality, I found that this harsh drop in March is most likely due to COVID-19, which started in March 2020.
To conclude, we have seen Nvidia’s success reflected in both our classical time series with decomposition and with Meta’s Prophet model analysis.
The Prophet model predicts that Nvidia’s stock price will keep rising, but also shows that there is some uncertainty surrounding the price.