logo

Introductory statement

This project is intended to provide an insight into stock price data for the company Nvidia (NVDA), using data compiled by members on the internet. Using time series analysis with base R functionality, as well as Meta’s Prophet model, I aim to model historical data in a way that allows us to forecast future stock prices.

Required dependencies.

We need to load the dependencies required to perform the data analysis.

library(prophet)
library(dplyr)

Dataset introduction.

Nvidia is a company that I have great interest in, having been a long time user of their GPUs and software since getting into PC technology in 2018. As such, I have chosen to analyse their stock prices over the years. This data should allow us to see the successes and failures that Nvidia has had throughout the years, alongside various other market developments, through a reflection in their stock price. Our dataset is taken from the website Kaggle, where data is compiled and uploaded by users.

Data correction, part I.

The Kaggle dataset contains daily data from the year 1999 to May 2024, but having a dataset this large would firstly have potential crashing issues when running Prophet, but secondly, and more importantly, it would cause the effect of recent changes in stock price to appear incorrectly. As such, we filter the dataset so that we can analyse from 2020 to 2024.

Data correction, part II.

Further to this, the data is daily but we will take monthly averages for our time series.

Upon reviewing the dataset, I also noticed that the dataset, being not completely up-to-date, ends at 23/05/2024. To avoid skewing in May 2024, I clean the dataset so that we end on 30/04.

Furthermore, having compared with Yahoo Finance figures for NVDA, I noticed that the Close price in the dataset was multiplied by a factor of 10, so I fix this in the code.

MacOS automatically converted Kaggle data to the UK YYYY/MM/DD format when I was working on my laptop, so to avoid errors in R across devices I include a clause to check both formats of the date.

nvda_raw <- read.csv("Nvidia Dataset.csv")

nvda_monthly <- nvda_raw %>%
  mutate(Date = as.Date(Date, tryFormats = c("%Y-%m-%d", "%d/%m/%Y"))) %>%
  mutate(Close = Close / 10) %>%
  filter(Date >= as.Date("2020-01-01") & Date <= as.Date("2024-04-30")) %>%
  
  mutate(Month = as.Date(format(Date, "%Y-%m-01"))) %>%
  group_by(Month) %>%
  summarise(Price = mean(Close, na.rm = TRUE))


nvda_ts <- ts(nvda_monthly$Price, start = c(2020, 1), frequency = 12)

summary(nvda_ts)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   6.089  13.272  18.005  25.304  30.045  89.443

Plot and Decomposition

Before we use Prophet to forecast, we will test the time series and perform a classical decomposition to split the data into its underlying components (\(X_t = m_t + S_t + Y_t\)).

plot(nvda_ts, 
     main = "Average Monthly Closing Price of NVDA (2020-2024)", 
     xlab = "Year", 
     ylab = "Price (USD)", 
     col = "darkgreen", 
     lwd = 3)

As we can see, the stock had steady growth until 2022, where it dropped until the later months of the year. In 2024, the price surged, increasing above the $80 mark as the year progressed to May.

We now decompose to gain a deeper insight.

nvda_decomp <- decompose(nvda_ts, type = "additive")

plot(nvda_decomp)

Decomposition Insights:

  • Trend: We can see that trend is relatively stable until 2023, where we can see the rise that we mentioned previously.

  • Seasonal: From the seasonal component, we can see that there is a regular pattern, where the stock price rises around halfway through the year, and dips near the end of the year. This may be attributable to Nvidia’s announcement and release cycles.

  • Random: The random component shows heteroskedasticity, due to bigger jumps in share price in 2023 and 2024.

Prophet Analysis

We now use Meta’s Prophet model to perform further analysis and forecasting. We must first format our data into the ds and y columns that Prophet requires.

prophet_df <- nvda_monthly %>%
  rename(ds = Month, y = Price)

Our data is based off monthly average data, so we disable daily and weekly seasonality.

nvda_prophet <- prophet(prophet_df, yearly.seasonality = TRUE, daily.seasonality = FALSE, weekly.seasonality = FALSE)

Forecasting

With the model fitted, we can forecast future values to predict how Nvidia’s stock might behave over the next two years. Since the data only goes to 2024, we will use 48 months (4 years) as our forecasting period.

nvda_forecast_df <- make_future_dataframe(nvda_prophet, periods = 48, freq = "month", include_history = TRUE)

nvda_predictions <- predict(nvda_prophet, nvda_forecast_df)


tail(nvda_predictions)
##             ds     trend additive_terms additive_terms_lower
## 95  2027-11-01  96.71203    -7.95347613          -7.95347613
## 96  2027-12-01  97.68239    -7.41805476          -7.41805476
## 97  2028-01-01  98.68509     0.07118799           0.07118799
## 98  2028-02-01  99.68779     8.70709705           8.70709705
## 99  2028-03-01 100.62580     9.69826723           9.69826723
## 100 2028-04-01 101.62851     7.50989717           7.50989717
##     additive_terms_upper      yearly yearly_lower yearly_upper
## 95           -7.95347613 -7.95347613  -7.95347613  -7.95347613
## 96           -7.41805476 -7.41805476  -7.41805476  -7.41805476
## 97            0.07118799  0.07118799   0.07118799   0.07118799
## 98            8.70709705  8.70709705   8.70709705   8.70709705
## 99            9.69826723  9.69826723   9.69826723   9.69826723
## 100           7.50989717  7.50989717   7.50989717   7.50989717
##     multiplicative_terms multiplicative_terms_lower multiplicative_terms_upper
## 95                     0                          0                          0
## 96                     0                          0                          0
## 97                     0                          0                          0
## 98                     0                          0                          0
## 99                     0                          0                          0
## 100                    0                          0                          0
##     yhat_lower yhat_upper trend_lower trend_upper      yhat
## 95    76.63815   100.9202    96.66689    96.75767  88.75856
## 96    77.25431   103.6982    97.63608    97.72959  90.26434
## 97    85.70444   111.6679    98.63733    98.73385  98.75628
## 98    96.32345   120.9269    99.63761    99.73820 108.39489
## 99    98.58875   122.2789   100.57396   100.67768 110.32407
## 100   95.78414   121.7491   101.57443   101.68202 109.13840

Now we plot, including our forecast data.

plot(nvda_prophet, nvda_predictions, 
     xlab = 'Year', 
     ylab = 'Price (USD)',
     main = 'Prophet Forecast for NVDA')

Observations

  • The black dots represent our actual historical monthly averages.

  • The blue line represents the model’s predicted trend, which is showing that the growth continues.

  • The light blue shaded region represents the confidence intervals, which we can see widen as the years go on, suggesting that the stock price is quite uncertain.

Seasonality Analysis

Prophet allows us to extract its own calculated components to compare against our classical decomposition.

prophet_plot_components(nvda_prophet, nvda_predictions)

Insights:

Trend Component: The Prophet analysis confirms what we saw earlier, with the growth rate rising past 2023.

Yearly Seasonality: The seasonality chart seems to be conflicting with what we found previously, where trend rises in March. Upon researching how Prophet calculates seasonality, I found that this harsh drop in March is most likely due to COVID-19, which started in March 2020.

  • Since the pandemic caused a global economy crash, the extremely harsh NVDA stock price drop in that year causes the yearly seasonality to appear this way.

Final statements

To conclude, we have seen Nvidia’s success reflected in both our classical time series with decomposition and with Meta’s Prophet model analysis.

The Prophet model predicts that Nvidia’s stock price will keep rising, but also shows that there is some uncertainty surrounding the price.