Objective: The primary goal is to forecast future prices of Tesla (TSLA) stock using time series analysis techniques. This includes fitting an ARIMA model to historical stock price data to generate forecasts for the future price movement of the stock.

install.packages(c("quantmod", "forecast", "tseries", "TTR", "ggplot2", "tidyverse"))

Installs all necessary packages.

library(quantmod)
getSymbols("TSLA", from = "2023-01-01", to = "2024-12-31")
[1] "TSLA"
prices <- Cl(TSLA)

Explanation: This code block loads the quantmod package and retrieves historical stock price data for Tesla (TSLA) from January 1, 2023, to December 31, 2024. The getSymbols function downloads the data, and the Cl() function extracts the closing prices.

Benefit: By loading historical price data, us can perform time series analysis and forecasting on real-world stock prices, making the analysis more relevant and practical.

library(tseries)
prices_ts <- ts(prices, frequency = 252)

Explanation: This block loads the tseries package and converts the extracted closing prices into a time series object using the ts() function. The frequency is set to 252, which corresponds to the approximate number of trading days in a year.

Benefit: Converting the data into a time series format allows us to apply time series analysis techniques and models, such as ARIMA, which are specifically designed for this type of data.

library(TTR)
ma20 <- SMA(prices, n = 20)
ma50 <- SMA(prices, n = 50)

Explanation: This code calculates the 20-day and 50-day simple moving averages (SMA) using the SMA() function from the TTR package. Moving averages smooth out price data, helping to identify trends by reducing noise.

Benefit: The moving averages provide additional insight into the stock’s price trends, making it easier to spot patterns such as upward or downward trends. This is useful for both short-term and long-term forecasting.

library(forecast)
fit <- auto.arima(prices_ts)

Explanation: This block loads the forecast package and automatically fits an ARIMA model to the time series data using the auto.arima() function.

Benefit: ARIMA models are powerful tools for time series forecasting. The auto.arima() function simplifies the process by automatically selecting the best parameters for the model, saving time and ensuring an optimal fit.

forecast_prices <- forecast(fit, h = 30)
print(forecast_prices)

Explanation: This code uses the fitted ARIMA model to forecast Tesla’s stock prices for the next 30 trading days. The forecast() function produces point forecasts and confidence intervals.

Benefit: Forecasting future prices is crucial for making informed investment decisions. This step provides us with predictive insights into Tesla’s future stock performance, including potential risks and opportunities.

# Plot the historical prices
plot(prices_ts, main = "ARIMA Forecast for TSLA Stock Prices", xlab = "Time", ylab = "Price", col = "blue", type = "l")

# Add the forecasted prices to the plot
lines(forecast_prices$fitted, col = "red", lty = 2)

# Add the forecast object to show the forecast and confidence intervals
lines(forecast_prices$mean, col = "green")
lines(forecast_prices$lower[,2], col = "orange")
lines(forecast_prices$upper[,2], col = "orange")

Explanation: This section of code creates a plot of the historical stock prices using the plot() function. It then adds the fitted values from the ARIMA model in red, the forecasted mean prices in green, and the confidence intervals in orange.

Benefit: Visualizing the historical and forecasted prices together helps us understand how well the model fits the historical data and what future prices might look like. The confidence intervals provide a range of likely values, helping us assess the uncertainty of the forecasts.

# Convert to data frame for ggplot2
historical_prices <- data.frame(Date = index(prices), Price = coredata(prices))
ma20_df <- data.frame(Date = index(prices), MA20 = coredata(ma20))
ma50_df <- data.frame(Date = index(prices), MA50 = coredata(ma50))

# Forecast data preparation
forecast_dates <- seq(as.Date("2024-08-08"), by = "day", length.out = 30)
forecast_df <- data.frame(Date = forecast_dates, 
                           Forecast = as.numeric(forecast_prices$mean),
                           Lower = as.numeric(forecast_prices$lower[,2]),
                           Upper = as.numeric(forecast_prices$upper[,2]))
library(ggplot2)

ggplot() +
  geom_line(data = historical_prices, aes(x = Date, y = TSLA.Close), color = "blue") +
  geom_line(data = ma20_df, aes(x = Date, y = ma20), color = "purple", linetype = "dashed") +
  geom_line(data = ma50_df, aes(x = Date, y = ma50), color = "orange", linetype = "dashed") +
  geom_line(data = forecast_df, aes(x = Date, y = Forecast), color = "red") +
  geom_ribbon(data = forecast_df, aes(x = Date, ymin = Lower, ymax = Upper), fill = "green", alpha = 0.2) +
  labs(title = "ARIMA Forecast for TSLA Stock Prices with Moving Averages", x = "Date", y = "Price") +
  theme_minimal() +
  theme(legend.position = "top")
Warning: Removed 19 rows containing missing values or values outside the scale range (`geom_line()`).
Warning: Removed 49 rows containing missing values or values outside the scale range (`geom_line()`).

Explanation: This section creates a more detailed plot using ggplot2. It visualizes historical prices, moving averages (20-day and 50-day), and forecasted prices with confidence intervals. Data frames are prepared for ggplot2 to handle the data more effectively.

Benefit: This plot provides a comprehensive view of Tesla’s stock performance, including historical trends, moving averages, and future forecasts. It’s particularly useful for visualizing the interaction between past performance and future predictions, allowing for better analysis and decision-making.

accuracy(forecast_prices)
                    ME     RMSE      MAE        MPE     MAPE      MASE       ACF1
Training set 0.2229182 7.143882 5.268739 0.09118097 2.555366 0.1636046 0.01290762

Understanding the Metrics:

ME (Mean Error): Value: 0.2229182 Explanation: ME represents the average of the errors (the difference between actual and predicted values). A value close to 0 suggests that the predictions are relatively unbiased. The positive value here indicates that, on average, the model slightly underestimates the actual prices, but the bias is minimal.

RMSE (Root Mean Squared Error): Value: 7.143882 Explanation: RMSE measures the square root of the average squared differences between actual and predicted values. It’s sensitive to large errors, which can heavily influence the overall accuracy. A lower RMSE value indicates better accuracy, and the value here suggests that the model has a reasonable overall error magnitude.

MAE (Mean Absolute Error): Value: 5.268739 Explanation: MAE is the average of the absolute differences between actual and predicted values. It’s a straightforward metric for understanding model accuracy, with a lower value indicating better performance. Here, the MAE indicates that the model’s predictions, on average, deviate by approximately 5.27 units from the actual prices.

MPE (Mean Percentage Error): Value: 0.09118097 Explanation: MPE represents the average of the percentage errors between actual and predicted values. A positive MPE suggests that the model slightly underestimates the actual values. The small positive value here shows that the model’s bias is minimal.

MAPE (Mean Absolute Percentage Error): Value: 2.555366 Explanation: MAPE is the average of the absolute percentage errors between actual and predicted values. It’s a common metric for assessing forecasting accuracy, with lower values indicating better performance. The MAPE of 2.555366% means that the model’s predictions are, on average, within 2.56% of the actual values, which indicates strong forecasting accuracy.

MASE (Mean Absolute Scaled Error): Value: 0.1636046 Explanation: MASE compares the model’s errors to those from a naive benchmark model. A MASE value below 1 indicates that the model outperforms the naive model. Here, the value of 0.1631641 suggests that the model performs significantly better than the naive approach, which is a positive outcome.

ACF1 (Autocorrelation of Errors at Lag 1): Value: 0.01290762 Explanation: ACF1 measures the correlation between forecast errors at lag 1. If errors are independent, ACF1 should be close to 0. The value here, 0.01468789, indicates that there is minimal autocorrelation in the errors, which is desirable and suggests that the model’s errors are not patterned or predictable.

Interpretation and Recommendations: Overall Model Performance: The model performs well, with low RMSE and MAE values, indicating accurate predictions. The minimal positive ME and MPE suggest that the model slightly underestimates actual prices, but this bias is very small.

MAPE: The MAPE of 2.555366% indicates high accuracy, as the predictions are typically within 2.56% of the actual values.

MASE: The MASE value of 0.1636046 demonstrates that the model significantly outperforms a naive benchmark, indicating strong performance.

ACF1: The low ACF1 value indicates that the forecast errors are largely independent, which is a good sign for model reliability.

Conclusion: These metrics show that the ARIMA model used in this project is effective and provides accurate forecasts with minimal bias and error. This suggests that the model is reliable for short-term forecasting, and further tuning may not be necessary unless we desire more precision.

---
title: "Financial Time Series Forecasting"
output: html_notebook
---
Objective: The primary goal is to forecast future prices of Tesla (TSLA) stock using time series analysis techniques. This includes fitting an ARIMA model to historical stock price data to generate forecasts for the future price movement of the stock.


```{r}
install.packages(c("quantmod", "forecast", "tseries", "TTR", "ggplot2", "tidyverse"))
```
Installs all necessary packages.




```{r}
library(quantmod)
getSymbols("TSLA", from = "2023-01-01", to = "2024-12-31")
prices <- Cl(TSLA)
```
Explanation: This code block loads the quantmod package and retrieves historical stock price data for Tesla (TSLA) from January 1, 2023, to December 31, 2024. The getSymbols function downloads the data, and the Cl() function extracts the closing prices.

Benefit: By loading historical price data, us can perform time series analysis and forecasting on real-world stock prices, making the analysis more relevant and practical.





```{r}
library(tseries)
prices_ts <- ts(prices, frequency = 252)
```
Explanation: This block loads the tseries package and converts the extracted closing prices into a time series object using the ts() function. The frequency is set to 252, which corresponds to the approximate number of trading days in a year.

Benefit: Converting the data into a time series format allows us to apply time series analysis techniques and models, such as ARIMA, which are specifically designed for this type of data.





```{r}
library(TTR)
ma20 <- SMA(prices, n = 20)
ma50 <- SMA(prices, n = 50)
```
Explanation: This code calculates the 20-day and 50-day simple moving averages (SMA) using the SMA() function from the TTR package. Moving averages smooth out price data, helping to identify trends by reducing noise.

Benefit: The moving averages provide additional insight into the stock's price trends, making it easier to spot patterns such as upward or downward trends. This is useful for both short-term and long-term forecasting.








```{r}
library(forecast)
fit <- auto.arima(prices_ts)
```
Explanation: This block loads the forecast package and automatically fits an ARIMA model to the time series data using the auto.arima() function.

Benefit: ARIMA models are powerful tools for time series forecasting. The auto.arima() function simplifies the process by automatically selecting the best parameters for the model, saving time and ensuring an optimal fit.







```{r}
forecast_prices <- forecast(fit, h = 30)
print(forecast_prices)
```
Explanation: This code uses the fitted ARIMA model to forecast Tesla's stock prices for the next 30 trading days. The forecast() function produces point forecasts and confidence intervals.

Benefit: Forecasting future prices is crucial for making informed investment decisions. This step provides us with predictive insights into Tesla's future stock performance, including potential risks and opportunities.







```{r}
# Plot the historical prices
plot(prices_ts, main = "ARIMA Forecast for TSLA Stock Prices", xlab = "Time", ylab = "Price", col = "blue", type = "l")

# Add the forecasted prices to the plot
lines(forecast_prices$fitted, col = "red", lty = 2)

# Add the forecast object to show the forecast and confidence intervals
lines(forecast_prices$mean, col = "green")
lines(forecast_prices$lower[,2], col = "orange")
lines(forecast_prices$upper[,2], col = "orange")

```
Explanation: This section of code creates a plot of the historical stock prices using the plot() function. It then adds the fitted values from the ARIMA model in red, the forecasted mean prices in green, and the confidence intervals in orange.

Benefit: Visualizing the historical and forecasted prices together helps us understand how well the model fits the historical data and what future prices might look like. The confidence intervals provide a range of likely values, helping us assess the uncertainty of the forecasts.








```{r}
# Convert to data frame for ggplot2
historical_prices <- data.frame(Date = index(prices), Price = coredata(prices))
ma20_df <- data.frame(Date = index(prices), MA20 = coredata(ma20))
ma50_df <- data.frame(Date = index(prices), MA50 = coredata(ma50))

# Forecast data preparation
forecast_dates <- seq(as.Date("2024-08-08"), by = "day", length.out = 30)
forecast_df <- data.frame(Date = forecast_dates, 
                           Forecast = as.numeric(forecast_prices$mean),
                           Lower = as.numeric(forecast_prices$lower[,2]),
                           Upper = as.numeric(forecast_prices$upper[,2]))
library(ggplot2)

ggplot() +
  geom_line(data = historical_prices, aes(x = Date, y = TSLA.Close), color = "blue") +
  geom_line(data = ma20_df, aes(x = Date, y = ma20), color = "purple", linetype = "dashed") +
  geom_line(data = ma50_df, aes(x = Date, y = ma50), color = "orange", linetype = "dashed") +
  geom_line(data = forecast_df, aes(x = Date, y = Forecast), color = "red") +
  geom_ribbon(data = forecast_df, aes(x = Date, ymin = Lower, ymax = Upper), fill = "green", alpha = 0.2) +
  labs(title = "ARIMA Forecast for TSLA Stock Prices with Moving Averages", x = "Date", y = "Price") +
  theme_minimal() +
  theme(legend.position = "top")

```
Explanation: This section creates a more detailed plot using ggplot2. It visualizes historical prices, moving averages (20-day and 50-day), and forecasted prices with confidence intervals. Data frames are prepared for ggplot2 to handle the data more effectively.

Benefit: This plot provides a comprehensive view of Tesla's stock performance, including historical trends, moving averages, and future forecasts. It’s particularly useful for visualizing the interaction between past performance and future predictions, allowing for better analysis and decision-making.





```{r}
accuracy(forecast_prices)
```

Understanding the Metrics:

ME (Mean Error):
Value: 0.2229182
Explanation: ME represents the average of the errors (the difference between actual and predicted values). A value close to 0 suggests that the predictions are relatively unbiased. The positive value here indicates that, on average, the model slightly underestimates the actual prices, but the bias is minimal.

RMSE (Root Mean Squared Error):
Value: 7.143882
Explanation: RMSE measures the square root of the average squared differences between actual and predicted values. It's sensitive to large errors, which can heavily influence the overall accuracy. A lower RMSE value indicates better accuracy, and the value here suggests that the model has a reasonable overall error magnitude.

MAE (Mean Absolute Error):
Value: 5.268739
Explanation: MAE is the average of the absolute differences between actual and predicted values. It's a straightforward metric for understanding model accuracy, with a lower value indicating better performance. Here, the MAE indicates that the model's predictions, on average, deviate by approximately 5.27 units from the actual prices.

MPE (Mean Percentage Error):
Value: 0.09118097
Explanation: MPE represents the average of the percentage errors between actual and predicted values. A positive MPE suggests that the model slightly underestimates the actual values. The small positive value here shows that the model's bias is minimal.

MAPE (Mean Absolute Percentage Error):
Value: 2.555366
Explanation: MAPE is the average of the absolute percentage errors between actual and predicted values. It's a common metric for assessing forecasting accuracy, with lower values indicating better performance. The MAPE of 2.555366% means that the model's predictions are, on average, within 2.56% of the actual values, which indicates strong forecasting accuracy.

MASE (Mean Absolute Scaled Error):
Value: 0.1636046
Explanation: MASE compares the model's errors to those from a naive benchmark model. A MASE value below 1 indicates that the model outperforms the naive model. Here, the value of 0.1631641 suggests that the model performs significantly better than the naive approach, which is a positive outcome.

ACF1 (Autocorrelation of Errors at Lag 1):
Value: 0.01290762
Explanation: ACF1 measures the correlation between forecast errors at lag 1. If errors are independent, ACF1 should be close to 0. The value here, 0.01468789, indicates that there is minimal autocorrelation in the errors, which is desirable and suggests that the model's errors are not patterned or predictable.

Interpretation and Recommendations:
Overall Model Performance: The model performs well, with low RMSE and MAE values, indicating accurate predictions. The minimal positive ME and MPE suggest that the model slightly underestimates actual prices, but this bias is very small.

MAPE: The MAPE of 2.555366% indicates high accuracy, as the predictions are typically within 2.56% of the actual values.

MASE: The MASE value of 0.1636046 demonstrates that the model significantly outperforms a naive benchmark, indicating strong performance.

ACF1: The low ACF1 value indicates that the forecast errors are largely independent, which is a good sign for model reliability.

Conclusion:
These metrics show that the ARIMA model used in this project is effective and provides accurate forecasts with minimal bias and error. This suggests that the model is reliable for short-term forecasting, and further tuning may not be necessary unless we desire more precision.
