Prediction in R with Prophet

23/10/2024 Yufy Firdiansyah, MM, S.Sos

Short Introduction

In the ever-evolving landscape of financial markets, accurately forecasting price movements is crucial for making informed investments. This analysis demonstrates how to leverage the power of the Prophet package in R, designed for forecasting time series data. Prophet simplifies the complexities associated with time series modeling, allowing users to capture trends, seasonality, and holiday effects effectively. By using historical price data, we will explore how Prophet can deliver robust forecasts, enabling investors and analysts to make better decisions based on anticipated market conditions. This introduction provides a foundation for understanding the capabilities of Prophet and sets the stage for practical implementations in financial forecasting.

Install Package & Loading Libraries

#Loading Libraries - Please install if necessary

library(readr)  
library(dplyr)  
library(stringr)  
library(quantmod)
library(TTR)
library(ggplot2)
library(prophet)

# Clear the environment to avoid naming conflicts
rm(list = ls())

Data Cleaning

Data cleaning is essential because downloads from Investing.com typically lack a Close Price column for candlestick charts, require renaming the ‘Vol.’ column to ‘Volume’, and have Date, Vol., and Change columns in character format.

First we set up the directory and upload the file to be analyzed

#Set Directory and Upload The File
setwd("~/RProject/Stocks_2")
stocs <-read.csv("LPKR Historical Data.csv")
head(stocs)
##         Date Price Open High Low    Vol. Change..
## 1 10/01/2024   105  100  106 100 244.35M    3.96%
## 2 09/30/2024   101  100  101  98 179.18M    2.02%
## 3 09/27/2024    99  100  101  96 170.50M    0.00%
## 4 09/26/2024    99  100  101  98 150.01M   -1.00%
## 5 09/25/2024   100  103  104  98 237.06M   -1.96%
## 6 09/24/2024   102   99  105  97 417.81M    4.08%

Then we trim column names to tidy up the data

column_names <- colnames(stocs)
print(column_names)
## [1] "Date"     "Price"    "Open"     "High"     "Low"      "Vol."     "Change.."
str(stocs)
## 'data.frame':    176 obs. of  7 variables:
##  $ Date    : chr  "10/01/2024" "09/30/2024" "09/27/2024" "09/26/2024" ...
##  $ Price   : int  105 101 99 99 100 102 98 101 107 95 ...
##  $ Open    : int  100 100 100 100 103 99 101 107 114 88 ...
##  $ High    : int  106 101 101 101 104 105 103 108 116 98 ...
##  $ Low     : int  100 98 96 98 98 97 95 101 98 88 ...
##  $ Vol.    : chr  "244.35M" "179.18M" "170.50M" "150.01M" ...
##  $ Change..: chr  "3.96%" "2.02%" "0.00%" "-1.00%" ...
colnames(stocs) <- trimws(colnames(stocs))
head(stocs)
##         Date Price Open High Low    Vol. Change..
## 1 10/01/2024   105  100  106 100 244.35M    3.96%
## 2 09/30/2024   101  100  101  98 179.18M    2.02%
## 3 09/27/2024    99  100  101  96 170.50M    0.00%
## 4 09/26/2024    99  100  101  98 150.01M   -1.00%
## 5 09/25/2024   100  103  104  98 237.06M   -1.96%
## 6 09/24/2024   102   99  105  97 417.81M    4.08%

Do not miss the Date Column because it is still in character/string format

#Convert Date to Date Format

library(lubridate)
stocs$Date <- parse_date_time(stocs$Date, orders = c("ymd", "dmy", "mdy"), quiet = TRUE)

Next, we create close column from open column next row

# Create the 'Close' column with the 'Open' value from the next row
socs <- stocs %>%
  mutate(Close = lead(Open))

# Print the updated dataframe
head(socs)
##         Date Price Open High Low    Vol. Change.. Close
## 1 2024-10-01   105  100  106 100 244.35M    3.96%   100
## 2 2024-09-30   101  100  101  98 179.18M    2.02%   100
## 3 2024-09-27    99  100  101  96 170.50M    0.00%   100
## 4 2024-09-26    99  100  101  98 150.01M   -1.00%   103
## 5 2024-09-25   100  103  104  98 237.06M   -1.96%    99
## 6 2024-09-24   102   99  105  97 417.81M    4.08%   101

We also have to delete the latest row because it does not have value inside close cell

#Delete latest row

socs_data_cleaned <- na.omit(socs)
str(socs_data_cleaned)
## 'data.frame':    175 obs. of  8 variables:
##  $ Date    : POSIXct, format: "2024-10-01" "2024-09-30" ...
##  $ Price   : int  105 101 99 99 100 102 98 101 107 95 ...
##  $ Open    : int  100 100 100 100 103 99 101 107 114 88 ...
##  $ High    : int  106 101 101 101 104 105 103 108 116 98 ...
##  $ Low     : int  100 98 96 98 98 97 95 101 98 88 ...
##  $ Vol.    : chr  "244.35M" "179.18M" "170.50M" "150.01M" ...
##  $ Change..: chr  "3.96%" "2.02%" "0.00%" "-1.00%" ...
##  $ Close   : int  100 100 100 103 99 101 107 114 88 87 ...
##  - attr(*, "na.action")= 'omit' Named int 176
##   ..- attr(*, "names")= chr "176"

Predictions

First we prepare

# Prepare the data for Prophet, ensuring columns are named correctly
df_prophet <- data.frame(
  ds = socs_data_cleaned$Date,   # Date column
  y = socs_data_cleaned$Close     # Close price column
)
head(df_prophet)
##           ds   y
## 1 2024-10-01 100
## 2 2024-09-30 100
## 3 2024-09-27 100
## 4 2024-09-26 103
## 5 2024-09-25  99
## 6 2024-09-24 101

Then we create the model

# Fit the model
model <- prophet(df_prophet)

# View the model summary
summary(model)
##                         Length Class      Mode     
## growth                    1    -none-     character
## changepoints             25    POSIXct    numeric  
## n.changepoints            1    -none-     numeric  
## changepoint.range         1    -none-     numeric  
## yearly.seasonality        1    -none-     character
## weekly.seasonality        1    -none-     character
## daily.seasonality         1    -none-     character
## holidays                  0    -none-     NULL     
## seasonality.mode          1    -none-     character
## seasonality.prior.scale   1    -none-     numeric  
## changepoint.prior.scale   1    -none-     numeric  
## holidays.prior.scale      1    -none-     numeric  
## mcmc.samples              1    -none-     numeric  
## interval.width            1    -none-     numeric  
## uncertainty.samples       1    -none-     numeric  
## specified.changepoints    1    -none-     logical  
## start                     1    POSIXct    numeric  
## y.scale                   1    -none-     numeric  
## logistic.floor            1    -none-     logical  
## t.scale                   1    -none-     numeric  
## changepoints.t           25    -none-     numeric  
## seasonalities             1    -none-     list     
## extra_regressors          0    -none-     list     
## country_holidays          0    -none-     NULL     
## stan.fit                  4    -none-     list     
## params                    6    -none-     list     
## history                   5    data.frame list     
## history.dates           175    POSIXct    numeric  
## train.holiday.names       0    -none-     NULL     
## train.component.cols      3    data.frame list     
## component.modes           2    -none-     list     
## fit.kwargs                0    -none-     list

Now, Predict 60 days

# Create a dataframe for future dates (e.g., predicting the next 60 days)
future <- make_future_dataframe(model, periods = 60)

# Make predictions
forecast <- predict(model, future)

# View the forecast results
head(forecast)
##           ds    trend additive_terms additive_terms_lower additive_terms_upper
## 1 2024-01-03 89.16507     -0.6878961           -0.6878961           -0.6878961
## 2 2024-01-04 88.99320     -2.4165668           -2.4165668           -2.4165668
## 3 2024-01-05 88.82132     -2.2542357           -2.2542357           -2.2542357
## 4 2024-01-08 88.30569     -0.9600463           -0.9600463           -0.9600463
## 5 2024-01-09 88.13382     -1.3154280           -1.3154280           -1.3154280
## 6 2024-01-10 87.96194     -0.6878961           -0.6878961           -0.6878961
##       weekly weekly_lower weekly_upper multiplicative_terms
## 1 -0.6878961   -0.6878961   -0.6878961                    0
## 2 -2.4165668   -2.4165668   -2.4165668                    0
## 3 -2.2542357   -2.2542357   -2.2542357                    0
## 4 -0.9600463   -0.9600463   -0.9600463                    0
## 5 -1.3154280   -1.3154280   -1.3154280                    0
## 6 -0.6878961   -0.6878961   -0.6878961                    0
##   multiplicative_terms_lower multiplicative_terms_upper yhat_lower yhat_upper
## 1                          0                          0   79.76347   97.04436
## 2                          0                          0   78.10065   95.76970
## 3                          0                          0   77.83975   94.70307
## 4                          0                          0   78.40013   95.51043
## 5                          0                          0   78.29185   95.41208
## 6                          0                          0   78.79586   96.02870
##   trend_lower trend_upper     yhat
## 1    89.16507    89.16507 88.47718
## 2    88.99320    88.99320 86.57663
## 3    88.82132    88.82132 86.56708
## 4    88.30569    88.30569 87.34565
## 5    88.13382    88.13382 86.81839
## 6    87.96194    87.96194 87.27405

Then we visualize

# Plot the forecast
plot(model, forecast)

# Optional: Plot components to analyze seasonality and trend
prophet_plot_components(model, forecast)

Conclusion

In conclusion, our prediction indicates an upward trend in prices. Additionally, we observed notable weekly patterns and a consistent monthly trend, highlighting the cyclical nature of price movements. These insights not only reinforce the efficacy of the Prophet model in capturing these dynamics but also provide valuable information for making informed investment decisions