stock prediction

Load necessary libraries

Code
library(prophet)
Warning: package 'prophet' was built under R version 4.2.3
Loading required package: Rcpp
Warning: package 'Rcpp' was built under R version 4.2.3
Loading required package: rlang
Warning: package 'rlang' was built under R version 4.2.3
Code
library(lubridate)
Warning: package 'lubridate' was built under R version 4.2.3

Attaching package: 'lubridate'
The following objects are masked from 'package:base':

    date, intersect, setdiff, union
Code
library(ggplot2)
Warning: package 'ggplot2' was built under R version 4.2.3
Code
library(pacman)
Warning: package 'pacman' was built under R version 4.2.3
Code
pacman::p_load(data.table, fixest, BatchGetSymbols, finreportr, ggplot2, lubridate)
Set parameters
first.date <- Sys.Date() -2500
last.date <- Sys.Date()
freq.data <- "daily"
tickers <- c("BALKRISIND.NS")
Collecting data

simply taking daily data from 2014-07-01 to 2022-05-05

Getting Data

Code
stocks <- BatchGetSymbols(tickers=tickers,
                          first.date = first.date,
                          last.date = last.date,
                          freq.data = freq.data,
                          do.cache = FALSE,
                          thresh.bad.data = 0)
Warning: `BatchGetSymbols()` was deprecated in BatchGetSymbols 2.6.4.
ℹ Please use `yfR::yf_get()` instead.
ℹ 2022-05-01: Package BatchGetSymbols will soon be replaced by yfR.  More
  details about the change is available at github
  <<www.github.com/msperlin/yfR> You can install yfR by executing:

remotes::install_github('msperlin/yfR')

Running BatchGetSymbols for:
   tickers =BALKRISIND.NS
   Downloading data for benchmark ticker
^GSPC | yahoo (1|1)
BALKRISIND.NS | yahoo (1|1) - Got 95% of valid prices | Nice!
Code
data <- stocks$df.tickers
data<- na.omit(data)
head(data)
  price.open price.high price.low price.close volume price.adjusted   ref.date
2    373.025      382.5   373.025     374.525  52912       348.7958 2016-08-22
3    374.500      375.0   370.000     372.625  12328       347.0265 2016-08-23
4    372.500      383.0   370.125     375.050  36094       349.2848 2016-08-24
5    381.500      385.0   374.125     383.300  75978       356.9681 2016-08-25
6    384.000      394.5   376.500     392.200  50296       365.2567 2016-08-26
7    396.950      421.0   394.500     416.900 267740       388.2597 2016-08-29
         ticker ret.adjusted.prices ret.closing.prices
2 BALKRISIND.NS         0.001069007        0.001069145
3 BALKRISIND.NS        -0.005072820       -0.005073076
4 BALKRISIND.NS         0.006507663        0.006507851
5 BALKRISIND.NS         0.021997203        0.021997068
6 BALKRISIND.NS         0.023219371        0.023219475
7 BALKRISIND.NS         0.062977860        0.062978024
str(data)
'data.frame':   1691 obs. of  10 variables:
 $ price.open         : num  373 374 372 382 384 ...
 $ price.high         : num  382 375 383 385 394 ...
 $ price.low          : num  373 370 370 374 376 ...
 $ price.close        : num  375 373 375 383 392 ...
 $ volume             : num  52912 12328 36094 75978 50296 ...
 $ price.adjusted     : num  349 347 349 357 365 ...
 $ ref.date           : Date, format: "2016-08-22" "2016-08-23" ...
 $ ticker             : chr  "BALKRISIND.NS" "BALKRISIND.NS" "BALKRISIND.NS" "BALKRISIND.NS" ...
 $ ret.adjusted.prices: num  0.00107 -0.00507 0.00651 0.022 0.02322 ...
 $ ret.closing.prices : num  0.00107 -0.00507 0.00651 0.022 0.02322 ...
 - attr(*, "na.action")= 'omit' Named int 1
  ..- attr(*, "names")= chr "1"
dataset info

the dataset contains total 1691 observations and 10 variables

qplot(data$ref.date, data$price.close, data=data)
Warning: `qplot()` was deprecated in ggplot2 3.4.0.

caution

it is clearly evident that the data set is not stationary, so our next step is to make use of log transformation and convert it into stationary data

Log transformation

ds <- data$ref.date
y <- log(data$price.close)
df <- data.frame(ds,y)
head(df)
          ds        y
1 2016-08-22 5.925659
2 2016-08-23 5.920573
3 2016-08-24 5.927059
4 2016-08-25 5.948818
5 2016-08-26 5.971772
6 2016-08-29 6.032846

Stock forecasting using prophet package

m <- prophet(df)
Disabling daily seasonality. Run prophet with daily.seasonality=TRUE to override this.
future <- make_future_dataframe(m, periods = 30)
forecast <- predict(m, future)
f_d <- prophet(df, daily.seasonality = TRUE)
future_d <- make_future_dataframe(f_d, periods = 30)
forecast_1 <- predict(f_d, future_d)

Model performance & Stock Prediction

pred <- forecast$yhat[1:dim(df)[1]]
actual <- m$history$y
plot(actual, pred)

pred_1 <- forecast_1$yhat[1:dim(df)[1]]
actual_1 <- f_d$history$y
plot(actual_1, pred_1)

summary(lm(pred~actual))

Call:
lm(formula = pred ~ actual)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.226360 -0.036114  0.002566  0.036161  0.306661 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.138538   0.022108   6.266 4.68e-10 ***
actual      0.980543   0.003098 316.497  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.05996 on 1689 degrees of freedom
Multiple R-squared:  0.9834,    Adjusted R-squared:  0.9834 
F-statistic: 1.002e+05 on 1 and 1689 DF,  p-value: < 2.2e-16
Model 1

Adjusted R square is 98.34% which means is a good model

summary(lm(pred_1~actual_1))

Call:
lm(formula = pred_1 ~ actual_1)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.225511 -0.036084  0.002886  0.035988  0.306746 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.137486   0.022030   6.241 5.49e-10 ***
actual_1    0.980691   0.003087 317.669  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.05975 on 1689 degrees of freedom
Multiple R-squared:  0.9835,    Adjusted R-squared:  0.9835 
F-statistic: 1.009e+05 on 1 and 1689 DF,  p-value: < 2.2e-16
Model 2

basically the same

Plot forecast

prophet_plot_components(m,forecast)

predictions in R

plot(m, forecast)

interpretation

plot is showing an increasing trend for the next 30 days.

  • lets transform log values into original values based on exp function in R
tail(forecast)
             ds    trend additive_terms additive_terms_lower
1716 2023-07-18 7.670333     0.02208152           0.02208152
1717 2023-07-19 7.670348     0.02195054           0.02195054
1718 2023-07-20 7.670363     0.02167228           0.02167228
1719 2023-07-21 7.670378     0.01735060           0.01735060
1720 2023-07-22 7.670394     0.09522528           0.09522528
1721 2023-07-23 7.670409     0.03139401           0.03139401
     additive_terms_upper       weekly weekly_lower weekly_upper     yearly
1716           0.02208152 -0.014177279 -0.014177279 -0.014177279 0.03625880
1717           0.02195054 -0.012706933 -0.012706933 -0.012706933 0.03465747
1718           0.02167228 -0.011292873 -0.011292873 -0.011292873 0.03296515
1719           0.01735060 -0.013856093 -0.013856093 -0.013856093 0.03120669
1720           0.09522528  0.065817643  0.065817643  0.065817643 0.02940763
1721           0.03139401  0.003800532  0.003800532  0.003800532 0.02759348
     yearly_lower yearly_upper multiplicative_terms multiplicative_terms_lower
1716   0.03625880   0.03625880                    0                          0
1717   0.03465747   0.03465747                    0                          0
1718   0.03296515   0.03296515                    0                          0
1719   0.03120669   0.03120669                    0                          0
1720   0.02940763   0.02940763                    0                          0
1721   0.02759348   0.02759348                    0                          0
     multiplicative_terms_upper yhat_lower yhat_upper trend_lower trend_upper
1716                          0   7.612457   7.771357    7.670157    7.670393
1717                          0   7.613917   7.776449    7.669935    7.670636
1718                          0   7.612278   7.774837    7.669118    7.671053
1719                          0   7.606057   7.765511    7.668348    7.671668
1720                          0   7.683131   7.843805    7.667661    7.671917
1721                          0   7.618205   7.782441    7.667075    7.672231
         yhat
1716 7.692414
1717 7.692298
1718 7.692035
1719 7.687729
1720 7.765619
1721 7.701803
forecast$yhat <- exp(forecast$yhat)
forecast$trend <- exp(forecast$trend)
forecast$trend_upper <- exp(forecast$trend_upper)
forecast$trend_lower <- exp(forecast$trend_lower)
predicted values

above are the predicted values in yhat

Disclaimer

for any kind of investment consult your financial advisors, i do not take any take responsibility of your loss or any trading ideas