Bitcoin Price Forecasting (Time-Series)

Kris Kilgroe

2022-06-19

Disclaimer

This is not financial advice. This is an applied technical exercise on time-series forecasting using historical Bitcoin closing price data.

Introduction

Given the recent turbulence of Bitcoin’s value in recent months, it is easy to succumb to hype and objection heresay from speculative sources. Often, many may lose sight of the simple fundamentals that lend credibility towards an asset’s valuation. With this in mind, the topic for today addresses a more technical method for speculating projected price.

Enter Rstudio’s Prophet package, developed by Facebook’s Core Data Science team. Prophet aims to streamline the forecasting process by implementing a regression model that fits non-linear trends with seasonality and holiday effects. This technique can be applied to daily, monthly, and yearly price prediction for such assets like crypto currency and securities.1

Process

For our purposes, we will use Bitcoin as the asset in question to develop a one year forecast prediction for its closing price.

We will approach this simply and methodically by:

  • Creating an Rproject directory to easily read in data.

  • Downloading a dataset from Yahoo Finance on Bitcoin historical price data from September 2014 to June 2022.

  • Isolating the Date and Close columns, then transform them to integrate into the Prophet package.

  • Executing the fit model and forecast model to extrapolate Bitcoin’s closing price, one year from June 2022.

  • Creating a vector that estimates the standard price forecast, upper bound price forecast, and lower bound price forecast.

  • Condensing the new data into a mid-month interval for more efficient interpretation (This is not an essential step, but a personal preference).

  • Viewing the final results in graphical and tabular form.

  • Drawing a conclusion based off the forecasting estimates.

Step 1

Import libraries.

library(readxl) #to extract data from excel file
library(stringr) #to manipulate change string values for model integration 
library(dplyr) #to manipulate initial raw dataset
library(prophet) #to implement the times-series forecasting procedure

Step 2

From the Rproject directory, load the downloaded excel file into a vector; this is the preferred method to read in datasets for simplicity.2

my_data <- read_excel("BTC-USD.xlsx")

Step 3

Check the initial starting data and ending data to ensure the desired start time and end time, respectively.

knitr::kable(head(my_data))
Date Open High Low Close Adj Close Volume
2014-09-17 465.864 468.174 452.422 457.334 457.334 21056800
2014-09-18 456.860 456.860 413.104 424.440 424.440 34483200
2014-09-19 424.103 427.835 384.532 394.796 394.796 37919700
2014-09-20 394.673 423.296 389.883 408.904 408.904 36863600
2014-09-21 408.085 412.426 393.181 398.821 398.821 26580100
2014-09-22 399.100 406.916 397.130 402.152 402.152 24127600
knitr::kable(tail(my_data))
Date Open High Low Close Adj Close Volume
2022-06-14 22487.99 23018.95 20950.82 22206.79 22206.79 50913575242
2022-06-15 22196.73 22642.67 20178.38 22572.84 22572.84 54912007015
2022-06-16 22576.30 22868.92 20265.23 20381.65 20381.65 31183975654
2022-06-17 20385.72 21243.31 20326.52 20471.48 20471.48 27132421514
2022-06-18 20473.43 20736.04 17708.62 19017.64 19017.64 42009436760
2022-06-19 18936.42 19816.04 18085.06 19473.26 19473.26 40294412288

Step 4

Drop the Open, High, Low, Adj Close, and Volume columns; retain the Date & Close columns.

my_data2 = select(my_data, -2,-3,-4,-6,-7)
knitr::kable(head(my_data2))
Date Close
2014-09-17 457.334
2014-09-18 424.440
2014-09-19 394.796
2014-09-20 408.904
2014-09-21 398.821
2014-09-22 402.152

Step 5

Replace Date column title with ds and Close column title with y. This is a crucial step in order for the Prophet package to read the data correctly.

colnames(my_data2)[1] <- "ds"
colnames(my_data2)[2] <- "y"
knitr::kable(head(my_data2))
ds y
2014-09-17 457.334
2014-09-18 424.440
2014-09-19 394.796
2014-09-20 408.904
2014-09-21 398.821
2014-09-22 402.152

Step 6

Replace all null string values with 0; this is crucial for the Prophet package to read the data correctly. Note: for this particular dataset the output is unchanged, but this is a precautionary and redundant step to ensure proper data cleaning.

my_data2$y <- str_replace_all(my_data2$y, 'null', '0') #replaces all nulls to 0s in column
knitr::kable(head(my_data2))
ds y
2014-09-17 457.334015
2014-09-18 424.440002
2014-09-19 394.79599
2014-09-20 408.903992
2014-09-21 398.821014
2014-09-22 402.152008

Step 7

Call the Prophet f(x) to Fit Model the model into a vector using daily seasonality…

Model1<-prophet(my_data2, daily.seasonality = TRUE)

…then, call the make_future_dataframe f(x) into another vector, to forecast 1 year from the dataset’s end date.

Future1<-make_future_dataframe(Model1, periods = (365*1))

Step 8

Create a vector that includes the Date ds, standard predicted close price yhat, lower estimated close price yhat_lower, and upper estimated close price yhat_upper.

Forecast1<-predict(Model1,Future1)
knitr::kable(tail(Forecast1[c('ds','yhat','yhat_lower','yhat_upper')]))
ds yhat yhat_lower yhat_upper
3193 2023-06-14 56000.37 45980.83 66956.84
3194 2023-06-15 56059.31 45984.03 65940.03
3195 2023-06-16 56193.12 45449.73 66411.94
3196 2023-06-17 56337.43 45849.43 66993.55
3197 2023-06-18 56459.55 45895.33 66801.10
3198 2023-06-19 56668.71 46441.59 67916.79

Step 9

Plot the Model Estimates.

dyplot.prophet(Model1,Forecast1)

This is the interactive graphical depiction of the forecasting model, in all its glory. The thin blue line is the forecasted price trend, while the dotted black line is the historical price over time.

prophet_plot_components(Model1,Forecast1)

These individual forecasted components show the seasonal trends captured at daily, weekly, and yearly intervals.

Step 10

Subset the dataframe for the 15th day of each month/year.

d<-data.frame(Forecast1[c('ds','yhat','yhat_lower','yhat_upper')])

Price_1_Year_Table<-d[format.Date(d$ds, "%d")=="15" &
                              !is.na(d$ds),]

Filter for the range between 2022-06-15 and 2023-06-15, as we are interested in this target year range for a grainier look.

Truncated <- (subset(Price_1_Year_Table, ds>="2022-05-15"))
knitr::kable(Truncated)
ds yhat yhat_lower yhat_upper
2829 2022-06-15 45823.62 37197.39 52913.46
2859 2022-07-15 47799.84 39568.29 55599.78
2890 2022-08-15 50533.95 42918.03 57702.43
2921 2022-09-15 50191.86 42165.80 57559.06
2951 2022-10-15 52078.07 44600.84 60164.31
2982 2022-11-15 53398.11 44816.40 60532.95
3012 2022-12-15 53171.10 44832.75 61491.97
3043 2023-01-15 54677.55 46059.34 63406.97
3074 2023-02-15 56803.39 48526.19 65578.74
3102 2023-03-15 57485.30 48178.42 66705.18
3133 2023-04-15 58459.75 49116.34 68623.36
3163 2023-05-15 57232.28 47465.64 67053.69
3194 2023-06-15 56059.31 45984.03 65940.03


For extra credit, take the means of yhat, yhat lower, and yhat upper ranges.

knitr::kable(mean(Truncated$yhat))
x
53362.63

The standard forecasted average for the next year.

knitr::kable(mean(Truncated$yhat_lower))
x
44725.34

The lower bound forecasted average for the next year.

knitr::kable(mean(Truncated$yhat_upper))
x
61790.15

The upper bound forecasted average for the next year.

Conclusion

Instinctively, the forecasted results indicate a sensible spread between projected Bitcoin closing price estimates, in the coming year. However, contrary to the package title, this is not an 100% future-predicting oracle analysis and should be considered with caution, given Bitcoin’s market volatility. Also, always consider black swan events.

Considering Bitcoin’s use case as an asset, this regression model bakes in some considerations as retail and institutional buying habits over time, factoring seasons and holidays, along with network difficulty (how difficult it is to mine a Bitcoin block, which offers staggered correlation to Bitcoin price)3.

However, this forecasting model does not necessarily account for the Federal Funds Rate changes with fidelity. A multiple linear regression analysis4 might be a better technique to pinpoint the significance in Federal Funds Rate dynamics on Bitcoin’s value.

Ultimately, this exercise demonstrates one tool in how one may project future seasonal trends in securities, commodities, or other non-monetary metrics such as: quantity of shoppers in a retail store, quantity of tickets sold, number of cars sold, or projected sale prices for merchandise. The key component to any analysis is accurate and well-scrubbed data entry.

For any questions or comments on this analysis please contact .

P.S.

Trust the math!

Session Info

sessionInfo()
#> R version 4.2.0 (2022-04-22)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Linux Mint 20.3
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] prophet_1.0   rlang_1.0.2   Rcpp_1.0.8.3  dplyr_1.0.9   stringr_1.4.0
#> [6] readxl_1.4.0 
#> 
#> loaded via a namespace (and not attached):
#>  [1] lubridate_1.8.0      lattice_0.20-45      tidyr_1.2.0         
#>  [4] prettyunits_1.1.1    ps_1.7.1             zoo_1.8-10          
#>  [7] assertthat_0.2.1     digest_0.6.29        utf8_1.2.2          
#> [10] R6_2.5.1             cellranger_1.1.0     stats4_4.2.0        
#> [13] evaluate_0.15        ggplot2_3.3.6        highr_0.9           
#> [16] pillar_1.7.0         rstudioapi_0.13      callr_3.7.0         
#> [19] jquerylib_0.1.4      rmarkdown_2.14       labeling_0.4.2      
#> [22] htmlwidgets_1.5.4    loo_2.5.1            munsell_0.5.0       
#> [25] compiler_4.2.0       xfun_0.31            rstan_2.21.5        
#> [28] pkgconfig_2.0.3      pkgbuild_1.3.1       htmltools_0.5.2     
#> [31] tidyselect_1.1.2     tibble_3.1.7         gridExtra_2.3       
#> [34] bookdown_0.27        codetools_0.2-18     matrixStats_0.62.0  
#> [37] fansi_1.0.3          crayon_1.5.1         grid_4.2.0          
#> [40] jsonlite_1.8.0       gtable_0.3.0         lifecycle_1.0.1     
#> [43] DBI_1.1.3            magrittr_2.0.3       StanHeaders_2.21.0-7
#> [46] scales_1.2.0         rmdformats_1.0.4     RcppParallel_5.1.5  
#> [49] cli_3.3.0            stringi_1.7.6        farver_2.1.0        
#> [52] bslib_0.3.1          dygraphs_1.1.1.6     ellipsis_0.3.2      
#> [55] xts_0.12.1           generics_0.1.2       vctrs_0.4.1         
#> [58] tools_4.2.0          glue_1.6.2           purrr_0.3.4         
#> [61] processx_3.6.1       parallel_4.2.0       fastmap_1.1.0       
#> [64] yaml_2.3.5           inline_0.3.19        colorspace_2.0-3    
#> [67] extraDistr_1.9.1     knitr_1.39           sass_0.4.1

Footnotes


  1. The Prophet white paper may be viewed at: https://peerj.com/preprints/3190/ for those interested.↩︎

  2. Data pulled from Yahoo Finance: https://finance.yahoo.com/quote/BTC-USD/↩︎

  3. How Mining Difficulty Affects Bitcoin Price: https://medium.com/@datalightme/how-mining-difficulty-affects-bitcoin-price-a6a6e1eeb39d↩︎

  4. Technical case study in multiple linear regression analysis: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3049417/↩︎