Disclaimer
This is not financial advice. This is an applied
technical exercise on time-series forecasting using
historical Bitcoin closing price data.
Introduction
Given the recent turbulence of Bitcoin’s value in recent months, it
is easy to succumb to hype and
objection heresay from speculative sources. Often, many may
lose sight of the simple fundamentals that lend credibility towards an
asset’s valuation. With this in mind, the topic for today addresses a
more technical method for speculating projected price.
Enter Rstudio’s Prophet package, developed by Facebook’s
Core Data Science team. Prophet aims to streamline the
forecasting process by implementing a regression model that
fits non-linear trends with seasonality and
holiday effects. This technique can be applied to daily,
monthly, and yearly price prediction for such assets like crypto
currency and securities.1
Process
For our purposes, we will use Bitcoin as the asset in question to develop a one year forecast prediction for its closing price.
We will approach this simply and methodically by:
Creating an
Rproject directoryto easily read in data.Downloading a dataset from
Yahoo Financeon Bitcoin historical price data from September 2014 to June 2022.Isolating the
DateandClosecolumns, then transform them to integrate into the Prophet package.Executing the
fit modelandforecast modelto extrapolate Bitcoin’s closing price, one year from June 2022.Creating a vector that estimates the
standardprice forecast,upper boundprice forecast, andlower boundprice forecast.Condensing the new data into a mid-month interval for more efficient interpretation (This is not an essential step, but a
personal preference).Viewing the
final resultsin graphical and tabular form.Drawing a
conclusionbased off the forecasting estimates.
Step 1
Import libraries.
library(readxl) #to extract data from excel file
library(stringr) #to manipulate change string values for model integration
library(dplyr) #to manipulate initial raw dataset
library(prophet) #to implement the times-series forecasting procedure
Step 2
From the Rproject directory, load the downloaded
excel file into a vector; this is the preferred method to
read in datasets for simplicity.2
my_data <- read_excel("BTC-USD.xlsx")
Step 3
Check the initial starting data and ending data to ensure the desired start time and end time, respectively.
knitr::kable(head(my_data))
| Date | Open | High | Low | Close | Adj Close | Volume |
|---|---|---|---|---|---|---|
| 2014-09-17 | 465.864 | 468.174 | 452.422 | 457.334 | 457.334 | 21056800 |
| 2014-09-18 | 456.860 | 456.860 | 413.104 | 424.440 | 424.440 | 34483200 |
| 2014-09-19 | 424.103 | 427.835 | 384.532 | 394.796 | 394.796 | 37919700 |
| 2014-09-20 | 394.673 | 423.296 | 389.883 | 408.904 | 408.904 | 36863600 |
| 2014-09-21 | 408.085 | 412.426 | 393.181 | 398.821 | 398.821 | 26580100 |
| 2014-09-22 | 399.100 | 406.916 | 397.130 | 402.152 | 402.152 | 24127600 |
knitr::kable(tail(my_data))
| Date | Open | High | Low | Close | Adj Close | Volume |
|---|---|---|---|---|---|---|
| 2022-06-14 | 22487.99 | 23018.95 | 20950.82 | 22206.79 | 22206.79 | 50913575242 |
| 2022-06-15 | 22196.73 | 22642.67 | 20178.38 | 22572.84 | 22572.84 | 54912007015 |
| 2022-06-16 | 22576.30 | 22868.92 | 20265.23 | 20381.65 | 20381.65 | 31183975654 |
| 2022-06-17 | 20385.72 | 21243.31 | 20326.52 | 20471.48 | 20471.48 | 27132421514 |
| 2022-06-18 | 20473.43 | 20736.04 | 17708.62 | 19017.64 | 19017.64 | 42009436760 |
| 2022-06-19 | 18936.42 | 19816.04 | 18085.06 | 19473.26 | 19473.26 | 40294412288 |
Step 4
Drop the Open, High, Low,
Adj Close, and Volume columns; retain the
Date & Close columns.
my_data2 = select(my_data, -2,-3,-4,-6,-7)
knitr::kable(head(my_data2))
| Date | Close |
|---|---|
| 2014-09-17 | 457.334 |
| 2014-09-18 | 424.440 |
| 2014-09-19 | 394.796 |
| 2014-09-20 | 408.904 |
| 2014-09-21 | 398.821 |
| 2014-09-22 | 402.152 |
Step 5
Replace Date column title with ds and
Close column title with y. This is a
crucial step in order for the Prophet package to read the
data correctly.
colnames(my_data2)[1] <- "ds"
colnames(my_data2)[2] <- "y"
knitr::kable(head(my_data2))
| ds | y |
|---|---|
| 2014-09-17 | 457.334 |
| 2014-09-18 | 424.440 |
| 2014-09-19 | 394.796 |
| 2014-09-20 | 408.904 |
| 2014-09-21 | 398.821 |
| 2014-09-22 | 402.152 |
Step 6
Replace all null string values with 0; this
is crucial for the Prophet package to read the data
correctly. Note: for this particular dataset the output is unchanged,
but this is a precautionary and redundant step to ensure
proper data cleaning.
my_data2$y <- str_replace_all(my_data2$y, 'null', '0') #replaces all nulls to 0s in column
knitr::kable(head(my_data2))
| ds | y |
|---|---|
| 2014-09-17 | 457.334015 |
| 2014-09-18 | 424.440002 |
| 2014-09-19 | 394.79599 |
| 2014-09-20 | 408.903992 |
| 2014-09-21 | 398.821014 |
| 2014-09-22 | 402.152008 |
Step 7
Call the Prophet f(x) to Fit Model the model into a
vector using daily seasonality…
Model1<-prophet(my_data2, daily.seasonality = TRUE)
…then, call the make_future_dataframe f(x) into another
vector, to forecast 1 year from the dataset’s end date.
Future1<-make_future_dataframe(Model1, periods = (365*1))
Step 8
Create a vector that includes the Date ds, standard
predicted close price yhat, lower estimated close price
yhat_lower, and upper estimated close price
yhat_upper.
Forecast1<-predict(Model1,Future1)
knitr::kable(tail(Forecast1[c('ds','yhat','yhat_lower','yhat_upper')]))
| ds | yhat | yhat_lower | yhat_upper | |
|---|---|---|---|---|
| 3193 | 2023-06-14 | 56000.37 | 45980.83 | 66956.84 |
| 3194 | 2023-06-15 | 56059.31 | 45984.03 | 65940.03 |
| 3195 | 2023-06-16 | 56193.12 | 45449.73 | 66411.94 |
| 3196 | 2023-06-17 | 56337.43 | 45849.43 | 66993.55 |
| 3197 | 2023-06-18 | 56459.55 | 45895.33 | 66801.10 |
| 3198 | 2023-06-19 | 56668.71 | 46441.59 | 67916.79 |
Step 9
Plot the Model Estimates.
dyplot.prophet(Model1,Forecast1)
This is the interactive graphical depiction of the
forecasting model, in all its glory. The thin blue line is
the forecasted price trend, while the dotted black line is
the historical price over time.
prophet_plot_components(Model1,Forecast1)
These individual forecasted components show the seasonal trends captured at daily, weekly, and yearly intervals.
Step 10
Subset the dataframe for the 15th day of each month/year.
d<-data.frame(Forecast1[c('ds','yhat','yhat_lower','yhat_upper')])
Price_1_Year_Table<-d[format.Date(d$ds, "%d")=="15" &
!is.na(d$ds),]
Filter for the range between 2022-06-15 and 2023-06-15, as we are interested in this target year range for a grainier look.
Truncated <- (subset(Price_1_Year_Table, ds>="2022-05-15"))
knitr::kable(Truncated)
| ds | yhat | yhat_lower | yhat_upper | |
|---|---|---|---|---|
| 2829 | 2022-06-15 | 45823.62 | 37197.39 | 52913.46 |
| 2859 | 2022-07-15 | 47799.84 | 39568.29 | 55599.78 |
| 2890 | 2022-08-15 | 50533.95 | 42918.03 | 57702.43 |
| 2921 | 2022-09-15 | 50191.86 | 42165.80 | 57559.06 |
| 2951 | 2022-10-15 | 52078.07 | 44600.84 | 60164.31 |
| 2982 | 2022-11-15 | 53398.11 | 44816.40 | 60532.95 |
| 3012 | 2022-12-15 | 53171.10 | 44832.75 | 61491.97 |
| 3043 | 2023-01-15 | 54677.55 | 46059.34 | 63406.97 |
| 3074 | 2023-02-15 | 56803.39 | 48526.19 | 65578.74 |
| 3102 | 2023-03-15 | 57485.30 | 48178.42 | 66705.18 |
| 3133 | 2023-04-15 | 58459.75 | 49116.34 | 68623.36 |
| 3163 | 2023-05-15 | 57232.28 | 47465.64 | 67053.69 |
| 3194 | 2023-06-15 | 56059.31 | 45984.03 | 65940.03 |
For extra credit, take the means of yhat, yhat lower,
and yhat upper ranges.
knitr::kable(mean(Truncated$yhat))
| x |
|---|
| 53362.63 |
The standard forecasted average for the next year.
knitr::kable(mean(Truncated$yhat_lower))
| x |
|---|
| 44725.34 |
The lower bound forecasted average for the next
year.
knitr::kable(mean(Truncated$yhat_upper))
| x |
|---|
| 61790.15 |
The upper bound forecasted average for the next
year.
Conclusion
Instinctively, the forecasted results indicate a sensible spread
between projected Bitcoin closing price estimates, in the coming year.
However, contrary to the package title, this is not an 100%
future-predicting oracle analysis and should be
considered with caution, given Bitcoin’s market volatility.
Also, always consider black swan events.
Considering Bitcoin’s use case as an asset, this regression
model bakes in some considerations as
retail and institutional buying habits over time, factoring
seasons and holidays, along with network difficulty (how
difficult it is to mine a Bitcoin block, which offers staggered
correlation to Bitcoin price)3.
However, this forecasting model does not necessarily account for the
Federal Funds Rate changes with fidelity. A
multiple linear regression analysis4 might be a better
technique to pinpoint the significance in Federal Funds Rate dynamics on
Bitcoin’s value.
Ultimately, this exercise demonstrates one tool in how one may project
future seasonal trends in securities, commodities, or other
non-monetary metrics such as: quantity of shoppers in a
retail store, quantity of tickets sold, number of cars sold, or
projected sale prices for merchandise. The key component to any analysis
is accurate and well-scrubbed data entry.
For any questions or comments on this analysis please contact kris.kilgroe@sigmacode.io.
P.S.
Trust the math!
Session Info
sessionInfo()
#> R version 4.2.0 (2022-04-22)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Linux Mint 20.3
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] prophet_1.0 rlang_1.0.2 Rcpp_1.0.8.3 dplyr_1.0.9 stringr_1.4.0
#> [6] readxl_1.4.0
#>
#> loaded via a namespace (and not attached):
#> [1] lubridate_1.8.0 lattice_0.20-45 tidyr_1.2.0
#> [4] prettyunits_1.1.1 ps_1.7.1 zoo_1.8-10
#> [7] assertthat_0.2.1 digest_0.6.29 utf8_1.2.2
#> [10] R6_2.5.1 cellranger_1.1.0 stats4_4.2.0
#> [13] evaluate_0.15 ggplot2_3.3.6 highr_0.9
#> [16] pillar_1.7.0 rstudioapi_0.13 callr_3.7.0
#> [19] jquerylib_0.1.4 rmarkdown_2.14 labeling_0.4.2
#> [22] htmlwidgets_1.5.4 loo_2.5.1 munsell_0.5.0
#> [25] compiler_4.2.0 xfun_0.31 rstan_2.21.5
#> [28] pkgconfig_2.0.3 pkgbuild_1.3.1 htmltools_0.5.2
#> [31] tidyselect_1.1.2 tibble_3.1.7 gridExtra_2.3
#> [34] bookdown_0.27 codetools_0.2-18 matrixStats_0.62.0
#> [37] fansi_1.0.3 crayon_1.5.1 grid_4.2.0
#> [40] jsonlite_1.8.0 gtable_0.3.0 lifecycle_1.0.1
#> [43] DBI_1.1.3 magrittr_2.0.3 StanHeaders_2.21.0-7
#> [46] scales_1.2.0 rmdformats_1.0.4 RcppParallel_5.1.5
#> [49] cli_3.3.0 stringi_1.7.6 farver_2.1.0
#> [52] bslib_0.3.1 dygraphs_1.1.1.6 ellipsis_0.3.2
#> [55] xts_0.12.1 generics_0.1.2 vctrs_0.4.1
#> [58] tools_4.2.0 glue_1.6.2 purrr_0.3.4
#> [61] processx_3.6.1 parallel_4.2.0 fastmap_1.1.0
#> [64] yaml_2.3.5 inline_0.3.19 colorspace_2.0-3
#> [67] extraDistr_1.9.1 knitr_1.39 sass_0.4.1
Footnotes
The Prophet
white papermay be viewed at: https://peerj.com/preprints/3190/ for those interested.↩︎Data pulled from Yahoo Finance: https://finance.yahoo.com/quote/BTC-USD/↩︎
How Mining Difficulty Affects Bitcoin Price: https://medium.com/@datalightme/how-mining-difficulty-affects-bitcoin-price-a6a6e1eeb39d↩︎
Technical case study in
multiple linear regression analysis: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3049417/↩︎