Disclaimer
This is not financial advice. This is a
demonstration of multiple linear regression applied to
Bitcoin Price.
Introduction
Multiple linear regression is a powerful technique in
determining the correlation between factors influencing outputs such as
sales or advertising revenue,
attendance, sports performance,
medical statistics, among other items.
Today, we will apply multiple linear regression to
BTC Price valued in USD (the
dependent variable)1 , in order to determine significant
correlating factors.
Let’s consider the following independent variables:
BTC Mining Difficulty2BTC.Mining.Profitability3BTC.Atm.Count.US4Federal Funds Rateas a percentage5M1 Money supplyin billions6Global Inflation Rate7
I created a custom dataset8 compiling all of the
monthly intervals for the aforementioned data, with the data range
June 2017 to December 2021.
I chose these variables out of curiosity to see how a mix of
Bitcoin-specific variables (mining difficulty and
mining profitability), Bitcoin ATM count,
along with macroeconomic variables (including the recently publicized
Federal Funds Rate) may or may not be significant.
Although I chose these variables based on some common knowledge and
my perspective on the Bitcoin ecosystem, out of strict
objectivity I am initially agnostic to which
independent variables will or will not provide any
correlation to Bitcoin Price.
Process
We will conduct a classic
multivariable regression analysis in sequence by
importingandcleaningthe data- splitting the data into
trainingandtest(validation) sets building the modeltesting assumptionsand the model’s accuracy- drawing sensible
conclusionsfor the model’s efficacy
Objective:
Predict BTC Price for January 2022 (one month past the data
end date)
Section 1: Data Cleaning
Import libraries.
library(readxl) #to read in excel file dataset
library(tidyverse) #for required functions to transform data (stemming from dplyr & ggplot packages)
library(caret) #for training and holdout sets
library(broom) #for working with the model output
library(corrplot) #to construct the required correlation matrix
library(car) #to augment the correlation matrix assumption testing
library(scatterplot3d) #for 3D plotting
library(olsrr) #for checking residuals for normal distribution
library(lmtest) #for checking Homoscedasticity
library(kableExtra) #to enhance the kable package
Read in the data.
BTC <- read_excel("data.xlsx")
Check the row count and for n/a or null
values.
kbl(nrow(BTC))
is.na(BTC)
is.null(BTC)
The output shows 55 rows and no missing values.
To ensure the achievement of homoscedasticity (a
required model testing assumption after we build our
model), transform BTC.Price logarithmically.
BTC$BTC.Price.Log = log(BTC$BTC.Price)
#reorder columns
BTC <- BTC %>% relocate("BTC.Price.Log", .before = "BTC.Mining.Difficulty")
We will retain BTC.Price for reference, but will not
employ it in the linear regression model analysis as the independent
variable. The transformed log column for BTC.Price will serve as the
independent variable.
A good practice is to delete any outliers that may skew
the data. Employing the Cook's Distance technique will
pinpoint and delete outliers.9
mod <- lm(BTC.Price.Log ~ BTC.Mining.Difficulty + BTC.Mining.Profitability
+ BTC.Atm.Count.US + Fed.Fund.Rate + M1.Money.Supply
+ Global.Inflation.Rate, data = BTC)
BTC$cooksd <- cooks.distance(mod)
#define outliers based on 4/n criteria
BTC$outlier <- ifelse(BTC$cooksd < 4/nrow(BTC), "keep","delete")
#inspect the dataset, be sure the range is 1:55 for all observations
BTC[1:55, ]
## # A tibble: 55 × 11
## `Month-Year` BTC.Price BTC.Price.Log BTC.Mining.Difficulty BTC.Mining.Profit…
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Jun 2017 2547. 7.84 0.71 1.21
## 2 Jul 2017 2719. 7.91 0.86 0.88
## 3 Aug 2017 4676. 8.45 0.89 1.62
## 4 Sep 2017 4208. 8.34 1.1 0.84
## 5 Oct 2017 6131. 8.72 1.45 1.11
## 6 Nov 2017 10543. 9.26 1.35 2.08
## 7 Dec 2017 13062. 9.48 1.87 2.28
## 8 Jan 2018 9914. 9.20 2.6 0.91
## 9 Feb 2018 10903. 9.30 3.01 0.85
## 10 Mar 2018 6970. 8.85 3.46 0.46
## # … with 45 more rows, and 6 more variables: BTC.Atm.Count.US <dbl>,
## # Fed.Fund.Rate <dbl>, M1.Money.Supply <dbl>, Global.Inflation.Rate <dbl>,
## # cooksd <dbl>, outlier <chr>
This is confirmed with A tibble: 55 x 11 output.
Display the results showing the outliers.
df_BTC_cooksd <-BTC[1:55, ]
df_BTC_cooksd %>%
kbl() %>%
kable_styling() %>%
row_spec(which(df_BTC_cooksd$outlier == "delete"), bold = T,
color = "white", background = "red")
| Month-Year | BTC.Price | BTC.Price.Log | BTC.Mining.Difficulty | BTC.Mining.Profitability | BTC.Atm.Count.US | Fed.Fund.Rate | M1.Money.Supply | Global.Inflation.Rate | cooksd | outlier |
|---|---|---|---|---|---|---|---|---|---|---|
| Jun 2017 | 2547.32 | 7.842797 | 0.71 | 1.21 | 766.0 | 1.04 | 3524.9 | 3.23 | 0.1271788 | delete |
| Jul 2017 | 2718.62 | 7.907880 | 0.86 | 0.88 | 826.0 | 1.15 | 3548.6 | 3.23 | 0.0497825 | keep |
| Aug 2017 | 4676.14 | 8.450228 | 0.89 | 1.62 | 853.0 | 1.16 | 3587.8 | 3.23 | 0.0611313 | keep |
| Sep 2017 | 4207.83 | 8.344702 | 1.10 | 0.84 | 908.0 | 1.15 | 3571.2 | 3.23 | 0.0000132 | keep |
| Oct 2017 | 6130.56 | 8.721041 | 1.45 | 1.11 | 948.0 | 1.15 | 3606.3 | 3.23 | 0.0053589 | keep |
| Nov 2017 | 10542.66 | 9.263185 | 1.35 | 2.08 | 1052.0 | 1.16 | 3631.0 | 3.23 | 0.0013759 | keep |
| Dec 2017 | 13062.15 | 9.477474 | 1.87 | 2.28 | 1125.0 | 1.30 | 3610.6 | 3.23 | 0.0068056 | keep |
| Jan 2018 | 9914.47 | 9.201751 | 2.60 | 0.91 | 1216.0 | 1.41 | 3656.2 | 3.59 | 0.0234163 | keep |
| Feb 2018 | 10903.02 | 9.296795 | 3.01 | 0.85 | 1319.0 | 1.42 | 3619.0 | 3.59 | 0.0354220 | keep |
| Mar 2018 | 6970.36 | 8.849422 | 3.46 | 0.46 | 1475.0 | 1.51 | 3663.4 | 3.59 | 0.0141478 | keep |
| Apr 2018 | 9268.72 | 9.134401 | 4.02 | 0.62 | 1624.0 | 1.69 | 3655.2 | 3.59 | 0.0143643 | keep |
| May 2018 | 7363.28 | 8.904261 | 4.31 | 0.44 | 1752.0 | 1.70 | 3650.8 | 3.59 | 0.0077092 | keep |
| Jun 2018 | 6189.31 | 8.730579 | 5.08 | 0.31 | 1880.0 | 1.82 | 3652.2 | 3.59 | 0.0010366 | keep |
| Jul 2018 | 8166.39 | 9.007782 | 5.95 | 0.34 | 1982.0 | 1.91 | 3677.5 | 3.59 | 0.0062702 | keep |
| Aug 2018 | 6938.91 | 8.844900 | 6.73 | 0.28 | 1988.0 | 1.91 | 3693.2 | 3.59 | 0.0006102 | keep |
| Sep 2018 | 6576.61 | 8.791275 | 7.15 | 0.25 | 2066.0 | 1.95 | 3701.8 | 3.59 | 0.0000162 | keep |
| Oct 2018 | 6276.46 | 8.744561 | 7.18 | 0.23 | 2119.0 | 2.19 | 3728.2 | 3.59 | 0.0002029 | keep |
| Nov 2018 | 3952.45 | 8.282091 | 6.65 | 0.15 | 2183.0 | 2.20 | 3705.6 | 3.59 | 0.0196632 | keep |
| Dec 2018 | 3689.56 | 8.213263 | 5.62 | 0.17 | 2248.0 | 2.27 | 3763.4 | 3.59 | 0.0246806 | keep |
| Jan 2019 | 3441.03 | 8.143526 | 5.81 | 0.17 | 2301.0 | 2.40 | 3754.8 | 3.47 | 0.0299279 | keep |
| Feb 2019 | 3772.94 | 8.235610 | 6.07 | 0.16 | 2428.0 | 2.40 | 3762.9 | 3.47 | 0.0195361 | keep |
| Mar 2019 | 4081.22 | 8.314151 | 6.38 | 0.17 | 2487.0 | 2.41 | 3733.6 | 3.47 | 0.0142992 | keep |
| Apr 2019 | 5151.43 | 8.547030 | 6.35 | 0.20 | 2602.0 | 2.42 | 3775.0 | 3.47 | 0.0011489 | keep |
| May 2019 | 8287.04 | 9.022448 | 7.46 | 0.27 | 2760.0 | 2.39 | 3783.2 | 3.47 | 0.0100899 | keep |
| Jun 2019 | 12024.08 | 9.394667 | 7.93 | 0.37 | 2915.0 | 2.38 | 3824.4 | 3.47 | 0.0462735 | keep |
| Jul 2019 | 9572.74 | 9.166675 | 9.01 | 0.32 | 3061.0 | 2.40 | 3856.8 | 3.47 | 0.0107447 | keep |
| Aug 2019 | 9590.74 | 9.168553 | 10.18 | 0.25 | 3271.0 | 2.13 | 3858.9 | 3.47 | 0.0055603 | keep |
| Sep 2019 | 8085.71 | 8.997854 | 12.76 | 0.16 | 3332.0 | 2.04 | 3900.0 | 3.47 | 0.0016219 | keep |
| Oct 2019 | 9226.00 | 9.129781 | 13.69 | 0.16 | 3508.0 | 1.83 | 3935.3 | 3.47 | 0.0000842 | keep |
| Nov 2019 | 7729.27 | 8.952770 | 12.97 | 0.14 | 3752.0 | 1.55 | 3955.3 | 3.47 | 0.0015848 | keep |
| Dec 2019 | 7251.28 | 8.888933 | 12.95 | 0.14 | 3943.0 | 1.55 | 4007.1 | 3.47 | 0.0041594 | keep |
| Jan 2020 | 9545.08 | 9.163781 | 15.47 | 0.15 | 4199.0 | 1.55 | 3993.6 | 3.18 | 0.0053137 | keep |
| Feb 2020 | 8778.47 | 9.080057 | 15.49 | 0.13 | 4548.0 | 1.58 | 4002.9 | 3.18 | 0.0008524 | keep |
| Mar 2020 | 6483.74 | 8.777053 | 13.91 | 0.12 | 4859.0 | 0.65 | 4261.9 | 3.18 | 0.0008725 | keep |
| Apr 2020 | 8773.11 | 9.079447 | 15.96 | 0.15 | 5222.0 | 0.05 | 4779.8 | 3.18 | 0.0262619 | keep |
| May 2020 | 9688.32 | 9.178676 | 15.14 | 0.08 | 5535.0 | 0.05 | 16232.9 | 3.18 | 0.0000022 | keep |
| Jun 2020 | 9188.06 | 9.125660 | 15.78 | 0.07 | 5863.0 | 0.08 | 16563.5 | 3.18 | 0.0035458 | keep |
| Jul 2020 | 11118.92 | 9.316403 | 16.85 | 0.11 | 6320.0 | 0.09 | 16763.8 | 3.18 | 0.0004197 | keep |
| Aug 2020 | 11657.00 | 9.363662 | 17.56 | 0.09 | 6692.0 | 0.10 | 16896.7 | 3.18 | 0.0004485 | keep |
| Sep 2020 | 10764.28 | 9.283988 | 19.31 | 0.08 | 7396.0 | 0.09 | 17161.5 | 3.18 | 0.0192599 | keep |
| Oct 2020 | 13573.71 | 9.515890 | 20.00 | 0.13 | 8323.0 | 0.09 | 17365.3 | 3.18 | 0.0049156 | keep |
| Nov 2020 | 18114.41 | 9.804463 | 19.16 | 0.12 | 9096.0 | 0.09 | 17626.7 | 3.18 | 0.0159129 | keep |
| Dec 2020 | 28768.84 | 10.267048 | 18.60 | 0.22 | 26722.7 | 0.09 | 17834.5 | 3.18 | 0.1374381 | delete |
| Jan 2021 | 34622.37 | 10.452255 | 20.82 | 0.24 | 44349.4 | 0.09 | 18107.1 | 4.35 | 0.0541379 | keep |
| Feb 2021 | 46642.61 | 10.750270 | 21.72 | 0.27 | 61976.1 | 0.08 | 18367.8 | 4.35 | 0.0018537 | keep |
| Mar 2021 | 58734.48 | 10.980782 | 21.87 | 0.39 | 79602.8 | 0.07 | 18641.4 | 4.35 | 0.0024732 | keep |
| Apr 2021 | 53260.30 | 10.882946 | 23.58 | 0.36 | 97229.5 | 0.07 | 18927.7 | 4.35 | 0.0083401 | keep |
| May 2021 | 35749.66 | 10.484296 | 21.05 | 0.23 | 114856.2 | 0.06 | 19259.0 | 4.35 | 0.0125123 | keep |
| Jun 2021 | 35945.79 | 10.489767 | 19.93 | 0.24 | 132482.9 | 0.08 | 19318.8 | 4.35 | 0.0045121 | keep |
| Jul 2021 | 41157.15 | 10.625153 | 14.49 | 0.36 | 150109.6 | 0.10 | 19497.1 | 4.35 | 0.0412099 | keep |
| Aug 2021 | 47663.02 | 10.771911 | 17.62 | 0.35 | 167736.3 | 0.09 | 19746.0 | 4.35 | 0.0134071 | keep |
| Sep 2021 | 41412.61 | 10.631341 | 18.99 | 0.34 | 185363.0 | 0.08 | 19898.4 | 4.35 | 0.0001254 | keep |
| Oct 2021 | 61374.28 | 11.024746 | 21.66 | 0.36 | 202989.7 | 0.08 | 20061.8 | 4.35 | 0.0081369 | keep |
| Nov 2021 | 57834.36 | 10.965338 | 22.34 | 0.30 | 220616.4 | 0.08 | 20273.9 | 4.35 | 0.0025818 | keep |
| Dec 2021 | 47128.47 | 10.760633 | 24.27 | 0.29 | 238243.1 | 0.08 | 20423.6 | 4.35 | 0.0856546 | delete |
We find three identified outliers in the added outlier
column.
Next, we will delete the outliers from the dataset.
#redundant check of outlier count
outliers <- df_BTC_cooksd == 'delete'
sum(outliers)
## [1] 3
#remove the three outliers from the dataframe
BTC2 <- subset(BTC, outlier != "delete")
#verify that there are 52 observations remaining
nrow(BTC2)
## [1] 52
#Delete extraneous columns that resulted from the Cook's Distance formula
BTC3 <- BTC2[c(-10,-11)]
This is our complete dataset in all it’s glory.
BTC3 %>%
kbl() %>%
kable_styling()
| Month-Year | BTC.Price | BTC.Price.Log | BTC.Mining.Difficulty | BTC.Mining.Profitability | BTC.Atm.Count.US | Fed.Fund.Rate | M1.Money.Supply | Global.Inflation.Rate |
|---|---|---|---|---|---|---|---|---|
| Jul 2017 | 2718.62 | 7.907880 | 0.86 | 0.88 | 826.0 | 1.15 | 3548.6 | 3.23 |
| Aug 2017 | 4676.14 | 8.450228 | 0.89 | 1.62 | 853.0 | 1.16 | 3587.8 | 3.23 |
| Sep 2017 | 4207.83 | 8.344702 | 1.10 | 0.84 | 908.0 | 1.15 | 3571.2 | 3.23 |
| Oct 2017 | 6130.56 | 8.721041 | 1.45 | 1.11 | 948.0 | 1.15 | 3606.3 | 3.23 |
| Nov 2017 | 10542.66 | 9.263185 | 1.35 | 2.08 | 1052.0 | 1.16 | 3631.0 | 3.23 |
| Dec 2017 | 13062.15 | 9.477474 | 1.87 | 2.28 | 1125.0 | 1.30 | 3610.6 | 3.23 |
| Jan 2018 | 9914.47 | 9.201751 | 2.60 | 0.91 | 1216.0 | 1.41 | 3656.2 | 3.59 |
| Feb 2018 | 10903.02 | 9.296795 | 3.01 | 0.85 | 1319.0 | 1.42 | 3619.0 | 3.59 |
| Mar 2018 | 6970.36 | 8.849422 | 3.46 | 0.46 | 1475.0 | 1.51 | 3663.4 | 3.59 |
| Apr 2018 | 9268.72 | 9.134401 | 4.02 | 0.62 | 1624.0 | 1.69 | 3655.2 | 3.59 |
| May 2018 | 7363.28 | 8.904261 | 4.31 | 0.44 | 1752.0 | 1.70 | 3650.8 | 3.59 |
| Jun 2018 | 6189.31 | 8.730579 | 5.08 | 0.31 | 1880.0 | 1.82 | 3652.2 | 3.59 |
| Jul 2018 | 8166.39 | 9.007782 | 5.95 | 0.34 | 1982.0 | 1.91 | 3677.5 | 3.59 |
| Aug 2018 | 6938.91 | 8.844900 | 6.73 | 0.28 | 1988.0 | 1.91 | 3693.2 | 3.59 |
| Sep 2018 | 6576.61 | 8.791275 | 7.15 | 0.25 | 2066.0 | 1.95 | 3701.8 | 3.59 |
| Oct 2018 | 6276.46 | 8.744561 | 7.18 | 0.23 | 2119.0 | 2.19 | 3728.2 | 3.59 |
| Nov 2018 | 3952.45 | 8.282091 | 6.65 | 0.15 | 2183.0 | 2.20 | 3705.6 | 3.59 |
| Dec 2018 | 3689.56 | 8.213263 | 5.62 | 0.17 | 2248.0 | 2.27 | 3763.4 | 3.59 |
| Jan 2019 | 3441.03 | 8.143526 | 5.81 | 0.17 | 2301.0 | 2.40 | 3754.8 | 3.47 |
| Feb 2019 | 3772.94 | 8.235610 | 6.07 | 0.16 | 2428.0 | 2.40 | 3762.9 | 3.47 |
| Mar 2019 | 4081.22 | 8.314151 | 6.38 | 0.17 | 2487.0 | 2.41 | 3733.6 | 3.47 |
| Apr 2019 | 5151.43 | 8.547030 | 6.35 | 0.20 | 2602.0 | 2.42 | 3775.0 | 3.47 |
| May 2019 | 8287.04 | 9.022448 | 7.46 | 0.27 | 2760.0 | 2.39 | 3783.2 | 3.47 |
| Jun 2019 | 12024.08 | 9.394667 | 7.93 | 0.37 | 2915.0 | 2.38 | 3824.4 | 3.47 |
| Jul 2019 | 9572.74 | 9.166675 | 9.01 | 0.32 | 3061.0 | 2.40 | 3856.8 | 3.47 |
| Aug 2019 | 9590.74 | 9.168553 | 10.18 | 0.25 | 3271.0 | 2.13 | 3858.9 | 3.47 |
| Sep 2019 | 8085.71 | 8.997854 | 12.76 | 0.16 | 3332.0 | 2.04 | 3900.0 | 3.47 |
| Oct 2019 | 9226.00 | 9.129781 | 13.69 | 0.16 | 3508.0 | 1.83 | 3935.3 | 3.47 |
| Nov 2019 | 7729.27 | 8.952770 | 12.97 | 0.14 | 3752.0 | 1.55 | 3955.3 | 3.47 |
| Dec 2019 | 7251.28 | 8.888933 | 12.95 | 0.14 | 3943.0 | 1.55 | 4007.1 | 3.47 |
| Jan 2020 | 9545.08 | 9.163781 | 15.47 | 0.15 | 4199.0 | 1.55 | 3993.6 | 3.18 |
| Feb 2020 | 8778.47 | 9.080057 | 15.49 | 0.13 | 4548.0 | 1.58 | 4002.9 | 3.18 |
| Mar 2020 | 6483.74 | 8.777053 | 13.91 | 0.12 | 4859.0 | 0.65 | 4261.9 | 3.18 |
| Apr 2020 | 8773.11 | 9.079447 | 15.96 | 0.15 | 5222.0 | 0.05 | 4779.8 | 3.18 |
| May 2020 | 9688.32 | 9.178676 | 15.14 | 0.08 | 5535.0 | 0.05 | 16232.9 | 3.18 |
| Jun 2020 | 9188.06 | 9.125660 | 15.78 | 0.07 | 5863.0 | 0.08 | 16563.5 | 3.18 |
| Jul 2020 | 11118.92 | 9.316403 | 16.85 | 0.11 | 6320.0 | 0.09 | 16763.8 | 3.18 |
| Aug 2020 | 11657.00 | 9.363662 | 17.56 | 0.09 | 6692.0 | 0.10 | 16896.7 | 3.18 |
| Sep 2020 | 10764.28 | 9.283988 | 19.31 | 0.08 | 7396.0 | 0.09 | 17161.5 | 3.18 |
| Oct 2020 | 13573.71 | 9.515890 | 20.00 | 0.13 | 8323.0 | 0.09 | 17365.3 | 3.18 |
| Nov 2020 | 18114.41 | 9.804463 | 19.16 | 0.12 | 9096.0 | 0.09 | 17626.7 | 3.18 |
| Jan 2021 | 34622.37 | 10.452255 | 20.82 | 0.24 | 44349.4 | 0.09 | 18107.1 | 4.35 |
| Feb 2021 | 46642.61 | 10.750270 | 21.72 | 0.27 | 61976.1 | 0.08 | 18367.8 | 4.35 |
| Mar 2021 | 58734.48 | 10.980782 | 21.87 | 0.39 | 79602.8 | 0.07 | 18641.4 | 4.35 |
| Apr 2021 | 53260.30 | 10.882946 | 23.58 | 0.36 | 97229.5 | 0.07 | 18927.7 | 4.35 |
| May 2021 | 35749.66 | 10.484296 | 21.05 | 0.23 | 114856.2 | 0.06 | 19259.0 | 4.35 |
| Jun 2021 | 35945.79 | 10.489767 | 19.93 | 0.24 | 132482.9 | 0.08 | 19318.8 | 4.35 |
| Jul 2021 | 41157.15 | 10.625153 | 14.49 | 0.36 | 150109.6 | 0.10 | 19497.1 | 4.35 |
| Aug 2021 | 47663.02 | 10.771911 | 17.62 | 0.35 | 167736.3 | 0.09 | 19746.0 | 4.35 |
| Sep 2021 | 41412.61 | 10.631341 | 18.99 | 0.34 | 185363.0 | 0.08 | 19898.4 | 4.35 |
| Oct 2021 | 61374.28 | 11.024746 | 21.66 | 0.36 | 202989.7 | 0.08 | 20061.8 | 4.35 |
| Nov 2021 | 57834.36 | 10.965338 | 22.34 | 0.30 | 220616.4 | 0.08 | 20273.9 | 4.35 |
We are now ready to to make our Training and
Test sets.
Section 2: Training and Test Sets
Creating training and holdout sets is mandatory to ensure the model
is fits the data well.
We use the training set to develop the model, then we will apply the
model to the test set in order to determine the model’s efficacy.10
Create the training and test sets.
#make the procedure reproducible
set.seed(1)
#use 70% of dataset as training set and 30% as test set
sample <- sample(c(TRUE, FALSE), nrow(BTC3), replace=TRUE, prob=c(0.7,0.3))
train <- BTC3[sample, ]
test <- BTC3[!sample, ]
We will return to the test dataset after we build our model with the
training data first.
We confirm that the spread is 70% allocated to the
training set and 30% allocated to the test
set.
nrow(train)
## [1] 35
nrow(test)
## [1] 17
#Training set
35/52
## [1] 0.6730769
#Test Set
17/52
## [1] 0.3269231
Close enough. Now, let’s test for the data’s integrity.
Section 3: Multicollinearity & Variance Inflation Factor
Multicollinearity is the phenomena where multiple
independent variables are correlated, or influence each other.11 This is
a problem because predictor variables should have no correlation or as
little correlation as possible. We will test for this.
#subset the data only for the `independent variables`
train2 <- train[c(-1,-2, -3)]
par(mfrow=c(1,1))
#create a numeric vector
datamatrix <- cor(train2)
#make the correlation matrix
corrplot(datamatrix, method ='number')
On a cursory level, BTC.Mining.Difficulty,
M1.Money.Supply, BTC.Atm.Count.US and
Global.Inflation.Rate seem to have moderate to high
correlation between other variables. Should we delete any of these
predictor variables at this point?
To determine which variables to scrap, if needed, we employ
Variance Inflation Factor (VIF) for augmented analysis.12
#be sure to revert to the original training set to reincorporate the independent variable into the model
VIF<-vif(lm(BTC.Price.Log ~ BTC.Mining.Difficulty + BTC.Mining.Profitability + BTC.Atm.Count.US
+ Fed.Fund.Rate + M1.Money.Supply + Global.Inflation.Rate, data = train))
VIF %>%
kbl() %>%
kable_styling()
| x | |
|---|---|
| BTC.Mining.Difficulty | 5.160656 |
| BTC.Mining.Profitability | 2.311268 |
| BTC.Atm.Count.US | 4.805880 |
| Fed.Fund.Rate | 4.974582 |
| M1.Money.Supply | 8.509970 |
| Global.Inflation.Rate | 3.940872 |
The VIF should be less than 10. A VIF greater than 10 indicates
severe multicollinearity. Since all variables are under 10
(for some barely), we will retain all variables at this moment.13
On to the good stuff, let’s build the
linear regression model!
Section 4: Modeling
Calculate the linear regression model.
model<-lm(BTC.Price.Log ~ BTC.Mining.Difficulty + BTC.Mining.Profitability + BTC.Atm.Count.US
+ Fed.Fund.Rate + M1.Money.Supply + Global.Inflation.Rate, data = train)
summary(model)
##
## Call:
## lm(formula = BTC.Price.Log ~ BTC.Mining.Difficulty + BTC.Mining.Profitability +
## BTC.Atm.Count.US + Fed.Fund.Rate + M1.Money.Supply + Global.Inflation.Rate,
## data = train)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.54677 -0.14387 0.03012 0.13608 0.42740
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.714e+00 6.619e-01 7.122 9.49e-08 ***
## BTC.Mining.Difficulty 7.520e-02 1.358e-02 5.538 6.38e-06 ***
## BTC.Mining.Profitability 5.905e-01 1.512e-01 3.906 0.00054 ***
## BTC.Atm.Count.US 6.591e-07 1.634e-06 0.403 0.68974
## Fed.Fund.Rate 1.090e-01 1.052e-01 1.037 0.30864
## M1.Money.Supply 1.757e-05 1.795e-05 0.979 0.33619
## Global.Inflation.Rate 8.961e-01 1.993e-01 4.496 0.00011 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2432 on 28 degrees of freedom
## Multiple R-squared: 0.9256, Adjusted R-squared: 0.9096
## F-statistic: 58.02 on 6 and 28 DF, p-value: 1.667e-14
Since the BTC.Atm.Count.US is the greatest
non-significant variable (with a p-value > 0.05), we will delete this
variable.
With the deleted predictor variable, we run the process again.
model<-lm(BTC.Price.Log ~ BTC.Mining.Difficulty + BTC.Mining.Profitability + Fed.Fund.Rate
+ M1.Money.Supply + Global.Inflation.Rate, data = train)
summary(model)
##
## Call:
## lm(formula = BTC.Price.Log ~ BTC.Mining.Difficulty + BTC.Mining.Profitability +
## Fed.Fund.Rate + M1.Money.Supply + Global.Inflation.Rate,
## data = train)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.5430 -0.1556 0.0394 0.1362 0.4288
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.531e+00 4.745e-01 9.549 1.86e-10 ***
## BTC.Mining.Difficulty 7.566e-02 1.333e-02 5.674 3.91e-06 ***
## BTC.Mining.Profitability 5.987e-01 1.476e-01 4.056 0.000344 ***
## Fed.Fund.Rate 1.057e-01 1.033e-01 1.023 0.314770
## M1.Money.Supply 1.905e-05 1.732e-05 1.100 0.280297
## Global.Inflation.Rate 9.478e-01 1.504e-01 6.303 6.97e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2396 on 29 degrees of freedom
## Multiple R-squared: 0.9251, Adjusted R-squared: 0.9122
## F-statistic: 71.66 on 5 and 29 DF, p-value: 2.007e-15
Since the Federal.Fund.Rate is the next largest
non-significant variable (with a p-value > 0.05), we will delete this
variable.
With the deleted predictor variable, we run the process again.
model<-lm(BTC.Price.Log ~ BTC.Mining.Difficulty + BTC.Mining.Profitability + M1.Money.Supply
+ Global.Inflation.Rate, data = train)
summary(model)
##
## Call:
## lm(formula = BTC.Price.Log ~ BTC.Mining.Difficulty + BTC.Mining.Profitability +
## M1.Money.Supply + Global.Inflation.Rate, data = train)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.51257 -0.13521 0.02273 0.14103 0.49358
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.598e+00 4.703e-01 9.777 7.71e-11 ***
## BTC.Mining.Difficulty 7.148e-02 1.270e-02 5.627 3.97e-06 ***
## BTC.Mining.Profitability 5.288e-01 1.310e-01 4.038 0.000344 ***
## M1.Money.Supply 7.292e-06 1.296e-05 0.563 0.577871
## Global.Inflation.Rate 1.012e+00 1.367e-01 7.406 2.98e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2398 on 30 degrees of freedom
## Multiple R-squared: 0.9224, Adjusted R-squared: 0.9121
## F-statistic: 89.18 on 4 and 30 DF, p-value: 3.29e-16
Since the M1.Money.Supply is the next least significant
variable (with a p-value > 0.05), we will delete this variable.
With the deleted predictor variable, we run the process again.
model<-lm(BTC.Price.Log ~ BTC.Mining.Difficulty + BTC.Mining.Profitability
+ Global.Inflation.Rate, data = train)
summary(model)
##
## Call:
## lm(formula = BTC.Price.Log ~ BTC.Mining.Difficulty + BTC.Mining.Profitability +
## Global.Inflation.Rate, data = train)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.4913 -0.1487 0.0283 0.1471 0.4853
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.443317 0.376956 11.787 5.50e-13 ***
## BTC.Mining.Difficulty 0.076886 0.008222 9.351 1.55e-10 ***
## BTC.Mining.Profitability 0.557522 0.119266 4.675 5.45e-05 ***
## Global.Inflation.Rate 1.051885 0.115775 9.086 3.01e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2372 on 31 degrees of freedom
## Multiple R-squared: 0.9216, Adjusted R-squared: 0.914
## F-statistic: 121.5 on 3 and 31 DF, p-value: < 2.2e-16
Executive Decision: I am not convinced that the
Global.Inflation.Rate variable is a good metric in this
particular instance. The reason being: I was not able to
capture a month-to-month assessment in keeping with the
data; instead, I scraped a blanket average for each year,
given the resources at hand.14
Global Inflation Rate may fluctuate month-to-month.
While it may have an attractive p-value for this model, the
variable is not trustworthy due to
my lack of capturing better granualarity,
inconsistent with the other remaining predictor
variables.
Thus, due to how I perceive the Global Inflation Rate
interacts with the data, I am going to delete this variable, leaving
only BTC.Mining.Difficulty and
BTC.Mining.Profitability as the star predictor
variables.
model<-lm(BTC.Price.Log ~ BTC.Mining.Difficulty + BTC.Mining.Profitability, data = train)
summary(model)
##
## Call:
## lm(formula = BTC.Price.Log ~ BTC.Mining.Difficulty + BTC.Mining.Profitability,
## data = train)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.62588 -0.42858 0.09613 0.35924 0.96722
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.69717 0.22158 34.738 < 2e-16 ***
## BTC.Mining.Difficulty 0.11588 0.01321 8.771 5.06e-10 ***
## BTC.Mining.Profitability 0.78244 0.21977 3.560 0.00118 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4468 on 32 degrees of freedom
## Multiple R-squared: 0.7129, Adjusted R-squared: 0.6949
## F-statistic: 39.72 on 2 and 32 DF, p-value: 2.136e-09
print(model)
##
## Call:
## lm(formula = BTC.Price.Log ~ BTC.Mining.Difficulty + BTC.Mining.Profitability,
## data = train)
##
## Coefficients:
## (Intercept) BTC.Mining.Difficulty BTC.Mining.Profitability
## 7.6972 0.1159 0.7824
Thus, the significant independent variables that we will retain are:
BTC.Mining.Difficulty and
BTC.Mining.Profitability.
\({R^2}\) is 0.7129, which means that the model created explains 71.29% of the variability in the given test data set.
Our model follows the form \(Y_{0}\) = \(\beta_{0}\) + \(\beta_{1}\)\(X_{i}\) + \(\beta_{2}\)\(X_{i}\) + \(\varepsilon_{i}\)
Which transforms into:
BTC.Price.Log = 7.69 + 0.12\(X_{BTC.Mining.Difficulty}\) + 0.78\(X_{BTC.Mining.Profitability}\)
7.69 is the equation constant intercept.
A 0.12 increase in BTC.Price.Log is associated with a
BTC.Mining.Difficulty increase by a factor of 1.
A 0.78 increase in BTC.Price.Log is associated with a
BTC.Mining.Profitability increase by a factor of 1.
Section 5: Testing Assumptions
Check for homoscedasticity
#visual look
par(mfrow=c(1,1))
plot(model)
At a cursory glance, although there is not very much data provided in
the training set (35 observations), the data seems to meet the
assumption, with a seemingly
constant scatter through the residual output.
However, just to be sure, verify with the Breusch-Pagan
test
lmtest::bptest(model)
##
## studentized Breusch-Pagan test
##
## data: model
## BP = 1.4359, df = 2, p-value = 0.4878
Since the null hypothesis of the Breusch-Pagan test is
that heteroscedastiy is present, and the p value > 0.05
(not significant), we reject the null hypothesis and
determine that the model’s residuals are homoscedastic.
Check for normal distribution
ols_plot_resid_hist(model)
Most of the residuals fall around zero, and the number of observations
in the tails of the histogram is low, therefore the regression model is
fairly normally distributed. For OCD types wanting a picture perfect
bell-shaped curve spread, calm down, looks can be deceiving.
Let’s implement a Moneyball approach with some hard
math.
ols_test_normality(model)
## -----------------------------------------------
## Test Statistic pvalue
## -----------------------------------------------
## Shapiro-Wilk 0.9379 0.0482
## Kolmogorov-Smirnov 0.12 0.6507
## Cramer-von Mises 4.0129 0.0000
## Anderson-Darling 0.77 0.0411
## -----------------------------------------------
Since the number of observations in our test dataset are less than
50, we use Shapiro-Wilk test. Shapiro-Wilk
shows that the p-value is less than 0.05, meaning we
reject the null hypothesis that the residuals are not
normally distributed, and rather that the residuals are indeed normally
distributed. This model is normally distributed (by a hair), but that’s
good for our purposes. It gets on base.
Assess multicollinearity again.
remaining_variables <- train[c(4,5)]
par(mfrow=c(1,1))
datamatrix <- cor(remaining_variables)
corrplot(datamatrix, method ='number')
We confirm no outstanding multicollinearity with the remaining dependent variables in the model.
Additionally, they are moderately negatively correlated. A
decrease in mining difficulty indicates
an increase in mining profitability, and an
increase in mining difficulty indicates
a decrease in mining profitability.
Independence of Errors: the residuals indicate that the
noise terms are independent (possessing no autocorrelation, and
confirmed with the multicollinearity test). Furthermore, the noise terms
are independent of the explanatory variables and there is no obvious
string of positive to negative residuals, or any other definable
pattern.
However, we are not finished yet.
“I gotta lotta mo!”
Section 6: Testing Root Means Squared Predicted Error (RMSPE)
RMSPE determines how well a model fits the hold-out data. This is
where we will apply the training set model on the
test dataset, to validate where or not it fits the data.15
Use thepredict function with the training set-developed
model.
p <- predict(model, newdata = test)
error <- test$BTC.Price.Log - p
RMSPE <- sqrt(mean(error^2))
RMSPE
## [1] 0.3822073
Recall that the standard error of the model is ~0.44
(reference Residual standard error in output).
summary(model)
##
## Call:
## lm(formula = BTC.Price.Log ~ BTC.Mining.Difficulty + BTC.Mining.Profitability,
## data = train)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.62588 -0.42858 0.09613 0.35924 0.96722
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.69717 0.22158 34.738 < 2e-16 ***
## BTC.Mining.Difficulty 0.11588 0.01321 8.771 5.06e-10 ***
## BTC.Mining.Profitability 0.78244 0.21977 3.560 0.00118 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4468 on 32 degrees of freedom
## Multiple R-squared: 0.7129, Adjusted R-squared: 0.6949
## F-statistic: 39.72 on 2 and 32 DF, p-value: 2.136e-09
Since RMSPE of 0.38 is smaller than the
Residual Standard error of 0.44, we determine
that
the training set model predicts the holdout set better than the training (validation) set,
which is great news.
We’re good to hook and make a prediction on
January 2022's Bitcoin Price.
Section 7: Predicting Bitcoin Price in January 2022
From Section 4: Modeling, our stellar model equation
is:
BTC.Price.Log = 7.69 + 0.12\(X_{BTC.Mining.Difficulty}\) + 0.78\(X_{BTC.Mining.Profitability}\)
In January 2022, the
BTC.Mining.Difficulty was ~25.17 and the
BTC.Mining.Profitability was ~0.19.
Calculate the price prediction, remembering to take the
inverse log function of the result to convert it to Bitcoin
in USD, as we transformed the original data logarithmically in
Section 1: Data Cleaning.
c <- 7.69 + (.12*25.17)+(0.78*0.19)
exp(c) #take inverse log function, since we transformed the original data logarithmically during the data cleaning phase
## [1] 51979.26
Calculate the 95% confidence interval.16
data(train)
sample.mean <- mean(train$BTC.Price.Log)
sample.n <- length(train$BTC.Price.Log)
sample.sd <- sd(train$BTC.Price.Log)
sample.se <- sample.sd/sqrt(sample.n)
alpha = 0.05
degrees.freedom = sample.n - 1
t.score = qt(p=alpha/2, df=degrees.freedom,lower.tail=F)
margin.error <- t.score * sample.se
lower.bound <- sample.mean - margin.error
upper.bound <- sample.mean + margin.error
print(c(lower.bound,upper.bound))
## [1] 9.011596 9.567301
Take the reverse log to re-transform the interval into USD.
exp(9.011596)
## [1] 8197.594
exp(9.567301)
## [1] 14289.8
The model predicts Bitcoin price at $51,979.26 in
January 2022, with a 95% confidence interval of
[$8,197.59, $14,289.80]. Again, \({R^2}\) is 0.7129 and Standard Error is
0.4468.
Conclusion
The actual price of Bitcoin in January 2022 was between
$37,188.11 to $38,483.13.17 Intuitively, given
the volatility of Bitcoin in the last year, a difference
between the model’s prediction and the actual price, of around
$13,496.13 to $14,791.15. Additionally, the 95% confidence
interval complements this range. The model seems to assume a sensible
range for extrapolating future near-term Bitcoin price.
However, as the model is based on 35 observations, it is
ill-equipped for any serious forecasting. See the
Bitcoin Price Forecasting (Time-Series) model in the
footnotes.18 Given the data, the analysis backing
this model is stronger at identifying which potential explanatory
variables are statistically significant, along with those
that are frivolous. Yet, for its purpose, this model has
offered some initiial insight on the actual hard factors
that correlate to its price.
Deleted variables
These variables were
statistically insignificant in correlating to Bitcoin’s
price: BTC.ATM.Count.US, M1 Money Supply, and
Federal Funds Rate. They all seemed to be good economic
barometers, or similar, that might influence Bitcoin value. However, in
context, Bitcoin operates in its own
cryptographic ecosystem, as clearly shown by the
Mining Difficulty and Mining Profitability
variables making the final cut.
BTC ATMs in the US
The increase of BTC ATM supply
nationwide does not necessarily assume that retail consumers or any
other parties are using them to buy more Bitcoin, or influence its
valuation in any way. It might be interesting to conduct a sample survey
in most metropolitan areas and suburbs as to how often
BTC ATMs (or BTMs) are actually used.
M1 Money Supply
One plausible explanation for this variable’s insignificance is due
to the fact that fiat is not the only method for procuring
Bitcoin. Bitcoin can be mined and swapped back and forth from other
currencies (albeit through expensive marketplaces). Additionally, the
size of M1 does not imply at any time that a significant portion of the
supply is used to purchase Bitcoin.
Federal Funds Rate
The Federal Funds Rate is often hyped by media as a
potential predictor on Bitcoin bear and bull
markets, as of late. Yet, it does not appear to be in any way
significant for this particular model. Future case studies involving a
more granular capture of the Federal Funds Rate and its
impact on Bitcoin might offer more insight. However, as the
Federal Funds Rate to Bitcoin Price relation
has not been clearly identified (statistically) today, one may assume
that the Federal Funds Rate has little-to-no impact on
Bitcoin’s price fluctuation in this model.
Global Inflation Rate
It would be prudent to revisit this variable. The
Global Inflation Rate appeared to have significance in
correlation to Bitcoin’s price; however, I did not trust the source data
as it did not align with the other independent variables, and for that
reason removed it preemptively.
A look at the remaining predictor variables
scatterplot3d(BTC3$BTC.Mining.Profitability,
BTC3$BTC.Mining.Difficulty,
BTC3$BTC.Price,
pch = 19, color = "black",
xlab="Mining Profitability", ylab="Mining Difficulty",
zlab="Bitcoin Price",
highlight.3d = TRUE, angle=45,
trees, type="h",
main="BTC Price against Mining Profitabilty and Mining Difficulty",
scale.y=0.7)
This figure isolates all 52 observation variables. The dataset
clearly demonstrates the effect that
Bitcoin Mining Difficulty has on
Bitcoin Price. A higher
Bitcoin Difficulty (difficulty in mining a Bitcoin block,
or for dedicated hardware to solve a mathematical or
algorithmic hash below a given target level, considering the amount of
other miners attempting the same task), correlates with a
higher Bitcoin Price. There may be a level
of cognitive dissonance associated with recognizing that
Mining Profitability appears to take a secondary placement
(visually) in determining Bitcoin Price. Within the
confines of this analysis, remember that the final
multicollinearity shows Mining Profitability
and Mining Difficulty being inversely correlated.
Contextually, the more miners employed in mining a Bitcoin block, the
less profitable it is for one single miner (or cluster of miners) to
earn sizable revenue (hence, the popularity of
mining pools).
Adjusting for Seasonality
Keep in mind that this analysis started with 55 observations and the
model was based on 35, a very small amount of data. The adage of most
data scientists and analysts is to
get more (relevant) data. A greater quantity of
observations, contingent on the accommodating date ranges of the other
explanatory variables, would allow room for
lagging the economic variables to
adjust for inertia and seasonality. This would reinforce
the model, and perhaps expose different patterns and behavior, in future
iterations.
Ultimately, while domestic and global external macroeconomic factors
influencing retail and institutional investor habits may
affect Bitcoin price secondarily, the intrinsic factors identified only
seem to have the most statistical signifiance, in
determining Bitcoin valuation.
For any questions or comments on this analysis please contact kris.kilgroe@sigmacode.io.
Session Info
sessionInfo()
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Linux Mint 20.3
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] kableExtra_1.3.4 lmtest_0.9-40 zoo_1.8-10
## [4] olsrr_0.5.3 scatterplot3d_0.3-41 car_3.1-0
## [7] carData_3.0-5 corrplot_0.92 broom_0.8.0
## [10] caret_6.0-92 lattice_0.20-45 forcats_0.5.1
## [13] stringr_1.4.0 dplyr_1.0.9 purrr_0.3.4
## [16] readr_2.1.2 tidyr_1.2.0 tibble_3.1.7
## [19] ggplot2_3.3.6 tidyverse_1.3.1 readxl_1.4.0
##
## loaded via a namespace (and not attached):
## [1] colorspace_2.0-3 ellipsis_0.3.2 class_7.3-20
## [4] fs_1.5.2 rstudioapi_0.13 farver_2.1.0
## [7] listenv_0.8.0 prodlim_2019.11.13 fansi_1.0.3
## [10] lubridate_1.8.0 xml2_1.3.3 codetools_0.2-18
## [13] splines_4.2.1 knitr_1.39 jsonlite_1.8.0
## [16] pROC_1.18.0 dbplyr_2.2.0 compiler_4.2.1
## [19] httr_1.4.3 backports_1.4.1 assertthat_0.2.1
## [22] Matrix_1.4-1 fastmap_1.1.0 tufte_0.12
## [25] cli_3.3.0 htmltools_0.5.2 tools_4.2.1
## [28] gtable_0.3.0 glue_1.6.2 reshape2_1.4.4
## [31] Rcpp_1.0.8.3 cellranger_1.1.0 jquerylib_0.1.4
## [34] vctrs_0.4.1 svglite_2.1.0 nlme_3.1-157
## [37] iterators_1.0.14 timeDate_3043.102 gower_1.0.0
## [40] xfun_0.31 globals_0.15.0 rvest_1.0.2
## [43] lifecycle_1.0.1 goftest_1.2-3 future_1.26.1
## [46] MASS_7.3-57 scales_1.2.0 ipred_0.9-13
## [49] hms_1.1.1 parallel_4.2.1 yaml_2.3.5
## [52] gridExtra_2.3 sass_0.4.1 rpart_4.1.16
## [55] stringi_1.7.6 highr_0.9 foreach_1.5.2
## [58] nortest_1.0-4 hardhat_1.1.0 lava_1.6.10
## [61] rlang_1.0.2 pkgconfig_2.0.3 systemfonts_1.0.4
## [64] evaluate_0.15 labeling_0.4.2 recipes_0.2.0
## [67] tidyselect_1.1.2 parallelly_1.32.0 plyr_1.8.7
## [70] magrittr_2.0.3 bookdown_0.27 R6_2.5.1
## [73] generics_0.1.2 DBI_1.1.3 pillar_1.7.0
## [76] haven_2.5.0 withr_2.5.0 survival_3.2-13
## [79] abind_1.4-5 nnet_7.3-17 future.apply_1.9.0
## [82] modelr_0.1.8 crayon_1.5.1 utf8_1.2.2
## [85] tzdb_0.3.0 rmarkdown_2.14 grid_4.2.1
## [88] data.table_1.14.2 rmdformats_1.0.4 ModelMetrics_1.2.2.2
## [91] webshot_0.5.3 reprex_2.0.1 digest_0.6.29
## [94] stats4_4.2.1 munsell_0.5.0 viridisLite_0.4.0
## [97] bslib_0.3.1
Footnotes
Link to
BTC Pricedataset: https://docs.google.com/spreadsheets/d/10GtR7HMJutJqx54iJ4SMSoq1cPnl8pTv/edit?usp=sharing&ouid=116278907272817573735&rtpof=true&sd=true↩︎Link to
BTC Mining Difficultydataset: https://docs.google.com/spreadsheets/d/1yuuzSyCeDlxePXh10_fVgtpxVC_6sChe/edit?usp=sharing&ouid=116278907272817573735&rtpof=true&sd=true↩︎Link to
BTC.Mining.Profitabilitydataset: https://docs.google.com/spreadsheets/d/1idsNZTugrcouyY-HGgITjXmjgIuOt26O/edit?usp=sharing&rtpof=true&sd=true↩︎Link to
BTC ATM Count USdataset: https://docs.google.com/spreadsheets/d/1XEdA-5PBidwvEAQbSvuCm9-v47QHfU-U/edit?usp=sharing&ouid=116278907272817573735&rtpof=true&sd=true↩︎Link to
Federal Funds Ratedataset: https://docs.google.com/spreadsheets/d/13rb6HKnxUBTr1xInNGG-dKbTgMSaWDvL/edit?usp=sharing&ouid=116278907272817573735&rtpof=true&sd=true↩︎Link to
M1 Money supplydataset: https://docs.google.com/spreadsheets/d/1ofoa4V2hxwjJLhcRyR8J_mnfAcB1DIZ8/edit?usp=sharing&ouid=116278907272817573735&rtpof=true&sd=true↩︎Link to
Global Inflation Rate: https://docs.google.com/spreadsheets/d/1nMocnrsLW6IffI528BJYKy11mjfXKpgm/edit?usp=sharing&ouid=116278907272817573735&rtpof=true&sd=true↩︎Link to
custom dataset: https://docs.google.com/spreadsheets/d/1iIK7zBqW9gqbwSmxVG5J9P1kIiw3vam-/edit?usp=sharing&ouid=116278907272817573735&rtpof=true&sd=true↩︎Link to
Cook's Distanceexplanation: https://towardsdatascience.com/identifying-outliers-in-linear-regression-cooks-distance-9e212e9136a↩︎Link to
Test and Validation Datasets: https://machinelearningmastery.com/difference-test-validation-datasets/↩︎Link to
Multicollinearity: https://www.jmp.com/en_us/statistics-knowledge-portal/what-is-multiple-regression/multicollinearity.html↩︎VIFdefined here: https://online.stat.psu.edu/stat462/node/180/↩︎For an argument on what determined a “good”
VIFlevel: https://www.researchgate.net/post/Multicollinearity_issues_is_a_value_less_than_10_acceptable_for_VIF↩︎Link to
Global Inflation Rate: https://docs.google.com/spreadsheets/d/1nMocnrsLW6IffI528BJYKy11mjfXKpgm/edit?usp=sharing&ouid=116278907272817573735&rtpof=true&sd=true↩︎Explanation on
RMSPEat: https://www.sciencedirect.com/topics/engineering/root-mean-squared-error↩︎Calculating Confidence Intervals in R: https://bookdown.org/logan_kelly/r_practice/p09.html↩︎
Bitcoin Price for January 2022: https://infomediang.com/`Bitcoin`-price-for-january-2022/↩︎See
Bitcoin Price Forecasting (Time-Series)using theProphetpackage at: https://rpubs.com/KrisKilgroe/BTC_price_forecasting↩︎