Libraries & Packages

#install.packages("tidyquant")
#install.packages("tidyverse")
#install.packages("recipes")


library(tidyquant)
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
## ── Attaching core tidyquant packages ─────────────────────── tidyquant 1.0.11 ──
## ✔ PerformanceAnalytics 2.0.8      ✔ TTR                  0.24.4
## ✔ quantmod             0.4.27     ✔ xts                  0.14.1
## ── Conflicts ────────────────────────────────────────── tidyquant_conflicts() ──
## ✖ zoo::as.Date()                 masks base::as.Date()
## ✖ zoo::as.Date.numeric()         masks base::as.Date.numeric()
## ✖ PerformanceAnalytics::legend() masks graphics::legend()
## ✖ quantmod::summary()            masks base::summary()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.2     ✔ tibble    3.2.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.4     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::first()  masks xts::first()
## ✖ dplyr::lag()    masks stats::lag()
## ✖ dplyr::last()   masks xts::last()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Data Preparation

# Download Yahoo Finance data: Bitcoin, Gold, VIX
symbols <- c("BTC-USD", "GC=F", "^VIX")
market_data <- tq_get(symbols, 
                      from = "2020-01-01", 
                      to = "2023-12-31") %>%
  select(symbol, date, adjusted) %>%
  pivot_wider(names_from = symbol, values_from = adjusted) %>%
  rename(BTC = `BTC-USD`, Gold = `GC=F`, VIX = `^VIX`)

# Download DXY (U.S. Dollar Index) from FRED
getSymbols("DTWEXBGS", src = "FRED")
## [1] "DTWEXBGS"
dxy <- DTWEXBGS %>%
  as_tibble(rownames = "date") %>%
  rename(DXY = DTWEXBGS) %>%
  mutate(date = as.Date(date))

# Ensure date formats match
market_data <- market_data %>%
  mutate(date = as.Date(date))

# Merge DXY with Yahoo data
market_data <- market_data %>%
  left_join(dxy, by = "date") %>%
  drop_na(BTC, Gold, VIX, DXY)

# Save the full dataset for later reuse
saveRDS(market_data, file = "market_data.rds")

# Preview
head(market_data)
## # A tibble: 6 × 5
##   date         BTC  Gold   VIX   DXY
##   <date>     <dbl> <dbl> <dbl> <dbl>
## 1 2020-01-02 6985. 1524.  12.5  115.
## 2 2020-01-03 7345. 1549.  14.0  115.
## 3 2020-01-06 7769. 1566.  13.9  115.
## 4 2020-01-07 8164. 1572.  13.8  115.
## 5 2020-01-08 8080. 1557.  13.4  115.
## 6 2020-01-09 7879. 1552.  12.5  115.

Research question

Does Bitcoin behave like a non-traditional store of value by demonstrating a relationship with traditional financial indicators such as gold prices, the U.S. Dollar Index (DXY), and the Volatility Index (VIX)? Specifically, is there evidence of negative correlation with the dollar index and positive correlation with gold, similar to other stores of value?

Cases

Each case is a daily observation from financial markets, covering approximately 1,000 days from January 2020 to December 2023. Each row represents a day and contains the adjusted closing price for each asset.

Data collection

The data is gathered from public financial APIs using the tidyquant R package, which interfaces with Yahoo Finance and FRED. The data includes historical prices for Bitcoin (BTC), gold, the VIX (volatility index), and the U.S. Dollar Index (DXY).

Type of study

This is an observational study using historical financial market data with no experimental intervention.

Data Source

The data is not self-collected. All financial series are sourced from: - Bitcoin: Yahoo Finance - BTC-USD - Gold: Yahoo Finance - GC=F - VIX: Yahoo Finance - ^VIX - U.S. Dollar Index (to be added separately from FRED: https://fred.stlouisfed.org/series/DTWEXBGS)

Describe your variables?

All variables are quantitative (numeric): - BTC – Daily closing price of Bitcoin (dependent variable) - Gold – Daily closing price of gold - VIX – Daily value of the volatility index - date – Date (used for merging, not for modeling)

The dependent variable is Bitcoin price, and the others are potential explanatory variables in the regression.

Relevant summary statistics

summary(market_data)
##       date                 BTC             Gold           VIX       
##  Min.   :2020-01-02   Min.   : 4971   Min.   :1477   Min.   :12.07  
##  1st Qu.:2021-01-01   1st Qu.:17003   1st Qu.:1757   1st Qu.:17.23  
##  Median :2022-01-01   Median :27377   Median :1828   Median :21.39  
##  Mean   :2021-12-30   Mean   :28890   Mean   :1828   Mean   :22.89  
##  3rd Qu.:2022-12-29   3rd Qu.:40025   3rd Qu.:1919   3rd Qu.:26.37  
##  Max.   :2023-12-29   Max.   :67567   Max.   :2082   Max.   :82.69  
##       DXY       
##  Min.   :110.5  
##  1st Qu.:114.5  
##  Median :118.2  
##  Mean   :118.0  
##  3rd Qu.:121.1  
##  Max.   :128.5
# Correlation matrix
cor(market_data %>% select(BTC, Gold, VIX, DXY), use = "complete.obs")
##             BTC       Gold        VIX        DXY
## BTC   1.0000000  0.2171814 -0.4207926 -0.5094088
## Gold  0.2171814  1.0000000 -0.3934708 -0.1000124
## VIX  -0.4207926 -0.3934708  1.0000000  0.2739848
## DXY  -0.5094088 -0.1000124  0.2739848  1.0000000
# Scatterplot matrix
pairs(~ BTC + Gold + VIX + DXY, data = market_data)

# Linear regression model
model <- lm(BTC ~ Gold + VIX + DXY, data = market_data)
summary(model)
## 
## Call:
## lm(formula = BTC ~ Gold + VIX + DXY, data = market_data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -29102.7  -6778.2   -611.7   9062.0  29559.0 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 204917.874  13117.716  15.621   <2e-16 ***
## Gold             8.585      3.686   2.329     0.02 *  
## VIX           -493.619     51.200  -9.641   <2e-16 ***
## DXY          -1528.709     95.571 -15.995   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 12210 on 990 degrees of freedom
## Multiple R-squared:  0.3486, Adjusted R-squared:  0.3466 
## F-statistic: 176.6 on 3 and 990 DF,  p-value: < 2.2e-16