Getting started

Load packages

library(readxl)
library(ggplot2)
library(sensemakr)

readxl: This package provides functions to easily read Excel files into R. It allows you to import data from Excel spreadsheets directly into R data frames, making it convenient for data analysis and manipulation.

ggplot2: ggplot2 is a powerful data visualization package in R that implements the grammar of graphics. It allows you to create a wide variety of static and dynamic plots with a consistent and flexible syntax. ggplot2 is highly customizable, enabling you to create publication-quality graphics for data exploration and presentation.

sensemakr: This package is designed for sensitivity analysis in observational studies. It provides tools for assessing the sensitivity of causal inference to unmeasured confounding, measurement error, selection bias, and other sources of bias. It allows researchers to evaluate the robustness of their findings and make more informed decisions in the presence of potential biases.

Data

Cryptocurrency data was collected from two popular websites, CoinMarketCap and CoinGecko, spanning from April 29, 2018, to April 30, 2023. These websites serve as comprehensive platforms for tracking the prices, market capitalizations, trading volumes, and other relevant metrics of various cryptocurrencies. The data scraped from these sources provides insights into the historical performance and trends of cryptocurrencies over a five-year period. This dataset offers valuable information for analyzing the volatility, liquidity, and overall market dynamics of cryptocurrencies, aiding in research, investment decisions, and market analysis within the cryptocurrency ecosystem.

# read in the data
Dataf <- read.csv("G://SEM2//TEN//TEN_Finalproject//Finaldata.csv")

A linear regression model is performed. It models the relationship between the dependent variable BTCprice and the independent variables BTCcap, BTCvol, and ETHprice using the data contained in the dataframe Dataf and summarized.

  model <- lm(BTCprice ~ BTCcap + BTCvol + ETHprice, data = Dataf)
  summary(model)$r.squared
## [1] 0.9996794

The R-squared value from the summary of the model is extracted. The R-squared value, also known as the coefficient of determination, measures the proportion of the variance in the dependent variable (BTCprice in this case) that is predictable from the independent variables (BTCcap, BTCvol, ETHprice). It ranges from 0 to 1, where 1 indicates that all variability in the dependent variable is explained by the independent variables, and 0 indicates that none of the variability is explained.

Sensitivity Analysis sensemakr() - The goal of sensemakr is to make it easier to understand the impact that omitted variables would have on a regression result.

In the context of sensitivity analysis, “q” refers to the level of bias reduction that is deemed problematic in the null hypothesis. Specifically, it represents the minimum amount of bias that would need to be present to reduce the absolute value of the estimated treatment effect to zero. For example, if we set q = 1, we are assuming that any bias that reduces the absolute value of the estimated treatment effect by at least 100% is problematic and should be considered in the sensitivity analysis. If the robustness value for q = 1 is low, it suggests that the estimated treatment effect is relatively robust to unobserved confounding, whereas a high robustness value indicates that the estimated effect is more sensitive to such biases.Partial R squared value - Treatment variable’s contribution to r squared value of the mode. Sensitivity Analysis 1 Model : {Bitcoin Price}~{Bitcoin volume + Bitcoin Market cap} + {Ethereum Price} Outcome variable - Bitcoin Price Benchmark variables- Bitcoin volume + Bitcoin Market cap Treatment variable - Ethereum Price

   # run the sensitivity analysis
   sensitivity <- sensemakr( model = model, treatment = "ETHprice",benchmark_covariates = c("BTCcap", "BTCvol"), kd = 0.1,ky = 0.1, q = 1,alpha = 0.05,
    reduce = TRUE)
## Warning in ovb_partial_r2_bound.numeric(r2dxj.x = r2dxj.x[i], r2yxj.dx =
## r2yxj.dx[i], : Implied bound on r2yz.dx greater than 1, try lower kd and/or ky.
## Setting r2yz.dx to 1.
 # summarize the sensitivity analysis
   summary(sensitivity)
## Sensitivity Analysis to Unobserved Confounding
## 
## Model Formula: BTCprice ~ BTCcap + BTCvol + ETHprice
## 
## Null hypothesis: q = 1 and reduce = TRUE 
## -- This means we are considering biases that reduce the absolute value of the current estimate.
## -- The null hypothesis deemed problematic is H0:tau = 0 
## 
## Unadjusted Estimates of 'ETHprice': 
##   Coef. estimate: -0.4245 
##   Standard Error: 0.0175 
##   t-value (H0:tau = 0): -24.1961 
## 
## Sensitivity Statistics:
##   Partial R2 of treatment with outcome: 0.243 
##   Robustness Value, q = 1: 0.4283 
##   Robustness Value, q = 1, alpha = 0.05: 0.4024 
## 
## Verbal interpretation of sensitivity statistics:
## 
## -- Partial R2 of the treatment with the outcome: an extreme confounder (orthogonal to the covariates) that explains 100% of the residual variance of the outcome, would need to explain at least 24.3% of the residual variance of the treatment to fully account for the observed estimated effect.
## 
## -- Robustness Value, q = 1: unobserved confounders (orthogonal to the covariates) that explain more than 42.83% of the residual variance of both the treatment and the outcome are strong enough to bring the point estimate to 0 (a bias of 100% of the original estimate). Conversely, unobserved confounders that do not explain more than 42.83% of the residual variance of both the treatment and the outcome are not strong enough to bring the point estimate to 0.
## 
## -- Robustness Value, q = 1, alpha = 0.05: unobserved confounders (orthogonal to the covariates) that explain more than 40.24% of the residual variance of both the treatment and the outcome are strong enough to bring the estimate to a range where it is no longer 'statistically different' from 0 (a bias of 100% of the original estimate), at the significance level of alpha = 0.05. Conversely, unobserved confounders that do not explain more than 40.24% of the residual variance of both the treatment and the outcome are not strong enough to bring the estimate to a range where it is no longer 'statistically different' from 0, at the significance level of alpha = 0.05.
## 
## Bounds on omitted variable bias:
## 
## --The table below shows the maximum strength of unobserved confounders with association with the treatment and the outcome bounded by a multiple of the observed explanatory power of the chosen benchmark covariate(s).
## 
##  Bound Label R2dz.x R2yz.dx Treatment Adjusted Estimate Adjusted Se Adjusted T
##  0.1x BTCcap 0.6890  1.0000  ETHprice            0.6908      0.0000        Inf
##  0.1x BTCvol 0.0229  0.0026  ETHprice           -0.4187      0.0177    -23.614
##  Adjusted Lower CI Adjusted Upper CI
##             0.6908            0.6908
##            -0.4535           -0.3839

Data frame is created with the bounds on omitted variable bias

 # Create a data frame with the bounds on omitted variable bias
   bounds <- data.frame(
     Bound = c("1x market_cap", "2x market_cap"),
     R2dz.x = c(0.6890,0.0229),
     R2yz.dx = c(1.0000,0.006),
     Treatment = rep(c("Market cap", "Volume"), each = 1),
    
   Adjusted_Estimate = c(0.6908,-0.4187),
     Adjusted_Lower_CI = c(0.6908,-0.4535),
     Adjusted_Upper_CI = c(0.6908,-0.3839)
   )

Adjusted estimates and confidence intervals are plotted

#Plot the adjusted estimates and confidence intervals
  ggplot(bounds, aes(x = Bound, y = Adjusted_Estimate, ymin = Adjusted_Lower_CI, ymax = Adjusted_Upper_CI, fill = Treatment)) +
     geom_bar(stat = "identity", position = "dodge") +
     geom_errorbar(position = position_dodge(width = 0.9), width = 0.2) +
     labs(x = "Bounds on omitted variable bias", y = "Adjusted Estimate", fill = "Treatment") +
    scale_fill_manual(values = c("#F7931A", "#8C8C8C"), name = "") +
    theme_classic()

Similar process is done in model 2 Sensitivity Analysis 2 Model : {Ethereum Price }~{Ethereum volume + Ethereum Market cap} + {Bitcoin Price}

Outcome variable - Ethereum Price Benchmark variables- Ethereum volume + Ethereum Market cap Treatment variable - Bitcoin Price

  model2 <- lm(ETHprice ~ ETHcap + ETHvol + BTCprice, data = Dataf)
  summary(model)$r.squared
## [1] 0.9996794
  # run the sensitivity analysis
  sensitivity_2 <- sensemakr( model = model2, treatment = "BTCprice",benchmark_covariates = c("ETHcap", "ETHvol"), kd = 0.1,ky = 0.1, q = 1,alpha = 0.05,
                            reduce = TRUE)
## Warning in ovb_partial_r2_bound.numeric(r2dxj.x = r2dxj.x[i], r2yxj.dx =
## r2yxj.dx[i], : Implied bound on r2yz.dx greater than 1, try lower kd and/or ky.
## Setting r2yz.dx to 1.
  # summarize the sensitivity analysis
  summary(sensitivity_2)
## Sensitivity Analysis to Unobserved Confounding
## 
## Model Formula: ETHprice ~ ETHcap + ETHvol + BTCprice
## 
## Null hypothesis: q = 1 and reduce = TRUE 
## -- This means we are considering biases that reduce the absolute value of the current estimate.
## -- The null hypothesis deemed problematic is H0:tau = 0 
## 
## Unadjusted Estimates of 'BTCprice': 
##   Coef. estimate: 0.0015 
##   Standard Error: 1e-04 
##   t-value (H0:tau = 0): 12.8357 
## 
## Sensitivity Statistics:
##   Partial R2 of treatment with outcome: 0.0828 
##   Robustness Value, q = 1: 0.2588 
##   Robustness Value, q = 1, alpha = 0.05: 0.2242 
## 
## Verbal interpretation of sensitivity statistics:
## 
## -- Partial R2 of the treatment with the outcome: an extreme confounder (orthogonal to the covariates) that explains 100% of the residual variance of the outcome, would need to explain at least 8.28% of the residual variance of the treatment to fully account for the observed estimated effect.
## 
## -- Robustness Value, q = 1: unobserved confounders (orthogonal to the covariates) that explain more than 25.88% of the residual variance of both the treatment and the outcome are strong enough to bring the point estimate to 0 (a bias of 100% of the original estimate). Conversely, unobserved confounders that do not explain more than 25.88% of the residual variance of both the treatment and the outcome are not strong enough to bring the point estimate to 0.
## 
## -- Robustness Value, q = 1, alpha = 0.05: unobserved confounders (orthogonal to the covariates) that explain more than 22.42% of the residual variance of both the treatment and the outcome are strong enough to bring the estimate to a range where it is no longer 'statistically different' from 0 (a bias of 100% of the original estimate), at the significance level of alpha = 0.05. Conversely, unobserved confounders that do not explain more than 22.42% of the residual variance of both the treatment and the outcome are not strong enough to bring the estimate to a range where it is no longer 'statistically different' from 0, at the significance level of alpha = 0.05.
## 
## Bounds on omitted variable bias:
## 
## --The table below shows the maximum strength of unobserved confounders with association with the treatment and the outcome bounded by a multiple of the observed explanatory power of the chosen benchmark covariate(s).
## 
##  Bound Label R2dz.x R2yz.dx Treatment Adjusted Estimate Adjusted Se Adjusted T
##  0.1x ETHcap 0.4460  1.0000  BTCprice           -0.0030       0e+00       -Inf
##  0.1x ETHvol 0.0328  0.0031  BTCprice            0.0015       1e-04    12.2104
##  Adjusted Lower CI Adjusted Upper CI
##            -0.0030           -0.0030
##             0.0012            0.0017
  # Create a data frame with the bounds on omitted variable bias
  bounds <- data.frame(
    Bound = c("1x market_cap", "2x market_cap"),
    R2dz.x = c(0.4460,0.0328),
    R2yz.dx = c(1.0000,0.0031),
    Treatment = rep(c("Market cap", "Volume"), each = 1),
    
    Adjusted_Estimate = c(-0.0030,0.0015),
    Adjusted_Lower_CI = c(-0.0030,0.0012),
    Adjusted_Upper_CI = c(-0.0030,0.0017)
  )

  
 
  #Plot the adjusted estimates and confidence intervals
  ggplot(bounds, aes(x = Bound, y = Adjusted_Estimate, ymin = Adjusted_Lower_CI, ymax = Adjusted_Upper_CI, fill = Treatment)) +
    geom_bar(stat = "identity", position = "dodge") +
    geom_errorbar(position = position_dodge(width = 0.9), width = 0.2) +
    labs(x = "Bounds on omitted variable bias", y = "Adjusted Estimate", fill = "Treatment") +
    scale_fill_manual(values = c("#F7931A", "#8C8C8C"), name = "") +
    theme_classic()

Plot both Sensitivity plots

  plot(sensitivity)

 plot(sensitivity_2)