library(readxl)
library(ggplot2)
library(sensemakr)
readxl: This package provides functions to easily read Excel files into R. It allows you to import data from Excel spreadsheets directly into R data frames, making it convenient for data analysis and manipulation.
ggplot2: ggplot2 is a powerful data visualization package in R that implements the grammar of graphics. It allows you to create a wide variety of static and dynamic plots with a consistent and flexible syntax. ggplot2 is highly customizable, enabling you to create publication-quality graphics for data exploration and presentation.
sensemakr: This package is designed for sensitivity analysis in observational studies. It provides tools for assessing the sensitivity of causal inference to unmeasured confounding, measurement error, selection bias, and other sources of bias. It allows researchers to evaluate the robustness of their findings and make more informed decisions in the presence of potential biases.
Cryptocurrency data was collected from two popular websites, CoinMarketCap and CoinGecko, spanning from April 29, 2018, to April 30, 2023. These websites serve as comprehensive platforms for tracking the prices, market capitalizations, trading volumes, and other relevant metrics of various cryptocurrencies. The data scraped from these sources provides insights into the historical performance and trends of cryptocurrencies over a five-year period. This dataset offers valuable information for analyzing the volatility, liquidity, and overall market dynamics of cryptocurrencies, aiding in research, investment decisions, and market analysis within the cryptocurrency ecosystem.
# read in the data
Dataf <- read.csv("G://SEM2//TEN//TEN_Finalproject//Finaldata.csv")
A linear regression model is performed. It models the relationship between the dependent variable BTCprice and the independent variables BTCcap, BTCvol, and ETHprice using the data contained in the dataframe Dataf and summarized.
model <- lm(BTCprice ~ BTCcap + BTCvol + ETHprice, data = Dataf)
summary(model)$r.squared
## [1] 0.9996794
The R-squared value from the summary of the model is extracted. The R-squared value, also known as the coefficient of determination, measures the proportion of the variance in the dependent variable (BTCprice in this case) that is predictable from the independent variables (BTCcap, BTCvol, ETHprice). It ranges from 0 to 1, where 1 indicates that all variability in the dependent variable is explained by the independent variables, and 0 indicates that none of the variability is explained.
Sensitivity Analysis sensemakr() - The goal of sensemakr is to make it easier to understand the impact that omitted variables would have on a regression result.
In the context of sensitivity analysis, “q” refers to the level of bias reduction that is deemed problematic in the null hypothesis. Specifically, it represents the minimum amount of bias that would need to be present to reduce the absolute value of the estimated treatment effect to zero. For example, if we set q = 1, we are assuming that any bias that reduces the absolute value of the estimated treatment effect by at least 100% is problematic and should be considered in the sensitivity analysis. If the robustness value for q = 1 is low, it suggests that the estimated treatment effect is relatively robust to unobserved confounding, whereas a high robustness value indicates that the estimated effect is more sensitive to such biases.Partial R squared value - Treatment variable’s contribution to r squared value of the mode. Sensitivity Analysis 1 Model : {Bitcoin Price}~{Bitcoin volume + Bitcoin Market cap} + {Ethereum Price} Outcome variable - Bitcoin Price Benchmark variables- Bitcoin volume + Bitcoin Market cap Treatment variable - Ethereum Price
# run the sensitivity analysis
sensitivity <- sensemakr( model = model, treatment = "ETHprice",benchmark_covariates = c("BTCcap", "BTCvol"), kd = 0.1,ky = 0.1, q = 1,alpha = 0.05,
reduce = TRUE)
## Warning in ovb_partial_r2_bound.numeric(r2dxj.x = r2dxj.x[i], r2yxj.dx =
## r2yxj.dx[i], : Implied bound on r2yz.dx greater than 1, try lower kd and/or ky.
## Setting r2yz.dx to 1.
# summarize the sensitivity analysis
summary(sensitivity)
## Sensitivity Analysis to Unobserved Confounding
##
## Model Formula: BTCprice ~ BTCcap + BTCvol + ETHprice
##
## Null hypothesis: q = 1 and reduce = TRUE
## -- This means we are considering biases that reduce the absolute value of the current estimate.
## -- The null hypothesis deemed problematic is H0:tau = 0
##
## Unadjusted Estimates of 'ETHprice':
## Coef. estimate: -0.4245
## Standard Error: 0.0175
## t-value (H0:tau = 0): -24.1961
##
## Sensitivity Statistics:
## Partial R2 of treatment with outcome: 0.243
## Robustness Value, q = 1: 0.4283
## Robustness Value, q = 1, alpha = 0.05: 0.4024
##
## Verbal interpretation of sensitivity statistics:
##
## -- Partial R2 of the treatment with the outcome: an extreme confounder (orthogonal to the covariates) that explains 100% of the residual variance of the outcome, would need to explain at least 24.3% of the residual variance of the treatment to fully account for the observed estimated effect.
##
## -- Robustness Value, q = 1: unobserved confounders (orthogonal to the covariates) that explain more than 42.83% of the residual variance of both the treatment and the outcome are strong enough to bring the point estimate to 0 (a bias of 100% of the original estimate). Conversely, unobserved confounders that do not explain more than 42.83% of the residual variance of both the treatment and the outcome are not strong enough to bring the point estimate to 0.
##
## -- Robustness Value, q = 1, alpha = 0.05: unobserved confounders (orthogonal to the covariates) that explain more than 40.24% of the residual variance of both the treatment and the outcome are strong enough to bring the estimate to a range where it is no longer 'statistically different' from 0 (a bias of 100% of the original estimate), at the significance level of alpha = 0.05. Conversely, unobserved confounders that do not explain more than 40.24% of the residual variance of both the treatment and the outcome are not strong enough to bring the estimate to a range where it is no longer 'statistically different' from 0, at the significance level of alpha = 0.05.
##
## Bounds on omitted variable bias:
##
## --The table below shows the maximum strength of unobserved confounders with association with the treatment and the outcome bounded by a multiple of the observed explanatory power of the chosen benchmark covariate(s).
##
## Bound Label R2dz.x R2yz.dx Treatment Adjusted Estimate Adjusted Se Adjusted T
## 0.1x BTCcap 0.6890 1.0000 ETHprice 0.6908 0.0000 Inf
## 0.1x BTCvol 0.0229 0.0026 ETHprice -0.4187 0.0177 -23.614
## Adjusted Lower CI Adjusted Upper CI
## 0.6908 0.6908
## -0.4535 -0.3839
Data frame is created with the bounds on omitted variable bias
# Create a data frame with the bounds on omitted variable bias
bounds <- data.frame(
Bound = c("1x market_cap", "2x market_cap"),
R2dz.x = c(0.6890,0.0229),
R2yz.dx = c(1.0000,0.006),
Treatment = rep(c("Market cap", "Volume"), each = 1),
Adjusted_Estimate = c(0.6908,-0.4187),
Adjusted_Lower_CI = c(0.6908,-0.4535),
Adjusted_Upper_CI = c(0.6908,-0.3839)
)
Adjusted estimates and confidence intervals are plotted
#Plot the adjusted estimates and confidence intervals
ggplot(bounds, aes(x = Bound, y = Adjusted_Estimate, ymin = Adjusted_Lower_CI, ymax = Adjusted_Upper_CI, fill = Treatment)) +
geom_bar(stat = "identity", position = "dodge") +
geom_errorbar(position = position_dodge(width = 0.9), width = 0.2) +
labs(x = "Bounds on omitted variable bias", y = "Adjusted Estimate", fill = "Treatment") +
scale_fill_manual(values = c("#F7931A", "#8C8C8C"), name = "") +
theme_classic()
Similar process is done in model 2 Sensitivity Analysis 2 Model :
{Ethereum Price }~{Ethereum volume + Ethereum Market cap} + {Bitcoin
Price}
Outcome variable - Ethereum Price Benchmark variables- Ethereum volume + Ethereum Market cap Treatment variable - Bitcoin Price
model2 <- lm(ETHprice ~ ETHcap + ETHvol + BTCprice, data = Dataf)
summary(model)$r.squared
## [1] 0.9996794
# run the sensitivity analysis
sensitivity_2 <- sensemakr( model = model2, treatment = "BTCprice",benchmark_covariates = c("ETHcap", "ETHvol"), kd = 0.1,ky = 0.1, q = 1,alpha = 0.05,
reduce = TRUE)
## Warning in ovb_partial_r2_bound.numeric(r2dxj.x = r2dxj.x[i], r2yxj.dx =
## r2yxj.dx[i], : Implied bound on r2yz.dx greater than 1, try lower kd and/or ky.
## Setting r2yz.dx to 1.
# summarize the sensitivity analysis
summary(sensitivity_2)
## Sensitivity Analysis to Unobserved Confounding
##
## Model Formula: ETHprice ~ ETHcap + ETHvol + BTCprice
##
## Null hypothesis: q = 1 and reduce = TRUE
## -- This means we are considering biases that reduce the absolute value of the current estimate.
## -- The null hypothesis deemed problematic is H0:tau = 0
##
## Unadjusted Estimates of 'BTCprice':
## Coef. estimate: 0.0015
## Standard Error: 1e-04
## t-value (H0:tau = 0): 12.8357
##
## Sensitivity Statistics:
## Partial R2 of treatment with outcome: 0.0828
## Robustness Value, q = 1: 0.2588
## Robustness Value, q = 1, alpha = 0.05: 0.2242
##
## Verbal interpretation of sensitivity statistics:
##
## -- Partial R2 of the treatment with the outcome: an extreme confounder (orthogonal to the covariates) that explains 100% of the residual variance of the outcome, would need to explain at least 8.28% of the residual variance of the treatment to fully account for the observed estimated effect.
##
## -- Robustness Value, q = 1: unobserved confounders (orthogonal to the covariates) that explain more than 25.88% of the residual variance of both the treatment and the outcome are strong enough to bring the point estimate to 0 (a bias of 100% of the original estimate). Conversely, unobserved confounders that do not explain more than 25.88% of the residual variance of both the treatment and the outcome are not strong enough to bring the point estimate to 0.
##
## -- Robustness Value, q = 1, alpha = 0.05: unobserved confounders (orthogonal to the covariates) that explain more than 22.42% of the residual variance of both the treatment and the outcome are strong enough to bring the estimate to a range where it is no longer 'statistically different' from 0 (a bias of 100% of the original estimate), at the significance level of alpha = 0.05. Conversely, unobserved confounders that do not explain more than 22.42% of the residual variance of both the treatment and the outcome are not strong enough to bring the estimate to a range where it is no longer 'statistically different' from 0, at the significance level of alpha = 0.05.
##
## Bounds on omitted variable bias:
##
## --The table below shows the maximum strength of unobserved confounders with association with the treatment and the outcome bounded by a multiple of the observed explanatory power of the chosen benchmark covariate(s).
##
## Bound Label R2dz.x R2yz.dx Treatment Adjusted Estimate Adjusted Se Adjusted T
## 0.1x ETHcap 0.4460 1.0000 BTCprice -0.0030 0e+00 -Inf
## 0.1x ETHvol 0.0328 0.0031 BTCprice 0.0015 1e-04 12.2104
## Adjusted Lower CI Adjusted Upper CI
## -0.0030 -0.0030
## 0.0012 0.0017
# Create a data frame with the bounds on omitted variable bias
bounds <- data.frame(
Bound = c("1x market_cap", "2x market_cap"),
R2dz.x = c(0.4460,0.0328),
R2yz.dx = c(1.0000,0.0031),
Treatment = rep(c("Market cap", "Volume"), each = 1),
Adjusted_Estimate = c(-0.0030,0.0015),
Adjusted_Lower_CI = c(-0.0030,0.0012),
Adjusted_Upper_CI = c(-0.0030,0.0017)
)
#Plot the adjusted estimates and confidence intervals
ggplot(bounds, aes(x = Bound, y = Adjusted_Estimate, ymin = Adjusted_Lower_CI, ymax = Adjusted_Upper_CI, fill = Treatment)) +
geom_bar(stat = "identity", position = "dodge") +
geom_errorbar(position = position_dodge(width = 0.9), width = 0.2) +
labs(x = "Bounds on omitted variable bias", y = "Adjusted Estimate", fill = "Treatment") +
scale_fill_manual(values = c("#F7931A", "#8C8C8C"), name = "") +
theme_classic()
Plot both Sensitivity plots
plot(sensitivity)
plot(sensitivity_2)