graph_ratio <-ggplot(eth_btc, aes(x = ratio)) +geom_histogram(aes(y = ..density..), bins =150, fill ="lightblue", color ="black") +geom_density(color ="red", size =1) +labs(title ="Histogram of ETH/BTC Price Ratio",x ="ETH/BTC Ratio", y ="Density")simple_ratio <-ggplot(eth_btc, aes(x=date, y=ratio))+geom_line()#qqnorm(log(eth_btc$ratio), main = "QQ Plot of log(ETH/BTC) Ratio")#qqline(log(eth_btc$ratio), col = "red", lwd = 2)# For example, a mixture of 2 lognormal distributions# You’d fit to log_ratio, then exponentiate to interpret.# mix_model <- normalmixEM(log(eth_btc$ratio)[is.finite(log(eth_btc$ratio))], k = 2)# summary(mix_model)# plot(mix_model, which = 2) # plot the components
Here we see the ratio, and the distribution of the observation. It looks like camel’s back with two likely normal distribution. Now lets test for normality.
Code
simple_ratio
Code
graph_ratio
Now, lets test for normality. The dark line is the observed data and the red line is the expected or theoretical line of a normal distribution. S
Since the observation does not follow the theoretical line and there is deviation (since the quantiles of our data does not follow the QQline) we can conclude that our data is not normally distributed.
Code
qqnorm(log(eth_btc$ratio), main ="QQ Plot of log(ETH/BTC) Ratio")qqline(log(eth_btc$ratio), col ="red", lwd =2)
Given that it was looking like a bactrian camel’s back I wonder if it has two normal distribution within the same data. Each representing a regime. And it seems we are right.
Code
mix_model <-normalmixEM(log(eth_btc$ratio)[is.finite(log(eth_btc$ratio))], k =2)
number of iterations= 45
Code
summary(mix_model)
summary of normalmixEM object:
comp 1 comp 2
lambda 0.492086 0.507914
mu -3.499291 -2.719402
sigma 0.277704 0.155276
loglik at estimate: -1189.445
Code
plot(mix_model, which =2) # plot the components
Regime Summary Interpretation
Your model identifies two distinct regimes for the log-transformed ETH/BTC ratio:
Regime 1 (49.2% probability)
Mean (μ₁): -3.499 (log ratio)
Volatility (σ₁): 0.278
Implied ETH/BTC ratio: exp(-3.499) ≈ 0.030 BTC per ETH
Interpretation: A “low ETH valuation” regime where ETH is relatively cheap compared to BTC.
Regime 2 (50.8% probability)
Mean (μ₂): -2.719 (log ratio)
Volatility (σ₂): 0.155
Implied ETH/BTC ratio: exp(-2.719) ≈ 0.066 BTC per ETH
Interpretation: A “high ETH valuation” regime where ETH is relatively expensive compared to BTC.
Regime Assignment
First, classify historical data into regimes using posterior probabilities from mix_model$posterior:
Calculate how far the current ratio deviates from its regime’s mean (in standard deviations):
Code
# Load required libraries# Plot 1: ETH/BTC Ratio with Regimes and Signalsplot_ratio <-ggplot(eth_btc, aes(x = date)) +# Plot ETH/BTC ratiogeom_line(aes(y = ratio, color =as.factor(regime)), linewidth =0.8) +# Add trading signalsgeom_point(data =subset(eth_btc, signal !="Hold"),aes(y = ratio, shape = signal, color =as.factor(regime)),size =3, alpha =0.8) +# Custom colors and labelsscale_color_manual(values =c("1"="#FF6B6B", "2"="#4ECDC4"),name ="Regime") +scale_shape_manual(values =c("Short ETH/BTC (overvalued)"=6,"Long ETH/BTC (undervalued)"=2)) +labs(title ="ETH/BTC Ratio with Regimes and Trading Signals",y ="ETH/BTC Ratio",x ="Date") +theme_minimal() +theme(legend.position ="bottom")# Plot 2: Z-Scores with Thresholdsplot_zscore <-ggplot(eth_btc, aes(x = date)) +geom_line(aes(y = z_score, color ="Z-Score"), linewidth =0.8) +geom_hline(yintercept =c(-lower_mean_reversion_threshold, upper_mean_reversion_threshold),linetype ="dashed", color ="gray40") +annotate("text", x =min(eth_btc$date), y = upper_mean_reversion_threshold +0.1,label ="Overvalued Threshold", hjust =0, color ="gray40") +annotate("text", x =min(eth_btc$date),y =-lower_mean_reversion_threshold -0.1,label ="Undervalued Threshold", hjust =0, color ="gray40") +labs(title ="Z-Score with Mean Reversion Thresholds",y ="Z-Score",x ="Date") +theme_minimal() +theme(legend.position ="none")# Combine both plotslibrary(patchwork)combined_plot <- plot_ratio / plot_zscore +plot_layout(heights =c(2, 1))print(combined_plot)
Now that we have two regimes, for each regime we have a different statistical summary. And I used those to calculate the Z score of data within that regime. As a result we have the following data.
Source Code
---title: "PulsarPeak Capital Investment Philosophy"author: "Aftikhar"format: html: code-fold: true code-tools: true mainfont: "Times New Roman" self-contained: true grid: sidebar-width: 50px body-width: 1400px margin-width: 50pxexecute: echo: trueCSS: styles.cssfontsize: 12pt---```{r include = FALSE}knitr::opts_chunk$set(warning = F,message = F, fig.align = "center",tidy = FALSE, strip.white = TRUE)library(quantmod)library(magrittr)library(dplyr)library(ggplot2)library(plotly)library(mixtools)library(ggplot2)library(dplyr)library(scales)``````{r}getSymbols(c("BTC-USD", "ETH-USD"), src ="yahoo", auto.assign =TRUE)btcp <-data.frame(data=index(`BTC-USD`), coredata(`BTC-USD`)) %>%transmute(date = data, BTC_price = BTC.USD.Adjusted, BTC_volume = BTC.USD.Volume)ethp <-data.frame(data=index(`ETH-USD`), coredata(`ETH-USD`)) %>%transmute(date = data, ETH_price = ETH.USD.Adjusted, ETH_volume = ETH.USD.Volume)eth_btc <-inner_join(btcp, ethp, by ="date") %>%transmute(date, ratio = ETH_price/BTC_price) %>%na.omit()any(is.na(eth_btc$ratio) |is.infinite(eth_btc$ratio) | eth_btc$ratio ==0)which(is.na(eth_btc$ratio) |is.infinite(eth_btc$ratio) | eth_btc$ratio ==0)graph_ratio <-ggplot(eth_btc, aes(x = ratio)) +geom_histogram(aes(y = ..density..), bins =150, fill ="lightblue", color ="black") +geom_density(color ="red", size =1) +labs(title ="Histogram of ETH/BTC Price Ratio",x ="ETH/BTC Ratio", y ="Density")simple_ratio <-ggplot(eth_btc, aes(x=date, y=ratio))+geom_line()#qqnorm(log(eth_btc$ratio), main = "QQ Plot of log(ETH/BTC) Ratio")#qqline(log(eth_btc$ratio), col = "red", lwd = 2)# For example, a mixture of 2 lognormal distributions# You’d fit to log_ratio, then exponentiate to interpret.# mix_model <- normalmixEM(log(eth_btc$ratio)[is.finite(log(eth_btc$ratio))], k = 2)# summary(mix_model)# plot(mix_model, which = 2) # plot the components```Here we see the ratio, and the distribution of the observation. It looks like camel's back with two likely normal distribution. Now lets test for normality.```{r}simple_ratiograph_ratio```Now, lets test for normality. The dark line is the observed data and the red line is the expected or theoretical line of a normal distribution. S\Since the observation does not follow the theoretical line and there is deviation (since the quantiles of our data does not follow the QQline) we can conclude that our data is not normally distributed.\```{r}qqnorm(log(eth_btc$ratio), main ="QQ Plot of log(ETH/BTC) Ratio")qqline(log(eth_btc$ratio), col ="red", lwd =2)```Given that it was looking like a bactrian camel's back I wonder if it has two normal distribution within the same data. Each representing a regime. And it seems we are right.```{r}mix_model <-normalmixEM(log(eth_btc$ratio)[is.finite(log(eth_btc$ratio))], k =2)summary(mix_model)plot(mix_model, which =2) # plot the components```### **Regime Summary Interpretation**Your model identifies **two distinct regimes** for the log-transformed ETH/BTC ratio:1. **Regime 1 (49.2% probability)** - Mean (`μ₁`): **-3.499** (log ratio) - Volatility (`σ₁`): **0.278** - Implied ETH/BTC ratio: `exp(-3.499)` ≈ **0.030 BTC per ETH** - *Interpretation*: A "low ETH valuation" regime where ETH is relatively cheap compared to BTC.2. **Regime 2 (50.8% probability)** - Mean (`μ₂`): **-2.719** (log ratio) - Volatility (`σ₂`): **0.155** - Implied ETH/BTC ratio: `exp(-2.719)` ≈ **0.066 BTC per ETH** - *Interpretation*: A "high ETH valuation" regime where ETH is relatively expensive compared to BTC.**Regime Assignment**First, classify historical data into regimes using posterior probabilities from `mix_model$posterior`:```{r}# Assign regimes (prob > 0.5)eth_btc$regime <-ifelse(mix_model$posterior[,1] >0.5, 1, 2)eth_btc <- eth_btc %>%mutate(z_score =case_when( regime ==1~ (log(ratio) - (-3.499)) /0.278, regime ==2~ (log(ratio) - (-2.719)) /0.155 ) )upper_mean_reversion_threshold <-1.5lower_mean_reversion_threshold <-1.5eth_btc <- eth_btc %>%mutate(signal =case_when( z_score > upper_mean_reversion_threshold ~"Short ETH/BTC (overvalued)", z_score < lower_mean_reversion_threshold ~"Long ETH/BTC (undervalued)",TRUE~"Hold" ) )```#### 2. **Regime-Specific Z-Scores**Calculate how far the current ratio deviates from its regime’s mean (in standard deviations):```{r}# Load required libraries# Plot 1: ETH/BTC Ratio with Regimes and Signalsplot_ratio <-ggplot(eth_btc, aes(x = date)) +# Plot ETH/BTC ratiogeom_line(aes(y = ratio, color =as.factor(regime)), linewidth =0.8) +# Add trading signalsgeom_point(data =subset(eth_btc, signal !="Hold"),aes(y = ratio, shape = signal, color =as.factor(regime)),size =3, alpha =0.8) +# Custom colors and labelsscale_color_manual(values =c("1"="#FF6B6B", "2"="#4ECDC4"),name ="Regime") +scale_shape_manual(values =c("Short ETH/BTC (overvalued)"=6,"Long ETH/BTC (undervalued)"=2)) +labs(title ="ETH/BTC Ratio with Regimes and Trading Signals",y ="ETH/BTC Ratio",x ="Date") +theme_minimal() +theme(legend.position ="bottom")# Plot 2: Z-Scores with Thresholdsplot_zscore <-ggplot(eth_btc, aes(x = date)) +geom_line(aes(y = z_score, color ="Z-Score"), linewidth =0.8) +geom_hline(yintercept =c(-lower_mean_reversion_threshold, upper_mean_reversion_threshold),linetype ="dashed", color ="gray40") +annotate("text", x =min(eth_btc$date), y = upper_mean_reversion_threshold +0.1,label ="Overvalued Threshold", hjust =0, color ="gray40") +annotate("text", x =min(eth_btc$date),y =-lower_mean_reversion_threshold -0.1,label ="Undervalued Threshold", hjust =0, color ="gray40") +labs(title ="Z-Score with Mean Reversion Thresholds",y ="Z-Score",x ="Date") +theme_minimal() +theme(legend.position ="none")# Combine both plotslibrary(patchwork)combined_plot <- plot_ratio / plot_zscore +plot_layout(heights =c(2, 1))print(combined_plot)```Now that we have two regimes, for each regime we have a different statistical summary. And I used those to calculate the Z score of data within that regime. As a result we have the following data.