PulsarPeak Capital Investment Philosophy

Author

Aftikhar

Code
getSymbols(c("BTC-USD", "ETH-USD"), src = "yahoo", auto.assign = TRUE)
[1] "BTC-USD" "ETH-USD"
Code
btcp <- data.frame(data=index(`BTC-USD`), coredata(`BTC-USD`)) %>% transmute(date = data, BTC_price = BTC.USD.Adjusted, BTC_volume = BTC.USD.Volume)

ethp <- data.frame(data=index(`ETH-USD`), coredata(`ETH-USD`)) %>% transmute(date = data, ETH_price = ETH.USD.Adjusted, ETH_volume = ETH.USD.Volume)

eth_btc <- inner_join(btcp, ethp, by = "date") %>% transmute(date, ratio = ETH_price/BTC_price) %>% na.omit()

any(is.na(eth_btc$ratio) | is.infinite(eth_btc$ratio) | eth_btc$ratio == 0)
[1] FALSE
Code
which(is.na(eth_btc$ratio) | is.infinite(eth_btc$ratio) | eth_btc$ratio == 0)
integer(0)
Code
graph_ratio <- ggplot(eth_btc, aes(x = ratio)) +
  geom_histogram(aes(y = ..density..), bins = 150, fill = "lightblue", color = "black") +
  geom_density(color = "red", size = 1) +
  labs(title = "Histogram of ETH/BTC Price Ratio",
       x = "ETH/BTC Ratio", y = "Density")

simple_ratio <- ggplot(eth_btc, aes(x=date, y=ratio))+geom_line()
#qqnorm(log(eth_btc$ratio), main = "QQ Plot of log(ETH/BTC) Ratio")
#qqline(log(eth_btc$ratio), col = "red", lwd = 2)





# For example, a mixture of 2 lognormal distributions
# You’d fit to log_ratio, then exponentiate to interpret.
# mix_model <- normalmixEM(log(eth_btc$ratio)[is.finite(log(eth_btc$ratio))], k = 2)
# summary(mix_model)
# plot(mix_model, which = 2)  # plot the components

Here we see the ratio, and the distribution of the observation. It looks like camel’s back with two likely normal distribution. Now lets test for normality.

Code
simple_ratio

Code
graph_ratio

Now, lets test for normality. The dark line is the observed data and the red line is the expected or theoretical line of a normal distribution. S
Since the observation does not follow the theoretical line and there is deviation (since the quantiles of our data does not follow the QQline) we can conclude that our data is not normally distributed.

Code
qqnorm(log(eth_btc$ratio), main = "QQ Plot of log(ETH/BTC) Ratio")
qqline(log(eth_btc$ratio), col = "red", lwd = 2)

Given that it was looking like a bactrian camel’s back I wonder if it has two normal distribution within the same data. Each representing a regime. And it seems we are right.

Code
mix_model <- normalmixEM(log(eth_btc$ratio)[is.finite(log(eth_btc$ratio))], k = 2)
number of iterations= 45 
Code
summary(mix_model)
summary of normalmixEM object:
          comp 1    comp 2
lambda  0.492086  0.507914
mu     -3.499291 -2.719402
sigma   0.277704  0.155276
loglik at estimate:  -1189.445 
Code
plot(mix_model, which = 2)  # plot the components

Regime Summary Interpretation

Your model identifies two distinct regimes for the log-transformed ETH/BTC ratio:

  1. Regime 1 (49.2% probability)

    • Mean (μ₁): -3.499 (log ratio)

    • Volatility (σ₁): 0.278

    • Implied ETH/BTC ratio: exp(-3.499)0.030 BTC per ETH

    • Interpretation: A “low ETH valuation” regime where ETH is relatively cheap compared to BTC.

  2. Regime 2 (50.8% probability)

    • Mean (μ₂): -2.719 (log ratio)

    • Volatility (σ₂): 0.155

    • Implied ETH/BTC ratio: exp(-2.719)0.066 BTC per ETH

    • Interpretation: A “high ETH valuation” regime where ETH is relatively expensive compared to BTC.

Regime Assignment

First, classify historical data into regimes using posterior probabilities from mix_model$posterior:

Code
# Assign regimes (prob > 0.5)
eth_btc$regime <- ifelse(mix_model$posterior[,1] > 0.5, 1, 2)

eth_btc <- eth_btc %>%
  mutate(
    z_score = case_when(
      regime == 1 ~ (log(ratio) - (-3.499)) / 0.278,
      regime == 2 ~ (log(ratio) - (-2.719)) / 0.155
    )
  )

upper_mean_reversion_threshold <- 1.5
lower_mean_reversion_threshold <- 1.5


eth_btc <- eth_btc %>%
  mutate(
    signal = case_when(
      z_score > upper_mean_reversion_threshold ~ "Short ETH/BTC (overvalued)",
      z_score < lower_mean_reversion_threshold ~ "Long ETH/BTC (undervalued)",
      TRUE ~ "Hold"
    )
  )

2. Regime-Specific Z-Scores

Calculate how far the current ratio deviates from its regime’s mean (in standard deviations):

Code
# Load required libraries

# Plot 1: ETH/BTC Ratio with Regimes and Signals
plot_ratio <- ggplot(eth_btc, aes(x = date)) +
  # Plot ETH/BTC ratio
  geom_line(aes(y = ratio, color = as.factor(regime)), linewidth = 0.8) +
  # Add trading signals
  geom_point(data = subset(eth_btc, signal != "Hold"),
             aes(y = ratio, shape = signal, color = as.factor(regime)),
             size = 3, alpha = 0.8) +
  # Custom colors and labels
  scale_color_manual(values = c("1" = "#FF6B6B", "2" = "#4ECDC4"),
                     name = "Regime") +
  scale_shape_manual(values = c("Short ETH/BTC (overvalued)" = 6,
                                "Long ETH/BTC (undervalued)" = 2)) +
  labs(title = "ETH/BTC Ratio with Regimes and Trading Signals",
       y = "ETH/BTC Ratio",
       x = "Date") +
  theme_minimal() +
  theme(legend.position = "bottom")

# Plot 2: Z-Scores with Thresholds
plot_zscore <- ggplot(eth_btc, aes(x = date)) +
  geom_line(aes(y = z_score, color = "Z-Score"), linewidth = 0.8) +
  geom_hline(yintercept = c(-lower_mean_reversion_threshold, 
                            upper_mean_reversion_threshold),
             linetype = "dashed", color = "gray40") +
  annotate("text", x = min(eth_btc$date), 
           y = upper_mean_reversion_threshold + 0.1,
           label = "Overvalued Threshold", hjust = 0, color = "gray40") +
  annotate("text", x = min(eth_btc$date),
           y = -lower_mean_reversion_threshold - 0.1,
           label = "Undervalued Threshold", hjust = 0, color = "gray40") +
  labs(title = "Z-Score with Mean Reversion Thresholds",
       y = "Z-Score",
       x = "Date") +
  theme_minimal() +
  theme(legend.position = "none")

# Combine both plots
library(patchwork)
combined_plot <- plot_ratio / plot_zscore + 
  plot_layout(heights = c(2, 1))

print(combined_plot)

Now that we have two regimes, for each regime we have a different statistical summary. And I used those to calculate the Z score of data within that regime. As a result we have the following data.