alpha <- 0.0017
beta <- 0.98
se_alpha <- 0.0020
se_beta <- 0.17
R2_q1 <- 0.50
E_mkt_prem <- 0.0070
n_q1 <- 96
critical_t <- 1.98
t_beta <- beta / se_beta
cat("t-statistic for beta:", round(t_beta, 4), "\n")
## t-statistic for beta: 5.7647
cat("Critical value: ±", critical_t, "\n")
## Critical value: ± 1.98
cat("Reject H0 (beta = 0)?", abs(t_beta) > critical_t, "\n")
## Reject H0 (beta = 0)? TRUE
Interpretation: Since |t| = 5.7647 > 1.98, we reject H₀: β = 0. β = 0.98 means the fund moves ~0.98% for every 1% market move — nearly identical systematic risk to the market.
t_beta1 <- (beta - 1) / se_beta
cat("t-statistic for H0: beta = 1:", round(t_beta1, 4), "\n")
## t-statistic for H0: beta = 1: -0.1176
cat("Reject H0 (beta = 1)?", abs(t_beta1) > critical_t, "\n")
## Reject H0 (beta = 1)? FALSE
Interpretation: |t| = 0.1176 < 1.98 → fail to reject H₀: β = 1. The fund’s systematic risk is not statistically different from the market.
t_alpha <- alpha / se_alpha
cat("t-statistic for alpha:", round(t_alpha, 4), "\n")
## t-statistic for alpha: 0.85
cat("Reject H0 (alpha = 0)?", abs(t_alpha) > critical_t, "\n")
## Reject H0 (alpha = 0)? FALSE
Interpretation: |t| = 0.85 < 1.98 → α is not statistically significant. The marketing claim of “positive risk-adjusted performance” is not statistically justified.
systematic_pct <- R2_q1 * 100
diversifiable_pct <- (1 - R2_q1) * 100
cat("Systematic variation:", systematic_pct, "%\n")
## Systematic variation: 50 %
cat("Diversifiable variation:", diversifiable_pct, "%\n")
## Diversifiable variation: 50 %
Interpretation: 50% of return variation is systematic (market risk); 50% is diversifiable (idiosyncratic) risk.
expected_excess_return <- beta * E_mkt_prem
cat("Expected monthly excess return:", round(expected_excess_return * 100, 4), "%\n")
## Expected monthly excess return: 0.686 %
Result: β × E[R_m − R_f] = 0.98 × 0.70% = 0.686% per month
a <- 0.0029; se_a <- 0.0018
b <- 0.97; se_b <- 0.08
s <- 0.75; se_s <- 0.11
h <- -0.13; se_h <- 0.13
R2_q2 <- 0.92
adj_R2_q2 <- 0.918
n_q2 <- 144
t_a <- a / se_a
t_b <- b / se_b
t_s <- s / se_s
t_h <- h / se_h
results <- data.frame(
Term = c("Intercept (alpha)", "MKT (b)", "SMB (s)", "HML (h)"),
Estimate = c(a, b, s, h),
Std.Error = c(se_a, se_b, se_s, se_h),
t_stat = round(c(t_a, t_b, t_s, t_h), 4),
Significant = abs(c(t_a, t_b, t_s, t_h)) > critical_t
)
print(results)
## Term Estimate Std.Error t_stat Significant
## 1 Intercept (alpha) 0.0029 0.0018 1.6111 FALSE
## 2 MKT (b) 0.9700 0.0800 12.1250 TRUE
## 3 SMB (s) 0.7500 0.1100 6.8182 TRUE
## 4 HML (h) -0.1300 0.1300 -1.0000 FALSE
Significant at 5%: MKT (b) and SMB (s). Intercept and HML are not significant.
cat("SMB loading (s):", s, "→ Large positive → Small-cap tilt\n")
## SMB loading (s): 0.75 → Large positive → Small-cap tilt
cat("HML loading (h):", h, "→ Small negative (insignificant) → Slight growth tilt\n")
## HML loading (h): -0.13 → Small negative (insignificant) → Slight growth tilt
cat("Style: Small-cap Growth fund\n")
## Style: Small-cap Growth fund
cat("Alpha:", a, " | t-stat:", round(t_a, 4), "\n")
## Alpha: 0.0029 | t-stat: 1.6111
cat("Statistically significant?", abs(t_a) > critical_t, "\n")
## Statistically significant? FALSE
Interpretation: α is not significant (t = 1.6111) → we cannot conclude the manager adds value beyond the three factor exposures.
R2_single <- 0.75
cat("Single-factor R²:", R2_single, "\n")
## Single-factor R²: 0.75
cat("Three-factor R²:", R2_q2, "\n")
## Three-factor R²: 0.92
cat("Adjusted R²:", adj_R2_q2, "\n")
## Adjusted R²: 0.918
cat("Improvement:", R2_q2 - R2_single, "\n")
## Improvement: 0.17
Interpretation: SMB and HML explain substantial additional variation. Adjusted R² penalizes for extra predictors — it only rises if new factors genuinely improve fit, making it the correct metric for comparing models with different numbers of predictors.
beta0 <- -0.02
beta1 <- 5.4
beta2 <- -0.38
r_lag <- 0.010
dVIX <- 1.5
threshold <- 0.5
logit_val <- beta0 + beta1 * r_lag + beta2 * dVIX
prob_up <- 1 / (1 + exp(-logit_val))
cat("Logit value:", round(logit_val, 4), "\n")
## Logit value: -0.536
cat("P(Up):", round(prob_up, 4), "\n")
## P(Up): 0.3691
cat("Predicted class:", ifelse(prob_up >= threshold, "Up", "Down"), "\n")
## Predicted class: Down
Result: P(Up) = 0.3691 < 0.5 → predicted class is “Down”.
cat("beta1 = +5.4 (lagged return): Momentum effect\n")
## beta1 = +5.4 (lagged return): Momentum effect
cat(" Positive yesterday return → higher P(Up) tomorrow\n\n")
## Positive yesterday return → higher P(Up) tomorrow
cat("beta2 = -0.38 (delta VIX): Fear effect\n")
## beta2 = -0.38 (delta VIX): Fear effect
cat(" Rising VIX (fear) → lower P(Up) tomorrow\n")
## Rising VIX (fear) → lower P(Up) tomorrow
TP <- 67; FP <- 44; FN <- 33; TN <- 56; Total <- 200
accuracy <- (TP + TN) / Total
sensitivity <- TP / (TP + FN)
specificity <- TN / (TN + FP)
precision <- TP / (TP + FP)
metrics <- data.frame(
Metric = c("Accuracy", "Sensitivity (TPR for Up)", "Specificity", "Precision for Up"),
Value = round(c(accuracy, sensitivity, specificity, precision), 4),
Pct = paste0(round(c(accuracy, sensitivity, specificity, precision)*100, 2), "%")
)
print(metrics)
## Metric Value Pct
## 1 Accuracy 0.6150 61.5%
## 2 Sensitivity (TPR for Up) 0.6700 67%
## 3 Specificity 0.5600 56%
## 4 Precision for Up 0.6036 60.36%
naive_accuracy <- 100 / 200 # balanced classes
cat("Naïve (majority class) accuracy:", naive_accuracy * 100, "%\n")
## Naïve (majority class) accuracy: 50 %
cat("Model accuracy:", round(accuracy * 100, 2), "%\n")
## Model accuracy: 61.5 %
cat("Model beats naïve rule?", accuracy > naive_accuracy, "\n\n")
## Model beats naïve rule? TRUE
cat("Accuracy is inadequate for trading: a wrong prediction on a large-move day\n")
## Accuracy is inadequate for trading: a wrong prediction on a large-move day
cat("costs far more than a correct prediction on a flat day.\n")
## costs far more than a correct prediction on a flat day.
cat("Better criterion: Sharpe ratio or P&L of the resulting trading strategy.\n")
## Better criterion: Sharpe ratio or P&L of the resulting trading strategy.
mean_return <- 0.0070
sd_return <- 0.0550
n_months <- 48
SR_monthly <- mean_return / sd_return
scaling <- sqrt(12)
SR_annual <- SR_monthly * scaling
cat("Monthly Sharpe Ratio:", round(SR_monthly, 4), "\n")
## Monthly Sharpe Ratio: 0.1273
cat("Scaling factor: sqrt(12) =", round(scaling, 4), "\n")
## Scaling factor: sqrt(12) = 3.4641
cat("Annualized Sharpe Ratio:", round(SR_annual, 4), "\n")
## Annualized Sharpe Ratio: 0.4409
set.seed(42)
monthly_returns <- rnorm(n_months, mean = mean_return, sd = sd_return)
B <- 10000
boot_SR <- numeric(B)
for (i in 1:B) {
samp <- sample(monthly_returns, size = n_months, replace = TRUE)
boot_SR[i] <- mean(samp) / sd(samp) * sqrt(12)
}
SE_boot <- sd(boot_SR)
cat("Bootstrap SE of Annualized Sharpe Ratio:", round(SE_boot, 4), "\n")
## Bootstrap SE of Annualized Sharpe Ratio: 0.516
Why i.i.d. bootstrap is inappropriate: Monthly returns often show serial correlation. The i.i.d. bootstrap destroys time structure → underestimates true uncertainty.
Fix: Use a block bootstrap to preserve temporal dependence.
lambda_min_cv <- 0.030
lambda_1se <- 0.065
cat("Lambda min-CV:", lambda_min_cv, "→ 14 factors\n")
## Lambda min-CV: 0.03 → 14 factors
cat("Lambda 1-SE: ", lambda_1se, "→ 7 factors\n\n")
## Lambda 1-SE: 0.065 → 7 factors
cat("Deploy: lambda =", lambda_1se, "(1-SE rule)\n")
## Deploy: lambda = 0.065 (1-SE rule)
cat("Reason: Simpler model generalizes better OOS and reduces overfitting risk.\n")
## Reason: Simpler model generalizes better OOS and reduces overfitting risk.
cat("Walk-Forward (Time-Respecting) Scheme:\n\n")
## Walk-Forward (Time-Respecting) Scheme:
cat("Fold 1: Train months 1-36 | Test months 37-48\n")
## Fold 1: Train months 1-36 | Test months 37-48
cat("Fold 2: Train months 1-48 | Test months 49-60\n")
## Fold 2: Train months 1-48 | Test months 49-60
cat("Fold 3: Train months 1-60 | Test months 61-72\n")
## Fold 3: Train months 1-60 | Test months 61-72
cat("...always expand training window forward...\n\n")
## ...always expand training window forward...
cat("Rule: NEVER use future data in the training set.\n\n")
## Rule: NEVER use future data in the training set.
cat("Why standard k-fold CV is UNSAFE:\n")
## Why standard k-fold CV is UNSAFE:
cat(" - Randomly mixes past and future observations\n")
## - Randomly mixes past and future observations
cat(" - Creates look-ahead bias → inflated performance estimates\n")
## - Creates look-ahead bias → inflated performance estimates
cat(" - Walk-forward respects the arrow of time → honest OOS evaluation\n")
## - Walk-forward respects the arrow of time → honest OOS evaluation