Model: \(R_i - R_f = \alpha + \beta(R_m - R_f) + \varepsilon\)
Regression output (\(n = 96\) months):
| Term | Estimate | Std. Error |
|---|---|---|
| Intercept, \(\hat{\alpha}\) | 0.0017 | 0.0020 |
| Market beta, \(\hat{\beta}\) | 0.98 | 0.17 |
\(R^2 = 0.50\), \(E[R_m - R_f] = 0.70\%\) per month, critical \(|t|_{0.05} \approx 1.98\).
alpha_hat <- 0.0017; se_alpha <- 0.0020
beta_hat <- 0.98; se_beta <- 0.17
R2_capm <- 0.50
E_mkt <- 0.0070
t_crit <- 1.98\[t_{\hat{\beta}} = \frac{\hat{\beta} - 0}{SE(\hat{\beta})} = \frac{0.98}{0.17}\]
t-statistic for H0: beta = 0 : 5.765
Critical value (two-tailed, 5%): 1.98
Reject H0? TRUE
Result: \(t_{\hat{\beta}} = 5.7647\). Since \(|t| = 5.7647 > 1.98\), we reject \(H_0: \beta = 0\) at the 5% significance level.
Economic interpretation: \(\hat{\beta} = 0.98\) means the fund moves nearly one-for-one with the market. A 1 percentage point rise in the market excess return is associated with approximately a 0.98 percentage point rise in the fund’s excess return. The fund’s systematic risk is very close to, but marginally below, that of the market portfolio.
\[t = \frac{\hat{\beta} - 1}{SE(\hat{\beta})} = \frac{0.98 - 1}{0.17}\]
t-statistic for H0: beta = 1 : -0.1176
Reject H0? FALSE
Result: \(t = -0.1176\). Since \(|t| = 0.1176 < 1.98\), we fail to reject \(H_0: \beta = 1\) at the 5% level.
Interpretation: There is no statistically significant evidence that the fund’s systematic risk differs from the market’s. The deviation of \(\hat{\beta} = 0.98\) from unity is well within sampling error. The fund carries market-equivalent systematic risk with no discernible leverage reduction or amplification relative to the benchmark.
\[t_{\hat{\alpha}} = \frac{\hat{\alpha} - 0}{SE(\hat{\alpha})} = \frac{0.0017}{0.0020}\]
t-statistic for H0: alpha = 0 : 0.85
Reject H0? FALSE
Result: \(t_{\hat{\alpha}} = 0.85\). Since \(|t| = 0.85 < 1.98\), we fail to reject \(H_0: \alpha = 0\) at the 5% level.
Conclusion: The marketing claim of “positive risk-adjusted performance” is not statistically justified. Although the point estimate \(\hat{\alpha} = 0.0017\) (~0.17% per month) is positive, it is indistinguishable from zero given the sampling variability. The data do not provide sufficient evidence of genuine managerial skill beyond market exposure.
\[R^2 = \frac{\text{Systematic variance}}{\text{Total variance}}\]
Systematic (market-explained) variance: 50 %
Idiosyncratic (diversifiable) variance: 50 %
Interpretation: Exactly 50% of the fund’s monthly excess return variance is systematic — explained by co-movement with the market portfolio. The remaining 50% is idiosyncratic (diversifiable) risk arising from fund-specific positions. This relatively low \(R^2\) indicates the fund carries substantial unsystematic risk that a well-diversified investor could eliminate by combining it with other assets.
\[E[R_i - R_f] = \hat{\beta} \times E[R_m - R_f] = 0.98 \times 0.70\%\]
CAPM-implied monthly excess return: 0.686 %
Result: The CAPM predicts the fund should earn \(0.98 \times 0.70\% = 0.686\%\) per month above the risk-free rate, as compensation for its market risk exposure.
Model: \(R_i - R_f = \alpha + b \cdot MKT + s \cdot SMB + h \cdot HML + \varepsilon\)
Regression output (\(n = 144\) months):
| Term | Estimate | Std. Error |
|---|---|---|
| Intercept, \(\hat{\alpha}\) | 0.0029 | 0.0018 |
| MKT, \(\hat{b}\) | 0.97 | 0.08 |
| SMB, \(\hat{s}\) | 0.75 | 0.11 |
| HML, \(\hat{h}\) | −0.13 | 0.13 |
\(R^2 = 0.92\), Adjusted \(R^2 = 0.918\), critical \(|t|_{0.05} \approx 1.98\).
coefs <- c(alpha = 0.0029, b_MKT = 0.97, s_SMB = 0.75, h_HML = -0.13)
ses <- c(alpha = 0.0018, b_MKT = 0.08, s_SMB = 0.11, h_HML = 0.13)
R2_ff3 <- 0.92
adjR2_ff3 <- 0.918
n_q2 <- 144\[t_k = \frac{\hat{\theta}_k}{SE(\hat{\theta}_k)}\]
t_stats <- coefs / ses
significant <- abs(t_stats) > 1.98
results_ff3 <- data.frame(
Coefficient = c("alpha (Intercept)", "b (MKT)", "s (SMB)", "h (HML)"),
Estimate = coefs,
Std_Error = ses,
t_Statistic = round(t_stats, 4),
Significant_5pct = ifelse(significant, "Yes", "No")
)
rownames(results_ff3) <- NULL
knitr::kable(results_ff3,
col.names = c("Coefficient", "Estimate", "Std. Error",
"t-Statistic", "Significant (5%)?"))| Coefficient | Estimate | Std. Error | t-Statistic | Significant (5%)? |
|---|---|---|---|---|
| alpha (Intercept) | 0.0029 | 0.0018 | 1.611 | No |
| b (MKT) | 0.9700 | 0.0800 | 12.125 | Yes |
| s (SMB) | 0.7500 | 0.1100 | 6.818 | Yes |
| h (HML) | -0.1300 | 0.1300 | -1.000 | No |
Summary: MKT (\(t = 12.125\)) and SMB (\(t = 6.8182\)) are highly significant. The intercept (\(t = 1.6111\)) and HML (\(t = -1\)) are not significant at the 5% level.
cat("Size loading s =", coefs["s_SMB"], "| t =", round(t_stats["s_SMB"], 4),
"| Significant:", significant["s_SMB"])Size loading s = 0.75 | t = 6.818 | Significant: TRUE
cat("\nValue loading h =", coefs["h_HML"], "| t =", round(t_stats["h_HML"], 4),
"| Significant:", significant["h_HML"])
Value loading h = -0.13 | t = -1 | Significant: FALSE
Size tilt: \(\hat{s} = 0.75 > 0\) and statistically significant (\(t = 6.8182\)). The fund has a strong small-cap tilt — it co-moves substantially more with small-capitalisation stocks than with large-capitalisation stocks.
Value/Growth tilt: \(\hat{h} = -0.13 < 0\) but not statistically significant (\(t = -1\)). The negative sign is directionally consistent with a growth tilt, but cannot be distinguished from zero at the 5% level.
Fund classification: Small-cap oriented fund with no statistically discernible value/growth bias.
t_alpha_ff <- t_stats["alpha"]
annual_alpha <- coefs["alpha"] * 12 * 100
cat("FF3 alpha (monthly) :", coefs["alpha"])FF3 alpha (monthly) : 0.0029
t-statistic : 1.611
Reject H0: alpha = 0? : FALSE
Annualised alpha (approx) : 3.48 %
Interpretation: \(\hat{\alpha} = 0.0029\) (~0.29% per month, approximately 3.48% annualised) is the fund’s abnormal return after controlling for market, size, and value risk factor exposures.
Result: \(t = 1.6111\). Since \(|t| < 1.98\), we fail to reject \(H_0: \alpha = 0\). The manager does not add statistically significant value beyond the three factor exposures. The positive point estimate may reflect estimation error rather than genuine skill.
CAPM R-squared : 0.75
FF3 R-squared : 0.92
Increase in R-squared : 0.17
FF3 Adjusted R-squared: 0.918
# Formula check: adj R^2 = 1 - (1 - R^2)(n - 1)/(n - k - 1), k = 3
adj_check <- 1 - (1 - R2_ff3) * (n_q2 - 1) / (n_q2 - 3 - 1)
cat("\nVerified Adjusted R^2 :", round(adj_check, 4))
Verified Adjusted R^2 : 0.9183
Rise from 0.75 to 0.92: Adding SMB and HML accounts for an additional 17 percentage points of return variance. The single market factor fails to capture the fund’s strong small-cap tilt; including SMB resolves this gap, leaving only 8% of return variation unexplained.
Why adjusted \(R^2\) is the appropriate metric: Raw \(R^2\) is non-decreasing in the number of predictors — adding any variable, including noise, cannot reduce it. The adjusted \(R^2\) applies a degrees-of-freedom penalty:
\[\bar{R}^2 = 1 - \frac{(1 - R^2)(n - 1)}{n - k - 1}\]
When comparing CAPM (\(k = 1\)) with FF3 (\(k = 3\)), adjusted \(R^2\) is the only valid criterion because it rises only when the added factors reduce residual variance by more than chance alone would predict. The adjusted \(R^2 = 0.918\) confirms that SMB and HML earn their inclusion — the improvement is genuine, not mechanical.
Model: \(\text{logit}\,P(\text{Up}) = \beta_0 + \beta_1 r_{t-1} + \beta_2\,\Delta VIX_{t-1}\)
Estimated coefficients: \(\hat{\beta}_0 = -0.02\), \(\hat{\beta}_1 = 5.4\), \(\hat{\beta}_2 = -0.38\). Today’s inputs: \(r_{t-1} = 0.010\), \(\Delta VIX = 1.5\).
Step 1 — Linear predictor (log-odds):
\[\text{logit} = \beta_0 + \beta_1 r_{t-1} + \beta_2\,\Delta VIX_{t-1}\]
Step 2 — Sigmoid transformation:
\[\hat{P}(\text{Up}) = \frac{1}{1 + e^{-\text{logit}}}\]
logit_val <- b0 + b1 * r_lag + b2 * dVIX
prob_up <- 1 / (1 + exp(-logit_val))
cat("Logit = -0.02 +", b1, "*", r_lag, "+", b2, "*", dVIX)Logit = -0.02 + 5.4 * 0.01 + -0.38 * 1.5
= -0.536
P(Up) = 0.3691
Predicted class (threshold 0.5): Down
Result:
\[\text{logit} = -0.02 + 5.4(0.010) + (-0.38)(1.5) = -0.02 + 0.054 - 0.570 = -0.536\]
\[\hat{P}(\text{Up}) = \frac{1}{1 + e^{-0.536}} = 0.3691\]
Since \(\hat{P}(\text{Up}) = 0.3691 < 0.50\), the predicted class is Down.
beta_1 (lagged return) = 5.4 > 0
beta_2 (delta VIX) = -0.38 < 0
\(\hat{\beta}_1 = 5.4 > 0\) — Short-term return momentum: A positive lagged return increases the probability of an “Up” day. This captures the tendency of recent gains to predict further gains — consistent with short-term price momentum and positive return autocorrelation at short horizons.
\(\hat{\beta}_2 = -0.38 < 0\) — Volatility/fear signal: A rise in VIX decreases the probability of an “Up” day. VIX measures implied volatility and investor fear; a VIX spike signals heightened uncertainty, typically associated with selling pressure and negative returns. This reflects the well-documented risk-off dynamic.
Confusion matrix (0.5 threshold, \(n = 200\) hold-out days):
| Actual Up | Actual Down | Total | |
|---|---|---|---|
| Predicted Up | 67 | 44 | 111 |
| Predicted Down | 33 | 56 | 89 |
| Total | 100 | 100 | 200 |
TP <- 67; FP <- 44; FN <- 33; TN <- 56; N <- 200
accuracy <- (TP + TN) / N
sensitivity <- TP / (TP + FN)
specificity <- TN / (TN + FP)
precision <- TP / (TP + FP)
cat("Accuracy = (67 + 56) / 200 =", round(accuracy, 4))Accuracy = (67 + 56) / 200 = 0.615
Sensitivity = 67 / (67 + 33) = 0.67
Specificity = 56 / (56 + 44) = 0.56
Precision = 67 / (67 + 44) = 0.6036
\[\text{Accuracy} = \frac{TP + TN}{N} = \frac{67 + 56}{200} = 0.615\]
\[\text{Sensitivity} = \frac{TP}{TP + FN} = \frac{67}{100} = 0.67\]
\[\text{Specificity} = \frac{TN}{TN + FP} = \frac{56}{100} = 0.56\]
\[\text{Precision} = \frac{TP}{TP + FP} = \frac{67}{111} = 0.6036\]
Naive majority-class accuracy : 0.5
Model accuracy : 0.615
Model beats naive baseline? : TRUE
Naive rule: Since the test set is balanced (100 Up, 100 Down), always predicting the majority class achieves:
\[\text{Naive accuracy} = \frac{100}{200} = 0.50\]
The model achieves \(0.615 > 0.50\), so it outperforms the naive baseline.
Why accuracy alone is inadequate: In a trading context, prediction errors carry asymmetric financial consequences. A false positive (predicting “Up” when the market falls) results in a losing long position, while a false negative (predicting “Down” when the market rises) results in a missed gain. Accuracy weights both error types equally and therefore fails to capture the true economic cost of misclassification.
A more economically relevant criterion is realised P&L — the portfolio return generated by acting on the model’s signals, net of transaction costs. Precision (the fraction of “Up” predictions that are correct) directly determines the hit rate of long entry signals and is more actionable for a practitioner. The AUC-ROC and F1 score provide additional diagnostic power across all classification thresholds.
Sample statistics over \(T = 48\) months: \(\bar{r} = 0.70\%\), \(\hat{\sigma} = 5.50\%\).
\[SR_{\text{monthly}} = \frac{\bar{r}}{\hat{\sigma}} = \frac{0.0070}{0.0550}\]
\[SR_{\text{annual}} = SR_{\text{monthly}} \times \sqrt{12}\]
SR_monthly <- r_bar / sigma
scale_factor <- sqrt(12)
SR_annual <- SR_monthly * scale_factor
cat("Monthly Sharpe Ratio :", round(SR_monthly, 4))Monthly Sharpe Ratio : 0.1273
Scaling factor (sqrt 12): 3.464
Annualised Sharpe Ratio : 0.4409
\[SR_{\text{monthly}} = \frac{0.0070}{0.0550} = 0.1273\]
\[SR_{\text{annual}} = 0.1273 \times \sqrt{12} = 0.1273 \times 3.4641 = 0.4409\]
Scaling factor: \(\sqrt{12}\) — under i.i.d. monthly returns, the annualised mean scales by 12 and the annualised standard deviation scales by \(\sqrt{12}\), so their ratio scales by \(12 / \sqrt{12} = \sqrt{12}\).
set.seed(2024)
r_synthetic <- rnorm(T_months, mean = r_bar, sd = sigma)
B <- 10000
sr_boot <- numeric(B)
for (b in seq_len(B)) {
r_b <- sample(r_synthetic, size = T_months, replace = TRUE)
sr_boot[b] <- mean(r_b) / sd(r_b)
}
SE_boot <- sd(sr_boot)
CI_95 <- quantile(sr_boot, c(0.025, 0.975))
cat("Bootstrap replications (B) :", B)Bootstrap replications (B) : 10000
Bootstrap SE (monthly SR) : 0.1473
95% Percentile CI (monthly SR): -0.2696 to 0.3113
Bootstrap procedure — step by step:
Why i.i.d. bootstrap is inappropriate: Monthly financial returns exhibit serial dependence — momentum produces positive autocorrelation and GARCH processes generate volatility clustering. The i.i.d. bootstrap destroys the temporal ordering, thereby understating the true variance of the Sharpe ratio estimator and producing artificially narrow confidence intervals.
Recommended variant — Stationary Block Bootstrap (Politis & Romano, 1994): Rather than resampling individual observations, this procedure draws contiguous blocks of consecutive returns (e.g., of length \(\ell = 6\) to 12 months). Blocking preserves short-run serial dependence while still providing asymptotically valid inference. The block length is chosen to cover the dominant autocorrelation horizon.
results_lasso <- data.frame(
Rule = c("Minimum CV Error", "One-Standard-Error Rule"),
Lambda = c(0.030, 0.065),
Factors_Retained = c(14, 7),
Recommendation = c("Risk of overfitting", "Recommended for deployment")
)
knitr::kable(results_lasso,
col.names = c("Selection Rule", "Lambda", "Factors Retained", "Assessment"))| Selection Rule | Lambda | Factors Retained | Assessment |
|---|---|---|---|
| Minimum CV Error | 0.030 | 14 | Risk of overfitting |
| One-Standard-Error Rule | 0.065 | 7 | Recommended for deployment |
Recommendation: Deploy \(\lambda = 0.065\) (one-standard-error rule), retaining 7 factors.
Overfitting risk: With 60 candidate factors and a finite backtest window, the minimum-CV solution (14 factors) likely includes spurious predictors. Multiple-testing and data-mining biases are pervasive in quantitative finance; parsimonious models generalise more reliably out-of-sample.
One-SE principle: The 1-SE rule selects the most parsimonious model whose CV error is statistically indistinguishable from the minimum (within one standard error). Since the CV error curve is itself noisy, the apparent minimum may not be a genuine optimum. Preferring the simpler model is statistically and economically defensible.
Economic durability: Factors in finance tend to decay as capital arbitrages them away. A 7-factor model is more interpretable, imposes lower transaction costs, and is more robust to regime changes than a 14-factor model loaded with marginal predictors.
Parsimony principle: When two models perform comparably in cross-validation, prefer the simpler one. Deploy \(\lambda = 0.065\).
wf <- data.frame(
Period = paste("Period", 1:5),
Train_Window = c("Months 1–36", "Months 1–42", "Months 1–48",
"Months 1–54", "Months 1–60"),
Test_Window = c("Months 37–42", "Months 43–48", "Months 49–54",
"Months 55–60", "Months 61–66"),
Train_n = c(36, 42, 48, 54, 60),
Test_n = c(6, 6, 6, 6, 6)
)
knitr::kable(wf,
col.names = c("Period", "Training Window", "Test Window", "Train n", "Test n"))| Period | Training Window | Test Window | Train n | Test n |
|---|---|---|---|---|
| Period 1 | Months 1–36 | Months 37–42 | 36 | 6 |
| Period 2 | Months 1–42 | Months 43–48 | 42 | 6 |
| Period 3 | Months 1–48 | Months 49–54 | 48 | 6 |
| Period 4 | Months 1–54 | Months 55–60 | 54 | 6 |
| Period 5 | Months 1–60 | Months 61–66 | 60 | 6 |
Walk-forward (expanding-window) procedure:
Why standard \(k\)-fold CV is unsafe: Random \(k\)-fold CV assigns observations to folds without regard to time, so a training fold may contain observations that chronologically post-date the test fold. This is look-ahead bias — the model is effectively trained on future data, producing unrealistically optimistic performance estimates that will not materialise in live trading.
The violation is especially severe for financial time series because: (i) returns exhibit serial dependence that random shuffling destroys, causing CV error to underestimate true prediction error; and (ii) non-stationarity and regime changes mean earlier and later periods are not exchangeable. Walk-forward CV strictly enforces the information frontier — at every step, the model may only use data prior to the test window, exactly replicating real-time constraints.