Model estimated on \(n = 96\) months: \[R_i - R_f = \alpha + \beta\,(R_m - R_f) + \varepsilon\]
| Term | Estimate | Std. Error |
|---|---|---|
| Intercept \(\alpha\) | 0.0017 | 0.0020 |
| Market premium \(\beta\) | 0.98 | 0.17 |
\(R^2 = 0.50\), average market premium \(E[R_m - R_f] = 0.70\% = 0.0070\), critical \(|t| \approx 1.98\).
alpha <- 0.0017; se_alpha <- 0.0020
beta <- 0.98; se_beta <- 0.17
R2 <- 0.50
mkt_prem<- 0.0070
tcrit <- 1.98
\[t_\beta = \frac{\hat\beta - 0}{SE(\hat\beta)}\]
t_beta0 <- (beta - 0) / se_beta
round(t_beta0, 4)
## [1] 5.7647
\(t_\beta = 0.98/0.17 = 5.7647\). Since \(|5.7647| > 1.98\), reject \(H_0\): \(\beta\) is statistically significant at the 5% level.
Economic interpretation: \(\beta \approx 0.98\) means the fund’s excess return moves almost one-for-one with the market — a 1% market excess return is associated with about a 0.98% fund excess return. The fund carries systematic (market) risk close to that of the market itself.
\[t = \frac{\hat\beta - 1}{SE(\hat\beta)}\]
t_beta1 <- (beta - 1) / se_beta
round(t_beta1, 4)
## [1] -0.1176
\(t = (0.98 - 1)/0.17 = -0.1176\). Since \(|{-0.1176}| < 1.98\), fail to reject \(H_0\).
The fund’s beta is not statistically distinguishable from 1, so its systematic risk is not significantly different from the market’s. The fund is effectively “market-like” in exposure.
\[t_\alpha = \frac{\hat\alpha}{SE(\hat\alpha)}\]
t_alpha <- alpha / se_alpha
round(t_alpha, 4)
## [1] 0.85
\(t_\alpha = 0.0017/0.0020 = 0.85\). Since \(0.85 < 1.98\), fail to reject \(H_0\).
The data do not statistically justify the marketing claim of “positive risk-adjusted performance.” The point estimate of alpha is positive (0.17%/month) but it is not statistically distinguishable from zero, so there is no reliable evidence of skill beyond market exposure.
\(R^2 = 0.50\) means 50% of the variation in the fund’s excess returns is explained by the market factor (systematic risk). The remaining 50% is idiosyncratic / diversifiable (firm-specific) variation not attributable to the market.
Under CAPM (\(\alpha = 0\)): \(E[R_i - R_f] = \beta \cdot E[R_m - R_f]\).
capm_er <- beta * mkt_prem
round(capm_er, 4)
## [1] 0.0069
\(E[R_i - R_f] = 0.98 \times 0.0070 = 0.00686 \approx 0.69\%\) per month.
Estimated on \(n = 144\) months: \[R_i - R_f = \alpha + b\cdot MKT + s\cdot SMB + h\cdot HML + \varepsilon\]
| Term | Estimate | Std. Error |
|---|---|---|
| \(\alpha\) | 0.0029 | 0.0018 |
| MKT \(b\) | 0.97 | 0.08 |
| SMB \(s\) | 0.75 | 0.11 |
| HML \(h\) | -0.13 | 0.13 |
\(R^2 = 0.92\), Adjusted \(R^2 = 0.918\), critical \(|t| \approx 1.98\).
ff <- data.frame(
term = c("alpha","MKT (b)","SMB (s)","HML (h)"),
estimate = c(0.0029, 0.97, 0.75, -0.13),
se = c(0.0018, 0.08, 0.11, 0.13)
)
ff$t_stat <- round(ff$estimate / ff$se, 4)
ff$significant <- ifelse(abs(ff$t_stat) > 1.98, "Yes", "No")
ff
## term estimate se t_stat significant
## 1 alpha 0.0029 0.0018 1.6111 No
## 2 MKT (b) 0.9700 0.0800 12.1250 Yes
## 3 SMB (s) 0.7500 0.1100 6.8182 Yes
## 4 HML (h) -0.1300 0.1300 -1.0000 No
Significant at 5%: MKT and SMB.
Style: a small-cap fund with no statistically meaningful value/growth tilt (a faint, insignificant growth lean).
\(\hat\alpha = 0.0029\) (about 0.29%/month, \(\approx 3.5\%\)/year) with \(t = 1.61 < 1.98\).
The alpha is positive in point estimate but not statistically significant. We therefore cannot conclude the manager adds value beyond the three factor exposures — the apparent outperformance is within sampling noise.
The increase from CAPM’s \(R^2 = 0.75\) to the three-factor \(R^2 = 0.92\) shows the SMB and HML factors (especially the strong size factor) explain a substantial portion of the fund’s return variation that the market factor alone misses — consistent with this being a small-cap fund.
Why adjusted \(R^2\): ordinary \(R^2\) can only increase (never decrease) when predictors are added, even useless ones, so it is biased toward larger models and cannot fairly compare models of different dimension. Adjusted \(R^2\) penalizes the number of predictors, rising only when an added factor improves fit beyond what is expected by chance. Hence it is the appropriate metric when comparing the one-factor and three-factor specifications.
\[\text{logit }P(\text{Up}) = \beta_0 + \beta_1 r_{t-1} + \beta_2 \Delta VIX_{t-1}\]
\(\beta_0 = -0.02,\ \beta_1 = 5.4,\ \beta_2 = -0.38\); inputs \(r_{t-1} = 0.010,\ \Delta VIX = 1.5\).
b0 <- -0.02; b1 <- 5.4; b2 <- -0.38
r_lag <- 0.010; dVIX <- 1.5
\[z = \beta_0 + \beta_1 r_{t-1} + \beta_2 \Delta VIX,\qquad P(\text{Up}) = \frac{1}{1+e^{-z}}\]
z <- b0 + b1*r_lag + b2*dVIX
p_up <- 1/(1 + exp(-z))
round(z, 4)
## [1] -0.536
round(p_up, 4)
## [1] 0.3691
\(z = -0.02 + 5.4(0.010) - 0.38(1.5) = -0.536\), so \(P(\text{Up}) = 1/(1+e^{0.536}) = 0.3691\).
Since \(0.3691 < 0.5\), the predicted class is “Down.”
| Actual Up | Actual Down | Total | |
|---|---|---|---|
| Pred Up | 67 (TP) | 44 (FP) | 111 |
| Pred Down | 33 (FN) | 56 (TN) | 89 |
| Total | 100 | 100 | 200 |
TP <- 67; FP <- 44; FN <- 33; TN <- 56
accuracy <- (TP + TN) / 200
sensitivity <- TP / (TP + FN) # true positive rate for "Up"
specificity <- TN / (TN + FP)
precision <- TP / (TP + FP)
round(c(accuracy=accuracy, sensitivity=sensitivity,
specificity=specificity, precision=precision), 4)
## accuracy sensitivity specificity precision
## 0.6150 0.6700 0.5600 0.6036
The two classes are tied (100 Up, 100 Down), so always predicting one class gives accuracy \(= 100/200 = 0.50\).
naive_acc <- 100/200
round(naive_acc, 4)
## [1] 0.5
The model’s accuracy (0.6150) beats the naive rule (0.50).
Why accuracy alone is inadequate for a trading system: it counts directional hits equally and ignores the magnitude of returns and the asymmetric P&L of trades. A handful of large losses on wrong calls can outweigh many small correct calls, so a high hit-rate can still lose money. A more economically relevant criterion is the risk-adjusted return of the realised strategy P&L (e.g. the Sharpe ratio), or expected/cumulative profit net of transaction costs — metrics that weight each decision by the money it makes or loses.
Sample over \(n = 48\) months: mean monthly return \(= 0.70\% = 0.0070\), sample SD \(= 5.50\% = 0.0550\).
mu <- 0.0070; sigma <- 0.0550; n <- 48
\[SR_{\text{monthly}} = \frac{\bar r}{s},\qquad SR_{\text{annual}} = SR_{\text{monthly}} \times \sqrt{12}\]
sr_m <- mu / sigma
sr_a <- sr_m * sqrt(12)
round(sr_m, 4)
## [1] 0.1273
round(sqrt(12), 4)
## [1] 3.4641
round(sr_a, 4)
## [1] 0.4409
Monthly \(SR = 0.0070/0.0550 = 0.1273\). Scaling factor \(= \sqrt{12} \approx 3.4641\) (returns scale with time, volatility with \(\sqrt{\text{time}}\), assuming i.i.d. returns). Annualized \(SR = 0.1273 \times \sqrt{12} = 0.4409\).
Procedure:
set.seed(1)
B <- 5000
boot_sr <- replicate(B, {
x <- sample(returns, length(returns), replace = TRUE)
mean(x) / sd(x)
})
se_sr <- sd(boot_sr) # bootstrap standard error
quantile(boot_sr, c(.025,.975)) # 95% CI
Why the ordinary i.i.d. bootstrap is inappropriate here: it assumes observations are independent, but monthly returns exhibit serial dependence (autocorrelation and volatility clustering). Resampling individual months destroys this time-series structure and understates the true standard error.
The fix — the block bootstrap (moving-block or the stationary bootstrap), which resamples contiguous blocks of consecutive months so that the short-range dependence is preserved within each block.
Deploy the one-standard-error solution (\(\lambda = 0.065\), 7 factors). The 1-SE rule keeps the most parsimonious model whose CV error is within one standard error of the minimum, so we give up no statistically meaningful accuracy. In a noisy backtest with 60 candidate factors and a high risk of fitting spurious patterns, the simpler 7-factor model is more robust to overfitting, more stable, more interpretable, and more likely to generalize out-of-sample. The minimum-CV model tends to chase noise.
Scheme:
Why random k-fold CV is unsafe here: it shuffles observations into folds, so a training fold can contain months that occur after the test months — look-ahead bias / data leakage. Future information leaks into the fitted model, breaking the temporal ordering and producing optimistically inflated performance. A trading model can only use the past to predict the future, so the evaluation must respect the arrow of time.