1 Question 1. Single-Factor (Market) Model [25 points]

Model estimated on \(n = 96\) months: \[R_i - R_f = \alpha + \beta\,(R_m - R_f) + \varepsilon\]

Term	Estimate	Std. Error
Intercept \(\alpha\)	0.0017	0.0020
Market premium \(\beta\)	0.98	0.17

\(R^2 = 0.50\), average market premium \(E[R_m - R_f] = 0.70\% = 0.0070\), critical \(|t| \approx 1.98\).

alpha   <- 0.0017; se_alpha <- 0.0020
beta    <- 0.98;   se_beta  <- 0.17
R2      <- 0.50
mkt_prem<- 0.0070
tcrit   <- 1.98

1.1 (a) Test \(H_0:\ \beta = 0\)

\[t_\beta = \frac{\hat\beta - 0}{SE(\hat\beta)}\]

t_beta0 <- (beta - 0) / se_beta
round(t_beta0, 4)

## [1] 5.7647

\(t_\beta = 0.98/0.17 = 5.7647\). Since \(|5.7647| > 1.98\), reject \(H_0\): \(\beta\) is statistically significant at the 5% level.

Economic interpretation: \(\beta \approx 0.98\) means the fund’s excess return moves almost one-for-one with the market — a 1% market excess return is associated with about a 0.98% fund excess return. The fund carries systematic (market) risk close to that of the market itself.

1.2 (b) Test \(H_0:\ \beta = 1\)

\[t = \frac{\hat\beta - 1}{SE(\hat\beta)}\]

t_beta1 <- (beta - 1) / se_beta
round(t_beta1, 4)

## [1] -0.1176

\(t = (0.98 - 1)/0.17 = -0.1176\). Since \(|{-0.1176}| < 1.98\), fail to reject \(H_0\).

The fund’s beta is not statistically distinguishable from 1, so its systematic risk is not significantly different from the market’s. The fund is effectively “market-like” in exposure.

1.3 (c) Jensen’s alpha — test \(H_0:\ \alpha = 0\)

\[t_\alpha = \frac{\hat\alpha}{SE(\hat\alpha)}\]

t_alpha <- alpha / se_alpha
round(t_alpha, 4)

## [1] 0.85

\(t_\alpha = 0.0017/0.0020 = 0.85\). Since \(0.85 < 1.98\), fail to reject \(H_0\).

The data do not statistically justify the marketing claim of “positive risk-adjusted performance.” The point estimate of alpha is positive (0.17%/month) but it is not statistically distinguishable from zero, so there is no reliable evidence of skill beyond market exposure.

1.4 (d) Interpret \(R^2\)

\(R^2 = 0.50\) means 50% of the variation in the fund’s excess returns is explained by the market factor (systematic risk). The remaining 50% is idiosyncratic / diversifiable (firm-specific) variation not attributable to the market.

1.5 (e) CAPM-implied expected monthly excess return

Under CAPM (\(\alpha = 0\)): \(E[R_i - R_f] = \beta \cdot E[R_m - R_f]\).

capm_er <- beta * mkt_prem
round(capm_er, 4)

## [1] 0.0069

\(E[R_i - R_f] = 0.98 \times 0.0070 = 0.00686 \approx 0.69\%\) per month.

2 Question 2. Fama–French Three-Factor Model [25 points]

Estimated on \(n = 144\) months: \[R_i - R_f = \alpha + b\cdot MKT + s\cdot SMB + h\cdot HML + \varepsilon\]

Term	Estimate	Std. Error
\(\alpha\)	0.0029	0.0018
MKT \(b\)	0.97	0.08
SMB \(s\)	0.75	0.11
HML \(h\)	-0.13	0.13

\(R^2 = 0.92\), Adjusted \(R^2 = 0.918\), critical \(|t| \approx 1.98\).

ff <- data.frame(
  term     = c("alpha","MKT (b)","SMB (s)","HML (h)"),
  estimate = c(0.0029, 0.97, 0.75, -0.13),
  se       = c(0.0018, 0.08, 0.11, 0.13)
)

2.1 (f) t-statistics and significance

ff$t_stat      <- round(ff$estimate / ff$se, 4)
ff$significant <- ifelse(abs(ff$t_stat) > 1.98, "Yes", "No")
ff

##      term estimate     se  t_stat significant
## 1   alpha   0.0029 0.0018  1.6111          No
## 2 MKT (b)   0.9700 0.0800 12.1250         Yes
## 3 SMB (s)   0.7500 0.1100  6.8182         Yes
## 4 HML (h)  -0.1300 0.1300 -1.0000          No

\(\alpha\): \(t = 1.6111\) → not significant
\(b\) (MKT): \(t = 12.125\) → significant
\(s\) (SMB): \(t = 6.8182\) → significant
\(h\) (HML): \(t = -1.000\) → not significant

Significant at 5%: MKT and SMB.

2.2 (g) Investment-style classification

SMB \(s = 0.75 > 0\) and highly significant → the fund loads positively on small-minus-big, i.e. it behaves like small-cap stocks → a small-cap (size) tilt.
HML \(h = -0.13\), negative but not significant → a nominal lean toward growth (negative HML = growth), but the loading is statistically indistinguishable from zero, so there is no significant value/growth tilt.

Style: a small-cap fund with no statistically meaningful value/growth tilt (a faint, insignificant growth lean).

2.3 (h) Intercept interpretation

\(\hat\alpha = 0.0029\) (about 0.29%/month, \(\approx 3.5\%\)/year) with \(t = 1.61 < 1.98\).

The alpha is positive in point estimate but not statistically significant. We therefore cannot conclude the manager adds value beyond the three factor exposures — the apparent outperformance is within sampling noise.

2.4 (i) \(R^2\) rising 0.75 → 0.92, and why adjusted \(R^2\)

The increase from CAPM’s \(R^2 = 0.75\) to the three-factor \(R^2 = 0.92\) shows the SMB and HML factors (especially the strong size factor) explain a substantial portion of the fund’s return variation that the market factor alone misses — consistent with this being a small-cap fund.

Why adjusted \(R^2\): ordinary \(R^2\) can only increase (never decrease) when predictors are added, even useless ones, so it is biased toward larger models and cannot fairly compare models of different dimension. Adjusted \(R^2\) penalizes the number of predictors, rising only when an added factor improves fit beyond what is expected by chance. Hence it is the appropriate metric when comparing the one-factor and three-factor specifications.

3 Question 3. Logistic Regression for Market Direction [25 points]

\[\text{logit }P(\text{Up}) = \beta_0 + \beta_1 r_{t-1} + \beta_2 \Delta VIX_{t-1}\]

\(\beta_0 = -0.02,\ \beta_1 = 5.4,\ \beta_2 = -0.38\); inputs \(r_{t-1} = 0.010,\ \Delta VIX = 1.5\).

b0 <- -0.02; b1 <- 5.4; b2 <- -0.38
r_lag <- 0.010; dVIX <- 1.5

3.1 (j) Predicted probability and class

\[z = \beta_0 + \beta_1 r_{t-1} + \beta_2 \Delta VIX,\qquad P(\text{Up}) = \frac{1}{1+e^{-z}}\]

z <- b0 + b1*r_lag + b2*dVIX
p_up <- 1/(1 + exp(-z))
round(z, 4)

## [1] -0.536

round(p_up, 4)

## [1] 0.3691

\(z = -0.02 + 5.4(0.010) - 0.38(1.5) = -0.536\), so \(P(\text{Up}) = 1/(1+e^{0.536}) = 0.3691\).

Since \(0.3691 < 0.5\), the predicted class is “Down.”

3.2 (k) Sign interpretation

\(\beta_1 = 5.4 > 0\): a higher lagged return raises the probability of an up day → short-term momentum / positive return autocorrelation (yesterday up makes today more likely up).
\(\beta_2 = -0.38 < 0\): an increase in VIX lowers the probability of an up day → rising volatility/fear (“risk-off”) is associated with market declines (the leverage/volatility-feedback effect).

3.3 (l) Confusion-matrix metrics

	Actual Up	Actual Down	Total
Pred Up	67 (TP)	44 (FP)	111
Pred Down	33 (FN)	56 (TN)	89
Total	100	100	200

TP <- 67; FP <- 44; FN <- 33; TN <- 56
accuracy    <- (TP + TN) / 200
sensitivity <- TP / (TP + FN)   # true positive rate for "Up"
specificity <- TN / (TN + FP)
precision   <- TP / (TP + FP)
round(c(accuracy=accuracy, sensitivity=sensitivity,
        specificity=specificity, precision=precision), 4)

##    accuracy sensitivity specificity   precision 
##      0.6150      0.6700      0.5600      0.6036

Accuracy \(= (67+56)/200 = 0.6150\)
Sensitivity (TPR for Up) \(= 67/100 = 0.6700\)
Specificity \(= 56/100 = 0.5600\)
Precision (Up) \(= 67/111 = 0.6036\)

3.4 (m) Naive majority-class rule

The two classes are tied (100 Up, 100 Down), so always predicting one class gives accuracy \(= 100/200 = 0.50\).

naive_acc <- 100/200
round(naive_acc, 4)

## [1] 0.5

The model’s accuracy (0.6150) beats the naive rule (0.50).

Why accuracy alone is inadequate for a trading system: it counts directional hits equally and ignores the magnitude of returns and the asymmetric P&L of trades. A handful of large losses on wrong calls can outweigh many small correct calls, so a high hit-rate can still lose money. A more economically relevant criterion is the risk-adjusted return of the realised strategy P&L (e.g. the Sharpe ratio), or expected/cumulative profit net of transaction costs — metrics that weight each decision by the money it makes or loses.

4 Question 4. Resampling and Regularization in a Backtest [25 points]

Sample over \(n = 48\) months: mean monthly return \(= 0.70\% = 0.0070\), sample SD \(= 5.50\% = 0.0550\).

mu <- 0.0070; sigma <- 0.0550; n <- 48

4.1 (n) Sharpe ratio (monthly and annualized)

\[SR_{\text{monthly}} = \frac{\bar r}{s},\qquad SR_{\text{annual}} = SR_{\text{monthly}} \times \sqrt{12}\]

sr_m <- mu / sigma
sr_a <- sr_m * sqrt(12)
round(sr_m, 4)

## [1] 0.1273

round(sqrt(12), 4)

## [1] 3.4641

round(sr_a, 4)

## [1] 0.4409

Monthly \(SR = 0.0070/0.0550 = 0.1273\). Scaling factor \(= \sqrt{12} \approx 3.4641\) (returns scale with time, volatility with \(\sqrt{\text{time}}\), assuming i.i.d. returns). Annualized \(SR = 0.1273 \times \sqrt{12} = 0.4409\).

4.2 (o) Bootstrap standard error for the Sharpe ratio

Procedure:

Treat the 48 observed monthly returns as the empirical sample.
Draw a resample of size 48 with replacement from these returns.
Compute the Sharpe ratio \(\bar r / s\) on the resample.
Repeat steps 2–3 a large number of times \(B\) (e.g. 1,000–10,000), storing each bootstrap Sharpe.
The standard error is the standard deviation of the \(B\) bootstrap Sharpe ratios; percentiles (e.g. 2.5% and 97.5%) give a confidence interval.

set.seed(1)
B <- 5000
boot_sr <- replicate(B, {
  x <- sample(returns, length(returns), replace = TRUE)
  mean(x) / sd(x)
})
se_sr <- sd(boot_sr)            # bootstrap standard error
quantile(boot_sr, c(.025,.975)) # 95% CI

Why the ordinary i.i.d. bootstrap is inappropriate here: it assumes observations are independent, but monthly returns exhibit serial dependence (autocorrelation and volatility clustering). Resampling individual months destroys this time-series structure and understates the true standard error.

The fix — the block bootstrap (moving-block or the stationary bootstrap), which resamples contiguous blocks of consecutive months so that the short-range dependence is preserved within each block.

4.3 (p) Which \(\lambda\) to deploy

Minimum-CV-error: \(\lambda = 0.030\), 14 factors.
One-standard-error rule: \(\lambda = 0.065\), 7 factors.

Deploy the one-standard-error solution (\(\lambda = 0.065\), 7 factors). The 1-SE rule keeps the most parsimonious model whose CV error is within one standard error of the minimum, so we give up no statistically meaningful accuracy. In a noisy backtest with 60 candidate factors and a high risk of fitting spurious patterns, the simpler 7-factor model is more robust to overfitting, more stable, more interpretable, and more likely to generalize out-of-sample. The minimum-CV model tends to chase noise.

4.4 (q) Walk-forward (time-respecting) evaluation

Scheme:

Order all data chronologically.
Train on an initial in-sample window (months \(1\ldots T\)); choose \(\lambda\) by CV using only that window.
Predict / trade on the next out-of-sample block (e.g. the following \(k\) months), which the model has never seen.
Roll forward — either an expanding window (add the new block, refit) or a rolling window (fixed length, drop the oldest months) — re-estimating parameters and \(\lambda\) each step using only past data.
Stitch together the out-of-sample blocks and evaluate strategy performance (Sharpe, cumulative return) on that concatenated OOS series.

Why random k-fold CV is unsafe here: it shuffles observations into folds, so a training fold can contain months that occur after the test months — look-ahead bias / data leakage. Future information leaks into the fitted model, breaking the temporal ordering and producing optimistically inflated performance. A trading model can only use the past to predict the future, so the evaluation must respect the arrow of time.

Final Examination — Machine Learning Applications in Finance

蔣舜涵 (Angelina)

2026-06-08