This project walks through my solutions to the four problems in the Machine Learning Applications in Finance final, covering the market and Fama–French factor models, a logistic model for market direction, and resampling methods for backtesting. My aim throughout is to show not just the calculations, but the reasoning behind each result — testing whether an estimate is statistically and economically meaningful rather than just a product of chance.


Question 1 — Single-Factor (Market) Model

A fund’s monthly excess returns are regressed on the market excess return over \(n = 96\) months:

\[R_i - R_f \;=\; \alpha + \beta\,(R_m - R_f) + \varepsilon.\]

Table 1. Regression estimates (R-squared = 0.50, n = 96).
Term Estimate Std. Error
Intercept (α) 0.0017 0.0020
Market premium (β) 0.98 0.17

Reported \(R^2 = 0.50\) and average market risk premium \(E[R_m - R_f] = 0.70\% = 0.0070\).

alpha <- 0.0017; se_alpha <- 0.0020
beta  <- 0.98;   se_beta  <- 0.17
mkt_prem <- 0.0070; t_crit <- 1.98

(a) Significance of \(\beta\)

\[t_\beta = \frac{\hat\beta - 0}{\operatorname{SE}(\hat\beta)} = \frac{0.98}{0.17}.\]

t_b0 <- beta / se_beta

Result. \(t_\beta = 5.7647\). Since \(|t_\beta| = 5.7647 > 1.98\), we reject \(H_0:\beta = 0\) at the 5% level.

Economic interpretation. \(\hat\beta \approx 0.98\) implies the fund moves almost one-for-one with the market: a 1% market excess return corresponds to roughly a 0.98% fund excess return. The fund therefore carries close to average market (systematic) exposure.

(b) Test of \(H_0:\beta = 1\)

\[t = \frac{\hat\beta - 1}{\operatorname{SE}(\hat\beta)} = \frac{0.98 - 1}{0.17}.\]

t_b1 <- (beta - 1) / se_beta

Result. \(t = -0.1176\), and \(|t| = 0.1176 < 1.98\), so we fail to reject \(H_0:\beta = 1\).

Interpretation. The beta is statistically indistinguishable from one; the fund’s systematic risk is not significantly different from the market’s. We cannot classify it as aggressive (\(\beta>1\)) or defensive (\(\beta<1\)).

(c) Jensen’s alpha

\[t_\alpha = \frac{\hat\alpha}{\operatorname{SE}(\hat\alpha)} = \frac{0.0017}{0.0020}.\]

t_a <- alpha / se_alpha

Result. \(t_\alpha = 0.8500\), and \(|t_\alpha| = 0.8500 < 1.98\); we fail to reject \(H_0:\alpha = 0\).

Assessment of the marketing claim. Although \(\hat\alpha\) is positive (0.17% per month), it is not statistically significant. The evidence therefore does not support advertising “positive risk-adjusted performance” — the estimated alpha cannot be distinguished from zero and may reflect chance.

(d) Interpretation of \(R^2\)

With \(R^2 = 0.50\), the market factor explains 50% of the variation in the fund’s excess returns (systematic risk); the remaining 50% is diversifiable, firm-specific (idiosyncratic) risk.

(e) CAPM-implied expected excess return

Under CAPM the intercept is zero, so

\[E[R_i - R_f] = \beta \cdot E[R_m - R_f] = 0.98 \times 0.0070.\]

capm_excess <- beta * mkt_prem

Result. \(E[R_i - R_f] = 0.0069\), i.e. about 0.686% per month.


Question 2 — Fama–French Three-Factor Model

For a managed equity fund estimated on \(n = 144\) monthly observations:

\[R_i - R_f = \alpha + b\,\text{MKT} + s\,\text{SMB} + h\,\text{HML} + \varepsilon.\]

ff <- data.frame(
  Term     = c("Intercept (alpha)", "MKT (b)", "SMB (s)", "HML (h)"),
  Estimate = c(0.0029, 0.97, 0.75, -0.13),
  SE       = c(0.0018, 0.08, 0.11, 0.13))

Reported \(R^2 = 0.92\), adjusted \(R^2 = 0.918\).

(f) Coefficient significance

\[t_j = \frac{\hat\theta_j}{\operatorname{SE}(\hat\theta_j)}.\]

ff$t       <- ff$Estimate / ff$SE
ff$Signif. <- ifelse(abs(ff$t) > 1.98, "Yes", "No")
disp <- data.frame(Term = ff$Term, Estimate = r4(ff$Estimate),
                   SE = r4(ff$SE), t = r4(ff$t), `Signif.` = ff$Signif.,
                   check.names = FALSE)
Table 2. t-statistics and 5% significance.
Term Estimate SE t Signif.
Intercept (alpha) 0.0029 0.0018 1.6111 No
MKT (b) 0.9700 0.0800 12.1250 Yes
SMB (s) 0.7500 0.1100 6.8182 Yes
HML (h) -0.1300 0.1300 -1.0000 No

Significant at 5%: MKT (\(t = 12.1250\)) and SMB (\(t = 6.8182\)). Not significant: \(\alpha\) (\(t = 1.6111\)) and HML (\(t = -1.0000\)).

(g) Investment style

Loading Estimate Significant? Implication
SMB (\(s\)) \(+0.75\) Yes Pronounced small-cap tilt
HML (\(h\)) \(-0.13\) No Slight growth lean, but statistically negligible

The fund is best described as a small-capitalisation fund. The negative HML loading hints at a growth orientation, but because it is statistically insignificant there is no reliable value/growth tilt.

(h) The intercept

\(\hat\alpha = 0.0029\) (0.29% per month) with \(t = 1.6111 < 1.98\). The risk-adjusted return beyond the three factor exposures is positive in point estimate but not statistically significant. We therefore cannot conclude that the manager adds value once market, size, and value exposures are controlled for.

(i) Why \(R^2\) rises and why adjusted \(R^2\) matters

Introducing SMB and HML raised explained variance from \(0.75\) to \(0.92\), showing the fund carries genuine size and value exposure that the single-factor CAPM attributed to idiosyncratic noise.

Plain \(R^2\) never decreases when predictors are added — even irrelevant ones — so it cannot fairly rank models of differing dimension. Adjusted \(R^2\) penalises the parameter count,

\[\bar R^2 = 1 - (1 - R^2)\,\frac{n-1}{n-k-1},\]

and here remains \(0.918\), essentially equal to \(R^2 = 0.92\). The negligible gap confirms the additional factors improve fit substantively, not mechanically.


Question 3 — Logistic Regression for Market Direction

\[\operatorname{logit} P(\text{Up}) = \beta_0 + \beta_1\,r_{t-1} + \beta_2\,\Delta \mathrm{VIX}_{t-1},\]

with \(\beta_0 = -0.02\), \(\beta_1 = 5.4\), \(\beta_2 = -0.38\), and today’s inputs \(r_{t-1} = 0.010\), \(\Delta\mathrm{VIX} = 1.5\).

b0 <- -0.02; b1 <- 5.4; b2 <- -0.38
r_lag <- 0.010; dVIX <- 1.5

(j) Predicted probability and class

\[\eta = \beta_0 + \beta_1 r_{t-1} + \beta_2 \Delta\mathrm{VIX}_{t-1}, \qquad P(\text{Up}) = \frac{1}{1 + e^{-\eta}}.\]

eta  <- b0 + b1*r_lag + b2*dVIX
prob <- 1 / (1 + exp(-eta))
pred <- ifelse(prob >= 0.5, "Up", "Down")

Result. \(\eta = -0.02 + 5.4(0.010) - 0.38(1.5) = -0.5360\), hence \(P(\text{Up}) = 0.3691\). Because \(0.3691 < 0.5\), the predicted class is “Down”.

(k) Sign interpretation

  • \(\beta_1 = +5.4\): a higher lagged return raises the odds of an up day — short-term momentum / positive return persistence.
  • \(\beta_2 = -0.38\): a rise in the VIX (increasing fear and volatility) lowers the odds of an up day — volatility spikes accompany market declines.

(l) Classification metrics

Table 3. Hold-out confusion matrix (n = 200).
Actual Up Actual Down Total
Predicted Up 67 44 111
Predicted Down 33 56 89
Total 100 100 200
TP <- 67; FP <- 44; FN <- 33; TN <- 56; N <- 200
accuracy    <- (TP + TN) / N
sensitivity <- TP / (TP + FN)   # true-positive rate for "Up"
specificity <- TN / (TN + FP)
precision   <- TP / (TP + FP)

\[\text{Accuracy} = \tfrac{TP+TN}{N},\;\; \text{Sensitivity} = \tfrac{TP}{TP+FN},\;\; \text{Specificity} = \tfrac{TN}{TN+FP},\;\; \text{Precision} = \tfrac{TP}{TP+FP}.\]

Table 4. Performance metrics.
Metric Value
Accuracy 0.6150
Sensitivity (TPR, Up) 0.6700
Specificity 0.5600
Precision (Up) 0.6036

(m) Comparison with the naive rule

naive_acc <- 100 / N   # balanced classes: majority rule scores 100/200

The test set is balanced (100 up, 100 down), so a majority-class predictor attains accuracy \(= 0.5000\). The model’s accuracy of \(0.6150\) exceeds this by 11.5 percentage points, so it beats the benchmark.

Why accuracy is inadequate for a trading system. Accuracy weights every error equally and ignores (i) the asymmetric economic cost of false signals versus missed opportunities, (ii) the magnitude of returns — many correct calls on tiny moves cannot offset a few wrong calls on large ones — and (iii) transaction costs. A more economically relevant criterion is the Sharpe ratio (or net P&L) of the strategy implied by the signal, i.e. risk-adjusted profitability rather than hit rate.


Question 4 — Resampling and Regularisation in a Backtest

A candidate strategy earns a mean monthly return \(\bar r = 0.70\%\) with sample standard deviation \(s = 5.50\%\) over \(n = 48\) months.

mean_ret <- 0.0070; sd_ret <- 0.0550; n <- 48

(n) Sharpe ratio, monthly and annualised

\[\mathrm{SR}_{\text{m}} = \frac{\bar r}{s}, \qquad \mathrm{SR}_{\text{ann}} = \mathrm{SR}_{\text{m}} \times \sqrt{12}.\]

SR_m <- mean_ret / sd_ret
SR_a <- SR_m * sqrt(12)

Result. \(\mathrm{SR}_{\text{m}} = 0.0070 / 0.0550 = 0.1273\) and \(\mathrm{SR}_{\text{ann}} = 0.1273 \times \sqrt{12} = 0.4409\).

Scaling factor. \(\sqrt{12}\): under i.i.d. returns the mean scales with the number of periods \(T\) while the standard deviation scales with \(\sqrt{T}\), so the ratio scales with \(\sqrt{T} = \sqrt{12}\).

(o) Bootstrap standard error for the Sharpe ratio

Procedure.

  1. Treat the 48 observed monthly returns as the sample.
  2. Draw a resample of size 48 with replacement.
  3. Compute \(\widehat{\mathrm{SR}}^{*} = \bar r^{*} / s^{*}\) on the resample.
  4. Repeat \(B\) times (e.g. \(B = 10{,}000\)) to build the bootstrap distribution.
  5. The standard error is the standard deviation of the \(B\) replicates; percentile or BCa intervals give a confidence interval without assuming normality.

Why the i.i.d. bootstrap fails here. Monthly returns are not independent — they exhibit autocorrelation and volatility clustering. Resampling individual months independently destroys this dependence and typically understates the true standard error.

Remedy. A block bootstrap — the moving-block or the stationary (Politis–Romano) bootstrap — resamples contiguous blocks of returns, preserving the short-range dependence structure.

(p) Choice of the lasso penalty

Table 5. Cross-validated lasso solutions.
Rule Lambda Factors kept
Minimum CV error 0.030 14
One-standard-error 0.065 7

Deploy \(\lambda_{1\text{SE}} = 0.065\) (7 factors). The one-standard-error rule selects the most parsimonious model whose cross-validation error lies within one standard error of the minimum. With 60 candidate factors, the \(\lambda_{\min}\) solution is prone to overfitting noise and selecting spurious factors. The seven-factor model is more robust, more interpretable, less exposed to data-snooping, and far more likely to generalise out-of-sample — the property that matters for a deployable strategy.

(q) Walk-forward evaluation

Scheme.

  1. Keep the data in chronological order.
  2. Estimate on an initial window (e.g. months 1–24).
  3. Predict / trade the next out-of-sample period (e.g. month 25).
  4. Roll forward, re-estimating on either an expanding window (all data up to \(t\)) or a fixed rolling window (the most recent \(W\) months).
  5. Repeat to the end of the sample, then evaluate performance (Sharpe, net return) on the concatenated out-of-sample predictions only.

Why random \(k\)-fold CV is unsafe. Random folds shuffle observations, so a training fold may contain data occurring after its test fold. This is look-ahead bias / data leakage: the model exploits future information to “predict” the past, which is impossible in live trading and inflates measured performance. Time-series structure requires that only past data inform predictions of the future, exactly what walk-forward validation enforces.