Question 1: Single-Factor (Market) Model [25 points]

Model: \(R_i - R_f = \alpha + \beta(R_m - R_f) + \varepsilon\)

Regression output (\(n = 96\) months):

Term	Estimate	Std. Error
Intercept, \(\hat{\alpha}\)	0.0017	0.0020
Market beta, \(\hat{\beta}\)	0.98	0.17

\(R^2 = 0.50\), \(E[R_m - R_f] = 0.70\%\) per month, critical \(|t|_{0.05} \approx 1.98\).

alpha_hat <- 0.0017;  se_alpha <- 0.0020
beta_hat  <- 0.98;    se_beta  <- 0.17
R2_capm   <- 0.50
E_mkt     <- 0.0070
t_crit    <- 1.98

Part (a): t-test for \(H_0: \beta = 0\)

\[t_{\hat{\beta}} = \frac{\hat{\beta} - 0}{SE(\hat{\beta})} = \frac{0.98}{0.17}\]

t_beta <- beta_hat / se_beta
cat("t-statistic for H0: beta = 0 :", round(t_beta, 4))

t-statistic for H0: beta = 0 : 5.765

cat("\nCritical value (two-tailed, 5%):", t_crit)


Critical value (two-tailed, 5%): 1.98

cat("\nReject H0?", abs(t_beta) > t_crit)


Reject H0? TRUE

Result: \(t_{\hat{\beta}} = 5.7647\). Since \(|t| = 5.7647 > 1.98\), we reject \(H_0: \beta = 0\) at the 5% significance level.

Economic interpretation: \(\hat{\beta} = 0.98\) means the fund moves nearly one-for-one with the market. A 1 percentage point rise in the market excess return is associated with approximately a 0.98 percentage point rise in the fund’s excess return. The fund’s systematic risk is very close to, but marginally below, that of the market portfolio.

Part (b): t-test for \(H_0: \beta = 1\)

\[t = \frac{\hat{\beta} - 1}{SE(\hat{\beta})} = \frac{0.98 - 1}{0.17}\]

t_beta1 <- (beta_hat - 1) / se_beta
cat("t-statistic for H0: beta = 1 :", round(t_beta1, 4))

t-statistic for H0: beta = 1 : -0.1176

cat("\nReject H0?", abs(t_beta1) > t_crit)


Reject H0? FALSE

Result: \(t = -0.1176\). Since \(|t| = 0.1176 < 1.98\), we fail to reject \(H_0: \beta = 1\) at the 5% level.

Interpretation: There is no statistically significant evidence that the fund’s systematic risk differs from the market’s. The deviation of \(\hat{\beta} = 0.98\) from unity is well within sampling error. The fund carries market-equivalent systematic risk with no discernible leverage reduction or amplification relative to the benchmark.

Part (c): Jensen’s Alpha — \(H_0: \alpha = 0\)

\[t_{\hat{\alpha}} = \frac{\hat{\alpha} - 0}{SE(\hat{\alpha})} = \frac{0.0017}{0.0020}\]

t_alpha <- alpha_hat / se_alpha
cat("t-statistic for H0: alpha = 0 :", round(t_alpha, 4))

t-statistic for H0: alpha = 0 : 0.85

cat("\nReject H0?", abs(t_alpha) > t_crit)


Reject H0? FALSE

Result: \(t_{\hat{\alpha}} = 0.85\). Since \(|t| = 0.85 < 1.98\), we fail to reject \(H_0: \alpha = 0\) at the 5% level.

Conclusion: The marketing claim of “positive risk-adjusted performance” is not statistically justified. Although the point estimate \(\hat{\alpha} = 0.0017\) (~0.17% per month) is positive, it is indistinguishable from zero given the sampling variability. The data do not provide sufficient evidence of genuine managerial skill beyond market exposure.

Part (d): Interpretation of \(R^2 = 0.50\)

\[R^2 = \frac{\text{Systematic variance}}{\text{Total variance}}\]

cat("Systematic (market-explained) variance:", R2_capm * 100, "%")

Systematic (market-explained) variance: 50 %

cat("\nIdiosyncratic (diversifiable) variance:", (1 - R2_capm) * 100, "%")


Idiosyncratic (diversifiable) variance: 50 %

Interpretation: Exactly 50% of the fund’s monthly excess return variance is systematic — explained by co-movement with the market portfolio. The remaining 50% is idiosyncratic (diversifiable) risk arising from fund-specific positions. This relatively low \(R^2\) indicates the fund carries substantial unsystematic risk that a well-diversified investor could eliminate by combining it with other assets.

Part (e): CAPM-Implied Expected Return

\[E[R_i - R_f] = \hat{\beta} \times E[R_m - R_f] = 0.98 \times 0.70\%\]

E_fund <- beta_hat * E_mkt
cat("CAPM-implied monthly excess return:", round(E_fund * 100, 4), "%")

CAPM-implied monthly excess return: 0.686 %

Result: The CAPM predicts the fund should earn \(0.98 \times 0.70\% = 0.686\%\) per month above the risk-free rate, as compensation for its market risk exposure.

Question 2: Fama–French Three-Factor Model [25 points]

Model: \(R_i - R_f = \alpha + b \cdot MKT + s \cdot SMB + h \cdot HML + \varepsilon\)

Regression output (\(n = 144\) months):

Term	Estimate	Std. Error
Intercept, \(\hat{\alpha}\)	0.0029	0.0018
MKT, \(\hat{b}\)	0.97	0.08
SMB, \(\hat{s}\)	0.75	0.11
HML, \(\hat{h}\)	−0.13	0.13

\(R^2 = 0.92\), Adjusted \(R^2 = 0.918\), critical \(|t|_{0.05} \approx 1.98\).

coefs <- c(alpha = 0.0029, b_MKT = 0.97, s_SMB = 0.75, h_HML = -0.13)
ses   <- c(alpha = 0.0018, b_MKT = 0.08, s_SMB = 0.11, h_HML =  0.13)
R2_ff3    <- 0.92
adjR2_ff3 <- 0.918
n_q2      <- 144

Part (f): t-statistics for All Coefficients

\[t_k = \frac{\hat{\theta}_k}{SE(\hat{\theta}_k)}\]

t_stats     <- coefs / ses
significant <- abs(t_stats) > 1.98

results_ff3 <- data.frame(
  Coefficient      = c("alpha (Intercept)", "b (MKT)", "s (SMB)", "h (HML)"),
  Estimate         = coefs,
  Std_Error        = ses,
  t_Statistic      = round(t_stats, 4),
  Significant_5pct = ifelse(significant, "Yes", "No")
)
rownames(results_ff3) <- NULL
knitr::kable(results_ff3,
             col.names = c("Coefficient", "Estimate", "Std. Error",
                           "t-Statistic", "Significant (5%)?"))

Coefficient	Estimate	Std. Error	t-Statistic	Significant (5%)?
alpha (Intercept)	0.0029	0.0018	1.611	No
b (MKT)	0.9700	0.0800	12.125	Yes
s (SMB)	0.7500	0.1100	6.818	Yes
h (HML)	-0.1300	0.1300	-1.000	No

Summary: MKT (\(t = 12.125\)) and SMB (\(t = 6.8182\)) are highly significant. The intercept (\(t = 1.6111\)) and HML (\(t = -1\)) are not significant at the 5% level.

Part (g): Investment Style Classification

cat("Size loading  s =", coefs["s_SMB"], "| t =", round(t_stats["s_SMB"], 4),
    "| Significant:", significant["s_SMB"])

Size loading  s = 0.75 | t = 6.818 | Significant: TRUE

cat("\nValue loading h =", coefs["h_HML"], "| t =", round(t_stats["h_HML"], 4),
    "| Significant:", significant["h_HML"])


Value loading h = -0.13 | t = -1 | Significant: FALSE

Size tilt: \(\hat{s} = 0.75 > 0\) and statistically significant (\(t = 6.8182\)). The fund has a strong small-cap tilt — it co-moves substantially more with small-capitalisation stocks than with large-capitalisation stocks.

Value/Growth tilt: \(\hat{h} = -0.13 < 0\) but not statistically significant (\(t = -1\)). The negative sign is directionally consistent with a growth tilt, but cannot be distinguished from zero at the 5% level.

Fund classification: Small-cap oriented fund with no statistically discernible value/growth bias.

Part (h): Intercept and Managerial Value-Add

t_alpha_ff   <- t_stats["alpha"]
annual_alpha <- coefs["alpha"] * 12 * 100

cat("FF3 alpha (monthly)        :", coefs["alpha"])

FF3 alpha (monthly)        : 0.0029

cat("\nt-statistic                :", round(t_alpha_ff, 4))


t-statistic                : 1.611

cat("\nReject H0: alpha = 0?     :", abs(t_alpha_ff) > 1.98)


Reject H0: alpha = 0?     : FALSE

cat("\nAnnualised alpha (approx) :", round(annual_alpha, 4), "%")


Annualised alpha (approx) : 3.48 %

Interpretation: \(\hat{\alpha} = 0.0029\) (~0.29% per month, approximately 3.48% annualised) is the fund’s abnormal return after controlling for market, size, and value risk factor exposures.

Result: \(t = 1.6111\). Since \(|t| < 1.98\), we fail to reject \(H_0: \alpha = 0\). The manager does not add statistically significant value beyond the three factor exposures. The positive point estimate may reflect estimation error rather than genuine skill.

Part (i): Rise in \(R^2\); Role of Adjusted \(R^2\)

R2_capm_q2 <- 0.75
delta_R2   <- R2_ff3 - R2_capm_q2

cat("CAPM R-squared        :", R2_capm_q2)

CAPM R-squared        : 0.75

cat("\nFF3  R-squared        :", R2_ff3)


FF3  R-squared        : 0.92

cat("\nIncrease in R-squared :", delta_R2)


Increase in R-squared : 0.17

cat("\nFF3  Adjusted R-squared:", adjR2_ff3)


FF3  Adjusted R-squared: 0.918

# Formula check: adj R^2 = 1 - (1 - R^2)(n - 1)/(n - k - 1), k = 3
adj_check <- 1 - (1 - R2_ff3) * (n_q2 - 1) / (n_q2 - 3 - 1)
cat("\nVerified Adjusted R^2 :", round(adj_check, 4))


Verified Adjusted R^2 : 0.9183

Rise from 0.75 to 0.92: Adding SMB and HML accounts for an additional 17 percentage points of return variance. The single market factor fails to capture the fund’s strong small-cap tilt; including SMB resolves this gap, leaving only 8% of return variation unexplained.

Why adjusted \(R^2\) is the appropriate metric: Raw \(R^2\) is non-decreasing in the number of predictors — adding any variable, including noise, cannot reduce it. The adjusted \(R^2\) applies a degrees-of-freedom penalty:

\[\bar{R}^2 = 1 - \frac{(1 - R^2)(n - 1)}{n - k - 1}\]

When comparing CAPM (\(k = 1\)) with FF3 (\(k = 3\)), adjusted \(R^2\) is the only valid criterion because it rises only when the added factors reduce residual variance by more than chance alone would predict. The adjusted \(R^2 = 0.918\) confirms that SMB and HML earn their inclusion — the improvement is genuine, not mechanical.

Question 3: Logistic Regression for Market Direction [25 points]

Model: \(\text{logit}\,P(\text{Up}) = \beta_0 + \beta_1 r_{t-1} + \beta_2\,\Delta VIX_{t-1}\)

Estimated coefficients: \(\hat{\beta}_0 = -0.02\), \(\hat{\beta}_1 = 5.4\), \(\hat{\beta}_2 = -0.38\). Today’s inputs: \(r_{t-1} = 0.010\), \(\Delta VIX = 1.5\).

b0    <- -0.02;  b1 <- 5.4;  b2 <- -0.38
r_lag <- 0.010;  dVIX <- 1.5

Part (j): Predicted Probability of “Up”

Step 1 — Linear predictor (log-odds):

\[\text{logit} = \beta_0 + \beta_1 r_{t-1} + \beta_2\,\Delta VIX_{t-1}\]

Step 2 — Sigmoid transformation:

\[\hat{P}(\text{Up}) = \frac{1}{1 + e^{-\text{logit}}}\]

logit_val <- b0 + b1 * r_lag + b2 * dVIX
prob_up   <- 1 / (1 + exp(-logit_val))

cat("Logit = -0.02 +", b1, "*", r_lag, "+", b2, "*", dVIX)

Logit = -0.02 + 5.4 * 0.01 + -0.38 * 1.5

cat("\n      =", round(logit_val, 4))


      = -0.536

cat("\nP(Up) =", round(prob_up, 4))


P(Up) = 0.3691

cat("\nPredicted class (threshold 0.5):", ifelse(prob_up >= 0.5, "Up", "Down"))


Predicted class (threshold 0.5): Down

Result:

\[\text{logit} = -0.02 + 5.4(0.010) + (-0.38)(1.5) = -0.02 + 0.054 - 0.570 = -0.536\]

\[\hat{P}(\text{Up}) = \frac{1}{1 + e^{-0.536}} = 0.3691\]

Since \(\hat{P}(\text{Up}) = 0.3691 < 0.50\), the predicted class is Down.

Part (k): Economic Interpretation of Coefficient Signs

cat("beta_1 (lagged return) =", b1, " > 0")

beta_1 (lagged return) = 5.4  > 0

cat("\nbeta_2 (delta VIX)    =", b2, " < 0")


beta_2 (delta VIX)    = -0.38  < 0

\(\hat{\beta}_1 = 5.4 > 0\) — Short-term return momentum: A positive lagged return increases the probability of an “Up” day. This captures the tendency of recent gains to predict further gains — consistent with short-term price momentum and positive return autocorrelation at short horizons.

\(\hat{\beta}_2 = -0.38 < 0\) — Volatility/fear signal: A rise in VIX decreases the probability of an “Up” day. VIX measures implied volatility and investor fear; a VIX spike signals heightened uncertainty, typically associated with selling pressure and negative returns. This reflects the well-documented risk-off dynamic.

Part (l): Confusion Matrix Performance Metrics

Confusion matrix (0.5 threshold, \(n = 200\) hold-out days):

	Actual Up	Actual Down	Total
Predicted Up	67	44	111
Predicted Down	33	56	89
Total	100	100	200

TP <- 67; FP <- 44; FN <- 33; TN <- 56; N <- 200

accuracy    <- (TP + TN) / N
sensitivity <- TP / (TP + FN)
specificity <- TN / (TN + FP)
precision   <- TP / (TP + FP)

cat("Accuracy    = (67 + 56) / 200 =", round(accuracy,    4))

Accuracy    = (67 + 56) / 200 = 0.615

cat("\nSensitivity = 67 / (67 + 33)  =", round(sensitivity, 4))


Sensitivity = 67 / (67 + 33)  = 0.67

cat("\nSpecificity = 56 / (56 + 44)  =", round(specificity, 4))


Specificity = 56 / (56 + 44)  = 0.56

cat("\nPrecision   = 67 / (67 + 44)  =", round(precision,   4))


Precision   = 67 / (67 + 44)  = 0.6036

\[\text{Accuracy} = \frac{TP + TN}{N} = \frac{67 + 56}{200} = 0.615\]

\[\text{Sensitivity} = \frac{TP}{TP + FN} = \frac{67}{100} = 0.67\]

\[\text{Specificity} = \frac{TN}{TN + FP} = \frac{56}{100} = 0.56\]

\[\text{Precision} = \frac{TP}{TP + FP} = \frac{67}{111} = 0.6036\]

Part (m): Naive Benchmark and Limitations of Accuracy

naive_acc <- max(100, 100) / N
cat("Naive majority-class accuracy :", naive_acc)

Naive majority-class accuracy : 0.5

cat("\nModel accuracy                :", round(accuracy, 4))


Model accuracy                : 0.615

cat("\nModel beats naive baseline?   :", accuracy > naive_acc)


Model beats naive baseline?   : TRUE

Naive rule: Since the test set is balanced (100 Up, 100 Down), always predicting the majority class achieves:

\[\text{Naive accuracy} = \frac{100}{200} = 0.50\]

The model achieves \(0.615 > 0.50\), so it outperforms the naive baseline.

Why accuracy alone is inadequate: In a trading context, prediction errors carry asymmetric financial consequences. A false positive (predicting “Up” when the market falls) results in a losing long position, while a false negative (predicting “Down” when the market rises) results in a missed gain. Accuracy weights both error types equally and therefore fails to capture the true economic cost of misclassification.

A more economically relevant criterion is realised P&L — the portfolio return generated by acting on the model’s signals, net of transaction costs. Precision (the fraction of “Up” predictions that are correct) directly determines the hit rate of long entry signals and is more actionable for a practitioner. The AUC-ROC and F1 score provide additional diagnostic power across all classification thresholds.

Question 4: Resampling and Regularization in a Backtest [25 points]

Sample statistics over \(T = 48\) months: \(\bar{r} = 0.70\%\), \(\hat{\sigma} = 5.50\%\).

r_bar    <- 0.0070
sigma    <- 0.0550
T_months <- 48

Part (n): Monthly and Annualised Sharpe Ratio

\[SR_{\text{monthly}} = \frac{\bar{r}}{\hat{\sigma}} = \frac{0.0070}{0.0550}\]

\[SR_{\text{annual}} = SR_{\text{monthly}} \times \sqrt{12}\]

SR_monthly   <- r_bar / sigma
scale_factor <- sqrt(12)
SR_annual    <- SR_monthly * scale_factor

cat("Monthly Sharpe Ratio    :", round(SR_monthly, 4))

Monthly Sharpe Ratio    : 0.1273

cat("\nScaling factor (sqrt 12):", round(scale_factor, 4))


Scaling factor (sqrt 12): 3.464

cat("\nAnnualised Sharpe Ratio :", round(SR_annual, 4))


Annualised Sharpe Ratio : 0.4409

\[SR_{\text{monthly}} = \frac{0.0070}{0.0550} = 0.1273\]

\[SR_{\text{annual}} = 0.1273 \times \sqrt{12} = 0.1273 \times 3.4641 = 0.4409\]

Scaling factor: \(\sqrt{12}\) — under i.i.d. monthly returns, the annualised mean scales by 12 and the annualised standard deviation scales by \(\sqrt{12}\), so their ratio scales by \(12 / \sqrt{12} = \sqrt{12}\).

Part (o): Bootstrap Standard Error for the Sharpe Ratio

set.seed(2024)
r_synthetic <- rnorm(T_months, mean = r_bar, sd = sigma)

B       <- 10000
sr_boot <- numeric(B)
for (b in seq_len(B)) {
  r_b       <- sample(r_synthetic, size = T_months, replace = TRUE)
  sr_boot[b] <- mean(r_b) / sd(r_b)
}

SE_boot <- sd(sr_boot)
CI_95   <- quantile(sr_boot, c(0.025, 0.975))

cat("Bootstrap replications (B)    :", B)

Bootstrap replications (B)    : 10000

cat("\nBootstrap SE (monthly SR)     :", round(SE_boot, 4))


Bootstrap SE (monthly SR)     : 0.1473

cat("\n95% Percentile CI (monthly SR):", round(CI_95[1], 4), "to", round(CI_95[2], 4))


95% Percentile CI (monthly SR): -0.2696 to 0.3113

Bootstrap procedure — step by step:

Begin with the observed monthly return series \(\{r_1, r_2, \ldots, r_{48}\}\).
Draw \(B = 10{,}000\) bootstrap samples, each of size \(T = 48\), with replacement.
For each bootstrap sample \(b\), compute \(SR^{(b)} = \bar{r}^{(b)} / \hat{\sigma}^{(b)}\).
The bootstrap standard error is \(SE_{\text{boot}} = \text{SD}(\{SR^{(1)}, \ldots, SR^{(B)}\})\).
The 95% confidence interval is given by the 2.5th and 97.5th percentiles of the bootstrap distribution.

Why i.i.d. bootstrap is inappropriate: Monthly financial returns exhibit serial dependence — momentum produces positive autocorrelation and GARCH processes generate volatility clustering. The i.i.d. bootstrap destroys the temporal ordering, thereby understating the true variance of the Sharpe ratio estimator and producing artificially narrow confidence intervals.

Recommended variant — Stationary Block Bootstrap (Politis & Romano, 1994): Rather than resampling individual observations, this procedure draws contiguous blocks of consecutive returns (e.g., of length \(\ell = 6\) to 12 months). Blocking preserves short-run serial dependence while still providing asymptotically valid inference. The block length is chosen to cover the dominant autocorrelation horizon.

Part (p): LASSO \(\lambda\) Selection

results_lasso <- data.frame(
  Rule             = c("Minimum CV Error", "One-Standard-Error Rule"),
  Lambda           = c(0.030, 0.065),
  Factors_Retained = c(14, 7),
  Recommendation   = c("Risk of overfitting", "Recommended for deployment")
)
knitr::kable(results_lasso,
             col.names = c("Selection Rule", "Lambda", "Factors Retained", "Assessment"))

Selection Rule	Lambda	Factors Retained	Assessment
Minimum CV Error	0.030	14	Risk of overfitting
One-Standard-Error Rule	0.065	7	Recommended for deployment

Recommendation: Deploy \(\lambda = 0.065\) (one-standard-error rule), retaining 7 factors.

Overfitting risk: With 60 candidate factors and a finite backtest window, the minimum-CV solution (14 factors) likely includes spurious predictors. Multiple-testing and data-mining biases are pervasive in quantitative finance; parsimonious models generalise more reliably out-of-sample.
One-SE principle: The 1-SE rule selects the most parsimonious model whose CV error is statistically indistinguishable from the minimum (within one standard error). Since the CV error curve is itself noisy, the apparent minimum may not be a genuine optimum. Preferring the simpler model is statistically and economically defensible.
Economic durability: Factors in finance tend to decay as capital arbitrages them away. A 7-factor model is more interpretable, imposes lower transaction costs, and is more robust to regime changes than a 14-factor model loaded with marginal predictors.
Parsimony principle: When two models perform comparably in cross-validation, prefer the simpler one. Deploy \(\lambda = 0.065\).

Part (q): Walk-Forward Cross-Validation

wf <- data.frame(
  Period        = paste("Period", 1:5),
  Train_Window  = c("Months  1–36", "Months  1–42", "Months  1–48",
                    "Months  1–54", "Months  1–60"),
  Test_Window   = c("Months 37–42", "Months 43–48", "Months 49–54",
                    "Months 55–60", "Months 61–66"),
  Train_n       = c(36, 42, 48, 54, 60),
  Test_n        = c(6, 6, 6, 6, 6)
)
knitr::kable(wf,
             col.names = c("Period", "Training Window", "Test Window", "Train n", "Test n"))

Period	Training Window	Test Window	Train n	Test n
Period 1	Months 1–36	Months 37–42	36	6
Period 2	Months 1–42	Months 43–48	42	6
Period 3	Months 1–48	Months 49–54	48	6
Period 4	Months 1–54	Months 55–60	54	6
Period 5	Months 1–60	Months 61–66	60	6

Walk-forward (expanding-window) procedure:

Initialise: Fix a minimum training length \(W\) (e.g., 36 months) and forecast horizon \(h\) (e.g., 6 months).
Fit: Estimate the LASSO model on months \(\{1, \ldots, W\}\), applying the 1-SE rule to select \(\lambda\).
Forecast: Generate out-of-sample predictions for months \(\{W+1, \ldots, W+h\}\) using only information available at time \(W\).
Evaluate: Record realised returns and P&L for the test window.
Roll forward: Expand the training window to \(\{1, \ldots, W+h\}\) and repeat from Step 2.
Aggregate: Concatenate all out-of-sample windows; compute Sharpe ratio, maximum drawdown, and information ratio over the full out-of-sample track record.

Why standard \(k\)-fold CV is unsafe: Random \(k\)-fold CV assigns observations to folds without regard to time, so a training fold may contain observations that chronologically post-date the test fold. This is look-ahead bias — the model is effectively trained on future data, producing unrealistically optimistic performance estimates that will not materialise in live trading.

The violation is especially severe for financial time series because: (i) returns exhibit serial dependence that random shuffling destroys, causing CV error to underestimate true prediction error; and (ii) non-stationarity and regime changes mean earlier and later periods are not exchangeable. Walk-forward CV strictly enforces the information frontier — at every step, the model may only use data prior to the test window, exactly replicating real-time constraints.

Machine Learning Applications in Finance – Final Examination

Nomin Ayurzana

June 08, 2026