Question 1 — Single-Factor (Market) Model \[25 points\]

Model: \(R_i - R_f = \alpha + \beta(R_m - R_f) + \varepsilon\)

Given:

Term	Estimate	Std. Error
Intercept (\(\alpha\))	0.0017	0.0020
Market premium (\(\beta\))	0.98	0.17

\(R^2 = 0.50\), \(E[R_m - R_f] = 0.70\%\), \(n = 96\) months, critical \(|t| \approx 1.98\).

(a) t-statistic for \(\beta\); test \(H_0: \beta = 0\)

beta_hat <- 0.98
se_beta  <- 0.17
t_beta   <- beta_hat / se_beta
t_crit   <- 1.98

cat("t-statistic for beta:", round(t_beta, 4), "\n")

t-statistic for beta: 5.7647

cat("Critical |t|:", t_crit, "\n")

Critical |t|: 1.98

cat("Decision:", ifelse(abs(t_beta) > t_crit, "Reject H0: beta = 0", "Fail to reject H0: beta = 0"), "\n")

Decision: Reject H0: beta = 0

Formula: \[t_{\hat{\beta}} = \frac{\hat{\beta}}{SE(\hat{\beta})} = \frac{0.98}{0.17} = 5.7647\]

Decision: \(|5.7647| > 1.98\) → Reject \(H_0: \beta = 0\) at the 5% significance level.

Economic interpretation: \(\hat{\beta} \approx 0.98\) means the fund’s excess return moves almost one-for-one with the market risk premium. A 1% increase in the market excess return is associated with approximately a 0.98% increase in the fund’s excess return. The fund has near-market systematic risk.

(b) Test \(H_0: \beta = 1\)

t_beta1 <- (beta_hat - 1) / se_beta

cat("t-statistic for H0: beta = 1:", round(t_beta1, 4), "\n")

t-statistic for H0: beta = 1: -0.1176

cat("Decision:", ifelse(abs(t_beta1) > t_crit, "Reject H0: beta = 1", "Fail to reject H0: beta = 1"), "\n")

Decision: Fail to reject H0: beta = 1

Formula: \[t = \frac{\hat{\beta} - 1}{SE(\hat{\beta})} = \frac{0.98 - 1}{0.17} = \frac{-0.02}{0.17} = -0.1176\]

Decision: \(|-0.1176| < 1.98\) → Fail to reject \(H_0: \beta = 1\) at the 5% level.

Interpretation: The fund’s systematic risk is statistically indistinguishable from the market (\(\beta = 1\)). It is neither a leveraged (\(\beta > 1\)) nor a defensive (\(\beta < 1\)) fund. Its market exposure is essentially equivalent to a passive index position.

(c) t-statistic for \(\alpha\) (Jensen’s Alpha)

alpha_hat <- 0.0017
se_alpha  <- 0.0020
t_alpha   <- alpha_hat / se_alpha

cat("Jensen's alpha:", alpha_hat, "\n")

Jensen's alpha: 0.0017

cat("t-statistic for alpha:", round(t_alpha, 4), "\n")

t-statistic for alpha: 0.85

cat("Decision:", ifelse(abs(t_alpha) > t_crit, "Reject H0: alpha = 0", "Fail to reject H0: alpha = 0"), "\n")

Decision: Fail to reject H0: alpha = 0

Formula: \[t_{\hat{\alpha}} = \frac{\hat{\alpha}}{SE(\hat{\alpha})} = \frac{0.0017}{0.0020} = 0.8500\]

Decision: \(|0.8500| < 1.98\) → Fail to reject \(H_0: \alpha = 0\) at the 5% level.

Conclusion: The data do not statistically justify the marketing claim of “positive risk-adjusted performance.” Although the point estimate \(\hat{\alpha} = 0.0017\) (0.17% per month) is positive, the t-statistic of 0.85 corresponds to a p-value of approximately 0.40, meaning there is a 40% probability of observing this result by chance even if true \(\alpha = 0\). The claim is statistically unwarranted.

(d) Interpretation of \(R^2 = 0.50\)

50% of the monthly variation in the fund’s excess return is explained by market movements — this is the systematic (non-diversifiable) component of risk.

The remaining 50% is idiosyncratic (diversifiable) risk — return variation specific to the fund’s individual holdings, unrelated to market-wide movements. A fully diversified investor would not be compensated for bearing this component.

\[R^2 = \frac{\text{Systematic variance}}{\text{Total variance}} = 0.50\]

(e) CAPM-implied expected monthly excess return

E_mkt_premium <- 0.0070  # 0.70% expressed as decimal
E_fund_excess <- beta_hat * E_mkt_premium

cat("CAPM-implied expected monthly excess return:", round(E_fund_excess * 100, 4), "%\n")

CAPM-implied expected monthly excess return: 0.686 %

Formula: \[E[R_i - R_f] = \hat{\beta} \times E[R_m - R_f] = 0.98 \times 0.70\% = \mathbf{0.6860\%}\]

Question 2 — Fama–French Three-Factor Model \[25 points\]

Model: \(R_i - R_f = \alpha + b \cdot MKT + s \cdot SMB + h \cdot HML + \varepsilon\)

Given (\(n = 144\) months):

Term	Estimate	Std. Error
Intercept (\(\alpha\))	0.0029	0.0018
MKT (\(b\))	0.97	0.08
SMB (\(s\))	0.75	0.11
HML (\(h\))	−0.13	0.13

\(R^2 = 0.92\), Adjusted \(R^2 = 0.918\), critical \(|t| \approx 1.98\).

(f) t-statistics for all four coefficients

coefs <- data.frame(
  Term      = c("alpha (intercept)", "b (MKT)", "s (SMB)", "h (HML)"),
  Estimate  = c(0.0029, 0.97, 0.75, -0.13),
  Std_Error = c(0.0018, 0.08, 0.11, 0.13)
)

coefs$t_stat     <- round(coefs$Estimate / coefs$Std_Error, 4)
coefs$Significant <- ifelse(abs(coefs$t_stat) > 1.98, "Yes (5%)", "No")

print(coefs, row.names = FALSE)

              Term Estimate Std_Error  t_stat Significant
 alpha (intercept)   0.0029    0.0018  1.6111          No
           b (MKT)   0.9700    0.0800 12.1250    Yes (5%)
           s (SMB)   0.7500    0.1100  6.8182    Yes (5%)
           h (HML)  -0.1300    0.1300 -1.0000          No

Formulas: \[t_\alpha = \frac{0.0029}{0.0018} = 1.6111 \quad t_b = \frac{0.97}{0.08} = 12.1250 \quad t_s = \frac{0.75}{0.11} = 6.8182 \quad t_h = \frac{-0.13}{0.13} = -1.0000\]

Significant at 5%: \(b\) (MKT) and \(s\) (SMB) only. \(\alpha\) and \(h\) (HML) are not significant.

(g) Investment style classification

s_hat <- 0.75
h_hat <- -0.13

cat("SMB loading (s):", s_hat, "→", ifelse(s_hat > 0, "Small-cap tilt", "Large-cap tilt"), "\n")

SMB loading (s): 0.75 → Small-cap tilt

cat("HML loading (h):", h_hat, "→", ifelse(h_hat < 0, "Growth tilt", "Value tilt"),
    "(not significant)\n")

HML loading (h): -0.13 → Growth tilt (not significant)

cat("Overall style: Small-cap blend / slight growth tilt\n")

Overall style: Small-cap blend / slight growth tilt

Size tilt: \(\hat{s} = +0.75\) (significant at 5%) → the fund co-moves strongly with small-cap stocks. It tilts toward smaller companies relative to large-caps.
Value/Growth tilt: \(\hat{h} = -0.13\) (not significant, \(|t| = 1.0\)) → slight lean toward growth stocks, but statistically indistinguishable from zero. No reliable value or growth classification can be made.

Overall classification: Small-cap blend (no statistically significant value/growth tilt).

(h) Intercept interpretation

alpha_ff <- 0.0029
se_alpha_ff <- 0.0018
t_alpha_ff <- alpha_ff / se_alpha_ff

cat("FF alpha:", alpha_ff, "(", round(alpha_ff * 100, 2), "% per month )\n")

FF alpha: 0.0029 ( 0.29 % per month )

cat("t-statistic:", round(t_alpha_ff, 4), "\n")

t-statistic: 1.6111

cat("Decision:", ifelse(abs(t_alpha_ff) > 1.98, "Significant — manager adds value",
                        "Not significant — no evidence of value-added"), "\n")

Decision: Not significant — no evidence of value-added

\(\hat{\alpha} = 0.0029\) (0.29% per month). With \(t = 1.61 < 1.98\), this is not statistically significant at the 5% level.

Conclusion: After accounting for exposure to the market, size, and value factors, there is no statistically reliable evidence that the manager generates returns beyond what the three systematic factors predict. The positive point estimate may be due to sampling variation. The manager does not demonstrably add value beyond factor exposures.

(i) R² rise from 0.75 to 0.92; role of adjusted R²

R2_capm <- 0.75
R2_ff   <- 0.92
adj_R2  <- 0.918
n       <- 144
k_capm  <- 1   # one predictor
k_ff    <- 3   # three predictors

adj_R2_capm_calc <- 1 - (1 - R2_capm) * (n - 1) / (n - k_capm - 1)
adj_R2_ff_calc   <- 1 - (1 - R2_ff)   * (n - 1) / (n - k_ff - 1)

cat("CAPM R²:", R2_capm, "| Approx Adj R²:", round(adj_R2_capm_calc, 4), "\n")

CAPM R²: 0.75 | Approx Adj R²: 0.7482

cat("FF   R²:", R2_ff,   "| Approx Adj R²:", round(adj_R2_ff_calc,   4), "\n")

FF   R²: 0.92 | Approx Adj R²: 0.9183

cat("Rise in R²:", R2_ff - R2_capm, "\n")

Rise in R²: 0.17

What the rise indicates: Adding SMB and HML increases \(R^2\) from 0.75 to 0.92 — an additional 17% of the fund’s return variation is explained. This gain is almost entirely attributable to the fund’s significant small-cap loading (\(s = 0.75\)); the fund’s returns behave like those of small-cap stocks, a dimension the single-factor CAPM ignores.

Why adjusted \(R^2\) is appropriate for model comparison:

\[\bar{R}^2 = 1 - \frac{(1-R^2)(n-1)}{n - k - 1}\]

Adding any predictor — even a pure noise variable — mechanically increases \(R^2\). Adjusted \(R^2\) penalizes for additional parameters; it rises only if the new predictor improves fit by more than chance would. When comparing models with different numbers of predictors (1 vs. 3 factors here), adjusted \(R^2\) is the appropriate metric to confirm the improvement is genuine.

Question 3 — Logistic Regression for Market Direction \[25 points\]

Model: \(\text{logit}\, P(\text{Up}) = \beta_0 + \beta_1 r_{t-1} + \beta_2 \Delta VIX_{t-1}\)

Coefficients: \(\beta_0 = -0.02\), \(\beta_1 = 5.4\), \(\beta_2 = -0.38\). Inputs: \(r_{t-1} = 0.010\), \(\Delta VIX = 1.5\).

(j) Predicted probability and class

b0 <- -0.02;  b1 <- 5.4;  b2 <- -0.38
r_lag  <- 0.010;  delta_vix <- 1.5

logit_val <- b0 + b1 * r_lag + b2 * delta_vix
cat("logit =", b0, "+", b1, "*", r_lag, "+", b2, "*", delta_vix, "\n")

logit = -0.02 + 5.4 * 0.01 + -0.38 * 1.5

cat("logit =", round(logit_val, 4), "\n\n")

logit = -0.536

prob_up <- 1 / (1 + exp(-logit_val))
cat("P(Up) = 1 / (1 + exp(", round(-logit_val, 4), ")) =", round(prob_up, 4), "\n")

P(Up) = 1 / (1 + exp( 0.536 )) = 0.3691

cat("Predicted class (threshold 0.5):", ifelse(prob_up >= 0.5, "Up", "Down"), "\n")

Predicted class (threshold 0.5): Down

Formula: \[\text{logit} = -0.02 + 5.4(0.010) + (-0.38)(1.5) = -0.02 + 0.054 - 0.570 = -0.5360\]

\[P(\text{Up}) = \frac{1}{1 + e^{0.5360}} = \frac{1}{1 + 1.7096} = 0.3693\]

Since \(0.3693 < 0.50\) → Predicted class: Down

(k) Economic interpretation of \(\beta_1\) and \(\beta_2\)

\(\beta_1 = +5.4\) (lagged return): A positive lagged market return increases the log-odds of an “Up” day. This captures short-term momentum — a market that rose yesterday is more likely to rise tomorrow. The large magnitude (5.4) indicates the effect is economically meaningful.

\(\beta_2 = -0.38\) (\(\Delta\)VIX): A rise in implied volatility (the “fear index”) decreases the probability of an “Up” day. This reflects the well-documented negative volatility–return relationship: when uncertainty spikes, risk-averse investors sell, driving prices down. A 1-unit rise in VIX reduces the log-odds by 0.38.

(l) Confusion matrix metrics

TP <- 67; FP <- 44; FN <- 33; TN <- 56
N  <- TP + FP + FN + TN

accuracy    <- (TP + TN) / N
sensitivity <- TP / (TP + FN)   # True Positive Rate / Recall
specificity <- TN / (TN + FP)
precision   <- TP / (TP + FP)

cat("Confusion Matrix:\n")

Confusion Matrix:

cat("                 Actual Up   Actual Down\n")

                 Actual Up   Actual Down

cat("Predicted Up    ", TP, "(TP)   ", FP, "(FP)\n")

Predicted Up     67 (TP)    44 (FP)

cat("Predicted Down  ", FN, "(FN)   ", TN, "(TN)\n\n")

Predicted Down   33 (FN)    56 (TN)

cat("Accuracy    = (", TP, "+", TN, ") /", N, "=", round(accuracy, 4), "\n")

Accuracy    = ( 67 + 56 ) / 200 = 0.615

cat("Sensitivity = ", TP, "/ (", TP, "+", FN, ") =", round(sensitivity, 4), "\n")

Sensitivity =  67 / ( 67 + 33 ) = 0.67

cat("Specificity = ", TN, "/ (", TN, "+", FP, ") =", round(specificity, 4), "\n")

Specificity =  56 / ( 56 + 44 ) = 0.56

cat("Precision   = ", TP, "/ (", TP, "+", FP, ") =", round(precision, 4), "\n")

Precision   =  67 / ( 67 + 44 ) = 0.6036

Formulas:

\[\text{Accuracy} = \frac{TP + TN}{N} = \frac{67 + 56}{200} = 0.6150\]

\[\text{Sensitivity} = \frac{TP}{TP + FN} = \frac{67}{100} = 0.6700\]

\[\text{Specificity} = \frac{TN}{TN + FP} = \frac{56}{100} = 0.5600\]

\[\text{Precision} = \frac{TP}{TP + FP} = \frac{67}{111} = 0.6036\]

(m) Naive benchmark and limitations of accuracy

# Dataset is balanced: 100 Up, 100 Down
naive_accuracy <- 100 / 200
model_accuracy <- (TP + TN) / N

cat("Naive majority-class accuracy:", naive_accuracy, "\n")

Naive majority-class accuracy: 0.5

cat("Model accuracy:               ", round(model_accuracy, 4), "\n")

Model accuracy:                0.615

cat("Model beats naive benchmark:  ", model_accuracy > naive_accuracy, "\n")

Model beats naive benchmark:   TRUE

Naive accuracy: The dataset is balanced (100 Up, 100 Down), so the majority class is either. Predicting the majority class always gives accuracy = \(100/200 = \mathbf{0.50}\).

Model accuracy = 0.615 > 0.50 → The model beats the naive benchmark.

Why accuracy alone is inadequate for a trading system: In trading, false positives (predicting Up when the market falls) and false negatives (predicting Down when the market rises) carry asymmetric economic costs depending on position sizing, transaction costs, and risk tolerance. Accuracy treats all errors equally, ignoring P&L consequences. For example, a model that is always right on small Up days but always wrong on large Down days may have high accuracy but negative expected P&L.

A more economically relevant criterion: The Sharpe ratio (or information ratio) of the resulting trading strategy’s P&L — this directly measures risk-adjusted profitability. Alternatively, for a long-only strategy, Precision is key: it measures what fraction of “Up” predictions actually result in gains.

Question 4 — Resampling and Regularization in a Backtest \[25 points\]

Strategy: \(\bar{r} = 0.70\%\) per month, \(s = 5.50\%\) per month, \(n = 48\) months.

(n) Sharpe ratio — monthly and annualized

r_bar <- 0.0070   # 0.70%
s_r   <- 0.0550   # 5.50%
n_obs <- 48

SR_monthly  <- r_bar / s_r
scale       <- sqrt(12)            # annualisation scaling factor
SR_annual   <- SR_monthly * scale

cat("Monthly Sharpe ratio  = ", round(r_bar * 100, 2), "% /", round(s_r * 100, 2), "%\n")

Monthly Sharpe ratio  =  0.7 % / 5.5 %

cat("                      =", round(SR_monthly, 4), "\n\n")

                      = 0.1273

cat("Scaling factor        = sqrt(12) =", round(scale, 4), "\n")

Scaling factor        = sqrt(12) = 3.4641

cat("Annualized Sharpe     =", round(SR_monthly, 4), "x", round(scale, 4), "\n")

Annualized Sharpe     = 0.1273 x 3.4641

cat("                      =", round(SR_annual, 4), "\n")

                      = 0.4409

Formulas:

\[SR_{\text{monthly}} = \frac{\bar{r}}{s} = \frac{0.0070}{0.0550} = 0.1273\]

\[SR_{\text{annual}} = SR_{\text{monthly}} \times \sqrt{12} = 0.1273 \times 3.4641 = \mathbf{0.4408}\]

Scaling factor justification: Under i.i.d. returns, \(\sigma^2_{\text{annual}} = 12 \cdot \sigma^2_{\text{monthly}}\), so \(\sigma_{\text{annual}} = \sqrt{12} \cdot \sigma_{\text{monthly}}\). Since the mean also scales by 12, the Sharpe ratio scales by \(12 / \sqrt{12} = \sqrt{12}\).

(o) Bootstrap SE for the Sharpe ratio

# Illustration of the iid bootstrap procedure
set.seed(42)

# Simulate the 48 monthly returns for demonstration
set.seed(42)
r_sim <- rnorm(48, mean = 0.007, sd = 0.055)

B <- 10000
sr_boot <- replicate(B, {
  r_b <- sample(r_sim, size = 48, replace = TRUE)
  mean(r_b) / sd(r_b)
})

cat("Bootstrap SE of monthly Sharpe ratio (iid, B =", B, "):", round(sd(sr_boot), 4), "\n")

Bootstrap SE of monthly Sharpe ratio (iid, B = 10000 ): 0.149

cat("95% Bootstrap CI: [", round(quantile(sr_boot, 0.025), 4), ",",
                           round(quantile(sr_boot, 0.975), 4), "]\n\n")

95% Bootstrap CI: [ -0.2097 , 0.3765 ]

cat("NOTE: This is illustrative. The iid bootstrap is biased for autocorrelated returns.\n")

NOTE: This is illustrative. The iid bootstrap is biased for autocorrelated returns.

cat("Use the stationary (block) bootstrap for valid inference.\n")

Use the stationary (block) bootstrap for valid inference.

Bootstrap procedure (step-by-step):

Start with the observed sample of \(n = 48\) monthly returns \(\{r_1, r_2, \ldots, r_{48}\}\).
Draw \(B = 10{,}000\) bootstrap samples of size 48 with replacement from this set.
For each bootstrap sample \(b\), compute \(SR^*_b = \bar{r}^*_b / s^*_b\).
The bootstrap SE is: \(\widehat{SE}_{boot} = \text{sd}(SR^*_1, \ldots, SR^*_B)\).

Why ordinary i.i.d. bootstrap is inappropriate: Monthly financial returns typically exhibit serial autocorrelation (momentum, mean-reversion) and volatility clustering (GARCH effects). The i.i.d. bootstrap samples observations independently, thereby destroying the time-series dependence structure and producing an underestimated standard error.

Fix — Block Bootstrap (stationary bootstrap): The stationary bootstrap (Politis & Romano, 1994) samples contiguous blocks of length drawn from a geometric distribution, preserving local autocorrelation. Alternatively, the moving block bootstrap uses fixed-length blocks of size \(l \approx \sqrt{n} \approx 7\) months. Either variant respects the temporal dependence in the return series.

(p) Choosing between \(\lambda = 0.030\) and \(\lambda = 0.065\)

lambda_min  <- 0.030; factors_min <- 14
lambda_1se  <- 0.065; factors_1se <- 7

cat("Lambda (min CV error):   ", lambda_min, "| Factors retained:", factors_min, "\n")

Lambda (min CV error):    0.03 | Factors retained: 14

cat("Lambda (1-SE rule):      ", lambda_1se, "| Factors retained:", factors_1se, "\n\n")

Lambda (1-SE rule):       0.065 | Factors retained: 7

cat("Recommended: lambda =", lambda_1se, "(1-SE rule)\n")

Recommended: lambda = 0.065 (1-SE rule)

cat("Reason: parsimony reduces overfitting with 48 months of data and 60 candidate factors.\n")

Reason: parsimony reduces overfitting with 48 months of data and 60 candidate factors.

Recommendation: Deploy \(\lambda = 0.065\) (the 1-SE rule, retaining 7 factors).

Justification:

The 1-SE rule selects the most regularized (sparsest) model whose CV error lies within one standard error of the minimum. This is the standard recommendation in Hastie, Tibshirani & Friedman (ESL).
With \(n = 48\) observations and \(p = 60\) candidate factors, the system is highly underdetermined. The 14-factor solution at \(\lambda = 0.030\) risks overfitting — some retained factors may capture noise rather than genuine return predictors.
In empirical asset pricing, parameter estimation error degrades out-of-sample performance severely in small samples. Simpler models generalize better.
The 7-factor model sacrifices a negligible amount of in-sample fit but substantially reduces the risk of data mining and is more interpretable.

(q) Walk-forward (time-respecting) cross-validation

cat("Walk-forward cross-validation scheme:\n\n")

Walk-forward cross-validation scheme:

cat("Step 1: Order all T observations chronologically: t = 1, 2, ..., T\n")

Step 1: Order all T observations chronologically: t = 1, 2, ..., T

cat("Step 2: Set initial training window (e.g., first 24 months)\n")

Step 2: Set initial training window (e.g., first 24 months)

cat("Step 3: Train model on t = 1..24; predict and record OOS metric at t = 25\n")

Step 3: Train model on t = 1..24; predict and record OOS metric at t = 25

cat("Step 4: Expand window by one month; retrain on t = 1..25; predict t = 26\n")

Step 4: Expand window by one month; retrain on t = 1..25; predict t = 26

cat("Step 5: Repeat until t = T; average all OOS metrics\n\n")

Step 5: Repeat until t = T; average all OOS metrics

cat("Why k-fold CV is unsafe for financial time series:\n")

Why k-fold CV is unsafe for financial time series:

cat("  - k-fold randomly shuffles data into folds\n")

  - k-fold randomly shuffles data into folds

cat("  - Future data can appear in training set → look-ahead bias\n")

  - Future data can appear in training set → look-ahead bias

cat("  - OOS estimates are optimistically biased and not tradeable in practice\n")

  - OOS estimates are optimistically biased and not tradeable in practice

Walk-forward scheme (expanding window):

Step	Training Set	Test Point
1	\(t = 1 \ldots 24\)	\(t = 25\)
2	\(t = 1 \ldots 25\)	\(t = 26\)
\(\vdots\)	\(\vdots\)	\(\vdots\)
\(T-24\)	\(t = 1 \ldots T-1\)	\(t = T\)

The model is retrained at each step with all available historical data, and the test point always lies strictly in the future relative to the training data.

Why standard \(k\)-fold CV is unsafe:

Standard \(k\)-fold randomly assigns observations to folds, so observations from \(t = 40\) may appear in the training set while \(t = 20\) appears in the test set. This constitutes look-ahead bias — the model is trained on information it could not have had at prediction time in real deployment. The resulting OOS metrics are spuriously optimistic and do not reflect true out-of-sample performance.

Walk-forward CV strictly preserves temporal ordering, mimicking live trading conditions: the model only ever uses past data to predict the future, exactly as it would in practice.

End of examination.

Final Examination — Machine Learning Applications in Finance

Egshiglen Baatar

2026-06-08