1 Question 1: Single-Factor (Market) Model [25 points]

Model: \[R_i - R_f = \alpha + \beta(R_m - R_f) + \varepsilon\]

Given:

Term	Estimate	Std. Error
Intercept (α)	0.0017	0.0020
Market premium (β)	0.98	0.17

\(R^2 = 0.50\), \(E[R_m - R_f] = 0.70\%\), critical \(|t| \approx 1.98\), \(n = 96\) months.

1.1 (a) t-statistic for β and test H₀: β = 0

\[t_{\hat{\beta}} = \frac{\hat{\beta} - 0}{SE(\hat{\beta})} = \frac{0.98}{0.17}\]

beta_hat  <- 0.98
se_beta   <- 0.17
t_beta    <- beta_hat / se_beta
t_crit    <- 1.98

cat("t-statistic for beta:", round(t_beta, 4), "\n")

## t-statistic for beta: 5.7647

cat("Critical value:      ", t_crit, "\n")

## Critical value:       1.98

cat("Reject H0: beta = 0?", abs(t_beta) > t_crit, "\n")

## Reject H0: beta = 0? TRUE

Interpretation: \(t = 5.7647\), which exceeds the critical value \(1.98\), so we reject \(H_0: \beta = 0\) at the 5% level. The market premium is a statistically significant driver of the fund’s excess returns.

Economic interpretation of β: A \(\beta = 0.98\) means the fund moves almost one-for-one with the market. For every 1% increase in the market excess return, the fund’s excess return rises by approximately 0.98%. The fund has nearly the same systematic (market) risk as the overall market portfolio.

1.2 (b) Test H₀: β = 1

\[t = \frac{\hat{\beta} - 1}{SE(\hat{\beta})} = \frac{0.98 - 1}{0.17}\]

t_beta1 <- (beta_hat - 1) / se_beta
cat("t-statistic for H0: beta = 1:", round(t_beta1, 4), "\n")

## t-statistic for H0: beta = 1: -0.1176

cat("Critical value:               ", t_crit, "\n")

## Critical value:                1.98

cat("Reject H0: beta = 1?         ", abs(t_beta1) > t_crit, "\n")

## Reject H0: beta = 1?          FALSE

Conclusion: \(t = -0.1176\), which does not exceed \(1.98\) in absolute value. We fail to reject \(H_0: \beta = 1\). The fund’s systematic risk is statistically indistinguishable from the market; it behaves like a passive index-mimicking fund in terms of market exposure.

1.3 (c) t-statistic for α (Jensen’s Alpha)

\[t_{\hat{\alpha}} = \frac{\hat{\alpha} - 0}{SE(\hat{\alpha})} = \frac{0.0017}{0.0020}\]

alpha_hat <- 0.0017
se_alpha  <- 0.0020
t_alpha   <- alpha_hat / se_alpha

cat("t-statistic for alpha:", round(t_alpha, 4), "\n")

## t-statistic for alpha: 0.85

cat("Critical value:       ", t_crit, "\n")

## Critical value:        1.98

cat("Reject H0: alpha = 0?", abs(t_alpha) > t_crit, "\n")

## Reject H0: alpha = 0? FALSE

Conclusion: \(t = 0.85\), which is below the critical value \(1.98\). We fail to reject \(H_0: \alpha = 0\). The estimated alpha is positive but not statistically significant. The data do not justify the marketing claim of “positive risk-adjusted performance” — the observed alpha could easily be due to sampling variation.

1.4 (d) Interpret R²

R2 <- 0.50
systematic     <- R2
diversifiable  <- 1 - R2

cat("R-squared (systematic fraction):    ", systematic, "\n")

## R-squared (systematic fraction):     0.5

cat("1 - R-squared (diversifiable frac.):", diversifiable, "\n")

## 1 - R-squared (diversifiable frac.): 0.5

Interpretation: \(R^2 = 0.50\) means that 50% of the fund’s return variation is explained by market movements (systematic risk), while the remaining 50% is idiosyncratic (diversifiable) risk. Half of the fund’s volatility can be eliminated through diversification, which is relatively high idiosyncratic exposure compared to a well-diversified fund.

1.5 (e) CAPM-Implied Expected Monthly Excess Return

\[E[R_i - R_f] = \beta \times E[R_m - R_f] = 0.98 \times 0.70\%\]

mkt_premium <- 0.0070   # 0.70% in decimal
capm_exp    <- beta_hat * mkt_premium

cat("CAPM-implied expected monthly excess return:", round(capm_exp, 4), "\n")

## CAPM-implied expected monthly excess return: 0.0069

cat("As a percentage:                            ", round(capm_exp * 100, 4), "%\n")

## As a percentage:                             0.686 %

Result: The CAPM-implied expected monthly excess return for the fund is \(0.686\%\).

2 Question 2: Fama–French Three-Factor Model [25 points]

Model: \[R_i - R_f = \alpha + b \cdot MKT + s \cdot SMB + h \cdot HML + \varepsilon\]

Given (\(n = 144\) months):

Term	Estimate	Std. Error
Intercept (α)	0.0029	0.0018
MKT (b)	0.97	0.08
SMB (s)	0.75	0.11
HML (h)	-0.13	0.13

\(R^2 = 0.92\), Adjusted \(R^2 = 0.918\), critical \(|t| \approx 1.98\).

2.1 (f) t-statistics for all four coefficients

\[t_j = \frac{\hat{\theta}_j}{SE(\hat{\theta}_j)}\]

estimates <- c(alpha = 0.0029, b_MKT = 0.97, s_SMB = 0.75, h_HML = -0.13)
std_errs  <- c(alpha = 0.0018, b_MKT = 0.08, s_SMB = 0.11, h_HML = 0.13)

t_stats   <- estimates / std_errs
significant <- abs(t_stats) > t_crit

results_q2 <- data.frame(
  Estimate   = estimates,
  Std_Error  = std_errs,
  t_stat     = round(t_stats, 4),
  Significant = significant
)

print(results_q2)

##       Estimate Std_Error  t_stat Significant
## alpha   0.0029    0.0018  1.6111       FALSE
## b_MKT   0.9700    0.0800 12.1250        TRUE
## s_SMB   0.7500    0.1100  6.8182        TRUE
## h_HML  -0.1300    0.1300 -1.0000       FALSE

Summary:

α: \(t = 1.6111\) — not significant at 5% level.
MKT (b): \(t = 12.125\) — significant ✓
SMB (s): \(t = 6.8182\) — significant ✓
HML (h): \(t = -1\) — not significant at 5% level.

2.2 (g) Investment Style Classification

cat("SMB loading (s):", estimates["s_SMB"], "— positive and large\n")

## SMB loading (s): 0.75 — positive and large

cat("HML loading (h):", estimates["h_HML"], "— negative but insignificant\n")

## HML loading (h): -0.13 — negative but insignificant

Size tilt: The SMB loading is \(s = 0.75\), which is positive and highly significant. The fund tilts toward small-cap stocks.

Value/Growth tilt: The HML loading is \(h = -0.13\), which is negative (growth tilt), but it is not statistically significant (\(|t| = 1\)). We cannot confidently classify the fund as growth-oriented; it is closer to neutral on the value/growth dimension.

Overall style: The fund is best characterized as a small-cap, style-neutral fund.

2.3 (h) Intercept Interpretation and Manager Value Added

cat("Alpha estimate:", estimates["alpha"], "\n")

## Alpha estimate: 0.0029

cat("t-statistic:   ", round(t_stats["alpha"], 4), "\n")

## t-statistic:    1.6111

cat("Significant?   ", abs(t_stats["alpha"]) > t_crit, "\n")

## Significant?    FALSE

Interpretation: The intercept \(\hat{\alpha} = 0.0029\) (i.e., \(+0.29\%\) per month) represents the fund’s average return not explained by exposure to the three Fama–French factors. This is Jensen’s alpha adjusted for size and value exposures.

However, with \(t = 1.6111\) below the critical value \(1.98\), the alpha is not statistically significant. We cannot conclude that the manager adds value beyond systematic factor exposures; the positive alpha may simply reflect sampling noise.

2.4 (i) R² Rise from 0.75 to 0.92 and Role of Adjusted R²

R2_capm <- 0.75
R2_ff3  <- 0.92
adj_R2  <- 0.918

cat("CAPM R-squared:       ", R2_capm, "\n")

## CAPM R-squared:        0.75

cat("FF3 R-squared:        ", R2_ff3,  "\n")

## FF3 R-squared:         0.92

cat("FF3 Adjusted R-squared:", adj_R2, "\n")

## FF3 Adjusted R-squared: 0.918

cat("Improvement in R2:    ", R2_ff3 - R2_capm, "\n")

## Improvement in R2:     0.17

What the rise indicates: Going from \(R^2 = 0.75\) (CAPM) to \(R^2 = 0.92\) (FF3) shows that adding SMB and HML explains an additional 17% of return variation. The three-factor model captures important dimensions of systematic risk — particularly the fund’s small-cap tilt — that the single market factor misses.

Why Adjusted R² is appropriate: Ordinary \(R^2\) increases mechanically whenever a predictor is added, even if it has no true explanatory power. Adjusted \(R^2\) penalizes for the number of predictors, rising only when the added variable improves fit by more than chance would predict. Since the CAPM uses 1 factor and the FF3 uses 3, the adjusted \(R^2 = 0.918\) (vs. \(R^2 = 0.92\)) confirms the improvement is genuine and not an artifact of adding variables.

3 Question 3: Logistic Regression for Market Direction [25 points]

Model: \[\text{logit}\, P(\text{Up}) = \beta_0 + \beta_1 r_{t-1} + \beta_2 \Delta VIX_{t-1}\]

\(\beta_0 = -0.02\), \(\beta_1 = 5.4\), \(\beta_2 = -0.38\)

Today’s inputs: \(r_{t-1} = 0.010\), \(\Delta VIX = 1.5\)

3.1 (j) Predicted Probability and Class

\[\text{logit} = \beta_0 + \beta_1 r_{t-1} + \beta_2 \Delta VIX = -0.02 + 5.4(0.010) + (-0.38)(1.5)\]

\[P(\text{Up}) = \frac{e^{\text{logit}}}{1 + e^{\text{logit}}} = \frac{1}{1 + e^{-\text{logit}}}\]

beta0 <- -0.02
beta1 <-  5.40
beta2 <- -0.38

r_lag   <- 0.010
dvix    <- 1.5

log_odds <- beta0 + beta1 * r_lag + beta2 * dvix
prob_up  <- 1 / (1 + exp(-log_odds))
pred_class <- ifelse(prob_up >= 0.5, "Up", "Down")

cat("Log-odds (logit):        ", round(log_odds, 4), "\n")

## Log-odds (logit):         -0.536

cat("Predicted P(Up):         ", round(prob_up, 4), "\n")

## Predicted P(Up):          0.3691

cat("Predicted class (≥0.5): ", pred_class, "\n")

## Predicted class (≥0.5):  Down

Result: The log-odds equal \(-0.536\), giving \(P(\text{Up}) = 0.3691\). At the 0.5 threshold, the predicted class is “Down”.

3.2 (k) Economic Interpretation of β₁ and β₂

β₁ = 5.4 (positive): A higher lagged return increases the log-odds of an “Up” day. This reflects momentum — when yesterday’s market return was positive, the model predicts a higher probability of an up day today. Positive serial correlation in returns (short-run momentum) is a well-documented market anomaly.

β₂ = −0.38 (negative): A larger increase in VIX (the “fear index”) decreases the log-odds of an “Up” day. This captures risk-aversion / volatility effects — rising uncertainty or fear (as measured by VIX) is associated with lower probability of a positive return the following day, consistent with flight-to-safety behavior.

3.3 (l) Confusion Matrix Metrics

Confusion Matrix:

	Actual Up	Actual Down	Total
Predicted Up	67	44	111
Predicted Down	33	56	89
Total	100	100	200

TP <- 67   # True Positives  (Predicted Up, Actual Up)
FP <- 44   # False Positives (Predicted Up, Actual Down)
FN <- 33   # False Negatives (Predicted Down, Actual Up)
TN <- 56   # True Negatives  (Predicted Down, Actual Down)
N  <- TP + FP + FN + TN

accuracy    <- (TP + TN) / N
sensitivity <- TP / (TP + FN)        # True Positive Rate (Recall for "Up")
specificity <- TN / (TN + FP)        # True Negative Rate
precision   <- TP / (TP + FP)        # Positive Predictive Value for "Up"

cat("Accuracy:    ", round(accuracy,    4), "\n")

## Accuracy:     0.615

cat("Sensitivity: ", round(sensitivity, 4), "\n")

## Sensitivity:  0.67

cat("Specificity: ", round(specificity, 4), "\n")

## Specificity:  0.56

cat("Precision:   ", round(precision,   4), "\n")

## Precision:    0.6036

Formulas:

\[\text{Accuracy} = \frac{TP + TN}{N} = \frac{67 + 56}{200} = 0.615\]

\[\text{Sensitivity (TPR)} = \frac{TP}{TP + FN} = \frac{67}{100} = 0.67\]

\[\text{Specificity} = \frac{TN}{TN + FP} = \frac{56}{100} = 0.56\]

\[\text{Precision} = \frac{TP}{TP + FP} = \frac{67}{111} = 0.6036\]

3.4 (m) Naive Classifier and Adequacy of Accuracy

# Naive rule: always predict the majority class
# Both classes have 100 observations, so it's a 50/50 split — any majority class gives 50%
n_up   <- 100
n_down <- 100
majority_class <- "Up or Down (tied)"
naive_accuracy <- max(n_up, n_down) / N

cat("Naive classifier accuracy:", round(naive_accuracy, 4), "\n")

## Naive classifier accuracy: 0.5

cat("Model accuracy:           ", round(accuracy, 4), "\n")

## Model accuracy:            0.615

cat("Model beats naive?        ", accuracy > naive_accuracy, "\n")

## Model beats naive?         TRUE

Naive accuracy: Since both classes are balanced (\(100\) Up and \(100\) Down), the naive rule achieves \(0.5\) (50%). The model’s accuracy of 0.615 beats the naive classifier.

Why accuracy alone is inadequate for trading:

Cost asymmetry: Missing a true “Up” day (false negative) has a different economic cost than incorrectly predicting “Up” on a “Down” day (false positive). Accuracy treats both errors equally.
Class imbalance: In real markets, “Up” and “Down” days may not be 50/50 — a naive model can achieve high accuracy by always predicting the majority class, even with zero predictive power.
A more economically relevant criterion: Precision (or the F1-score) is more appropriate for a trading system. High precision means that when the model predicts “Up,” it is usually correct — directly tied to trading profitability. Alternatively, the Sharpe ratio of the resulting trading strategy is the most economically meaningful metric, as it accounts for both the returns generated and the risk taken.

4 Question 4: Resampling and Regularization in a Backtest [25 points]

Given: Sample mean monthly return \(\bar{r} = 0.70\%\), sample standard deviation \(\hat{\sigma} = 5.50\%\), \(T = 48\) months.

4.1 (n) Monthly and Annualized Sharpe Ratio

\[SR_{\text{monthly}} = \frac{\bar{r}}{\hat{\sigma}} = \frac{0.0070}{0.0550}\]

\[SR_{\text{annual}} = SR_{\text{monthly}} \times \sqrt{12}\]

r_bar  <- 0.0070   # 0.70% as decimal
sigma  <- 0.0550   # 5.50% as decimal
T_obs  <- 48

SR_monthly  <- r_bar / sigma
scale_factor <- sqrt(12)
SR_annual   <- SR_monthly * scale_factor

cat("Monthly Sharpe Ratio:    ", round(SR_monthly, 4), "\n")

## Monthly Sharpe Ratio:     0.1273

cat("Scaling factor:          ", round(scale_factor, 4), "(= sqrt(12))\n")

## Scaling factor:           3.4641 (= sqrt(12))

cat("Annualized Sharpe Ratio: ", round(SR_annual,  4), "\n")

## Annualized Sharpe Ratio:  0.4409

Formulas:

\[SR_{\text{monthly}} = \frac{0.0070}{0.0550} = 0.1273\]

\[SR_{\text{annual}} = 0.1273 \times \sqrt{12} = 0.4409\]

Scaling factor: \(\sqrt{12}\) is used because monthly returns are scaled to annual by multiplying mean by 12 and standard deviation by \(\sqrt{12}\); their ratio yields \(SR_{\text{annual}} = SR_{\text{monthly}} \times \sqrt{12}\).

4.2 (o) Bootstrap for Standard Error of the Sharpe Ratio

Step-by-step bootstrap procedure:

Collect the observed sample of \(T = 48\) monthly returns: \(\{r_1, r_2, \ldots, r_{48}\}\).
Resample with replacement: Draw \(T = 48\) observations at random (with replacement) from the sample to form a bootstrap sample \(\{r_1^*, r_2^*, \ldots, r_{48}^*\}\).
Compute the Sharpe ratio for the bootstrap sample: \(SR^* = \bar{r}^* / \hat{\sigma}^*\).
Repeat steps 2–3 a large number of times \(B\) (e.g., \(B = 1{,}000\) or \(10{,}000\)).
Estimate the standard error as the standard deviation of the \(B\) bootstrap Sharpe ratios: \(\widehat{SE}(SR) = \text{sd}(SR_1^*, SR_2^*, \ldots, SR_B^*)\).

Why ordinary i.i.d. bootstrap is inappropriate:

Monthly financial returns exhibit serial dependence — autocorrelation in returns, volatility clustering (GARCH effects), and momentum/mean-reversion patterns. The standard i.i.d. bootstrap assumes observations are independent and identically distributed, which is violated. Re-sampling individual observations independently destroys the temporal structure of the data, leading to incorrect standard error estimates.

Which variant fixes it:

The block bootstrap (e.g., the moving block bootstrap or stationary bootstrap) preserves the serial dependence structure by sampling contiguous blocks of consecutive observations rather than individual returns. This maintains the autocorrelation structure within each block, yielding valid inference under weak serial dependence.

4.3 (p) LASSO λ Selection: Minimum-CV vs. One-Standard-Error Rule

cat("lambda_min (min CV error): 0.030 -> retains 14 factors\n")

## lambda_min (min CV error): 0.030 -> retains 14 factors

cat("lambda_1se (1-SE rule):    0.065 -> retains  7 factors\n")

## lambda_1se (1-SE rule):    0.065 -> retains  7 factors

Recommended choice: \(\lambda_{\text{1SE}} = 0.065\) (7 factors)

Reasoning:

Overfitting risk in finance: With 60 candidate factors and only a finite backtest, the minimum-CV solution (\(\lambda = 0.030\), 14 factors) is likely to overfit in-sample noise. Financial data has low signal-to-noise ratios, so parsimony is especially valuable.
The one-standard-error rule selects the most regularized (simplest) model whose cross-validation error is within one standard error of the minimum. The 7-factor model is statistically indistinguishable in predictive performance from the 14-factor model, but is much simpler.
Practical interpretability and robustness: Fewer factors reduce transaction costs, data requirements, and model instability. A parsimonious model is more likely to generalize to unseen data (out-of-sample) in a non-stationary financial environment.

4.4 (q) Walk-Forward (Time-Respecting) Cross-Validation

Walk-forward scheme:

Fix an initial training window (e.g., the first 36 months of data).
Train the model on the training window; record the estimated parameters.
Predict on the next 1 month (or a fixed hold-out period) immediately following the training window — this is the out-of-sample evaluation period.
Roll forward: Expand (or slide) the training window by one period, and repeat steps 2–3.
Aggregate all out-of-sample predictions to compute performance metrics (Sharpe ratio, accuracy, etc.).

Why standard random k-fold cross-validation is unsafe:

Random k-fold cross-validation splits data into folds randomly, meaning future observations can appear in the training set while past observations appear in the test set. This constitutes look-ahead bias (data snooping):

A model trained on data that includes period \(t+1\) and then evaluated on period \(t\) appears to perform well, but it had access to future information it would not have in real trading.
Financial returns exhibit non-stationarity — distributional shifts over time mean that randomly mixed training/test data does not represent the realistic deployment scenario.
The walk-forward scheme respects the arrow of time: the model is always trained on past data and evaluated on future data, exactly mimicking live trading conditions and providing an honest estimate of out-of-sample performance.

Machine Learning Applications in Finance – Final Examination

Your Name

2026-06-09