Question 1. Single-Factor (Market) Model

Given:

Monthly excess return regression over \(n = 96\) months
\(R_i - R_f = \alpha + \beta(R_m - R_f) + \varepsilon\)
\(\hat{\alpha} = 0.0017\), SE\((\hat{\alpha}) = 0.0020\)
\(\hat{\beta} = 0.98\), SE\((\hat{\beta}) = 0.17\)
\(R^2 = 0.50\)
\(E[R_m - R_f] = 0.70\%\)
Critical \(|t| \approx 1.98\) at 5% significance level

(a) t-statistic for \(\hat{\beta}\) and test \(H_0: \beta = 0\)

Formula: \[t = \frac{\hat{\beta} - \beta_0}{\text{SE}(\hat{\beta})}\]

beta_hat   <- 0.98
se_beta    <- 0.17
beta_null  <- 0       # H0: beta = 0
t_crit     <- 1.98

# t-statistic
t_beta <- (beta_hat - beta_null) / se_beta
t_beta <- round(t_beta, 4)
cat("t-statistic for H0: beta = 0:", t_beta, "\n")

## t-statistic for H0: beta = 0: 5.7647

cat("Critical |t|:", t_crit, "\n")

## Critical |t|: 1.98

cat("Reject H0?", abs(t_beta) > t_crit, "\n")

## Reject H0? TRUE

Result: \(t = \frac{0.98 - 0}{0.17} = 5.7647\)

Since \(|t| = 5.7647\) \(>\) \(1.98\), we reject \(H_0: \beta = 0\) at the 5% significance level.

Economic interpretation: A \(\hat{\beta} = 0.98\) means the fund moves almost in lockstep with the market. For every 1% increase in the market excess return, the fund’s excess return increases by approximately 0.98%, indicating near-market-level systematic risk.

(b) Test \(H_0: \beta = 1\)

beta_null1 <- 1   # H0: beta = 1

t_beta1 <- (beta_hat - beta_null1) / se_beta
t_beta1 <- round(t_beta1, 4)
cat("t-statistic for H0: beta = 1:", t_beta1, "\n")

## t-statistic for H0: beta = 1: -0.1176

cat("Critical |t|:", t_crit, "\n")

## Critical |t|: 1.98

cat("Reject H0?", abs(t_beta1) > t_crit, "\n")

## Reject H0? FALSE

Result: \(t = \frac{0.98 - 1}{0.17} = -0.1176\)

Since \(|t| = 0.1176\) \(<\) \(1.98\), we fail to reject \(H_0: \beta = 1\) at the 5% significance level.

Interpretation: The fund’s systematic risk is statistically indistinguishable from the market (\(\beta = 1\)). The fund does not exhibit significantly higher or lower market sensitivity than a passive index, suggesting it provides no meaningful leverage or de-risking relative to the market portfolio.

(c) t-statistic for \(\hat{\alpha}\) (Jensen’s Alpha)

alpha_hat <- 0.0017
se_alpha  <- 0.0020
alpha_null <- 0    # H0: alpha = 0

t_alpha <- (alpha_hat - alpha_null) / se_alpha
t_alpha <- round(t_alpha, 4)
cat("t-statistic for Jensen's alpha:", t_alpha, "\n")

## t-statistic for Jensen's alpha: 0.85

cat("Critical |t|:", t_crit, "\n")

## Critical |t|: 1.98

cat("Reject H0?", abs(t_alpha) > t_crit, "\n")

## Reject H0? FALSE

Result: \(t = \frac{0.0017 - 0}{0.0020} = 0.85\)

Since \(|t| = 0.85\) \(<\) \(1.98\), we fail to reject \(H_0: \alpha = 0\).

Conclusion: The marketing team’s claim of “positive risk-adjusted performance” is not statistically justified. While the point estimate \(\hat{\alpha} = 0.0017\) is positive, it is statistically indistinguishable from zero at the 5% level. The observed alpha could easily be due to sampling variability rather than genuine manager skill.

(d) Interpretation of \(R^2\)

R2 <- 0.50
cat("R-squared:", R2, "\n")

## R-squared: 0.5

cat("Systematic variation (%):", R2 * 100, "\n")

## Systematic variation (%): 50

cat("Idiosyncratic / diversifiable variation (%):", (1 - R2) * 100, "\n")

## Idiosyncratic / diversifiable variation (%): 50

Interpretation: \(R^2 = 0.50\) means that 50% of the fund’s return variation is explained by systematic (market) risk, and the remaining 50% is idiosyncratic (diversifiable) risk. This relatively low \(R^2\) suggests the fund holds significant unsystematic exposures — perhaps concentrated positions or sector bets — that are not captured by the single market factor.

(e) CAPM-implied Expected Monthly Excess Return

Formula: \[E[R_i - R_f] = \hat{\beta} \times E[R_m - R_f]\]

E_market_premium <- 0.0070  # 0.70% = 0.0070

E_excess_return <- beta_hat * E_market_premium
E_excess_return_pct <- round(E_excess_return * 100, 4)
cat("CAPM-implied expected monthly excess return:", E_excess_return_pct, "%\n")

## CAPM-implied expected monthly excess return: 0.686 %

Result: \(E[R_i - R_f] = 0.98 \times 0.70\% = 0.686\%\)

Question 2. Fama–French Three-Factor Model

Given:

\(n = 144\) monthly observations
\(R_i - R_f = \alpha + b \cdot MKT + s \cdot SMB + h \cdot HML + \varepsilon\)

Term	Estimate	Std. Error
\(\hat{\alpha}\)	0.0029	0.0018
\(\hat{b}\) (MKT)	0.97	0.08
\(\hat{s}\) (SMB)	0.75	0.11
\(\hat{h}\) (HML)	−0.13	0.13

\(R^2 = 0.92\), Adjusted \(R^2 = 0.918\)
Critical \(|t| \approx 1.98\)

(f) t-statistics for all four coefficients

Formula: \(t_k = \frac{\hat{\theta}_k}{\text{SE}(\hat{\theta}_k)}\)

# Estimates
coefs   <- c(alpha = 0.0029, MKT = 0.97, SMB = 0.75, HML = -0.13)
ses     <- c(alpha = 0.0018, MKT = 0.08, SMB = 0.11, HML =  0.13)

t_stats <- round(coefs / ses, 4)
significant <- abs(t_stats) > t_crit

results_q2f <- data.frame(
  Term       = names(coefs),
  Estimate   = coefs,
  Std_Error  = ses,
  t_stat     = t_stats,
  Significant = significant
)
rownames(results_q2f) <- NULL
print(results_q2f)

##    Term Estimate Std_Error  t_stat Significant
## 1 alpha   0.0029    0.0018  1.6111       FALSE
## 2   MKT   0.9700    0.0800 12.1250        TRUE
## 3   SMB   0.7500    0.1100  6.8182        TRUE
## 4   HML  -0.1300    0.1300 -1.0000       FALSE

Summary:

\(t_\alpha = 1.6111\): Not significant (\(|t| < 1.98\))
\(t_b = 12.125\): Significant (\(|t| > 1.98\))
\(t_s = 6.8182\): Significant (\(|t| > 1.98\))
\(t_h = -1\): Not significant (\(|t| < 1.98\))

(g) Investment Style Classification

s_hat <- 0.75   # SMB loading
h_hat <- -0.13  # HML loading

cat("SMB loading (s):", s_hat, "\n")

## SMB loading (s): 0.75

cat("HML loading (h):", h_hat, "\n")

## HML loading (h): -0.13

cat("\nSize tilt: Positive and large SMB => SMALL-CAP tilt\n")

## 
## Size tilt: Positive and large SMB => SMALL-CAP tilt

cat("Value/Growth tilt: Negative HML => GROWTH tilt\n")

## Value/Growth tilt: Negative HML => GROWTH tilt

Interpretation:

Size tilt: \(\hat{s} = 0.75 > 0\) (large and significant) → the fund has a small-cap tilt, systematically overweighting smaller companies relative to the market.
Value/Growth tilt: \(\hat{h} = -0.13 < 0\) (negative but not statistically significant) → weak evidence of a growth tilt; the fund appears to lean slightly toward growth stocks (low book-to-market), though this loading cannot be distinguished from zero at the 5% level.

Overall style: The fund is best characterized as a small-cap growth fund, though the growth dimension is not statistically confirmed.

(h) Intercept Interpretation and Manager Value-Add

alpha_ff <- 0.0029
se_alpha_ff <- 0.0018
t_alpha_ff <- round(alpha_ff / se_alpha_ff, 4)

cat("FF3 Alpha:", alpha_ff, "\n")

## FF3 Alpha: 0.0029

cat("t-statistic:", t_alpha_ff, "\n")

## t-statistic: 1.6111

cat("Significant at 5%?", abs(t_alpha_ff) > t_crit, "\n")

## Significant at 5%? FALSE

Result: \(t_\alpha = \frac{0.0029}{0.0018} = 1.6111\)

Since \(|t| = 1.6111\) \(<\) \(1.98\), the FF3 alpha is not statistically significant at the 5% level.

Interpretation: After controlling for market, size, and value exposures, the fund earns a monthly alpha of 0.29%, but this is indistinguishable from zero statistically. We cannot conclude that the manager adds value beyond the three Fama–French factor exposures. The positive point estimate may reflect sampling variation rather than genuine stock-selection or timing skill.

(i) Rise in \(R^2\) from 0.75 to 0.92, and Adjusted \(R^2\)

R2_capm <- 0.75
R2_ff3  <- 0.92
adj_R2_ff3 <- 0.918

cat("CAPM R2:         ", R2_capm, "\n")

## CAPM R2:          0.75

cat("FF3 R2:          ", R2_ff3, "\n")

## FF3 R2:           0.92

cat("FF3 Adjusted R2: ", adj_R2_ff3, "\n")

## FF3 Adjusted R2:  0.918

cat("Improvement in R2:", round(R2_ff3 - R2_capm, 4), "\n")

## Improvement in R2: 0.17

Interpretation of the rise from 0.75 to 0.92: Adding the SMB and HML factors explains an additional 17% of the fund’s return variation. This indicates that the fund’s returns are materially driven by size and value premiums, not just broad market movements. The single-factor model was leaving substantial systematic variation unexplained.

Why adjusted \(R^2\) is appropriate for model comparison: Raw \(R^2\) mechanically increases with every added predictor, even irrelevant ones. The adjusted \(R^2\) penalizes for the number of predictors: \[\bar{R}^2 = 1 - \frac{(1-R^2)(n-1)}{n-k-1}\] where \(k\) is the number of regressors. Here, adjusted \(R^2 = 0.918\) (vs raw \(R^2 = 0.920\)) — very close, confirming that the added factors (SMB, HML) provide genuine explanatory power and are not inflating fit spuriously.

Question 3. Logistic Regression for Market Direction

Model: \[\text{logit}\, P(\text{Up}) = \beta_0 + \beta_1 r_{t-1} + \beta_2 \Delta VIX_{t-1}\]

Coefficients: \(\beta_0 = -0.02\), \(\beta_1 = 5.4\), \(\beta_2 = -0.38\)

Today’s inputs: \(r_{t-1} = 0.010\), \(\Delta VIX = 1.5\)

(j) Predicted Probability and Class

Formulas: \[\text{logit} = \beta_0 + \beta_1 r_{t-1} + \beta_2 \Delta VIX\] \[P(\text{Up}) = \frac{e^{\text{logit}}}{1 + e^{\text{logit}}} = \frac{1}{1 + e^{-\text{logit}}}\]

b0 <- -0.02
b1 <-  5.4
b2 <- -0.38

r_lag  <- 0.010
dVIX   <- 1.5

logit_val <- b0 + b1 * r_lag + b2 * dVIX
prob_up   <- 1 / (1 + exp(-logit_val))
pred_class <- ifelse(prob_up >= 0.5, "Up", "Down")

cat("Logit value:", round(logit_val, 4), "\n")

## Logit value: -0.536

cat("P(Up)      :", round(prob_up, 4), "\n")

## P(Up)      : 0.3691

cat("Predicted class (threshold = 0.5):", pred_class, "\n")

## Predicted class (threshold = 0.5): Down

Step-by-step:

\[\text{logit} = -0.02 + 5.4(0.010) + (-0.38)(1.5) = -0.02 + 0.054 - 0.57 = -0.536\]

\[P(\text{Up}) = \frac{1}{1 + e^{-(-0.536)}} = 0.3691\]

Predicted class: Down (probability \(0.3691 < 0.5\))

(k) Economic Interpretation of \(\beta_1\) and \(\beta_2\)

\(\beta_1 = 5.4 > 0\) (lagged return): A positive lagged return increases the log-odds of an “Up” day tomorrow. This reflects short-term momentum — a day of positive returns tends to be followed by another positive day. This is consistent with return autocorrelation documented at short horizons.

\(\beta_2 = -0.38 < 0\) (\(\Delta VIX\)): A rise in the VIX (fear index) decreases the log-odds of an “Up” day. This captures risk aversion and market stress — when implied volatility spikes, investors become fearful and equity returns tend to be negative or weak the following day.

(l) Confusion Matrix Metrics

Confusion Matrix (200-day hold-out):

	Actual Up	Actual Down	Total
Predicted Up	67	44	111
Predicted Down	33	56	89
Total	100	100	200

TP <- 67   # True Positive  (Predicted Up,   Actual Up)
FP <- 44   # False Positive (Predicted Up,   Actual Down)
FN <- 33   # False Negative (Predicted Down, Actual Up)
TN <- 56   # True Negative  (Predicted Down, Actual Down)
N  <- 200

# Accuracy
accuracy <- (TP + TN) / N

# Sensitivity (True Positive Rate for "Up")
sensitivity <- TP / (TP + FN)

# Specificity (True Negative Rate for "Down")
specificity <- TN / (TN + FP)

# Precision for "Up" predictions
precision <- TP / (TP + FP)

cat("Accuracy   :", round(accuracy,    4), "\n")

## Accuracy   : 0.615

cat("Sensitivity:", round(sensitivity, 4), "\n")

## Sensitivity: 0.67

cat("Specificity:", round(specificity, 4), "\n")

## Specificity: 0.56

cat("Precision  :", round(precision,   4), "\n")

## Precision  : 0.6036

Results:

\[\text{Accuracy} = \frac{TP + TN}{N} = \frac{67 + 56}{200} = 0.615\]

\[\text{Sensitivity} = \frac{TP}{TP + FN} = \frac{67}{67 + 33} = 0.67\]

\[\text{Specificity} = \frac{TN}{TN + FP} = \frac{56}{56 + 44} = 0.56\]

\[\text{Precision} = \frac{TP}{TP + FP} = \frac{67}{67 + 44} = 0.6036\]

(m) Naive Rule and Adequacy of Accuracy

# With balanced classes (100 Up, 100 Down), majority class = either (both 50%)
# Naive rule predicts the same class always
majority_class_count <- max(100, 100)  # Both classes are equal
naive_accuracy <- majority_class_count / N

cat("Naive classifier accuracy:", naive_accuracy, "\n")

## Naive classifier accuracy: 0.5

cat("Model accuracy:           ", round(accuracy, 4), "\n")

## Model accuracy:            0.615

cat("Model beats naive rule?   ", accuracy > naive_accuracy, "\n")

## Model beats naive rule?    TRUE

Naive rule accuracy: Since both classes are balanced (100 “Up”, 100 “Down”), a naive majority-class predictor achieves \(\frac{100}{200} = 0.5000\) (50%) accuracy.

Model accuracy: \(0.615\) — the model beats the naive rule (\(61.5\% > 50\%\)).

Why accuracy alone is inadequate for a trading system: Accuracy treats false positives and false negatives symmetrically, but the costs of trading errors are asymmetric. Missing a genuine “Up” day (false negative) vs. entering a trade on a “Down” day (false positive) have very different P&L consequences. In a long-only trading context: - False positives (predicting Up when market is Down) cause direct monetary losses. - False negatives (predicting Down when market is Up) result in missed gains (opportunity cost).

A more economically relevant criterion is precision (what fraction of “Up” predictions are correct, directly linked to trade win rate) or F1 score balancing precision and recall — or even better, a direct measure of strategy P&L under the predicted signals.

Question 4. Resampling and Regularization in a Backtest

Given: Sample mean monthly return \(\bar{r} = 0.70\%\), sample standard deviation \(s = 5.50\%\), \(n = 48\) months.

(n) Monthly and Annualized Sharpe Ratio

Formula (monthly): \[SR_{\text{monthly}} = \frac{\bar{r}}{s}\]

Annualization: Scale by \(\sqrt{12}\) (12 months per year): \[SR_{\text{annual}} = SR_{\text{monthly}} \times \sqrt{12}\]

r_bar  <- 0.0070   # 0.70%
s      <- 0.0550   # 5.50%
n_months <- 48

SR_monthly <- r_bar / s
SR_annual  <- SR_monthly * sqrt(12)

cat("Monthly Sharpe Ratio :", round(SR_monthly, 4), "\n")

## Monthly Sharpe Ratio : 0.1273

cat("Scaling factor       : sqrt(12) =", round(sqrt(12), 4), "\n")

## Scaling factor       : sqrt(12) = 3.4641

cat("Annualized Sharpe    :", round(SR_annual,  4), "\n")

## Annualized Sharpe    : 0.4409

Results:

\[SR_{\text{monthly}} = \frac{0.0070}{0.0550} = 0.1273\]

\[SR_{\text{annual}} = 0.1273 \times \sqrt{12} = 0.1273 \times 3.4641 = 0.4409\]

Scaling factor: \(\sqrt{12}\), because returns scale by \(T\) and standard deviations scale by \(\sqrt{T}\), so the Sharpe ratio scales by \(\sqrt{T}\).

(o) Bootstrap Procedure for Sharpe Ratio Standard Error

Step-by-step bootstrap procedure:

set.seed(42)
# Simulate monthly returns for illustration
r_sim <- rnorm(48, mean = 0.0070, sd = 0.0550)

B <- 10000
SR_boot <- numeric(B)

for (b in 1:B) {
  r_b       <- sample(r_sim, size = 48, replace = TRUE)  # i.i.d. resample
  SR_boot[b] <- mean(r_b) / sd(r_b)
}

SR_monthly_original <- mean(r_sim) / sd(r_sim)
SE_boot <- sd(SR_boot)

cat("Original monthly SR (simulated):", round(SR_monthly_original, 4), "\n")

## Original monthly SR (simulated): 0.073

cat("Bootstrap SE of monthly SR:     ", round(SE_boot, 4), "\n")

## Bootstrap SE of monthly SR:      0.149

cat("95% Bootstrap CI: [",
    round(quantile(SR_boot, 0.025), 4), ",",
    round(quantile(SR_boot, 0.975), 4), "]\n")

## 95% Bootstrap CI: [ -0.2097 , 0.3765 ]

Steps:

Collect the original \(n = 48\) monthly returns \(\{r_1, r_2, \ldots, r_{48}\}\).
Draw \(B = 10{,}000\) bootstrap samples of size 48 with replacement.
For each bootstrap sample \(b\), compute \(SR^{(b)} = \bar{r}^{(b)} / s^{(b)}\).
The bootstrap SE is the standard deviation of \(\{SR^{(1)}, \ldots, SR^{(B)}\}\).
Optionally construct a percentile confidence interval using the 2.5th and 97.5th percentiles.

Why ordinary i.i.d. bootstrap is inappropriate: Monthly financial returns exhibit serial correlation (autocorrelation, volatility clustering, GARCH effects). The i.i.d. bootstrap destroys the time-series dependence structure by resampling observations independently, leading to incorrect standard errors.

Fix — Block Bootstrap: Use a stationary block bootstrap (or overlapping block bootstrap). This resamples contiguous blocks of consecutive observations (e.g., blocks of 3–6 months), preserving local time-series dependence while still generating diverse bootstrap samples.

(p) Choosing the LASSO \(\lambda\)

lambda_min_cv <- 0.030
lambda_1se    <- 0.065
factors_min   <- 14
factors_1se   <- 7

cat("Lambda (min-CV):    ", lambda_min_cv, "-> retains", factors_min, "factors\n")

## Lambda (min-CV):     0.03 -> retains 14 factors

cat("Lambda (1-SE rule): ", lambda_1se,    "-> retains", factors_1se,  "factors\n")

## Lambda (1-SE rule):  0.065 -> retains 7 factors

Recommended choice: \(\lambda = 0.065\) (the 1-standard-error rule)

Rationale:

Parsimony and interpretability: Retaining 7 factors instead of 14 produces a simpler, more transparent model that is easier to maintain and monitor in production.
Overfitting protection: The 1-SE rule selects the most regularized model whose CV error is within one standard error of the minimum. The difference in in-sample CV performance between \(\lambda = 0.030\) and \(\lambda = 0.065\) is statistically negligible, while \(\lambda = 0.065\) provides substantially stronger regularization.
Out-of-sample degradation: In financial applications, in-sample factor signals are prone to data-snooping and overfitting. With 60 candidate factors and only 48–144 months of data, many factors are likely spurious. The sparser model (7 factors) is more likely to generalize to new data.
Transaction costs: Fewer factors typically means fewer positions, lower turnover, and reduced transaction costs in live trading.

(q) Walk-Forward (Time-Respecting) Cross-Validation

# Illustrate walk-forward scheme schematically
T_total <- 60  # total months (illustrative)
train_window <- 36
test_window  <- 6

folds <- data.frame(
  Fold        = integer(),
  Train_Start = integer(),
  Train_End   = integer(),
  Test_Start  = integer(),
  Test_End    = integer()
)

fold <- 1
train_end <- train_window
while ((train_end + test_window) <= T_total) {
  folds <- rbind(folds, data.frame(
    Fold        = fold,
    Train_Start = 1,
    Train_End   = train_end,
    Test_Start  = train_end + 1,
    Test_End    = train_end + test_window
  ))
  train_end <- train_end + test_window
  fold <- fold + 1
}

print(folds)

##   Fold Train_Start Train_End Test_Start Test_End
## 1    1           1        36         37       42
## 2    2           1        42         43       48
## 3    3           1        48         49       54
## 4    4           1        54         55       60

Walk-Forward Scheme (Step-by-Step):

Initial training window: Use the first \(W\) months (e.g., \(W = 36\)) as the training set. Fit the LASSO with the chosen \(\lambda\) and record the selected factors and coefficients.
Out-of-sample test step: Apply the fitted model to the next \(h\) months (e.g., \(h = 6\)) without refitting. Record predicted signals and actual returns.
Roll forward: Expand (or roll) the training window by \(h\) months and re-estimate the model. Repeat until the end of the data.
Aggregate OOS performance: Collect all out-of-sample predictions. Compute strategy returns, Sharpe ratio, maximum drawdown, and other metrics solely from the held-out periods.

Why standard random \(k\)-fold cross-validation is unsafe for this problem:

Random \(k\)-fold CV shuffles observations randomly and assigns them to folds without regard to time. This means the training set for any fold will contain future data relative to some of its test observations — a form of look-ahead bias (data leakage). For instance, if month 48’s return is used to train a model that is then tested on month 12, the model has been “told” information from the future. This produces optimistically biased CV estimates that will not generalize in real trading, where the model can only use information available at the time of the decision.

End of Assignment

Final Exam

Temuulen Sukhbat

2026-06-08