Given:
Formula: \[t_{\hat{\beta}} = \frac{\hat{\beta} - 0}{SE(\hat{\beta})}\]
beta_hat <- 0.98
se_beta <- 0.17
t_crit <- 1.98
t_beta_0 <- beta_hat / se_beta
cat("t-statistic for H0: beta = 0:", round(t_beta_0, 4), "\n")## t-statistic for H0: beta = 0: 5.7647
## Critical |t|: 1.98
## Reject H0? TRUE
Result: \(t = 0.98 / 0.17 = 5.7647\). Since \(|5.7647| > 1.98\), we reject \(H_0: \beta = 0\) at the 5% level.
Economic interpretation: \(\hat{\beta} = 0.98\) means the fund moves approximately 0.98% for every 1% move in the market excess return. The fund has slightly less-than-market systematic risk — it is nearly as volatile as the market but not amplified.
Formula: \[t_{\hat{\beta}=1} = \frac{\hat{\beta} - 1}{SE(\hat{\beta})}\]
## t-statistic for H0: beta = 1: -0.1176
## Critical |t|: 1.98
## Reject H0? FALSE
Result: \(t = (0.98 - 1) / 0.17 = -0.1176\). Since \(|-0.1176| < 1.98\), we fail to reject \(H_0: \beta = 1\) at the 5% level.
Interpretation: The fund’s systematic risk is statistically indistinguishable from that of the market. It behaves like a passive market-tracking fund in terms of beta.
Formula: \[t_{\hat{\alpha}} = \frac{\hat{\alpha} - 0}{SE(\hat{\alpha})}\]
alpha_hat <- 0.0017
se_alpha <- 0.0020
t_alpha <- alpha_hat / se_alpha
cat("t-statistic for alpha:", round(t_alpha, 4), "\n")## t-statistic for alpha: 0.85
## Critical |t|: 1.98
## Reject H0: alpha = 0? FALSE
Result: \(t = 0.0017 / 0.0020 = 0.8500\). Since \(|0.8500| < 1.98\), we fail to reject \(H_0: \alpha = 0\) at the 5% level.
Marketing claim assessment: The data do not statistically justify the claim of “positive risk-adjusted performance.” While the point estimate \(\hat{\alpha} = 0.0017 > 0\), the large standard error means this could easily be zero — the result is consistent with zero alpha at any conventional significance level.
R2 <- 0.50
systematic_pct <- R2 * 100
diversifiable_pct <- (1 - R2) * 100
cat("Systematic (market-explained) variation:", systematic_pct, "%\n")## Systematic (market-explained) variation: 50 %
## Diversifiable (idiosyncratic) variation: 50 %
Interpretation: \(R^2 = 0.50\) means 50% of the fund’s monthly return variation is explained by movements in the market factor (systematic risk). The remaining 50% is diversifiable, idiosyncratic risk specific to the fund’s holdings that a well-diversified investor need not bear.
Formula (CAPM): \[E[R_i - R_f] = \hat{\beta} \times E[R_m - R_f]\]
E_market_premium <- 0.0070
E_fund_excess <- beta_hat * E_market_premium
cat("CAPM-implied expected monthly excess return:",
round(E_fund_excess, 4), "or",
round(E_fund_excess * 100, 4), "%\n")## CAPM-implied expected monthly excess return: 0.0069 or 0.686 %
Result: \(E[R_i - R_f] = 0.98 \times 0.0070 = 0.0069\) or 0.69% per month.
Given (\(n = 144\)):
| Term | Estimate | Std. Error |
|---|---|---|
| \(\hat{\alpha}\) | 0.0029 | 0.0018 |
| \(b\) (MKT) | 0.97 | 0.08 |
| \(s\) (SMB) | 0.75 | 0.11 |
| \(h\) (HML) | -0.13 | 0.13 |
\(R^2 = 0.92\), Adjusted \(R^2 = 0.918\)
Formula: \(t_j = \hat{\theta}_j / SE(\hat{\theta}_j)\) for each coefficient \(\theta_j\).
coef_names <- c("alpha", "MKT (b)", "SMB (s)", "HML (h)")
estimates <- c(0.0029, 0.97, 0.75, -0.13)
std_errors <- c(0.0018, 0.08, 0.11, 0.13)
t_stats <- estimates / std_errors
significant <- abs(t_stats) > t_crit
results_df <- data.frame(
Coefficient = coef_names,
Estimate = estimates,
Std_Error = std_errors,
t_stat = round(t_stats, 4),
Significant = significant
)
print(results_df)## Coefficient Estimate Std_Error t_stat Significant
## 1 alpha 0.0029 0.0018 1.6111 FALSE
## 2 MKT (b) 0.9700 0.0800 12.1250 TRUE
## 3 SMB (s) 0.7500 0.1100 6.8182 TRUE
## 4 HML (h) -0.1300 0.1300 -1.0000 FALSE
Summary:
s_hat <- 0.75
h_hat <- -0.13
cat("SMB loading (s):", s_hat, "→ Positive and significant: SMALL-CAP tilt\n")## SMB loading (s): 0.75 → Positive and significant: SMALL-CAP tilt
## HML loading (h): -0.13 → Negative (not significant): GROWTH tilt
Style classification:
Overall style: The fund is a small-cap growth fund — or at minimum a small-cap fund with no demonstrated value tilt.
alpha_ff <- 0.0029
se_alpha_ff <- 0.0018
t_alpha_ff <- alpha_ff / se_alpha_ff
cat("FF3 alpha:", alpha_ff, "\n")## FF3 alpha: 0.0029
## t-statistic: 1.6111
## Significant at 5%? FALSE
Interpretation: \(\hat{\alpha} = 0.0029\) (0.29% per month) represents the abnormal return after accounting for the three Fama–French risk factors. This is Jensen’s alpha in the FF3 context.
\(t = 1.6111 < 1.98\): the alpha is not statistically significant at the 5% level. The manager does not demonstrate value-added beyond compensated factor exposures.
R2_capm <- 0.75
R2_ff3 <- 0.92
adj_R2_ff3 <- 0.918
improvement <- R2_ff3 - R2_capm
cat("R² improvement by adding SMB and HML:", round(improvement, 4), "\n")## R² improvement by adding SMB and HML: 0.17
## Adjusted R²: 0.918
Explanation:
The rise in \(R^2\) from \(0.75\) to \(0.92\) shows that adding the SMB and HML factors explains an additional 17% of the fund’s return variation that the market factor alone could not capture. The fund’s returns are heavily influenced by the size and (to a lesser extent) value dimensions that CAPM ignores.
Why Adjusted R²? Adding any predictor — even a random one — mechanically increases \(R^2\) because OLS fits noise as well as signal. The adjusted \(R^2\) penalizes for each additional parameter: \[\bar{R}^2 = 1 - \frac{(1-R^2)(n-1)}{n-k-1}\] Here \(\bar{R}^2 = 0.918 \approx R^2 = 0.92\), confirming that both SMB and HML genuinely improve fit beyond chance — the penalty is small and the gains are real.
Model: \[\text{logit}\,P(\text{Up}) = \beta_0 + \beta_1 r_{t-1} + \beta_2 \Delta\text{VIX}_{t-1}\]
\(\beta_0 = -0.02\), \(\beta_1 = 5.4\), \(\beta_2 = -0.38\)
Today’s inputs: \(r_{t-1} = 0.010\), \(\Delta\text{VIX} = 1.5\)
Formula: \[\text{logit} = \beta_0 + \beta_1 r_{t-1} + \beta_2 \Delta\text{VIX}\] \[P(\text{Up}) = \frac{1}{1 + e^{-\text{logit}}}\]
b0 <- -0.02
b1 <- 5.4
b2 <- -0.38
r_lag <- 0.010
delta_vix <- 1.5
logit_val <- b0 + b1 * r_lag + b2 * delta_vix
prob_up <- 1 / (1 + exp(-logit_val))
cat("Logit value:", round(logit_val, 4), "\n")## Logit value: -0.536
## P(Up): 0.3691
## Predicted class (threshold = 0.5): Down
Calculation: \[\text{logit} = -0.02 + 5.4(0.010) + (-0.38)(1.5) = -0.02 + 0.054 - 0.57 = -0.536\] \[P(\text{Up}) = \frac{1}{1 + e^{0.536}} = \frac{1}{1 + 1.7096} = 0.3692\]
Predicted class: \(P(\text{Up}) = 0.3692 < 0.5\) → predicted Down.
\(\beta_1 = 5.4 > 0\) (lagged return): A positive lagged return increases the log-odds of an “Up” day. This captures return momentum — when the market was up yesterday, it tends to continue upward. This is consistent with short-term momentum effects documented in equity markets.
\(\beta_2 = -0.38 < 0\) (ΔVIX): A rise in the VIX (increase in implied volatility) decreases the log-odds of an “Up” day. This reflects the fear gauge property of VIX: surging volatility signals investor fear and market stress, which is associated with negative returns or increased downside pressure. The inverse relationship between VIX changes and market direction is well-established empirically.
Confusion matrix:
| Actual Up | Actual Down | |
|---|---|---|
| Predicted Up | 67 (TP) | 44 (FP) |
| Predicted Down | 33 (FN) | 56 (TN) |
TP <- 67
FP <- 44
FN <- 33
TN <- 56
N <- 200
accuracy <- (TP + TN) / N
sensitivity <- TP / (TP + FN) # True Positive Rate
specificity <- TN / (TN + FP) # True Negative Rate
precision <- TP / (TP + FP)
cat("Accuracy: ", round(accuracy, 4), "\n")## Accuracy: 0.615
## Sensitivity: 0.67
## Specificity: 0.56
## Precision: 0.6036
Results: - Accuracy \(= (67 + 56)/200 = 123/200 = 0.6150\) (61.50%) - Sensitivity \(= 67/(67+33) = 67/100 = 0.6700\) (67.00%) - Specificity \(= 56/(56+44) = 56/100 = 0.5600\) (56.00%) - Precision \(= 67/(67+44) = 67/111 = 0.6036\) (60.36%)
actual_up <- 100
actual_down <- 100
majority_class <- ifelse(actual_up >= actual_down, "Up", "Down")
naive_accuracy <- max(actual_up, actual_down) / N
cat("Majority class:", majority_class, "\n")## Majority class: Up
## Naive accuracy: 0.5
## Model accuracy: 0.615
## Model beats naive rule? TRUE
Naive accuracy: Both classes are equally balanced (100 each), so the naive rule predicts “Up” or “Down” for all observations: accuracy \(= 100/200 = 0.5000\) (50%).
Model beats naive? Yes — 61.50% > 50.00%.
Why accuracy alone is inadequate for trading: Accuracy treats all errors symmetrically, but in trading they are not. A false positive (predicting Up when the market falls) and a false negative (predicting Down when the market rises) have very different P&L consequences. Furthermore:
A more suitable metric is the area under the ROC curve (AUC-ROC), which measures discrimination ability across all thresholds, or the precision-recall AUC, especially when false alarms are costly.
Given: \(\bar{r} = 0.0070\), \(s = 0.0550\), \(n = 48\) months.
Monthly Sharpe ratio: \[SR_{\text{monthly}} = \frac{\bar{r}}{s}\]
Annualized Sharpe ratio: \[SR_{\text{annual}} = SR_{\text{monthly}} \times \sqrt{12}\]
mean_r <- 0.0070
sd_r <- 0.0550
n_obs <- 48
scaling_factor <- sqrt(12)
SR_monthly <- mean_r / sd_r
SR_annual <- SR_monthly * scaling_factor
cat("Monthly Sharpe ratio:", round(SR_monthly, 4), "\n")## Monthly Sharpe ratio: 0.1273
## Scaling factor (sqrt(12)): 3.4641
## Annualized Sharpe ratio: 0.4409
Results: - \(SR_{\text{monthly}} = 0.0070 / 0.0550 = 0.1273\) - Scaling factor: \(\sqrt{12} = 3.4641\) - \(SR_{\text{annual}} = 0.1273 \times 3.4641 = 0.4409\)
The scaling factor \(\sqrt{12}\) is used because, under i.i.d. returns, the mean scales by 12 and the standard deviation scales by \(\sqrt{12}\), so their ratio scales by \(\sqrt{12}\).
Step-by-step i.i.d. bootstrap:
set.seed(42)
returns <- rnorm(48, mean = 0.0070, sd = 0.0550) # simulated stand-in
B <- 10000
sr_boot <- replicate(B, {
samp <- sample(returns, size = 48, replace = TRUE)
mean(samp) / sd(samp) * sqrt(12)
})
cat("Bootstrap SE of annualized Sharpe:", round(sd(sr_boot), 4), "\n")## Bootstrap SE of annualized Sharpe: 0.516
cat("Bootstrap 95% CI: [",
round(quantile(sr_boot, 0.025), 4), ",",
round(quantile(sr_boot, 0.975), 4), "]\n")## Bootstrap 95% CI: [ -0.7263 , 1.3042 ]
Why ordinary i.i.d. bootstrap is inappropriate for monthly returns:
Monthly financial returns exhibit serial dependence (autocorrelation, volatility clustering via GARCH-type effects). The i.i.d. bootstrap resamples observations independently, destroying any temporal structure. This leads to underestimated variance of the Sharpe ratio (and other time-series statistics) because it fails to account for the persistence in returns.
Fix — Block Bootstrap: The stationary block bootstrap (Politis & Romano, 1994) or circular block bootstrap resamples contiguous blocks of consecutive returns (e.g., blocks of length \(l = 3\)–\(6\) months), preserving local dependence structure. The block length \(l\) is typically chosen by a data-driven criterion (e.g., the automatic bandwidth selection method of Politis & White).
lambda_min <- 0.030
factors_min <- 14
lambda_1se <- 0.065
factors_1se <- 7
cat("Minimum-CV lambda:", lambda_min, "| Factors retained:", factors_min, "\n")## Minimum-CV lambda: 0.03 | Factors retained: 14
## One-SE-rule lambda: 0.065 | Factors retained: 7
Recommendation: Deploy \(\lambda = 0.065\) (one-standard-error rule).
Rationale:
Overfitting risk is severe in finance. With 60 candidate factors and a limited history, the minimum-CV solution (\(\lambda = 0.030\), 14 factors) risks in-sample overfitting. Many of those 14 factors may capture historical noise rather than genuine predictors.
One-SE rule principle. The one-SE rule selects the most parsimonious model whose CV error is within one standard error of the minimum. The 7-factor model is statistically indistinguishable from the 14-factor model on cross-validated performance, yet it is substantially simpler.
Parsimony and out-of-sample robustness. Sparser models generalize better to new data (lower estimation error even if slightly higher bias). In a backtest context, every extra factor adds a parameter to estimate, magnifying the curse of dimensionality.
Interpretability. A 7-factor model is easier to explain to stakeholders and monitor over time for structural breaks.
Walk-forward scheme:
# Conceptual illustration of walk-forward structure:
n_total <- 60 # total months of data
train0 <- 36 # initial training window
oos_preds <- numeric(n_total - train0)
cat("Walk-forward: training starts with", train0, "months\n")## Walk-forward: training starts with 36 months
## Out-of-sample predictions: 24 months
for (t in seq(train0, n_total - 1)) {
# train_data <- returns[1:t] # use all data up to t
# fit model, predict t+1
oos_preds[t - train0 + 1] <- t + 1 # placeholder
}
cat("Last in-sample month before each prediction respected.\n")## Last in-sample month before each prediction respected.
Why standard random k-fold cross-validation is unsafe:
Random k-fold CV shuffles all observations randomly into \(k\) folds. For time-series data, this causes data leakage: a validation fold containing month \(t\) will have training data from months \(t+1, t+2, \ldots\) in some other fold. The model “sees the future” during training, leading to wildly optimistic performance estimates that do not reflect real-world deployability. Financial time-series also have serial correlation and time-varying volatility — randomly mixing these observations destroys the temporal structure that the model needs to generalize.
Walk-forward CV ensures the model is always trained only on past data when predicting any future observation, exactly mirroring the live trading environment.
End of Examination