Given information:
alpha_hat <- 0.0017
se_alpha <- 0.0020
beta_hat <- 0.98
se_beta <- 0.17
R2_q1 <- 0.50
E_mkt <- 0.0070 # 0.70% in decimal
t_crit <- 1.98The t-statistic is:
\[t_{\hat{\beta}} = \frac{\hat{\beta} - 0}{SE(\hat{\beta})}\]
## t-statistic for H0: beta = 0: 5.7647
## Critical |t|: 1.98
## Decision: REJECT H0
Interpretation: Since \(|t| = 5.7647| > 1.98\), we reject \(H_0: \beta = 0\) at the 5% significance level. The fund has statistically significant market exposure. Economically, \(\hat{\beta} = 0.98\) means that for every 1% increase in the market excess return, the fund’s excess return increases by approximately 0.98% — nearly one-for-one sensitivity to systematic market movements.
\[t = \frac{\hat{\beta} - 1}{SE(\hat{\beta})}\]
## t-statistic for H0: beta = 1: -0.1176
## Decision: FAIL TO REJECT H0
Interpretation: Since \(|t| = 0.1176| < 1.98\), we fail to reject \(H_0: \beta = 1\). The fund’s systematic risk is statistically indistinguishable from the market portfolio. It does not exhibit significantly amplified or dampened market exposure.
\[t_{\hat{\alpha}} = \frac{\hat{\alpha}}{SE(\hat{\alpha})}\]
## t-statistic for alpha: 0.85
cat("Decision:", ifelse(abs(t_alpha) > t_crit, "REJECT H0 (alpha sig.)", "FAIL TO REJECT H0 (alpha not sig.)"), "\n")## Decision: FAIL TO REJECT H0 (alpha not sig.)
Interpretation: Since \(|t| = 0.85| < 1.98\), we fail to reject \(H_0: \alpha = 0\). The marketing team’s claim of “positive risk-adjusted performance” is not statistically justified. Although the point estimate is positive (\(\hat{\alpha} = 0.17\%\) per month), it falls within the range of sampling noise and does not constitute reliable evidence of managerial skill.
systematic <- R2_q1 * 100
diversifiable <- (1 - R2_q1) * 100
cat("Systematic (market-explained) variance:", systematic, "%\n")## Systematic (market-explained) variance: 50 %
## Idiosyncratic (diversifiable) variance: 50 %
Interpretation: \(R^2 = 0.50\) means 50% of the fund’s return variance is explained by systematic (market) risk, while the remaining 50% is idiosyncratic (diversifiable) risk unique to this fund. A fully diversified portfolio would have \(R^2\) close to 1.0.
\[E[R_i - R_f] = \hat{\beta} \times E[R_m - R_f]\]
capm_implied <- beta_hat * E_mkt
cat("CAPM-implied monthly excess return:", round(capm_implied * 100, 4), "%\n")## CAPM-implied monthly excess return: 0.686 %
The CAPM-implied expected monthly excess return is 0.686%.
Given information:
library(knitr)
coef_vals <- c(0.0029, 0.97, 0.75, -0.13)
se_vals <- c(0.0018, 0.08, 0.11, 0.13)
names_v <- c("Intercept (α)", "MKT (b)", "SMB (s)", "HML (h)")\[t_j = \frac{\hat{\theta}_j}{SE(\hat{\theta}_j)}\]
t_stats <- coef_vals / se_vals
significant <- ifelse(abs(t_stats) > t_crit, "Yes ✓", "No ✗")
results_q2 <- data.frame(
Term = names_v,
Estimate = coef_vals,
Std.Error = se_vals,
t_statistic = round(t_stats, 4),
Significant = significant
)
kable(results_q2, caption = "Fama-French Three-Factor Model: t-statistics",
col.names = c("Term", "Estimate", "Std. Error", "t-statistic", "Significant (5%)"),
align = "lrrrr")| Term | Estimate | Std. Error | t-statistic | Significant (5%) |
|---|---|---|---|---|
| Intercept (α) | 0.0029 | 0.0018 | 1.6111 | No ✗ |
| MKT (b) | 0.9700 | 0.0800 | 12.1250 | Yes ✓ |
| SMB (s) | 0.7500 | 0.1100 | 6.8182 | Yes ✓ |
| HML (h) | -0.1300 | 0.1300 | -1.0000 | No ✗ |
Significant at 5%: MKT (\(b\)) and SMB (\(s\)) are statistically significant. Alpha (\(\alpha\)) and HML (\(h\)) are not significant.
## SMB loading (s): 0.75 -> Positive and significant
## HML loading (h): -0.13 -> Negative and not significant
##
## Style classification: Small-cap Growth Fund
Classification: Small-cap Growth Fund.
## Alpha: 0.0029 ( 0.29 % per month)
## t-statistic: 1.6111
cat("Decision:", ifelse(abs(t_alpha_ff) > t_crit,
"REJECT H0 — manager adds value",
"FAIL TO REJECT H0 — no evidence of skill"), "\n")## Decision: FAIL TO REJECT H0 — no evidence of skill
Interpretation: \(\hat{\alpha} = 0.29\%\) per month, but \(t = 1.6111\) which is below the critical value of 1.98. We fail to reject \(H_0: \alpha = 0\). There is no statistically significant evidence that the manager adds value beyond the three factor exposures. The positive alpha may be attributable to chance.
R2_capm <- 0.75
R2_ff <- 0.92
adj_R2 <- 0.918
n <- 144
k_capm <- 1 # predictors in CAPM
k_ff <- 3 # predictors in FF3
# Manually verify Adjusted R2 for FF3
adj_R2_calc <- 1 - ((1 - R2_ff) * (n - 1)) / (n - k_ff - 1)
cat("Increase in R2:", R2_ff - R2_capm, "\n")## Increase in R2: 0.17
## FF3 Adjusted R2 (calculated): 0.9183
## FF3 Adjusted R2 (reported): 0.918
Interpretation: The jump from \(R^2 = 0.75\) (CAPM) to \(R^2 = 0.92\) (FF3) shows that SMB and HML explain an additional 17% of return variance beyond market exposure alone. The fund’s small-cap and growth style exposures were previously omitted.
Why Adjusted R²: Raw \(R^2\) always increases when adding predictors — even noise variables. Adjusted \(R^2\) penalizes for each additional predictor:
\[\bar{R}^2 = 1 - \frac{(1-R^2)(n-1)}{n-k-1}\]
It rises only when a new variable adds explanatory power beyond chance, making it the appropriate metric for comparing models with different numbers of predictors.
Given: \(\beta_0 = -0.02\), \(\beta_1 = 5.4\), \(\beta_2 = -0.38\)
Today’s inputs: \(r_{t-1} = 0.010\), \(\Delta VIX = 1.5\)
\[\text{logit} = \beta_0 + \beta_1 r_{t-1} + \beta_2 \Delta VIX_{t-1}\]
\[P(\text{Up}) = \frac{1}{1 + e^{-\text{logit}}}\]
logit_val <- b0 + b1 * r_lag + b2 * dVIX
prob_up <- 1 / (1 + exp(-logit_val))
pred_class <- ifelse(prob_up >= 0.5, "Up", "Down")
cat("logit value:", round(logit_val, 4), "\n")## logit value: -0.536
## P(Up): 0.3691
## Predicted class (threshold = 0.5): Down
## Verified with plogis(): 0.3691
The predicted probability is \(P(\text{Up}) = 0.3691\). Since \(0.3691< 0.5\), the predicted class is “Down”.
\(\beta_1 = +5.4\) (lagged return \(r_{t-1}\)): A positive sign means a higher lagged return increases the probability of an “Up” day tomorrow. This captures short-term momentum — recent market gains predict continued upward movement.
\(\beta_2 = -0.38\) (\(\Delta VIX_{t-1}\)): A negative sign means a rise in the VIX (increasing fear and uncertainty) decreases the probability of an “Up” day. This captures risk-off behavior — when volatility spikes, markets tend to decline.
# Confusion matrix values
TP <- 67 # Predicted Up, Actual Up
FP <- 44 # Predicted Up, Actual Down
FN <- 33 # Predicted Down, Actual Up
TN <- 56 # Predicted Down, Actual Down
N <- TP + FP + FN + TN
# Display confusion matrix
cm <- matrix(c(TP, FP, FN, TN), nrow = 2, byrow = TRUE,
dimnames = list(c("Predicted Up", "Predicted Down"),
c("Actual Up", "Actual Down")))
kable(cm, caption = "Confusion Matrix (200-day hold-out test set)")| Actual Up | Actual Down | |
|---|---|---|
| Predicted Up | 67 | 44 |
| Predicted Down | 33 | 56 |
# Metrics
accuracy <- (TP + TN) / N
sensitivity <- TP / (TP + FN) # True positive rate for "Up"
specificity <- TN / (FP + TN) # True negative rate for "Down"
precision <- TP / (TP + FP) # Precision for "Up"
metrics_df <- data.frame(
Metric = c("Accuracy", "Sensitivity (Recall for Up)",
"Specificity", "Precision (for Up)"),
Formula = c("(TP+TN)/N", "TP/(TP+FN)", "TN/(FP+TN)", "TP/(TP+FP)"),
Value = round(c(accuracy, sensitivity, specificity, precision), 4)
)
kable(metrics_df, caption = "Classification Metrics")| Metric | Formula | Value |
|---|---|---|
| Accuracy | (TP+TN)/N | 0.6150 |
| Sensitivity (Recall for Up) | TP/(TP+FN) | 0.6700 |
| Specificity | TN/(FP+TN) | 0.5600 |
| Precision (for Up) | TP/(TP+FP) | 0.6036 |
# Balanced dataset: 100 Up, 100 Down
# Naive majority-class rule always predicts "Up" (or "Down" — same result)
naive_accuracy <- 100 / 200
cat("Naive majority-class accuracy:", naive_accuracy, "\n")## Naive majority-class accuracy: 0.5
## Model accuracy: 0.615
## Model beats naive rule: TRUE
Why accuracy alone is inadequate for a trading system:
More economically relevant criterion: The Sharpe ratio of the resulting trading strategy — it directly measures risk-adjusted profitability and reflects the actual P&L impact of each correct and incorrect prediction.
Given: \(\bar{r} = 0.70\%\), \(\hat{\sigma} = 5.50\%\), \(n = 48\) months
\[SR_{\text{monthly}} = \frac{\bar{r}}{\hat{\sigma}}\]
\[SR_{\text{annual}} = SR_{\text{monthly}} \times \sqrt{12}\]
SR_monthly <- mu_hat / sd_hat
SR_annual <- SR_monthly * sqrt(12)
cat("Monthly Sharpe Ratio:", round(SR_monthly, 4), "\n")## Monthly Sharpe Ratio: 0.1273
## Scaling factor: 3.4641 (= sqrt(12))
## Annualized Sharpe Ratio: 0.4409
The scaling factor is \(\sqrt{12}\), derived from the assumption that monthly returns are i.i.d., so variance scales linearly with time and standard deviation scales with \(\sqrt{T}\).
| Metric | Value |
|---|---|
| Monthly Sharpe Ratio | 0.1273 |
| Annualized Sharpe Ratio | 0.4409 |
set.seed(42)
# Simulate monthly returns consistent with given stats
sim_returns <- rnorm(n_obs, mean = mu_hat, sd = sd_hat)
# i.i.d. bootstrap (for illustration — inappropriate for time series)
B <- 5000
boot_SR <- numeric(B)
for (i in 1:B) {
resample <- sample(sim_returns, size = n_obs, replace = TRUE)
boot_SR[i] <- mean(resample) / sd(resample)
}
SE_iid <- sd(boot_SR)
cat("i.i.d. Bootstrap SE of monthly SR:", round(SE_iid, 4), "\n")## i.i.d. Bootstrap SE of monthly SR: 0.1495
cat("95% CI (iid): [",
round(SR_monthly - 1.96 * SE_iid, 4), ",",
round(SR_monthly + 1.96 * SE_iid, 4), "]\n")## 95% CI (iid): [ -0.1657 , 0.4203 ]
Step-by-step i.i.d. bootstrap procedure:
Why i.i.d. bootstrap is inappropriate: Monthly returns exhibit serial correlation (momentum, mean-reversion) and volatility clustering (GARCH effects). The i.i.d. bootstrap destroys the time-series dependence by sampling independently, understating true estimation uncertainty.
Fix: Block bootstrap (e.g., tsboot() in
R). This resamples contiguous blocks of consecutive
observations (e.g., blocks of length 4–6 months), preserving local
autocorrelation while still providing non-parametric uncertainty
estimates.
# Block bootstrap example (requires 'boot' package)
library(boot)
SR_func <- function(data, i) {
d <- data[i]
mean(d) / sd(d)
}
# Moving block bootstrap
boot_block <- tsboot(sim_returns, SR_func, R = 5000,
l = 6, sim = "fixed")
cat("Block Bootstrap SE:", round(sd(boot_block$t), 4), "\n")lambda_minCV <- 0.030
factors_min <- 14
lambda_1se <- 0.065
factors_1se <- 7
lasso_df <- data.frame(
Rule = c("Minimum CV error", "One-standard-error rule"),
Lambda = c(lambda_minCV, lambda_1se),
Factors = c(factors_min, factors_1se),
Recommended = c("No", "Yes ✓")
)
kable(lasso_df, caption = "LASSO Lambda Selection")| Rule | Lambda | Factors | Recommended |
|---|---|---|---|
| Minimum CV error | 0.030 | 14 | No |
| One-standard-error rule | 0.065 | 7 | Yes ✓ |
Recommended: \(\lambda = 0.065\) (one-SE rule, 7 factors)
Reasons to prefer the more parsimonious model:
# Illustrate walk-forward splits
total_months <- 60
train_init <- 36
test_window <- 6
splits <- data.frame(
Fold = 1:4,
Train_Start = 1,
Train_End = c(36, 42, 48, 54),
Test_Start = c(37, 43, 49, 55),
Test_End = c(42, 48, 54, 60)
)
kable(splits, caption = "Walk-Forward (Expanding Window) Splits",
col.names = c("Fold", "Train Start", "Train End", "Test Start", "Test End"))| Fold | Train Start | Train End | Test Start | Test End |
|---|---|---|---|---|
| 1 | 1 | 36 | 37 | 42 |
| 2 | 1 | 42 | 43 | 48 |
| 3 | 1 | 48 | 49 | 54 |
| 4 | 1 | 54 | 55 | 60 |
Walk-forward procedure:
Why standard k-fold is unsafe: Random k-fold shuffles data before splitting, so a validation fold can contain observations from before the training fold — this is look-ahead bias. The model implicitly “sees” future data during training, producing inflated performance metrics that cannot be replicated in live trading. Walk-forward strictly enforces temporal ordering: the model is always evaluated only on data it has never seen.
summary_df <- data.frame(
Question = c("Q1(a)", "Q1(b)", "Q1(c)", "Q1(e)",
"Q2(f) t(α)", "Q2(f) t(b)", "Q2(f) t(s)", "Q2(f) t(h)",
"Q3(j) logit", "Q3(j) P(Up)",
"Q3(l) Accuracy", "Q3(l) Sensitivity", "Q3(l) Specificity", "Q3(l) Precision",
"Q4(n) SR monthly", "Q4(n) SR annual"),
Result = c(
round(t_beta_0, 4), round(t_beta_1, 4), round(t_alpha, 4),
paste0(round(capm_implied * 100, 4), "%"),
round(t_stats[1], 4), round(t_stats[2], 4),
round(t_stats[3], 4), round(t_stats[4], 4),
round(logit_val, 4), round(prob_up, 4),
round(accuracy, 4), round(sensitivity, 4),
round(specificity, 4), round(precision, 4),
round(SR_monthly, 4), round(SR_annual, 4)
),
Decision = c(
"Reject H0 (sig.)", "Fail to reject H0", "Fail to reject H0 (no skill)", "—",
"Not sig.", "Significant ✓", "Significant ✓", "Not sig.",
"—", "Predict: Down",
"Beats naive (50%)", "—", "—", "—",
"—", "—"
)
)
kable(summary_df, caption = "Summary of All Key Numerical Results",
col.names = c("Question", "Result", "Decision / Note"))| Question | Result | Decision / Note |
|---|---|---|
| Q1(a) | 5.7647 | Reject H0 (sig.) |
| Q1(b) | -0.1176 | Fail to reject H0 |
| Q1(c) | 0.85 | Fail to reject H0 (no skill) |
| Q1(e) | 0.686% | — |
| Q2(f) t(α) | 1.6111 | Not sig. |
| Q2(f) t(b) | 12.125 | Significant ✓ |
| Q2(f) t(s) | 6.8182 | Significant ✓ |
| Q2(f) t(h) | -1 | Not sig. |
| Q3(j) logit | -0.536 | — |
| Q3(j) P(Up) | 0.3691 | Predict: Down |
| Q3(l) Accuracy | 0.615 | Beats naive (50%) |
| Q3(l) Sensitivity | 0.67 | — |
| Q3(l) Specificity | 0.56 | — |
| Q3(l) Precision | 0.6036 | — |
| Q4(n) SR monthly | 0.1273 | — |
| Q4(n) SR annual | 0.4409 | — |
End of Examination