Regression model: \[R_i - R_f = \alpha + \beta(R_m - R_f) + \varepsilon\]
Given information:
| Term | Estimate | Std. Error |
|---|---|---|
| Intercept (α) | 0.0017 | 0.0020 |
| Market premium (β) | 0.98 | 0.17 |
\(R^2 = 0.50\), \(n = 96\) months, \(E[R_m - R_f] = 0.70\%\), critical \(|t| \approx 1.98\).
Formula: \[t_\beta = \frac{\hat{\beta} - \beta_0}{SE(\hat{\beta})} = \frac{0.98 - 0}{0.17}\]
# Given values
beta_hat <- 0.98
se_beta <- 0.17
t_crit <- 1.98
# t-statistic for H0: beta = 0
t_beta_0 <- (beta_hat - 0) / se_beta
cat("t-statistic (H0: β = 0):", round(t_beta_0, 4), "\n")t-statistic (H0: β = 0): 5.7647
Critical value : 1.98
Reject H0? : TRUE
Result: \(t_{\hat{\beta}} = 0.98 / 0.17 = 5.7647\)
Since \(|5.7647| > 1.98\), we reject \(H_0: \beta = 0\) at the 5% significance level.
Economic Interpretation: \(\hat{\beta} = 0.98\) means the fund moves approximately in tandem with the broad market. A 1% increase in the market excess return is associated with a 0.98% increase in the fund’s excess return. The fund has nearly market-neutral systematic risk — it is essentially a market-tracking vehicle with only a marginal underexposure to market swings relative to a pure index (β = 1).
Formula: \[t = \frac{\hat{\beta} - 1}{SE(\hat{\beta})} = \frac{0.98 - 1}{0.17}\]
# t-statistic for H0: beta = 1
t_beta_1 <- (beta_hat - 1) / se_beta
cat("t-statistic (H0: β = 1):", round(t_beta_1, 4), "\n")t-statistic (H0: β = 1): -0.1176
Critical value : 1.98
Reject H0? : FALSE
Result: \(t = (0.98 - 1) / 0.17 = -0.1176\)
Since \(|-0.1176| = 0.1176 < 1.98\), we fail to reject \(H_0: \beta = 1\) at the 5% level.
Interpretation: The fund’s systematic risk is statistically indistinguishable from that of the market portfolio. Despite the point estimate being slightly below 1, we cannot conclude the fund bears meaningfully less or more market risk than the index. The fund behaves like a closet index fund from a systematic-risk perspective.
Formula: \[t_\alpha = \frac{\hat{\alpha}}{SE(\hat{\alpha})} = \frac{0.0017}{0.0020}\]
alpha_hat <- 0.0017
se_alpha <- 0.0020
t_alpha <- alpha_hat / se_alpha
cat("t-statistic (H0: α = 0):", round(t_alpha, 4), "\n")t-statistic (H0: α = 0): 0.85
Critical value : 1.98
Reject H0? : FALSE
Alpha is positive? : TRUE
Result: \(t_{\hat{\alpha}} = 0.0017 / 0.0020 = 0.85\)
Since \(|0.85| < 1.98\), we fail to reject \(H_0: \alpha = 0\) at the 5% level.
Marketing Claim Assessment: While the point estimate \(\hat{\alpha} = 0.0017 > 0\) is numerically positive, it is not statistically significant. The data do not justify advertising “positive risk-adjusted performance.” The estimated alpha could plausibly be zero (or even negative) due to sampling variability. Advertising such a claim would be misleading, as the evidence is consistent with zero abnormal return after adjusting for market risk.
R2 <- 0.50
systematic_pct <- R2 * 100
diversifiable_pct <- (1 - R2) * 100
cat("R-squared :", R2, "\n")R-squared : 0.5
Systematic variation (%) : 50
Diversifiable (idiosyncratic) (%): 50
Result: \(R^2 = 0.50\)
Interpretation: An \(R^2\) of 0.50 is moderate for an equity fund. It suggests the fund is only partially driven by market forces and retains substantial idiosyncratic exposure. This idiosyncratic component can be eliminated through diversification — an investor holding this fund alongside other assets could reduce this 50% component. The relatively low \(R^2\) relative to a typical diversified equity fund implies significant active bets or concentrated positions.
CAPM formula: \[E[R_i - R_f] = \beta \times E[R_m - R_f]\]
E_mkt_premium <- 0.0070 # 0.70% expressed as decimal
E_fund_excess <- beta_hat * E_mkt_premium
cat("β :", beta_hat, "\n")β : 0.98
E[Rm - Rf] : 0.7 %
CAPM-implied E[Ri - Rf] : 0.686 %
Result: \[E[R_i - R_f] = 0.98 \times 0.70\% = 0.686\%\]
The CAPM predicts the fund should earn a monthly excess return of 0.686%, reflecting the fund’s near-market beta.
Regression model: \[R_i - R_f = \alpha + b \cdot \text{MKT} + s \cdot \text{SMB} + h \cdot \text{HML} + \varepsilon\]
Given information (\(n = 144\)):
| Term | Estimate | Std. Error |
|---|---|---|
| Intercept (α) | 0.0029 | 0.0018 |
| MKT (b) | 0.97 | 0.08 |
| SMB (s) | 0.75 | 0.11 |
| HML (h) | −0.13 | 0.13 |
\(R^2 = 0.92\), Adjusted \(R^2 = 0.918\), critical \(|t| \approx 1.98\).
Formula (general): \[t_k = \frac{\hat{\theta}_k}{SE(\hat{\theta}_k)}\]
# Coefficients and standard errors
coefs <- c(alpha = 0.0029, MKT = 0.97, SMB = 0.75, HML = -0.13)
ses <- c(alpha = 0.0018, MKT = 0.08, SMB = 0.11, HML = 0.13)
t_stats <- coefs / ses
significant <- abs(t_stats) > t_crit
results_df <- data.frame(
Coefficient = names(coefs),
Estimate = coefs,
Std_Error = ses,
t_statistic = round(t_stats, 4),
`|t| > 1.98`= significant,
Significance = ifelse(significant, "★ Significant", "Not significant"),
check.names = FALSE
)
knitr::kable(results_df, row.names = FALSE,
caption = "Fama–French Three-Factor Model: t-statistics")| Coefficient | Estimate | Std_Error | t_statistic | |t| > 1.98 | Significance |
|---|---|---|---|---|---|
| alpha | 0.0029 | 0.0018 | 1.6111 | FALSE | Not significant |
| MKT | 0.9700 | 0.0800 | 12.1250 | TRUE | ★ Significant |
| SMB | 0.7500 | 0.1100 | 6.8182 | TRUE | ★ Significant |
| HML | -0.1300 | 0.1300 | -1.0000 | FALSE | Not significant |
Summary:
s_loading <- 0.75 # SMB loading
h_loading <- -0.13 # HML loading
cat("SMB loading (s):", s_loading, "\n")SMB loading (s): 0.75
HML loading (h): -0.13
Size tilt : s > 0 → Small-cap tilt
Style tilt : h < 0 → Growth tilt
Style Classification:
| Dimension | Loading | Direction | Interpretation |
|---|---|---|---|
| Size (SMB) | \(s = +0.75\) | Positive & large | Small-cap tilt — the fund behaves like a small-cap fund; it earns a premium by loading on the small-minus-big factor |
| Value/Growth (HML) | \(h = -0.13\) | Negative (insignificant) | Growth tilt — negative HML loading is consistent with a growth-oriented mandate; however, this loading is not statistically significant |
Conclusion: The fund is a small-cap growth fund. The dominant and statistically significant style bet is on small-capitalization stocks (\(s = 0.75\), \(t = 6.82\)). The growth tilt exists directionally but cannot be distinguished from zero at the 5% level.
alpha_ff <- 0.0029
se_alpha_ff <- 0.0018
t_alpha_ff <- alpha_ff / se_alpha_ff
cat("FF3 Alpha estimate :", alpha_ff * 100, "%/month\n")FF3 Alpha estimate : 0.29 %/month
t-statistic : 1.6111
Annualized alpha : 3.48 %/year
Significant? : FALSE
Result: \(t_{\hat{\alpha}} = 0.0029 / 0.0018 = 1.6111\)
Since \(|1.6111| < 1.98\), the intercept is not statistically significant at the 5% level.
Interpretation: The Fama–French alpha of 0.29%/month (≈ 3.48%/year) is economically meaningful in magnitude, but the estimate is too noisy to distinguish from zero statistically. After accounting for market, size, and value exposures, we cannot conclude the manager generates genuine risk-adjusted outperformance. The apparent positive alpha could reflect sampling variation rather than skill.
R2_capm <- 0.75
R2_ff3 <- 0.92
adj_R2 <- 0.918
n_obs <- 144
# Incremental explanatory power
delta_R2 <- R2_ff3 - R2_capm
cat("CAPM R² :", R2_capm, "\n")CAPM R² : 0.75
FF3 R² : 0.92
Δ R² : 0.17
FF3 Adj. R² : 0.918
# Verify adjusted R2 formula: Adj R2 = 1 - (1-R2)*(n-1)/(n-k-1)
# FF3 has k=3 predictors
k_ff3 <- 3
adj_check <- 1 - (1 - R2_ff3) * (n_obs - 1) / (n_obs - k_ff3 - 1)
cat("Adj R² (manual):", round(adj_check, 4), "\n")Adj R² (manual): 0.9183
What the jump from 0.75 → 0.92 means:
The increase of \(\Delta R^2 = 0.17\) shows that the SMB and HML factors explain an additional 17% of the fund’s return variation beyond what the single market factor captures. This confirms that the fund’s returns contain significant size and value tilts that are invisible to the CAPM. The three-factor model provides a far richer description of the fund’s systematic exposures.
Why Adjusted R² is the correct comparison metric:
\(R^2\) mechanically increases whenever predictors are added, even if they contribute nothing. Adding SMB and HML will always raise \(R^2\) whether or not they are genuinely useful. Adjusted \(R^2\) corrects for this by penalizing the addition of each new predictor:
\[\bar{R}^2 = 1 - \frac{(1-R^2)(n-1)}{n-k-1}\]
where \(k\) is the number of predictors. The adjusted \(R^2\) rises only when a new variable improves fit by more than chance would predict. Here, \(\bar{R}^2 = 0.918 \approx R^2 = 0.92\), confirming the two additional factors are genuinely earning their place in the model.
Model: \[\text{logit}\, P(\text{Up}) = \beta_0 + \beta_1 r_{t-1} + \beta_2 \Delta\text{VIX}_{t-1}\]
Coefficients: \(\beta_0 = -0.02\), \(\beta_1 = 5.4\), \(\beta_2 = -0.38\)
Today’s inputs: \(r_{t-1} = 0.010\), \(\Delta\text{VIX} = 1.5\)
Steps:
\[\text{logit} = \beta_0 + \beta_1 r_{t-1} + \beta_2 \Delta\text{VIX}\] \[P(\text{Up}) = \frac{e^{\text{logit}}}{1 + e^{\text{logit}}} = \frac{1}{1 + e^{-\text{logit}}}\]
beta0 <- -0.02
beta1 <- 5.4
beta2 <- -0.38
r_lag <- 0.010
delta_vix <- 1.5
threshold <- 0.5
# Step 1: Compute logit
logit_val <- beta0 + beta1 * r_lag + beta2 * delta_vix
cat("Logit value:\n")Logit value:
cat(sprintf(" β₀ + β₁·r + β₂·ΔVIX = %.4f + %.4f·%.3f + (%.4f)·%.1f\n",
beta0, beta1, r_lag, beta2, delta_vix)) β₀ + β₁·r + β₂·ΔVIX = -0.0200 + 5.4000·0.010 + (-0.3800)·1.5
= -0.0200 + 0.0540 + (-0.5700)
= -0.5360
# Step 2: Sigmoid transform
prob_up <- 1 / (1 + exp(-logit_val))
cat("P(Up) = 1 / (1 + exp(-logit)):\n")P(Up) = 1 / (1 + exp(-logit)):
= 1 / (1 + exp(--0.5360))
= 0.3691 (36.91%)
# Step 3: Classification
pred_class <- ifelse(prob_up >= threshold, "Up", "Down")
cat("Threshold :", threshold, "\n")Threshold : 0.5
Predicted class : Down
Result:
\[\text{logit} = -0.02 + 5.4(0.010) + (-0.38)(1.5) = -0.02 + 0.054 - 0.570 = -0.536\]
\[P(\text{Up}) = \frac{1}{1 + e^{-(-0.536)}} = 0.3691 \approx 36.91\%\]
At a 0.5 threshold: Predicted class = Down
β₁ = 5.4 (positive)
β₂ = -0.38 (negative)
Economic Interpretation:
β₁ = +5.4 (Lagged return, positive): A positive \(\beta_1\) captures short-term return momentum. If yesterday’s market return was positive, the log-odds of an “Up” day tomorrow increase. This is consistent with the momentum anomaly — markets exhibit positive serial correlation over short horizons, possibly due to investor underreaction to information, herding, or trend-following strategies.
β₂ = −0.38 (ΔVIX, negative): A negative \(\beta_2\) captures fear/uncertainty as a contrarian signal. When the VIX (the “fear gauge”) rises sharply, the probability of an “Up” day falls. This is economically intuitive: rising implied volatility signals heightened investor anxiety, increased risk aversion, and often accompanies or precedes market declines. Conversely, falling VIX (negative ΔVIX) is associated with calmer markets and higher up-move probabilities.
# Confusion matrix values
TP <- 67 # Predicted Up, Actual Up
FP <- 44 # Predicted Up, Actual Down
FN <- 33 # Predicted Down, Actual Up
TN <- 56 # Predicted Down, Actual Down
N <- TP + FP + FN + TN
cat("Confusion Matrix:\n")Confusion Matrix:
True Positives (TP): 67
False Positives (FP): 44
False Negatives (FN): 33
True Negatives (TN): 56
Total (N) : 200
# Metrics
accuracy <- (TP + TN) / N
sensitivity <- TP / (TP + FN) # recall / TPR
specificity <- TN / (TN + FP) # TNR
precision <- TP / (TP + FP) # PPV
cat("─── Performance Metrics ───────────────────\n")─── Performance Metrics ───────────────────
Accuracy = (TP+TN)/N = (67+56)/200 = 0.6150
Sensitivity = TP/(TP+FN) = 67/(67+33) = 0.6700
Specificity = TN/(TN+FP) = 56/(56+44) = 0.5600
Precision = TP/(TP+FP) = 67/(67+44) = 0.6036
Computed Metrics:
| Metric | Formula | Calculation | Value |
|---|---|---|---|
| Accuracy | \((TP+TN)/N\) | \((67+56)/200\) | 0.615 |
| Sensitivity (Recall) | \(TP/(TP+FN)\) | \(67/(67+33)\) | 0.67 |
| Specificity | \(TN/(TN+FP)\) | \(56/(56+44)\) | 0.56 |
| Precision (PPV) | \(TP/(TP+FP)\) | \(67/(67+44)\) | 0.6036 |
actual_up <- 100
actual_down <- 100
majority_class <- ifelse(actual_up >= actual_down, "Up", "Down")
# Naive accuracy: always predict majority class
naive_accuracy <- max(actual_up, actual_down) / N
cat("Class distribution : Up =", actual_up, ", Down =", actual_down, "\n")Class distribution : Up = 100 , Down = 100
Majority class : Up
Naive accuracy : 100/200 = 0.5000
Model accuracy : 0.6150
Model beats naive? : TRUE
# Sharpe-analogous metric: precision-recall F1
F1 <- 2 * precision * sensitivity / (precision + sensitivity)
cat(sprintf("F1 Score (model) : %.4f\n", F1))F1 Score (model) : 0.6351
Naive classifier accuracy: \(100/200 = 0.5\) (always predict “Up”)
Model accuracy: 0.615
The model beats the naive rule (by 11.5 percentage points).
Why accuracy alone is inadequate for a trading system:
The dataset is perfectly balanced (100 Up, 100 Down), so accuracy is a fair comparison here. However, in practice:
Class imbalance: Real market data may have more “Up” days than “Down” days. A naive “always Up” rule achieves high accuracy without any real predictive power.
Asymmetric costs: In trading, false positives and false negatives have very different economic consequences. Missing a downside move (FN) while long is far more costly than missing an upside move (FP) while flat.
Profit-adjusted metrics are more relevant: A more economically meaningful criterion is the strategy’s Sharpe ratio or cumulative P&L when the model is actually used to trade. A classifier with 55% accuracy but high precision on “Up” calls could be far more profitable than one with 62% accuracy but randomly distributed errors.
Better criteria: Precision (for cost of acting on false signals), F1-score (0.6351 here), or ideally a direct backtest-based metric such as the Sharpe ratio of a long/short strategy driven by the model’s predictions.
Strategy: Monthly returns over \(n = 48\) months. \(\bar{r} = 0.70\%\), \(s = 5.50\%\)
Formulas: \[SR_{\text{monthly}} = \frac{\bar{r}}{s}, \qquad SR_{\text{annual}} = SR_{\text{monthly}} \times \sqrt{12}\]
mean_ret <- 0.0070 # 0.70% monthly
sd_ret <- 0.0550 # 5.50% monthly
n_months <- 48
scale <- 12 # months per year
# Monthly Sharpe
SR_monthly <- mean_ret / sd_ret
# Annualized Sharpe (IID scaling rule)
SR_annual <- SR_monthly * sqrt(scale)
cat("Mean monthly return :", mean_ret * 100, "%\n")Mean monthly return : 0.7 %
Std dev monthly : 5.5 %
Monthly Sharpe ratio : 0.1273
Scaling factor : sqrt(12) = 3.4641
Annualized Sharpe : 0.4409
Result:
\[SR_{\text{monthly}} = \frac{0.0070}{0.0550} = 0.1273\]
\[SR_{\text{annual}} = 0.1273 \times \sqrt{12} = 0.4409\]
Scaling factor: \(\sqrt{12}\), which is valid under the assumption that monthly returns are i.i.d. (no autocorrelation). This converts the monthly mean-to-volatility ratio to an annualized basis.
set.seed(2024)
# Simulate representative data (matching given moments)
r_sim <- rnorm(n_months, mean = mean_ret, sd = sd_ret)
# IID Bootstrap SE for Sharpe ratio
B <- 10000
boot_SR <- numeric(B)
for (b in seq_len(B)) {
sample_b <- sample(r_sim, size = n_months, replace = TRUE)
boot_SR[b] <- mean(sample_b) / sd(sample_b)
}
boot_se_iid <- sd(boot_SR)
boot_ci <- quantile(boot_SR, c(0.025, 0.975))
cat("Bootstrap Results (IID, B = 10,000 replicates):\n")Bootstrap Results (IID, B = 10,000 replicates):
Sample Sharpe (monthly): 0.1273
Bootstrap SE : 0.1473
95% CI (percentile) :[ -0.2696 , 0.3113 ]
Bootstrap Procedure (Step-by-Step):
Step 1 — Sample: Collect the \(n = 48\) monthly returns \(\{r_1, r_2, \ldots, r_{48}\}\).
Step 2 — Resample: Draw \(B = 10{,}000\) bootstrap samples. Each bootstrap sample \(r^{(b)}\) is obtained by drawing \(n = 48\) observations with replacement from the original data.
Step 3 — Compute statistic: For each bootstrap sample \(b\), compute the Sharpe ratio: \[SR^{(b)} = \frac{\bar{r}^{(b)}}{s^{(b)}}\]
Step 4 — Estimate SE: The bootstrap standard error is: \[\widehat{SE}_{\text{boot}}(SR) = \text{sd}\left(\{SR^{(1)}, SR^{(2)}, \ldots, SR^{(B)}\}\right)\]
Step 5 — Confidence interval: Use the percentile method: \([Q_{0.025}, Q_{0.975}]\) of the bootstrap distribution.
Why i.i.d. bootstrap is inappropriate for monthly returns:
Monthly financial returns exhibit autocorrelation — returns in adjacent months are not independent. The standard i.i.d. bootstrap destroys the time-series dependence structure by sampling observations without regard to their temporal ordering, leading to underestimated standard errors if positive serial correlation is present.
The fix — Block Bootstrap (e.g., Moving Block Bootstrap or Stationary Bootstrap): Instead of sampling individual observations, sample contiguous blocks of \(\ell\) months (e.g., \(\ell = 3\)–6). This preserves the local autocorrelation structure within each block. The stationary bootstrap (Politis & Romano, 1994) uses random block lengths to ensure stationarity of the resampled series.
lambda_min <- 0.030
factors_min <- 14
lambda_1se <- 0.065
factors_1se <- 7
cat("λ_min :", lambda_min, "→", factors_min, "factors retained\n")λ_min : 0.03 → 14 factors retained
λ_1SE : 0.065 → 7 factors retained
Factor reduction: 7 fewer factors with λ_1SE
Recommended choice: \(\lambda_{1SE} = 0.065\) (7 factors)
Reasoning:
The one-standard-error rule selects the largest \(\lambda\) whose CV error is within one standard error of the minimum. It sacrifices a negligible amount of in-sample fit in exchange for a sparser, more interpretable model.
Overfitting risk with 60 candidate factors: With 60 candidates and only a moderate number of observations, the minimum-CV solution retaining 14 factors is prone to data-snooping / multiple testing bias. Each of the 60 factors may have been mined from the same historical data, inflating apparent in-sample performance.
Parsimony principle: In a financial backtest, each additional factor introduces an additional source of estimation error, transaction cost, and model fragility. The 7-factor model under \(\lambda_{1SE}\) includes only the most robustly significant predictors, making it more likely to generalize out-of-sample.
Practical finance concern: Strategies with fewer factors are easier to implement and less susceptible to overfitting to historical noise — a critical concern given the low signal-to-noise ratio in financial data.
Conclusion: Deploy \(\lambda_{1SE} = 0.065\) with 7 factors.
library(ggplot2)
# Illustrate walk-forward scheme
n_total <- 60
min_train <- 36
step_size <- 6
# Build schedule
folds <- list()
t_start <- 1
fold_id <- 1
while ((t_start + min_train + step_size - 1) <= n_total) {
train_end <- t_start + min_train - 1
test_start <- train_end + 1
test_end <- min(test_start + step_size - 1, n_total)
folds[[fold_id]] <- data.frame(
fold = fold_id,
train_from = t_start,
train_to = train_end,
test_from = test_start,
test_to = test_end
)
t_start <- t_start + step_size
fold_id <- fold_id + 1
}
folds_df <- do.call(rbind, folds)
# Plot
plot_df <- rbind(
data.frame(fold = folds_df$fold,
start = folds_df$train_from,
end = folds_df$train_to,
type = "Train"),
data.frame(fold = folds_df$fold,
start = folds_df$test_from,
end = folds_df$test_to,
type = "Test")
)
ggplot(plot_df, aes(xmin = start, xmax = end + 1,
ymin = fold - 0.4, ymax = fold + 0.4,
fill = type)) +
geom_rect(alpha = 0.85, color = "white", linewidth = 0.4) +
scale_fill_manual(values = c("Train" = "#2980b9", "Test" = "#e74c3c")) +
scale_x_continuous(breaks = seq(0, n_total, 6),
labels = paste0("t=", seq(0, n_total, 6))) +
scale_y_continuous(breaks = folds_df$fold,
labels = paste("Fold", folds_df$fold)) +
labs(title = "Walk-Forward (Expanding Window) Cross-Validation",
subtitle = "Training window grows; test window always lies in the future",
x = "Month", y = NULL, fill = "Window") +
theme_minimal(base_size = 12) +
theme(
panel.grid.minor = element_blank(),
plot.title = element_text(face = "bold", size = 13),
plot.subtitle = element_text(color = "gray40"),
legend.position = "top"
)Walk-Forward Evaluation Scheme:
Step 1 — Initial Training Window: Use the first \(T_0\) months (e.g., 36 months) to fit the LASSO model, selecting \(\lambda\) by CV on this training window only.
Step 2 — Out-of-Sample Test Window: Apply the fitted model to predict returns for the next \(h\) months (e.g., 6 months). Record the strategy’s P&L, Sharpe ratio, and other metrics on this unseen data.
Step 3 — Roll Forward: Advance the window by \(h\) months (or 1 month for a rolling scheme). Refit the model on the expanded (or rolling) training set. Repeat from Step 2 until the data are exhausted.
Step 4 — Aggregate: Concatenate all out-of-sample predictions to form a continuous backtest track record. Compute overall Sharpe ratio, drawdown, and significance tests on this aggregated series.
Why standard random k-fold CV is unsafe here:
Random k-fold CV shuffles observations randomly across folds. For time-series data, this creates data leakage: future data contaminate the training fold and past data appear in the test fold. Specifically:
Look-ahead bias: If a test fold contains month \(t\) and the training fold contains month \(t+6\), the model “sees” future information during training — an impossibility in real trading.
Autocorrelation in errors: Random splits break the temporal dependence structure, making cross-validation error estimates overly optimistic and unreliable.
Walk-forward CV strictly ensures the test set always lies in the future relative to the training set, mimicking real deployment conditions and producing unbiased out-of-sample estimates.
| Part | Key Result |
|---|---|
| (a) | t_β = 5.7647 → Reject H₀: β=0 |
| (b) | t = -0.1176 → Fail to reject H₀: β=1 |
| (c) | t_α = 0.85 → Not significant; claim unjustified |
| (d) | R² = 0.50: 50% systematic, 50% idiosyncratic |
| (e) | CAPM E[R_i−R_f] = 0.686%/month |
| (f) | MKT & SMB significant; α & HML not |
| (g) | Small-cap growth fund (s=0.75 sig, h=−0.13 insig) |
| (h) | FF3 α: t=1.6111 → Not significant; no proven value |
| (i) | Δ R² = +0.17; Adj R² penalizes extra parameters |
| (j) | P(Up) = 0.3691 → Predicted: Down |
| (k) | β₁>0: momentum; β₂<0: VIX rise → bearish |
| (l) | Acc=0.615 Sens=0.67 Spec=0.56 Prec=0.6036 |
| (m) | Naive acc=0.5; Model beats naive by 11.5pp; use Sharpe as criterion |
| (n) | SR_monthly=0.1273 SR_annual=0.4409 |
| (o) | Block bootstrap preserves serial correlation |
| (p) | Deploy λ=0.065 (7 factors); sparser, less overfit |
| (q) | Walk-forward CV; k-fold leaks future data |