The single-factor (market) model is:
\[R_i - R_f = \alpha + \beta(R_m - R_f) + \varepsilon\]
# Given values
alpha_est <- 0.0017
alpha_se <- 0.0020
beta_est <- 0.98
beta_se <- 0.17
R2_q1 <- 0.50
E_mkt_prem <- 0.0070 # 0.70% expressed as decimal
n_q1 <- 96 # months
t_crit <- 1.98 # critical |t| at 5% levelFormula:
\[t_{\hat{\beta}} = \frac{\hat{\beta} - \beta_0}{SE(\hat{\beta})}\]
Under \(H_0: \beta = 0\), the null value \(\beta_0 = 0\).
t_beta_zero <- (beta_est - 0) / beta_se
cat("t-statistic for H0: beta = 0:", round(t_beta_zero, 4), "\n")## t-statistic for H0: beta = 0: 5.765
## Critical |t|: 1.98
## Reject H0? TRUE
Decision: Since \(|t| = 5.7647| > 1.98\), we reject \(H_0: \beta = 0\) at the 5% significance level.
Economic Interpretation of β:
\(\hat{\beta} = 0.98\) means that for
every 1% increase in the market excess return, the fund’s excess return
is expected to increase by approximately 0.98%. The
fund has slightly less systematic (market) risk than the market
portfolio itself. A \(\beta < 1\)
indicates a defensive fund — it amplifies market moves
by a factor of 0.98, meaning it underperforms the market in bull markets
but loses slightly less in bear markets.
Formula:
\[t_{\hat{\beta}=1} = \frac{\hat{\beta} - 1}{SE(\hat{\beta})}\]
t_beta_one <- (beta_est - 1) / beta_se
cat("t-statistic for H0: beta = 1:", round(t_beta_one, 4), "\n")## t-statistic for H0: beta = 1: -0.1176
## Critical |t|: 1.98
## Reject H0? FALSE
Decision: Since \(|t| = 0.1176 < 1.98\), we fail to reject \(H_0: \beta = 1\) at the 5% level.
Interpretation: Although the point estimate of \(\hat{\beta} = 0.98\) is slightly below 1, it is not statistically distinguishable from 1. The fund’s systematic risk is statistically consistent with having the same market exposure as the market portfolio itself (i.e., consistent with a passive market index fund in terms of systematic risk).
Formula:
\[t_{\hat{\alpha}} = \frac{\hat{\alpha}}{SE(\hat{\alpha})}\]
## t-statistic for alpha: 0.85
## Critical |t|: 1.98
## Reject H0 (alpha = 0)? FALSE
Decision: Since \(|t| = 0.85 < 1.98\), we fail to reject \(H_0: \alpha = 0\) at the 5% level.
Assessment of Marketing Claim: The marketing team’s claim of “positive risk-adjusted performance” is not statistically justified. While \(\hat{\alpha} = 0.0017\) (i.e., +0.17% per month above CAPM expectation) is positive in sign, it is indistinguishable from zero at the 5% significance level. The result could easily be due to sampling variation. Advertising positive alpha on this basis would be misleading.
R2_q1_val <- R2_q1
systematic_frac <- R2_q1_val
diversifiable_frac <- 1 - R2_q1_val
cat("R-squared:", R2_q1_val, "\n")## R-squared: 0.5
cat("Systematic (market) fraction:", round(systematic_frac, 4), "=",
round(systematic_frac * 100, 2), "%\n")## Systematic (market) fraction: 0.5 = 50 %
cat("Diversifiable (idiosyncratic) fraction:", round(diversifiable_frac, 4), "=",
round(diversifiable_frac * 100, 2), "%\n")## Diversifiable (idiosyncratic) fraction: 0.5 = 50 %
Interpretation:
\(R^2 = 0.50\) means that 50%
of the variance in the fund’s excess returns is explained by
movements in the market excess return — this is the systematic
component.
The remaining 50% is diversifiable (idiosyncratic) risk — variance attributable to factors specific to this fund (stock selection, sector bets, manager decisions) that are not explained by broad market movements. For a single-factor model, \(R^2\) is also equal to the squared correlation between the fund return and the market return.
Formula:
\[E[R_i - R_f] = \hat{\beta} \times E[R_m - R_f]\]
capm_expected_excess <- beta_est * E_mkt_prem
cat("CAPM-implied expected monthly excess return:",
round(capm_expected_excess, 4), "\n")## CAPM-implied expected monthly excess return: 0.0069
## In percentage terms: 0.686 %
Result: The CAPM predicts the fund should earn a monthly excess return of:
\[E[R_i - R_f] = 0.98 \times 0.70\% = 0.686\%\]
This is the return the fund must achieve to be fairly compensated for its level of systematic risk. Any realized return above this benchmark would represent alpha.
The Fama–French three-factor model is:
\[R_i - R_f = \alpha + b \cdot MKT + s \cdot SMB + h \cdot HML + \varepsilon\]
# Coefficient estimates
alpha_ff <- 0.0029; alpha_ff_se <- 0.0018
b_mkt <- 0.97; b_mkt_se <- 0.08
s_smb <- 0.75; s_smb_se <- 0.11
h_hml <- -0.13; h_hml_se <- 0.13
R2_ff <- 0.92
adjR2_ff <- 0.918
n_ff <- 144
t_crit_ff <- 1.98Formula for each coefficient:
\[t_{\hat{\theta}} = \frac{\hat{\theta}}{SE(\hat{\theta})}\]
t_alpha_ff <- alpha_ff / alpha_ff_se
t_b_mkt <- b_mkt / b_mkt_se
t_s_smb <- s_smb / s_smb_se
t_h_hml <- h_hml / h_hml_se
results_ff <- data.frame(
Coefficient = c("Intercept (α)", "MKT (b)", "SMB (s)", "HML (h)"),
Estimate = c(alpha_ff, b_mkt, s_smb, h_hml),
Std_Error = c(alpha_ff_se, b_mkt_se, s_smb_se, h_hml_se),
t_statistic = round(c(t_alpha_ff, t_b_mkt, t_s_smb, t_h_hml), 4),
Significant = abs(c(t_alpha_ff, t_b_mkt, t_s_smb, t_h_hml)) > t_crit_ff
)
kable(results_ff, caption = "Fama-French Three-Factor Model: t-statistics") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE)| Coefficient | Estimate | Std_Error | t_statistic | Significant |
|---|---|---|---|---|
| Intercept (α) | 0.0029 | 0.0018 | 1.611 | FALSE |
| MKT (b) | 0.9700 | 0.0800 | 12.125 | TRUE |
| SMB (s) | 0.7500 | 0.1100 | 6.818 | TRUE |
| HML (h) | -0.1300 | 0.1300 | -1.000 | FALSE |
Summary of significance at 5% level (|t| > 1.98):
for (i in 1:nrow(results_ff)) {
sig_label <- ifelse(results_ff$Significant[i], "SIGNIFICANT", "NOT significant")
cat(sprintf("%-20s t = %7.4f --> %s\n",
results_ff$Coefficient[i],
results_ff$t_statistic[i],
sig_label))
}## Intercept (α) t = 1.6111 --> NOT significant
## MKT (b) t = 12.1250 --> SIGNIFICANT
## SMB (s) t = 6.8182 --> SIGNIFICANT
## HML (h) t = -1.0000 --> NOT significant
## SMB loading (s): 0.75 -- Sign: POSITIVE, Magnitude: LARGE
## HML loading (h): -0.13 -- Sign: NEGATIVE, Magnitude: SMALL
Size Tilt — SMB loading:
\(\hat{s} = 0.75 > 0\) and
statistically significant. A positive SMB loading means the fund
co-moves with small-cap stocks relative to large-cap
stocks. The fund has a clear small-cap bias.
Value/Growth Tilt — HML loading:
\(\hat{h} = -0.13 < 0\) (negative).
A negative HML loading means the fund co-moves with growth
stocks (low book-to-market) rather than value stocks (high
book-to-market). However, \(h\) is
not statistically significant (\(|t| = 1 < 1.98\)), so this growth tilt
is weak and uncertain.
Conclusion: The fund is best classified as a small-cap, mild growth fund — strongly tilted toward small-capitalization stocks, with a weak (statistically insignificant) inclination toward growth stocks.
## Alpha estimate: 0.0029 (i.e., 0.29 % per month)
## t-statistic for alpha: 1.611
## Critical |t|: 1.98
## Significant? FALSE
annualized_alpha <- (1 + alpha_ff)^12 - 1
cat("Annualized alpha (approx):", round(annualized_alpha * 100, 4), "%\n")## Annualized alpha (approx): 3.536 %
Interpretation:
\(\hat{\alpha} = 0.0029\) implies
approximately +0.29% per month (roughly 3.54%
annualized) of return above what the three Fama–French factors
predict.
However, with \(t = 1.6111\), this alpha is not statistically significant at the 5% level (just below the threshold of 1.98). The evidence for genuine managerial skill beyond the three factor exposures is suggestive but not conclusive. We cannot reject the hypothesis that the true alpha is zero. The manager does not convincingly add value beyond the factor exposures at conventional significance levels.
R2_single <- 0.75
R2_ff_val <- 0.92
adjR2_ff_val <- 0.918
improvement_R2 <- R2_ff_val - R2_single
cat("Single-factor R²:", R2_single, "\n")## Single-factor R²: 0.75
## FF Three-factor R²: 0.92
## Improvement in R²: 0.17
## FF Adjusted R²: 0.918
Rise from 0.75 to 0.92:
Adding SMB and HML as additional factors increases explained variance by
17 percentage points. This means the SMB and HML
factors collectively capture an additional 17% of the variation in the
fund’s returns that the market factor alone could not explain —
reflecting the fund’s substantial small-cap exposure. The three-factor
model is a materially better description of this fund’s
return-generating process.
Why Adjusted R² is the Appropriate Metric:
\(R^2\) can only increase (or stay the
same) when additional predictors are added, even if those predictors
have no true explanatory power. This mechanical inflation makes raw
\(R^2\) unsuitable for comparing models
with different numbers of predictors.
The adjusted \(R^2\) penalizes for the number of predictors:
\[\bar{R}^2 = 1 - \frac{(1-R^2)(n-1)}{n-k-1}\]
where \(n\) is the number of observations and \(k\) is the number of predictors. It increases only if the new predictor improves fit by more than chance would predict. With \(\bar{R}^2 = 0.918\) for the three-factor model (versus \(R^2 = 0.92\)), the small penalty for two additional factors confirms the improvement is genuine, not mechanical.
The logistic regression model is:
\[\text{logit}\, P(\text{Up}) = \beta_0 + \beta_1 r_{t-1} + \beta_2 \Delta VIX_{t-1}\]
\[P(\text{Up}) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 r_{t-1} + \beta_2 \Delta VIX_{t-1})}}\]
Step 1 — Compute the linear predictor (log-odds):
\[\eta = \beta_0 + \beta_1 \cdot r_{t-1} + \beta_2 \cdot \Delta VIX\]
eta <- beta0 + beta1 * r_lag + beta2 * dvix
cat("Linear predictor (log-odds), eta:", round(eta, 4), "\n")## Linear predictor (log-odds), eta: -0.536
# Sigmoid function
p_up <- 1 / (1 + exp(-eta))
cat("Predicted probability P(Up):", round(p_up, 4), "\n")## Predicted probability P(Up): 0.3691
## Predicted class (threshold 0.5): Down
Calculation:
\[\eta = -0.02 + 5.4 \times 0.010 + (-0.38) \times 1.5 = -0.536\]
\[P(\text{Up}) = \frac{1}{1 + e^{-(-0.536)}} = 0.3691\]
Predicted class: Since \(P(\text{Up}) = 0.3691 < 0.5\), the model predicts “Down” for tomorrow.
## beta1 (lagged return): 5.4 -- POSITIVE
## beta2 (delta VIX): -0.38 -- NEGATIVE
β₁ = 5.4 (Positive — Lagged Return):
A positive \(\beta_1\) means that a
higher lagged return increases the log-odds of an “Up”
day tomorrow. This captures a momentum effect: days
following positive returns are more likely to be positive themselves.
The large magnitude (5.4) indicates that even a 1% positive lagged
return substantially increases the probability of a subsequent up
day.
β₂ = −0.38 (Negative — ΔVIX):
A negative \(\beta_2\) means that an
increase in VIX decreases the probability of an “Up”
day. This captures the fear gauge effect: rising VIX
signals increasing market fear/uncertainty, which is associated with
negative or volatile market returns. When investors become more fearful
(VIX rises), the market is more likely to decline or be turbulent the
next day.
# Confusion matrix values
TP <- 67 # Predicted Up, Actual Up
FP <- 44 # Predicted Up, Actual Down
FN <- 33 # Predicted Down, Actual Up
TN <- 56 # Predicted Down, Actual Down
N <- TP + FP + FN + TN
cat("Confusion Matrix:\n")## Confusion Matrix:
cm_df <- data.frame(
" " = c("Predicted Up", "Predicted Down", "Total"),
Actual_Up = c(TP, FN, TP + FN),
Actual_Down = c(FP, TN, FP + TN),
Total = c(TP + FP, FN + TN, N),
check.names = FALSE
)
kable(cm_df, caption = "Confusion Matrix (200-day holdout)") %>%
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)| Actual_Up | Actual_Down | Total | |
|---|---|---|---|
| Predicted Up | 67 | 44 | 111 |
| Predicted Down | 33 | 56 | 89 |
| Total | 100 | 100 | 200 |
Formulas and Calculations:
\[\text{Accuracy} = \frac{TP + TN}{N}\]
\[\text{Sensitivity (Recall)} = \frac{TP}{TP + FN}\]
\[\text{Specificity} = \frac{TN}{TN + FP}\]
\[\text{Precision} = \frac{TP}{TP + FP}\]
accuracy <- (TP + TN) / N
sensitivity <- TP / (TP + FN)
specificity <- TN / (TN + FP)
precision <- TP / (TP + FP)
metrics_df <- data.frame(
Metric = c("Accuracy", "Sensitivity (Recall for Up)",
"Specificity", "Precision (for Up)"),
Formula = c("(TP+TN)/N", "TP/(TP+FN)", "TN/(TN+FP)", "TP/(TP+FP)"),
Value = round(c(accuracy, sensitivity, specificity, precision), 4)
)
kable(metrics_df, caption = "Performance Metrics") %>%
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)| Metric | Formula | Value |
|---|---|---|
| Accuracy | (TP+TN)/N | 0.6150 |
| Sensitivity (Recall for Up) | TP/(TP+FN) | 0.6700 |
| Specificity | TN/(TN+FP) | 0.5600 |
| Precision (for Up) | TP/(TP+FP) | 0.6036 |
# Naive majority class: predict "Up" always (100 Up, 100 Down -- balanced)
# With balanced classes, majority = either class at 50%
actual_up <- 100
actual_down <- 100
naive_accuracy <- max(actual_up, actual_down) / N
cat("Naive majority-class accuracy:", round(naive_accuracy, 4),
"(", round(naive_accuracy*100, 2), "%)\n")## Naive majority-class accuracy: 0.5 ( 50 %)
## Model accuracy: 0.615 ( 61.5 %)
## Model beats naive? TRUE
## Improvement: 11.5 percentage points
Naive Accuracy:
The classes are balanced (100 Up, 100 Down). A naive classifier that
always predicts the majority class (either “Up” or “Down”) achieves:
\[\text{Naive Accuracy} = \frac{\max(100, 100)}{200} = 0.5 = 50\%\]
Does the model beat it? Yes — the model achieves 61.5% accuracy vs. the naive 50.0%, a gain of 11.5 percentage points.
Why Accuracy Alone is Inadequate for Trading:
Accuracy treats all errors equally (a missed “Up” day = a missed “Down” day), but in a trading system the costs of errors are asymmetric:
These two errors have very different P&L consequences depending on position sizes, transaction costs, and leverage. Accuracy also ignores the magnitude of returns — correctly predicting a +3% day is far more valuable than correctly predicting a +0.1% day.
More Economically Relevant Criterion:
A better metric for a trading system is the Sharpe ratio of the
strategy derived from the model’s signals — this directly
measures risk-adjusted profitability. Alternatively,
precision (of “Up” predictions) is more relevant when
the cost of false positives (bad trades) is high. For directional
trading, the F1 score (harmonic mean of precision and
recall) or Matthews Correlation Coefficient (MCC) are
more balanced than raw accuracy.
mu_monthly <- 0.0070 # 0.70% mean monthly return
sigma_monthly <- 0.0550 # 5.50% std dev
n_months <- 48Monthly Sharpe Ratio formula (assuming risk-free rate is embedded in excess returns):
\[SR_{monthly} = \frac{\bar{r}}{\hat{\sigma}}\]
Annualization: Monthly Sharpe ratios are annualized by multiplying by \(\sqrt{12}\), since returns scale linearly with time while standard deviations scale with the square root of time.
\[SR_{annual} = SR_{monthly} \times \sqrt{12}\]
SR_monthly <- mu_monthly / sigma_monthly
scaling <- sqrt(12)
SR_annual <- SR_monthly * scaling
cat("Mean monthly return:", mu_monthly, "(", mu_monthly*100, "%)\n")## Mean monthly return: 0.007 ( 0.7 %)
## Monthly std deviation: 0.055 ( 5.5 %)
## Monthly Sharpe Ratio: 0.1273
## Scaling factor: sqrt(12) = 3.464
## Annualized Sharpe Ratio: 0.4409
Results:
\[SR_{monthly} = \frac{0.0070}{0.0550} = 0.1273\]
\[SR_{annual} = 0.1273 \times \sqrt{12} = 0.1273 \times 3.4641 = 0.4409\]
Scaling factor used: \(\sqrt{12}\), based on the i.i.d. returns assumption under which variance scales as \(12\sigma^2\) per year, so standard deviation scales as \(\sqrt{12}\,\sigma\).
Bootstrap Procedure — Step by Step:
steps_df <- data.frame(
Step = 1:6,
Description = c(
"Collect the original sample of n = 48 monthly returns: {r1, r2, ..., r48}",
"Draw B = 1,000 (or 10,000) bootstrap samples, each of size n = 48, by resampling WITH replacement from the original data",
"For each bootstrap sample b, compute the sample mean (mu*_b) and sample std dev (sigma*_b)",
"Compute the bootstrap Sharpe ratio: SR*_b = mu*_b / sigma*_b",
"Repeat steps 2-4 B times to obtain the distribution {SR*_1, SR*_2, ..., SR*_B}",
"The bootstrap standard error of the Sharpe ratio is: SE_boot = std({SR*_1, ..., SR*_B})"
)
)
kable(steps_df, caption = "Bootstrap Procedure for Sharpe Ratio Standard Error") %>%
kable_styling(bootstrap_options = c("striped", "hover")) %>%
column_spec(1, bold = TRUE, width = "1cm") %>%
column_spec(2, width = "14cm")| Step | Description |
|---|---|
| 1 | Collect the original sample of n = 48 monthly returns: {r1, r2, …, r48} |
| 2 | Draw B = 1,000 (or 10,000) bootstrap samples, each of size n = 48, by resampling WITH replacement from the original data |
| 3 | For each bootstrap sample b, compute the sample mean (mu*_b) and sample std dev (sigma*_b) |
| 4 | Compute the bootstrap Sharpe ratio: SR*_b = mu*_b / sigma*_b |
| 5 | Repeat steps 2-4 B times to obtain the distribution {SR*_1, SR*_2, …, SR*_B} |
| 6 | The bootstrap standard error of the Sharpe ratio is: SE_boot = std({SR*_1, …, SR*_B}) |
set.seed(42)
# Simulate 48 monthly returns consistent with the problem parameters
returns_sim <- rnorm(n_months, mean = mu_monthly, sd = sigma_monthly)
B <- 5000 # number of bootstrap replications
SR_boot <- numeric(B)
for (b in 1:B) {
boot_sample <- sample(returns_sim, size = n_months, replace = TRUE)
SR_boot[b] <- mean(boot_sample) / sd(boot_sample)
}
SE_boot_monthly <- sd(SR_boot)
cat("Bootstrap SE of monthly Sharpe ratio:", round(SE_boot_monthly, 4), "\n")## Bootstrap SE of monthly Sharpe ratio: 0.1495
## Bootstrap SE of annualized Sharpe: 0.5178
# Quick visualization
hist(SR_boot * sqrt(12), breaks = 40, col = "steelblue", border = "white",
main = "Bootstrap Distribution of Annualized Sharpe Ratio",
xlab = "Annualized Sharpe Ratio",
sub = paste0("SE = ", round(SE_boot_monthly * sqrt(12), 4)))
abline(v = SR_annual, col = "red", lwd = 2, lty = 2)
legend("topright", legend = c("Sample Sharpe"), col = "red", lty = 2, lwd = 2)Why i.i.d. Bootstrap is Inappropriate for Monthly Returns:
Monthly financial returns exhibit serial dependence — specifically:
When these dependencies are present, i.i.d. resampling underestimates the true variability of the Sharpe ratio estimator, producing artificially narrow confidence intervals.
The fix: Use the block bootstrap (specifically, the stationary block bootstrap by Politis & Romano, or the circular block bootstrap). Instead of resampling individual observations, contiguous blocks of consecutive months are resampled. The block length is chosen to be long enough to preserve the serial dependence structure. This maintains the temporal correlation within blocks while still achieving variance estimation consistency.
lambda_min_cv <- 0.030; factors_min <- 14
lambda_1se <- 0.065; factors_1se <- 7
cat("Min-CV lambda:", lambda_min_cv, "-> factors retained:", factors_min, "\n")## Min-CV lambda: 0.03 -> factors retained: 14
## 1-SE rule lambda: 0.065 -> factors retained: 7
## Factor reduction: 7 fewer factors with 1-SE rule
Recommendation: Deploy \(\lambda = 0.065\) (the 1-SE rule solution) with 7 factors.
Reasoning:
The 1-SE rule selects the most parsimonious model (highest regularization) whose cross-validation error is within one standard error of the minimum. This is the preferred choice in a financial backtest context for several reasons:
Overfitting / Data Snooping Risk: With 60 candidate factors over a finite backtest, there is a high probability that some of the 14 factors retained by \(\lambda_{min}\) are spurious — they improve in-sample fit through noise-fitting, not true predictive signal. The 1-SE rule’s 7-factor model is less susceptible to this.
Parsimony and Interpretability: Fewer factors are easier to interpret, monitor, and trade. A 7-factor model is more robust to factor collinearity and parameter instability.
Out-of-Sample Generalization: The CV error difference between the two models is not statistically meaningful (within one standard error), yet the simpler model carries substantially lower variance. By the bias-variance tradeoff, the 7-factor model will typically generalize better to new data.
Transaction Costs: More factors often mean more rebalancing signals and higher turnover, increasing trading costs. A sparser model is more practical to implement.
The additional 7 factors in the min-CV model represent marginal in-sample improvement that is likely not recoverable out-of-sample.
# Illustrative walk-forward scheme
scheme_df <- data.frame(
Window = paste("Window", 1:5),
Training_Period = c("Months 1–36", "Months 1–42", "Months 1–48",
"Months 1–54", "Months 1–60"),
Test_Period = c("Months 37–42", "Months 43–48", "Months 49–54",
"Months 55–60", "Months 61–66"),
Type = rep("Expanding Window", 5)
)
kable(scheme_df, caption = "Illustrative Walk-Forward (Expanding Window) Scheme") %>%
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)| Window | Training_Period | Test_Period | Type |
|---|---|---|---|
| Window 1 | Months 1–36 | Months 37–42 | Expanding Window |
| Window 2 | Months 1–42 | Months 43–48 | Expanding Window |
| Window 3 | Months 1–48 | Months 49–54 | Expanding Window |
| Window 4 | Months 1–54 | Months 55–60 | Expanding Window |
| Window 5 | Months 1–60 | Months 61–66 | Expanding Window |
Walk-Forward Procedure — Step by Step:
Note on inner cross-validation for λ selection: Within each training window, a nested time-series cross-validation (or hold-out) must be used to select \(\lambda\) — never using future data.
Why Random k-Fold CV is Unsafe for This Problem:
Standard random k-fold CV randomly assigns observations to folds, meaning:
The walk-forward scheme ensures the model only ever trains on data available at the time of prediction, mirroring actual live trading conditions and producing realistic out-of-sample performance estimates.
summary_df <- data.frame(
Question = c(
"Q1(a): β t-stat (H₀:β=0)",
"Q1(b): β t-stat (H₀:β=1)",
"Q1(c): α t-stat",
"Q1(d): Systematic R²",
"Q1(e): CAPM Expected Excess Return",
"Q2(f): Significant Factors",
"Q2(g): Fund Style",
"Q2(h): Manager Alpha",
"Q3(j): P(Up)",
"Q3(l): Accuracy",
"Q3(l): Sensitivity",
"Q3(l): Specificity",
"Q3(l): Precision",
"Q3(m): Naive Accuracy",
"Q4(n): Monthly SR",
"Q4(n): Annualized SR"
),
Result = c(
paste0("t = ", round((0.98-0)/0.17, 4), " → Reject H₀"),
paste0("t = ", round((0.98-1)/0.17, 4), " → Fail to Reject H₀"),
paste0("t = ", round(0.0017/0.0020, 4), " → NOT significant"),
"50% systematic, 50% diversifiable",
paste0(round(0.98*0.0070*100, 4), "% per month"),
"MKT, SMB significant; α borderline; HML not",
"Small-cap, mild growth tilt",
paste0("α = +0.29%/month, NOT significant (t = ", round(0.0029/0.0018, 4), ")"),
paste0(round(1/(1+exp(-(-0.02+5.4*0.010+(-0.38)*1.5))), 4), " → Predict Down"),
paste0(round((67+56)/200, 4)),
paste0(round(67/100, 4)),
paste0(round(56/100, 4)),
paste0(round(67/111, 4)),
"0.5000 (balanced classes)",
paste0(round(0.0070/0.0550, 4)),
paste0(round((0.0070/0.0550)*sqrt(12), 4))
)
)
kable(summary_df, caption = "Complete Summary of Key Results") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = TRUE) %>%
row_spec(0, bold = TRUE)| Question | Result |
|---|---|
| Q1(a): β t-stat (H₀:β=0) | t = 5.7647 → Reject H₀ |
| Q1(b): β t-stat (H₀:β=1) | t = -0.1176 → Fail to Reject H₀ |
| Q1(c): α t-stat | t = 0.85 → NOT significant |
| Q1(d): Systematic R² | 50% systematic, 50% diversifiable |
| Q1(e): CAPM Expected Excess Return | 0.686% per month |
| Q2(f): Significant Factors | MKT, SMB significant; α borderline; HML not |
| Q2(g): Fund Style | Small-cap, mild growth tilt |
| Q2(h): Manager Alpha | α = +0.29%/month, NOT significant (t = 1.6111) |
| Q3(j): P(Up) | 0.3691 → Predict Down |
| Q3(l): Accuracy | 0.615 |
| Q3(l): Sensitivity | 0.67 |
| Q3(l): Specificity | 0.56 |
| Q3(l): Precision | 0.6036 |
| Q3(m): Naive Accuracy | 0.5000 (balanced classes) |
| Q4(n): Monthly SR | 0.1273 |
| Q4(n): Annualized SR | 0.4409 |
Q1 — Single-Factor Model: The fund has market beta statistically indistinguishable from 1.0, and its Jensen’s alpha (+0.17%/month) is not statistically significant. The single-factor model explains 50% of return variation, leaving substantial idiosyncratic risk.
Q2 — Fama–French: Adding SMB and HML lifts explanatory power from R²=0.75 to R²=0.92. The fund is clearly small-cap tilted (strong, significant SMB loading). Alpha of +0.29%/month is economically meaningful but just misses statistical significance — the manager shows promise but not proven skill.
Q3 — Logistic Regression: The model captures momentum (positive β₁) and the fear premium (negative β₂). It achieves 61.5% accuracy vs. 50% naive — meaningful but not dominant. Precision (60.4%) matters more than accuracy for trading applications.
Q4 — Resampling & LASSO: The annualized Sharpe of 0.4409 is estimated from a short 48-month window — bootstrap SEs would reveal substantial uncertainty. The 1-SE LASSO (7 factors) is preferred over the minimum-CV solution (14 factors) to guard against overfitting. Walk-forward validation is essential to avoid look-ahead bias that would render backtest results meaningless.