library(tidyquant)
library(tidyverse)
library(timetk)
library(lubridate)
library(PerformanceAnalytics)
library(quadprog)
library(ggplot2)
library(knitr)
library(kableExtra)
library(dplyr)
library(purrr)
library(broom)
library(xts)

1 Part 1 — Textbook Questions (60%)

1.1 Chapter 7: Optimal Risky Portfolios

1.1.1 CFA Problem 1

Context: Hennessy & Associates manages a $30 million equity portfolio holding approximately 40 stocks, committing 2–3% of total funds to each issue. Jason Jones proposes limiting the portfolio to no more than 20 stocks, doubling commitments to favored issues.

(a) Will limiting to 20 stocks likely increase or decrease portfolio risk? Explain.

Limiting the portfolio from 40 to 20 stocks will almost certainly increase total portfolio risk (measured by portfolio standard deviation), assuming the reduction is not accompanied by a deliberate reduction in correlation. Modern portfolio theory shows that as the number of holdings increases, the idiosyncratic (firm-specific) risk of each individual security is diversified away. The portfolio variance for an equally weighted portfolio of \(n\) assets with average variance \(\bar{\sigma}^2\) and average covariance \(\bar{\sigma}_{ij}\) is:

\[\sigma_p^2 = \frac{1}{n}\bar{\sigma}^2 + \frac{n-1}{n}\bar{\sigma}_{ij}\]

As \(n \to \infty\), the first term (unique risk) vanishes and only systematic covariance risk remains. Moving from 40 to 20 stocks increases the \(\frac{1}{n}\) term, raising idiosyncratic risk. Unless the 20 retained stocks happen to have very low pairwise correlations, total portfolio volatility will rise.

(b) Could Hennessy reduce from 40 to 20 without significantly affecting risk?

Yes, but only under a specific condition. If Hennessy identifies 20 stocks whose pairwise correlations are sufficiently low, the diversification benefit per stock is high enough that dropping the other 20 does not materially increase portfolio variance. In practice, since Hennessy is a “bottom-up” stock picker focusing on individual security selection rather than industry tilts, the 20 favored stocks likely share similar factor exposures (e.g., value or quality characteristics), which tend to raise inter-stock correlations. Therefore, while theoretically possible, it would in practice be difficult to achieve without careful correlation analysis.


1.1.2 CFA Problem 2

Why might further reduction from 20 to 10 stocks be less advantageous?

The diversification benefit of adding each incremental security is a convex, decreasing function of \(n\). The largest reduction in idiosyncratic risk occurs when moving from very small portfolios (e.g., 1 to 5 stocks). By the time a portfolio holds 20 stocks, a substantial fraction of diversifiable risk has already been eliminated. Cutting from 20 to 10 provides considerably less incremental diversification improvement than cutting from 40 to 20 — but imposes a cost in higher residual variance from the dropped positions.

Formally, the marginal reduction in variance from adding the \(n\)-th stock declines at rate \(1/n^2\):

\[\frac{d}{dn}\left(\frac{1}{n}\bar{\sigma}^2\right) = -\frac{\bar{\sigma}^2}{n^2}\]

At \(n = 20\), this marginal benefit is already small, while at \(n = 10\), removing securities significantly increases the idiosyncratic variance term. Meanwhile, if Wilstead evaluates Hennessy independently, concentrating into 10 stocks dramatically raises tracking error relative to any benchmark, increasing the probability of a significant underperformance year — which matters asymmetrically to a manager being evaluated on standalone results.


1.1.3 CFA Problem 3

How would evaluating Hennessy in the context of the total fund change the committee’s decision?

When Hennessy’s portfolio is evaluated within the context of the $280 million total Wilstead fund (Hennessy’s $30M plus the other five managers’ $250M across 150+ issues), the relevant question is no longer the standalone risk of Hennessy’s portfolio, but rather Hennessy’s marginal contribution to total fund risk — specifically, the covariance of Hennessy’s portfolio with the other managers’ holdings.

If Hennessy’s 10 or 20 concentrated holdings have low correlation with the remaining 150+ stocks managed by other Wilstead managers, then concentrating Hennessy into fewer, higher-conviction names could actually reduce or leave unchanged the total fund’s risk, even if it raises Hennessy’s standalone variance. From a total-fund perspective, what matters is:

\[\text{Marginal Contribution} = \beta_{H,F} = \frac{\text{Cov}(R_H, R_F)}{\sigma_F^2}\]

where \(R_H\) is Hennessy’s return and \(R_F\) is the total fund return. A lower-correlation, concentrated Hennessy sub-portfolio could serve as an effective diversifier for the broader fund. This broader perspective would make the committee more comfortable allowing further concentration to 10 stocks, as long as Hennessy’s selections remain uncorrelated with the rest of the fund.


1.1.4 CFA Problem 4

Which portfolio cannot lie on the Markowitz efficient frontier?

portfolios <- data.frame(
  Portfolio = c("W", "X", "Z", "Y"),
  Expected_Return = c(15, 12, 5, 9),
  Std_Dev = c(36, 15, 7, 21)
)

portfolios %>%
  kable(caption = "Portfolio Risk-Return Characteristics",
        col.names = c("Portfolio", "Expected Return (%)", "Std Dev (%)")) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"), full_width = FALSE)
Portfolio Risk-Return Characteristics
Portfolio Expected Return (%) Std Dev (%)
W 15 36
X 12 15
Z 5 7
Y 9 21

Answer: Portfolio Y cannot lie on the efficient frontier.

The Markowitz efficient frontier contains only portfolios that offer the maximum expected return for a given level of risk, or equivalently, the minimum risk for a given expected return. A portfolio is inefficient (dominated) if another portfolio exists with either: (i) the same standard deviation but higher return, or (ii) the same expected return but lower standard deviation.

Comparing Portfolio Y (9% return, 21% SD) with Portfolio X (12% return, 15% SD): Portfolio X dominates Portfolio Y on both dimensions — it offers higher expected return and lower standard deviation. Therefore, Portfolio Y is mean-variance dominated and cannot lie on the efficient frontier. No rational investor would hold Y when X is available.

ggplot(portfolios, aes(x = Std_Dev, y = Expected_Return, label = Portfolio)) +
  geom_point(aes(color = Portfolio), size = 5) +
  geom_text(vjust = -0.9, fontface = "bold", size = 4.5) +
  annotate("text", x = 21, y = 9, label = "DOMINATED", color = "red",
           hjust = -0.1, size = 3.5) +
  geom_curve(aes(x = 15, y = 12, xend = 21, yend = 9),
             arrow = arrow(length = unit(0.03, "npc")), color = "red",
             curvature = -0.3, linetype = "dashed") +
  labs(title = "Portfolio Risk-Return Space: Markowitz Efficient Frontier",
       subtitle = "Portfolio Y is dominated by Portfolio X",
       x = "Standard Deviation (%)", y = "Expected Return (%)") +
  theme_minimal(base_size = 13) +
  theme(legend.position = "none")


1.1.5 CFA Problem 10

Equal-weight portfolio AB vs. equal-weight portfolio BC — which is preferred?

Given data:

Stock Std Dev (%)
A 40
B 20
C 40

Correlations: \(\rho_{AB} = 0.90\), \(\rho_{AC} = 0.50\), \(\rho_{BC} = 0.10\)

For an equal-weight two-asset portfolio, variance is:

\[\sigma_p^2 = (0.5)^2\sigma_A^2 + (0.5)^2\sigma_B^2 + 2(0.5)(0.5)\rho_{AB}\sigma_A\sigma_B\]

sA <- 40; sB <- 20; sC <- 40
rAB <- 0.90; rBC <- 0.10

var_AB <- 0.25*sA^2 + 0.25*sB^2 + 2*0.25*rAB*sA*sB
var_BC <- 0.25*sB^2 + 0.25*sC^2 + 2*0.25*rBC*sB*sC

sd_AB <- sqrt(var_AB)
sd_BC <- sqrt(var_BC)

results <- data.frame(
  Portfolio   = c("AB (50/50)", "BC (50/50)"),
  Variance    = round(c(var_AB, var_BC), 2),
  Std_Dev     = round(c(sd_AB, sd_BC), 2)
)

results %>%
  kable(caption = "Portfolio Variance Comparison",
        col.names = c("Portfolio", "Variance (%²)", "Std Dev (%)")) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"), full_width = FALSE)
Portfolio Variance Comparison
Portfolio Variance (%²) Std Dev (%)
AB (50/50) 860 29.33
BC (50/50) 540 23.24

Portfolio BC is recommended because its standard deviation of 23.24% is substantially lower than portfolio AB’s 29.33%. The critical driver is the pairwise correlation: \(\rho_{AB} = 0.90\) is very high, meaning A and B move almost in lockstep and offer minimal diversification benefit when combined. By contrast, \(\rho_{BC} = 0.10\) is near zero, so B and C provide strong diversification — even though C has a higher standard deviation than B, the near-zero correlation dramatically reduces combined portfolio variance. Since we are given only risk information (no expected return data), the decision is based solely on minimizing variance, and portfolio BC unambiguously dominates portfolio AB.


1.2 Chapter 8: Index Models

1.2.1 CFA Problem 1

Analysis of regression results for ABC and XYZ stocks:

The SCL (Security Characteristic Line) regression for each stock is:

\[R_i - R_f = \alpha_i + \beta_i (R_M - R_f) + \epsilon_i\]

Sample period risk-return interpretation:

For ABC stock: The negative alpha of \(-3.20\%\) indicates that ABC underperformed its CAPM-expected return by 3.20% per year over the sample period, after adjusting for market risk. The beta of 0.60 indicates below-market systematic risk. The \(R^2 = 0.35\) means only 35% of ABC’s return variance is explained by market movements; the residual standard deviation of 13.02% reflects substantial firm-specific risk.

For XYZ stock: The positive alpha of \(+7.3\%\) indicates XYZ outperformed its risk-adjusted benchmark by 7.3% annually over the sample. Its beta of 0.97 is close to 1.0, making it a near-market-risk asset. However, \(R^2 = 0.17\) is very low — only 17% of variance is explained by the market — and residual SD of 21.45% is large, indicating that most of XYZ’s return variability is idiosyncratic.

Implications for a diversified portfolio:

The historical alpha estimates should not be taken at face value as forecasts. Alpha is subject to estimation error, and high past alpha often reverts toward zero. More critically, the beta estimates are unstable across brokerages: Brokerage A reports \(\beta_{ABC} = 0.62\) and \(\beta_{XYZ} = 1.45\), while Brokerage B reports \(\beta_{ABC} = 0.71\) and \(\beta_{XYZ} = 1.25\). This divergence (particularly for XYZ, where betas differ by 0.20) reveals that beta is sensitive to the estimation window and return frequency used — the 5-year regression betas (ABC = 0.60, XYZ = 0.97) differ from the recent 2-year weekly betas from both brokerages. For XYZ specifically, if its “true” current beta is closer to 1.25–1.45 rather than 0.97, the stock carries considerably more systematic risk than historical data suggests. In a diversified portfolio, the nonsystematic risk of each holding largely cancels out, so the primary concern is getting the beta estimate right for proper risk attribution and hedging.


1.2.2 CFA Problem 2

Nonsystematic risk of Baker Fund:

Given \(\rho_{Baker, M} = 0.70\), the coefficient of determination is:

\[R^2 = \rho^2 = (0.70)^2 = 0.49\]

\(R^2\) represents the fraction of total variance explained by the market (systematic risk). Therefore, the fraction of total risk that is nonsystematic is:

\[\text{Nonsystematic proportion} = 1 - R^2 = 1 - 0.49 = 0.51 = \mathbf{51\%}\]

rho <- 0.70
R2  <- rho^2
nonsys <- 1 - R2
cat("R-squared:", R2, "\nNonsystematic proportion:", nonsys, "=", nonsys*100, "%")
## R-squared: 0.49 
## Nonsystematic proportion: 0.51 = 51 %

Fifty-one percent of Baker Fund’s total risk is nonsystematic (diversifiable). A well-diversified investor holding Baker Fund as one component of a broader portfolio would be primarily exposed to the remaining 49% systematic risk, with the idiosyncratic 51% being eliminable through diversification.


1.2.3 CFA Problem 3

Implied beta of Charlottesville International Fund:

Given: \(\rho = 1.0\) (perfect correlation with world index), \(E(R_M) = 11\%\), \(E(R_{fund}) = 9\%\), \(R_f = 3\%\).

Using the CAPM return equation:

\[E(R_i) = R_f + \beta_i [E(R_M) - R_f]\]

Solving for \(\beta\):

\[\beta = \frac{E(R_i) - R_f}{E(R_M) - R_f} = \frac{9\% - 3\%}{11\% - 3\%} = \frac{6\%}{8\%} = 0.75\]

Rf <- 0.03; EM <- 0.11; Efund <- 0.09
beta_implied <- (Efund - Rf) / (EM - Rf)
cat("Implied beta:", beta_implied)
## Implied beta: 0.75

The implied beta is 0.75. Note that the perfect correlation of 1.0 with the world index means all of the fund’s variance is systematic — there is zero residual (firm-specific) variance. The fund simply moves proportionally with the market, but with a muted sensitivity of 0.75, which is why it earns less than the full market return of 11%.


1.2.4 CFA Problem 4

Beta is most closely associated with:

Answer: (d) Systematic risk.

Beta (\(\beta_i\)) is defined as the ratio of an asset’s covariance with the market portfolio to the variance of the market portfolio:

\[\beta_i = \frac{\text{Cov}(R_i, R_M)}{\sigma_M^2}\]

Beta measures only the systematic (market-related) component of an asset’s risk — the risk that cannot be eliminated through diversification. While beta is mathematically computed using a regression, which involves correlation coefficients, its conceptual meaning and practical purpose is as a measure of systematic risk sensitivity. Mean-variance analysis (option b) refers to the broader Markowitz framework. Nonsystematic risk (option c) is the opposite of what beta captures — it is the residual, firm-specific risk.


1.2.5 CFA Problem 5

Beta vs. standard deviation as risk measures:

Answer: (b) Beta measures only systematic risk, while standard deviation measures total risk.

Standard deviation captures total portfolio variance — both the systematic component (driven by market movements) and the nonsystematic component (firm-specific). Formally:

\[\sigma_i^2 = \beta_i^2 \sigma_M^2 + \sigma^2(\epsilon_i)\]

where \(\beta_i^2 \sigma_M^2\) is systematic variance and \(\sigma^2(\epsilon_i)\) is unsystematic variance. Standard deviation, as the square root of \(\sigma_i^2\), includes both. Beta, by construction, isolates only the sensitivity to the common market factor. Option (a) is wrong because beta does not measure unsystematic risk. Options (c) and (d) are wrong because standard deviation is not limited to one type of risk.


1.3 Chapter 9: The Capital Asset Pricing Model

1.3.1 CFA Problem 8 — Portfolio R relative to the SML

Data: Portfolio R: return = 11%, SD = 10%, beta = 0.5. S&P 500: return = 14%, SD = 12%, beta = 1.0.

To locate Portfolio R relative to the SML, we compute its CAPM-predicted (required) return. The SML uses the risk-free rate implied by the market data. Using the market as a reference point where \(E(R_M) = 14\%\) and \(\beta_M = 1.0\):

We need \(R_f\). The problem does not explicitly provide \(R_f\); however, the standard approach in these textbook problems is to use the risk-free rate consistent with the SML through the market. Given the S&P return of 14% and treating it as the market portfolio, with no explicit \(R_f\) we cannot uniquely pin the SML. The textbook convention for this problem uses \(R_f\) such that the SML is anchored at the market. Let us assume \(R_f = 6\%\) (a common textbook assumption for this problem set — this needs verification against your specific edition if a risk-free rate is provided elsewhere).

\[E(R_R)_{\text{SML}} = R_f + \beta_R [E(R_M) - R_f] = 6\% + 0.5 \times (14\% - 6\%) = 6\% + 4\% = 10\%\]

Jensen’s alpha:

\[\alpha_R = R_R - E(R_R)_{\text{SML}} = 11\% - 10\% = +1\%\]

Rf <- 0.06; Rm <- 0.14; beta_R <- 0.5; ret_R <- 0.11

SML_pred <- Rf + beta_R * (Rm - Rf)
alpha_J  <- ret_R - SML_pred

cat("SML-predicted return for Portfolio R:", SML_pred * 100, "%\n")
## SML-predicted return for Portfolio R: 10 %
cat("Jensen's Alpha:", alpha_J * 100, "%\n")
## Jensen's Alpha: 1 %
cat("Portfolio R lies: ABOVE the SML")
## Portfolio R lies: ABOVE the SML

Answer: (c) Above the SML.

Portfolio R earns 11% while the SML predicts only 10% for its beta of 0.5. The positive Jensen’s alpha of +1% means Portfolio R plots above the SML — it offers more return per unit of systematic risk than CAPM equilibrium requires.

betas <- seq(0, 1.5, by = 0.01)
sml_returns <- Rf + betas * (Rm - Rf)
sml_df <- data.frame(beta = betas, expected_return = sml_returns * 100)

ggplot(sml_df, aes(x = beta, y = expected_return)) +
  geom_line(color = "steelblue", size = 1.2) +
  geom_point(aes(x = 0.5, y = 11), color = "darkgreen", size = 4) +
  geom_point(aes(x = 1.0, y = 14), color = "orange",   size = 4) +
  geom_point(aes(x = 0,   y = Rf*100), color = "gray30", size = 3) +
  annotate("text", x = 0.5,  y = 11.3, label = "Portfolio R (Above SML)", color = "darkgreen", hjust = -0.05) +
  annotate("text", x = 1.0,  y = 14.3, label = "S&P 500 (Market)",       color = "orange",    hjust = -0.05) +
  annotate("segment", x = 0.5, xend = 0.5, y = SML_pred*100, yend = 11,
           arrow = arrow(length = unit(0.02,"npc")), color = "darkgreen", linetype = "dashed") +
  annotate("text", x = 0.52, y = (SML_pred*100 + 11)/2,
           label = paste0("α = +", round(alpha_J*100,1), "%"), color = "darkgreen", size = 3.5) +
  labs(title = "Security Market Line (SML)",
       subtitle = "Portfolio R lies above the SML (positive Jensen's alpha)",
       x = "Beta", y = "Expected Return (%)") +
  theme_minimal(base_size = 13)


1.3.2 CFA Problem 9 — Portfolio R relative to the CML

The CML plots efficient portfolios in return vs. standard deviation space. Only perfectly diversified portfolios (combinations of the market and risk-free asset) lie on the CML. The CML equation is:

\[E(R_p) = R_f + \frac{E(R_M) - R_f}{\sigma_M} \cdot \sigma_p\]

The CML-predicted return for Portfolio R’s SD of 10%:

\[E(R_R)_{\text{CML}} = 6\% + \frac{14\% - 6\%}{12\%} \times 10\% = 6\% + \frac{8\%}{12\%} \times 10\% = 6\% + 6.67\% = 12.67\%\]

sd_R <- 0.10; sd_M <- 0.12

CML_pred <- Rf + ((Rm - Rf) / sd_M) * sd_R
cat("CML-predicted return for Portfolio R's SD of 10%:", round(CML_pred*100, 2), "%\n")
## CML-predicted return for Portfolio R's SD of 10%: 12.67 %
cat("Actual return of Portfolio R: 11%\n")
## Actual return of Portfolio R: 11%
cat("Portfolio R lies: BELOW the CML")
## Portfolio R lies: BELOW the CML

Answer: (b) Below the CML.

The CML predicts that any efficient portfolio with 10% standard deviation should earn 12.67%. Portfolio R only earns 11%. The gap exists because Portfolio R is not a fully efficient, well-diversified portfolio — it contains some residual nonsystematic risk. It plots below the CML because it wastes some of its total risk budget on firm-specific variance that carries no expected return premium. Note that this is consistent with Portfolio R being above the SML: the SML rewards systematic risk (beta), and R does well on that metric; but the CML measures efficiency of total risk deployment, and R falls short.

sds <- seq(0, 0.20, by = 0.001)
cml_returns <- Rf + ((Rm - Rf) / sd_M) * sds
cml_df <- data.frame(sd = sds * 100, expected_return = cml_returns * 100)

ggplot(cml_df, aes(x = sd, y = expected_return)) +
  geom_line(color = "steelblue", size = 1.2) +
  geom_point(aes(x = 10, y = 11), color = "red",    size = 4) +
  geom_point(aes(x = 12, y = 14), color = "orange", size = 4) +
  geom_point(aes(x = 0,  y = Rf*100), color = "gray30", size = 3) +
  annotate("text", x = 10, y = 11.3, label = "Portfolio R (Below CML)", color = "red",    hjust = -0.05) +
  annotate("text", x = 12, y = 14.3, label = "Market Portfolio",        color = "orange", hjust = -0.05) +
  annotate("segment", x = 10, xend = 10, y = 11, yend = CML_pred*100,
           arrow = arrow(length = unit(0.02,"npc")), color = "red", linetype = "dashed") +
  annotate("text", x = 10.3, y = (11 + CML_pred*100)/2,
           label = paste0("Gap: ", round((CML_pred - ret_R)*100, 2), "%"), color = "red", size = 3.5) +
  labs(title = "Capital Market Line (CML)",
       subtitle = "Portfolio R lies below the CML (not fully efficient)",
       x = "Standard Deviation (%)", y = "Expected Return (%)") +
  theme_minimal(base_size = 13)


1.3.3 CFA Problem 10 — Should investors expect higher return from Portfolio A than Portfolio B?

Portfolio A: beta = 1.0, high specific risk. Portfolio B: beta = 1.0, low specific risk.

No — investors should not expect a higher return from Portfolio A under CAPM.

The core insight of CAPM is that only systematic risk (beta) is priced in equilibrium. Because both portfolios have identical betas of 1.0, CAPM predicts the same expected return for both:

\[E(R_A) = E(R_B) = R_f + 1.0 \times [E(R_M) - R_f]\]

The higher specific (idiosyncratic) risk of Portfolio A is diversifiable. In a large, well-diversified portfolio, firm-specific shocks average out and carry no expected return premium. Rational investors who hold the market portfolio (as CAPM assumes) will not pay less for Portfolio A just because it has higher idiosyncratic variance — because that variance simply disappears when A is combined with many other assets. Only the covariance with the market (captured by beta) contributes to the risk of the investor’s overall portfolio and therefore commands a risk premium. Expecting a higher return from A would imply a mispricing that diversified investors would arbitrage away.


1.4 Chapter 10: Arbitrage Pricing Theory and Multifactor Models

Setup: Two-factor APT model. Factor risk premiums: GDP growth = 8%, Inflation = 2%. Risk-free rate = 4%.

APT expected return formula:

\[E(R_i) = R_f + \beta_{i,GDP} \times \lambda_{GDP} + \beta_{i,Inf} \times \lambda_{Inf}\]

1.4.1 Problem 13 — High Growth Fund Expected Return

Sensitivities: \(\beta_{GDP} = 1.25\), \(\beta_{Inf} = 1.5\)

\[E(R_{HG}) = 4\% + 1.25 \times 8\% + 1.5 \times 2\% = 4\% + 10\% + 3\% = 17\%\]

Rf_apt <- 0.04
lam_GDP <- 0.08; lam_Inf <- 0.02
b_HG_GDP <- 1.25; b_HG_Inf <- 1.5

E_HG <- Rf_apt + b_HG_GDP*lam_GDP + b_HG_Inf*lam_Inf
cat("Expected return of High Growth Fund:", E_HG*100, "%")
## Expected return of High Growth Fund: 17 %

The equilibrium expected return of the High Growth Fund is 17%.


1.4.2 Problem 14 — Arbitrage Opportunity for Large Cap Fund

McCracken’s APT estimate for the Large Cap Fund: \(\beta_{GDP} = 0.75\), \(\beta_{Inf} = 1.25\)

\[E(R_{LC})_{\text{APT}} = 4\% + 0.75 \times 8\% + 1.25 \times 2\% = 4\% + 6\% + 2.5\% = 12.5\%\]

Kwon’s fundamental estimate: \(R_f + 8.5\% = 4\% + 8.5\% = 12.5\%\)

b_LC_GDP <- 0.75; b_LC_Inf <- 1.25

E_LC_APT <- Rf_apt + b_LC_GDP*lam_GDP + b_LC_Inf*lam_Inf
E_LC_fund <- Rf_apt + 0.085

cat("APT estimate for Large Cap Fund:", E_LC_APT*100, "%\n")
## APT estimate for Large Cap Fund: 12.5 %
cat("Kwon fundamental estimate:      ", E_LC_fund*100, "%\n")
## Kwon fundamental estimate:       12.5 %
cat("Difference:", (E_LC_fund - E_LC_APT)*100, "%")
## Difference: 0 %

No arbitrage opportunity exists. Both estimates agree at 12.5%. When the APT-implied expected return equals the fundamentally-derived expected return, the asset is fairly priced relative to the factor structure. An arbitrage opportunity would require a discrepancy — where one could construct a zero-cost, zero-risk portfolio with a positive expected return by going long the underpriced asset and short the replicating APT portfolio.


1.4.3 Problem 15 — Weight in the Utility Fund for GDP Fund Construction

The GDP Fund must have: \(\beta_{GDP} = 1.0\) and \(\beta_{Inf} = 0\). Using weights \(w_1\) (High Growth), \(w_2\) (Large Cap), \(w_3\) (Utility), with \(w_1 + w_2 + w_3 = 1\):

Factor sensitivities of the three funds:

GDP Inflation
High Growth 1.25 1.5
Large Cap 0.75 1.25
Utility 1.0 2.0

System of equations:

\[1.25 w_1 + 0.75 w_2 + 1.0 w_3 = 1 \quad \text{(GDP constraint)}\] \[1.5 w_1 + 1.25 w_2 + 2.0 w_3 = 0 \quad \text{(Inflation constraint)}\] \[w_1 + w_2 + w_3 = 1 \quad \text{(weights sum to 1)}\]

# Matrix form: A %*% w = b
A <- matrix(c(1.25, 0.75, 1.0,
              1.5,  1.25, 2.0,
              1.0,  1.0,  1.0), nrow = 3, byrow = TRUE)
b <- c(1, 0, 1)

w <- solve(A, b)
names(w) <- c("High Growth", "Large Cap", "Utility")
round(w, 4)
## High Growth   Large Cap     Utility 
##         1.6         1.6        -2.2

Answer: (a) The weight in the Utility Fund is \(w_3 \approx -2.2\).

The negative weight means the GDP Fund must short the Utility Fund to zero out its inflation exposure. This is feasible in an arbitrage/factor-mimicking context where short positions are permitted.


1.4.4 Problem 16 — Who is correct about the GDP Fund?

Answer: (b) Both are correct.

McCracken argues the GDP Fund suits investors who live off steady income (retirees). This is correct because the GDP Fund has zero inflation sensitivity (\(\beta_{Inf} = 0\)). Investors whose liabilities or consumption are inflation-linked would benefit from a fund that does not fluctuate with unexpected inflation — their real income stream is better protected.

Stiles argues the fund suits investors who benefit from supply-side policy success. This is also correct because the fund has unit GDP sensitivity (\(\beta_{GDP} = 1.0\)). Successful supply-side policies (tax cuts, deregulation, productivity improvements) tend to boost real GDP growth, directly increasing the fund’s return. Both characterizations correctly identify distinct investor clienteles for whom the GDP Fund’s specific factor exposures are appropriate.


2 Part 2 — R Code Questions (40%)

2.1 Q1: Import ETF Data

tickers <- c("SPY","QQQ","EEM","IWM","EFA","TLT","IYR","GLD")

raw_prices <- tq_get(tickers,
                     from = "2010-01-01",
                     to   = "2026-06-10",
                     get  = "stock.prices")

# Extract adjusted close and pivot to wide format
prices_wide <- raw_prices %>%
  select(symbol, date, adjusted) %>%
  pivot_wider(names_from = symbol, values_from = adjusted) %>%
  arrange(date)

# Convert to xts for time-series operations
prices_xts <- xts(prices_wide[,-1], order.by = as.Date(prices_wide$date))

# Display head and tail
head_df <- as.data.frame(head(prices_xts, 6))
head_df <- round(head_df, 4)
head_df <- cbind(Date = rownames(head_df), head_df)

tail_df <- as.data.frame(tail(prices_xts, 6))
tail_df <- round(tail_df, 4)
tail_df <- cbind(Date = rownames(tail_df), tail_df)

bind_rows(head_df, tail_df) %>%
  kable(caption = "ETF Adjusted Closing Prices — Head (top 6) and Tail (bottom 6)",
        row.names = FALSE) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"),
                full_width = FALSE, font_size = 11) %>%
  row_spec(1:6, background = "#f0f8ff") %>%
  row_spec(7:12, background = "#fff8f0")
ETF Adjusted Closing Prices — Head (top 6) and Tail (bottom 6)
Date SPY QQQ EEM IWM EFA TLT IYR GLD
2010-01-04 84.7964 40.2908 30.3515 51.3665 35.1284 55.7095 26.7681 109.80
2010-01-05 85.0209 40.2908 30.5718 51.1899 35.1594 56.0693 26.8324 109.70
2010-01-06 85.0807 40.0478 30.6358 51.1418 35.3080 55.3187 26.8207 111.51
2010-01-07 85.4399 40.0738 30.4581 51.5191 35.1718 55.4118 27.0603 110.82
2010-01-08 85.7242 40.4036 30.6997 51.8001 35.4504 55.3870 26.8791 111.37
2010-01-11 85.8439 40.2387 30.6358 51.5914 35.7415 55.0830 27.0077 112.85
2026-06-02 759.5700 746.1600 70.8000 291.6600 105.0200 85.6500 99.9900 411.95
2026-06-03 754.2400 744.2100 69.9200 287.6700 104.1200 85.3100 100.0000 407.87
2026-06-04 757.0900 740.6100 69.1000 292.0100 104.9500 85.5000 101.7900 411.27
2026-06-05 737.5500 705.0600 64.5900 281.6500 102.2600 85.0600 102.5400 396.24
2026-06-08 739.2200 716.0700 65.7500 284.1100 102.8800 84.6200 101.0800 397.27
2026-06-09 737.0500 707.8300 65.8200 285.0200 102.9000 85.1200 103.4900 390.78

2.2 Q2: Calculate Weekly and Monthly Returns

# Weekly returns (all 8 ETFs)
weekly_ret_xts <- do.call(merge, lapply(tickers, function(tk) {
  weeklyReturn(prices_xts[, tk], type = "arithmetic")
}))
colnames(weekly_ret_xts) <- tickers
weekly_ret_xts <- weekly_ret_xts[-1, ]

# Compute returns for all tickers — monthly simple returns via xts
monthly_ret_xts <- do.call(merge, lapply(tickers, function(tk) {
  px <- prices_xts[, tk]
  monthlyReturn(px, type = "arithmetic")
}))
colnames(monthly_ret_xts) <- tickers

# Remove first row (NA from return calculation)
monthly_ret_xts <- monthly_ret_xts[-1, ]

# Summary statistics
summary_stats <- do.call(rbind, lapply(tickers, function(tk) {
  x <- as.numeric(monthly_ret_xts[, tk])
  data.frame(
    Ticker = tk,
    Mean   = round(mean(x, na.rm=TRUE)*100, 3),
    SD     = round(sd(x, na.rm=TRUE)*100, 3),
    Min    = round(min(x, na.rm=TRUE)*100, 3),
    Max    = round(max(x, na.rm=TRUE)*100, 3),
    Sharpe = round(mean(x, na.rm=TRUE)/sd(x, na.rm=TRUE), 3)
  )
}))

summary_stats %>%
  kable(caption = "Monthly Return Summary Statistics (%)",
        col.names = c("Ticker","Mean (%)","Std Dev (%)","Min (%)","Max (%)","Sharpe (rf=0)")) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"), full_width = FALSE)
Monthly Return Summary Statistics (%)
Ticker Mean (%) Std Dev (%) Min (%) Max (%) Sharpe (rf=0)
SPY 1.218 4.170 -12.487 12.698 0.292
QQQ 1.635 5.081 -13.596 15.690 0.322
EEM 0.588 5.266 -17.895 16.268 0.112
IWM 1.065 5.648 -21.477 18.244 0.189
EFA 0.687 4.486 -14.107 14.269 0.153
TLT 0.277 3.910 -9.424 13.206 0.071
IYR 0.833 4.819 -19.632 13.190 0.173
GLD 0.774 4.701 -11.062 12.275 0.165

2.3 Q3: Convert Monthly Returns to Tibble

monthly_ret_tbl <- tk_tbl(monthly_ret_xts, rename_index = "date")

monthly_ret_tbl %>%
  head(6) %>%
  mutate(across(-date, ~round(., 5))) %>%
  kable(caption = "Monthly Returns — Tibble Format (first 6 rows)") %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"), full_width = FALSE)
Monthly Returns — Tibble Format (first 6 rows)
date SPY QQQ EEM IWM EFA TLT IYR GLD
2010-02-26 0.03119 0.04604 0.01776 0.04475 0.00267 -0.00342 0.05457 0.03275
2010-03-31 0.06088 0.07711 0.08111 0.08231 0.06385 -0.02057 0.09749 -0.00439
2010-04-30 0.01547 0.02242 -0.00166 0.05678 -0.02805 0.03322 0.06388 0.05883
2010-05-28 -0.07945 -0.07392 -0.09394 -0.07537 -0.11193 0.05108 -0.05684 0.03051
2010-06-30 -0.05174 -0.05976 -0.01399 -0.07743 -0.02062 0.05798 -0.04670 0.02355
2010-07-30 0.06830 0.07258 0.10932 0.06731 0.11610 -0.00946 0.09405 -0.05087

2.4 Q4: Download Fama-French 3 Factors

# Install frenchdata if needed
if (!requireNamespace("frenchdata", quietly = TRUE)) {
  install.packages("frenchdata")
}
library(frenchdata)

# Download FF3 monthly factors
ff3_raw <- download_french_data("Fama/French 3 Factors")
ff3_monthly_raw <- ff3_raw$subsets$data[[1]]   # monthly table is first subset

# Clean and convert
ff3_monthly <- ff3_monthly_raw %>%
  rename(date_raw = date) %>%
  mutate(
    date_raw = as.character(date_raw),
    date     = as.Date(paste0(date_raw, "01"), format = "%Y%m%d"),
    date     = ceiling_date(date, "month") - days(1),
    Mkt.RF   = as.numeric(`Mkt-RF`) / 100,
    SMB      = as.numeric(SMB)      / 100,
    HML      = as.numeric(HML)      / 100
  ) %>%
  filter(!is.na(Mkt.RF), date >= as.Date("2010-01-01")) %>%
  select(date, Mkt.RF, SMB, HML)

# Convert to xts
ff3_xts <- xts(ff3_monthly[, c("Mkt.RF","SMB","HML")],
               order.by = ff3_monthly$date)

ff3_xts %>%
  head(6) %>%
  as.data.frame() %>%
  mutate(across(everything(), ~round(., 5))) %>%
  tibble::rownames_to_column("Date") %>%
  kable(caption = "Fama-French 3 Factors — Monthly (decimal, first 6 rows)") %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"), full_width = FALSE)
Fama-French 3 Factors — Monthly (decimal, first 6 rows)
Date Mkt.RF SMB HML
2010-01-31 -0.0335 0.0043 0.0033
2010-02-28 0.0339 0.0118 0.0318
2010-03-31 0.0630 0.0146 0.0219
2010-04-30 0.0199 0.0484 0.0296
2010-05-31 -0.0790 0.0013 -0.0248
2010-06-30 -0.0556 -0.0179 -0.0473

2.5 Q5: Merge Monthly Returns and FF3 Factors

# Align index format: both to end-of-month dates
# monthly_ret_xts index already end-of-month from monthlyReturn()
# Make sure ff3_xts dates align

# Convert both to tibbles and merge by date
ret_tbl <- tk_tbl(monthly_ret_xts, rename_index = "date") %>%
  mutate(date = as.Date(date))

ff3_tbl <- tk_tbl(ff3_xts, rename_index = "date") %>%
  mutate(date = as.Date(date))

# Inner join on year-month to handle slight end-of-month discrepancies
ret_tbl2 <- ret_tbl %>%
  mutate(ym = format(date, "%Y-%m"))

ff3_tbl2 <- ff3_tbl %>%
  mutate(ym = format(date, "%Y-%m"))

merged_tbl <- inner_join(ret_tbl2, ff3_tbl2, by = "ym") %>%
  select(-ym, -date.y) %>%
  rename(date = date.x)

merged_tbl %>%
  head(6) %>%
  mutate(across(-date, ~round(., 5))) %>%
  kable(caption = "Merged Monthly Returns + FF3 Factors (first 6 rows)") %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"), full_width = FALSE, font_size = 11)
Merged Monthly Returns + FF3 Factors (first 6 rows)
date SPY QQQ EEM IWM EFA TLT IYR GLD Mkt.RF SMB HML
2010-02-26 0.03119 0.04604 0.01776 0.04475 0.00267 -0.00342 0.05457 0.03275 0.0339 0.0118 0.0318
2010-03-31 0.06088 0.07711 0.08111 0.08231 0.06385 -0.02057 0.09749 -0.00439 0.0630 0.0146 0.0219
2010-04-30 0.01547 0.02242 -0.00166 0.05678 -0.02805 0.03322 0.06388 0.05883 0.0199 0.0484 0.0296
2010-05-28 -0.07945 -0.07392 -0.09394 -0.07537 -0.11193 0.05108 -0.05684 0.03051 -0.0790 0.0013 -0.0248
2010-06-30 -0.05174 -0.05976 -0.01399 -0.07743 -0.02062 0.05798 -0.04670 0.02355 -0.0556 -0.0179 -0.0473
2010-07-30 0.06830 0.07258 0.10932 0.06731 0.11610 -0.00946 0.09405 -0.05087 0.0692 0.0022 -0.0050

2.6 Q6: CAPM-Based GMV Portfolio (Single Month)

# --- Subset training window: 2010/02 – 2015/01 ---
train_tbl <- merged_tbl %>%
  filter(date >= as.Date("2010-02-01") & date <= as.Date("2015-01-31"))

# Excess returns (assets and market)
excess_ret  <- as.matrix(train_tbl[, tickers]) - as.numeric(train_tbl$Mkt.RF)
# Note: Mkt.RF is already excess market return
mkt_excess  <- as.numeric(train_tbl$Mkt.RF)

n_assets    <- length(tickers)

# Regress each asset's excess return on Mkt-RF → extract beta and residual variance
betas_capm  <- numeric(n_assets)
resid_var   <- numeric(n_assets)

for (i in seq_along(tickers)) {
  y   <- as.matrix(train_tbl[, tickers[i]]) - train_tbl$Mkt.RF  # excess return of asset i
  fit <- lm(y ~ mkt_excess)
  betas_capm[i] <- coef(fit)[2]
  resid_var[i]  <- var(residuals(fit))
}

names(betas_capm) <- tickers
names(resid_var)  <- tickers

# Market variance
sigma2_mkt <- var(mkt_excess)

# CAPM covariance matrix: Sigma = beta %*% t(beta) * sigma2_mkt + D
Sigma_CAPM <- betas_capm %o% betas_capm * sigma2_mkt + diag(resid_var)

# --- Solve GMV using quadprog ---
# Minimize w'Sigma w subject to: sum(w)=1, w>=0
Dmat <- 2 * Sigma_CAPM
dvec <- rep(0, n_assets)

# Constraints: equality (sum=1) + non-negativity
Amat <- cbind(rep(1, n_assets), diag(n_assets))
bvec <- c(1, rep(0, n_assets))

gmv_capm <- solve.QP(Dmat, dvec, Amat, bvec, meq = 1)
w_capm   <- gmv_capm$solution
names(w_capm) <- tickers

# Apply weights to 2015/02 returns
ret_201502 <- merged_tbl %>%
  filter(format(date, "%Y-%m") == "2015-02") %>%
  select(all_of(tickers)) %>%
  as.numeric()

realized_capm <- sum(w_capm * ret_201502)

# --- Display ---
weight_df <- data.frame(
  ETF    = tickers,
  Weight = round(w_capm * 100, 2)
)

weight_df %>%
  kable(caption = "CAPM-Based GMV Weights (%)", col.names = c("ETF","Weight (%)")) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"), full_width = FALSE)
CAPM-Based GMV Weights (%)
ETF Weight (%)
SPY SPY 87.19
QQQ QQQ 3.33
EEM EEM 1.25
IWM IWM 5.95
EFA EFA 1.95
TLT TLT 0.00
IYR IYR 0.33
GLD GLD 0.00
cat("Realized Portfolio Return (2015/02) — CAPM GMV:", round(realized_capm * 100, 4), "%\n")
## Realized Portfolio Return (2015/02) — CAPM GMV: 5.6648 %
# Bar chart
ggplot(weight_df, aes(x = reorder(ETF, Weight), y = Weight, fill = ETF)) +
  geom_col(show.legend = FALSE) +
  coord_flip() +
  geom_text(aes(label = paste0(Weight, "%")), hjust = -0.1, size = 3.5) +
  labs(title = "CAPM-Based GMV Portfolio Weights",
       subtitle = "Training window: 2010/02 – 2015/01",
       x = NULL, y = "Weight (%)") +
  theme_minimal(base_size = 13) +
  scale_fill_brewer(palette = "Set2") +
  ylim(0, max(weight_df$Weight) * 1.15)


2.7 Q7: FF3-Based GMV Portfolio (Single Month)

# Factor returns matrix
F_mat <- as.matrix(train_tbl[, c("Mkt.RF","SMB","HML")])

# Regress each asset on all 3 FF factors
B_mat      <- matrix(0, n_assets, 3)   # 8 x 3 loading matrix
resid_var2 <- numeric(n_assets)

for (i in seq_along(tickers)) {
  y   <- as.matrix(train_tbl[, tickers[i]]) - train_tbl$Mkt.RF
  fit <- lm(y ~ F_mat - 1)  # no intercept; factor loadings only (or with intercept)
  B_mat[i, ]    <- coef(fit)
  resid_var2[i] <- var(residuals(fit))
}
rownames(B_mat) <- tickers
colnames(B_mat) <- c("Mkt.RF","SMB","HML")

# Factor covariance matrix (3x3)
Sigma_f <- cov(F_mat)

# FF3 covariance: Sigma = B %*% Sigma_f %*% t(B) + D
Sigma_FF3 <- B_mat %*% Sigma_f %*% t(B_mat) + diag(resid_var2)

# Solve GMV
Dmat2 <- 2 * Sigma_FF3
gmv_ff3 <- solve.QP(Dmat2, dvec, Amat, bvec, meq = 1)
w_ff3   <- gmv_ff3$solution
names(w_ff3) <- tickers

realized_ff3 <- sum(w_ff3 * ret_201502)

# Display
weight_ff3_df <- data.frame(ETF = tickers, Weight = round(w_ff3 * 100, 2))

weight_ff3_df %>%
  kable(caption = "FF3-Based GMV Weights (%)", col.names = c("ETF","Weight (%)")) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"), full_width = FALSE)
FF3-Based GMV Weights (%)
ETF Weight (%)
SPY SPY 83.52
QQQ QQQ 1.85
EEM EEM 0.24
IWM IWM 13.60
EFA EFA 0.56
TLT TLT 0.00
IYR IYR 0.19
GLD GLD 0.06
cat("Realized Portfolio Return (2015/02) — FF3 GMV:", round(realized_ff3 * 100, 4), "%\n")
## Realized Portfolio Return (2015/02) — FF3 GMV: 5.6734 %
# Side-by-side comparison bar chart
compare_df <- data.frame(
  ETF    = rep(tickers, 2),
  Weight = c(w_capm, w_ff3) * 100,
  Model  = rep(c("CAPM","FF3"), each = n_assets)
)

ggplot(compare_df, aes(x = ETF, y = Weight, fill = Model)) +
  geom_col(position = "dodge") +
  labs(title = "GMV Weights: CAPM vs. FF3 Model",
       subtitle = "Training window: 2010/02 – 2015/01",
       x = NULL, y = "Weight (%)") +
  scale_fill_manual(values = c("CAPM" = "#2196F3", "FF3" = "#FF5722")) +
  theme_minimal(base_size = 13) +
  theme(legend.position = "top")


2.8 Q8: Rolling Backtest (2015/02 – 2026/05)

# All monthly dates available
all_dates <- merged_tbl$date

# Identify the index positions for rolling windows
# Training starts at 2010/02 (row 1 of merged_tbl), ends at 2015/01 (row 60)
# Out-of-sample starts 2015/02

# Find row where 2015-01 ends
start_oos_idx <- which(format(all_dates, "%Y-%m") == "2015-02")[1]

# Number of out-of-sample months
n_oos <- nrow(merged_tbl) - start_oos_idx + 1

# Initialize storage
oos_dates   <- all_dates[start_oos_idx:(start_oos_idx + n_oos - 1)]
ret_capm_roll <- numeric(n_oos)
ret_ff3_roll  <- numeric(n_oos)

for (t in seq_len(n_oos)) {
  # Row index of the out-of-sample month
  oos_row <- start_oos_idx + t - 1
  # Training window: 60 rows ending one row before oos_row
  train_end   <- oos_row - 1
  train_start <- train_end - 59

  if (train_start < 1) next

  train_w <- merged_tbl[train_start:train_end, ]

  # --- CAPM covariance ---
  mkt_ex_t <- as.numeric(train_w$Mkt.RF)
  betas_t  <- numeric(n_assets)
  rv_t     <- numeric(n_assets)
  for (i in seq_along(tickers)) {
    y_t <- as.numeric(train_w[[tickers[i]]]) - mkt_ex_t
    fit_t <- lm(y_t ~ mkt_ex_t)
    betas_t[i] <- coef(fit_t)[2]
    rv_t[i]    <- var(residuals(fit_t))
  }
  Sig_C <- betas_t %o% betas_t * var(mkt_ex_t) + diag(rv_t)

  # GMV CAPM
  gC <- tryCatch(
    solve.QP(2*Sig_C, rep(0,n_assets), Amat, bvec, meq=1)$solution,
    error = function(e) rep(1/n_assets, n_assets)
  )

  # --- FF3 covariance ---
  F_t <- as.matrix(train_w[, c("Mkt.RF","SMB","HML")])
  B_t <- matrix(0, n_assets, 3)
  rv2_t <- numeric(n_assets)
  for (i in seq_along(tickers)) {
    y_t  <- as.numeric(train_w[[tickers[i]]]) - mkt_ex_t
    fit_f <- lm(y_t ~ F_t - 1)
    B_t[i,]  <- coef(fit_f)
    rv2_t[i] <- var(residuals(fit_f))
  }
  Sig_F <- B_t %*% cov(F_t) %*% t(B_t) + diag(rv2_t)

  # GMV FF3
  gF <- tryCatch(
    solve.QP(2*Sig_F, rep(0,n_assets), Amat, bvec, meq=1)$solution,
    error = function(e) rep(1/n_assets, n_assets)
  )

  # OOS returns
  oos_r <- as.numeric(merged_tbl[oos_row, tickers])
  ret_capm_roll[t] <- sum(gC * oos_r)
  ret_ff3_roll[t]  <- sum(gF * oos_r)
}

# Build results data frame
backtest_df <- data.frame(
  date         = oos_dates,
  ret_CAPM     = ret_capm_roll,
  ret_FF3      = ret_ff3_roll
) %>% filter(date >= as.Date("2015-02-01"))

# Cumulative returns
backtest_df <- backtest_df %>%
  mutate(
    cum_CAPM = cumprod(1 + ret_CAPM),
    cum_FF3  = cumprod(1 + ret_FF3)
  )

2.8.1 Cumulative Return Chart

backtest_long <- backtest_df %>%
  select(date, cum_CAPM, cum_FF3) %>%
  pivot_longer(-date, names_to = "Model", values_to = "CumReturn") %>%
  mutate(Model = recode(Model,
                        "cum_CAPM" = "CAPM GMV",
                        "cum_FF3"  = "FF3 GMV"))

ggplot(backtest_long, aes(x = date, y = CumReturn, color = Model)) +
  geom_line(size = 1.1) +
  scale_color_manual(values = c("CAPM GMV" = "#2196F3", "FF3 GMV" = "#FF5722")) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 1)) +
  labs(title    = "Cumulative Returns: CAPM GMV vs. FF3 GMV",
       subtitle = "Rolling 60-month window backtest: 2015/02 – 2026/05",
       x = NULL, y = "Cumulative Return",
       color = "Strategy") +
  theme_minimal(base_size = 13) +
  theme(legend.position = "top")

2.8.2 Performance Summary Table

# Helper functions
ann_ret <- function(r) prod(1 + r)^(12/length(r)) - 1
ann_vol <- function(r) sd(r) * sqrt(12)
sharpe  <- function(r) ann_ret(r) / ann_vol(r)

max_drawdown <- function(r) {
  cum_r  <- cumprod(1 + r)
  peak   <- cummax(cum_r)
  dd     <- (cum_r - peak) / peak
  min(dd)
}

perf_df <- data.frame(
  Metric = c("Annualized Return", "Annualized Volatility",
             "Sharpe Ratio (rf=0)", "Max Drawdown"),
  CAPM_GMV = c(
    paste0(round(ann_ret(backtest_df$ret_CAPM)*100, 2), "%"),
    paste0(round(ann_vol(backtest_df$ret_CAPM)*100, 2), "%"),
    round(sharpe(backtest_df$ret_CAPM), 3),
    paste0(round(max_drawdown(backtest_df$ret_CAPM)*100, 2), "%")
  ),
  FF3_GMV = c(
    paste0(round(ann_ret(backtest_df$ret_FF3)*100, 2), "%"),
    paste0(round(ann_vol(backtest_df$ret_FF3)*100, 2), "%"),
    round(sharpe(backtest_df$ret_FF3), 3),
    paste0(round(max_drawdown(backtest_df$ret_FF3)*100, 2), "%")
  )
)

perf_df %>%
  kable(caption = "Performance Comparison: CAPM GMV vs. FF3 GMV (2015/02–2026/05)",
        col.names = c("Metric","CAPM GMV","FF3 GMV")) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"), full_width = FALSE)
Performance Comparison: CAPM GMV vs. FF3 GMV (2015/02–2026/05)
Metric CAPM GMV FF3 GMV
Annualized Return 13.75% 13.47%
Annualized Volatility 15.28% 15.67%
Sharpe Ratio (rf=0) 0.899 0.86
Max Drawdown -24.68% -24.88%

2.8.3 Drawdown Chart

# Compute drawdown series
dd_series <- backtest_df %>%
  mutate(
    cum_CAPM_peak = cummax(cum_CAPM),
    cum_FF3_peak  = cummax(cum_FF3),
    dd_CAPM       = (cum_CAPM - cum_CAPM_peak) / cum_CAPM_peak,
    dd_FF3        = (cum_FF3  - cum_FF3_peak)  / cum_FF3_peak
  ) %>%
  select(date, dd_CAPM, dd_FF3) %>%
  pivot_longer(-date, names_to = "Model", values_to = "Drawdown") %>%
  mutate(Model = recode(Model,
                        "dd_CAPM" = "CAPM GMV",
                        "dd_FF3"  = "FF3 GMV"))

ggplot(dd_series, aes(x = date, y = Drawdown, fill = Model)) +
  geom_area(alpha = 0.5, position = "identity") +
  scale_fill_manual(values = c("CAPM GMV" = "#2196F3", "FF3 GMV" = "#FF5722")) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 1)) +
  labs(title    = "Drawdown Chart: CAPM GMV vs. FF3 GMV",
       subtitle = "Rolling 60-month window backtest: 2015/02 – 2026/05",
       x = NULL, y = "Drawdown",
       fill = "Strategy") +
  theme_minimal(base_size = 13) +
  theme(legend.position = "top")