library(tidyquant)
library(tidyverse)
library(timetk)
library(lubridate)
library(PerformanceAnalytics)
library(quadprog)
library(ggplot2)
library(knitr)
library(kableExtra)
library(dplyr)
library(purrr)
library(broom)
library(xts)

1 Part 1 — Textbook Questions (60%)

1.1 Chapter 7: Optimal Risky Portfolios

1.1.1 CFA Problem 1

Context: Hennessy & Associates manages a $30 million equity portfolio holding approximately 40 stocks, committing 2–3% of total funds to each issue. Jason Jones proposes limiting the portfolio to no more than 20 stocks, doubling commitments to favored issues.

(a) Will limiting to 20 stocks likely increase or decrease portfolio risk? Explain.

Limiting the portfolio from 40 to 20 stocks will almost certainly increase total portfolio risk (measured by portfolio standard deviation), assuming the reduction is not accompanied by a deliberate reduction in correlation. Modern portfolio theory shows that as the number of holdings increases, the idiosyncratic (firm-specific) risk of each individual security is diversified away. The portfolio variance for an equally weighted portfolio of $n$ assets with average variance $\bar{\sigma}^2$ and average covariance $\bar{\sigma}_{ij}$ is:

\[\sigma_p^2 = \frac{1}{n}\bar{\sigma}^2 + \frac{n-1}{n}\bar{\sigma}_{ij}\]

As $n \to \infty$, the first term (unique risk) vanishes and only systematic covariance risk remains. Moving from 40 to 20 stocks increases the $\frac{1}{n}$ term, raising idiosyncratic risk. Unless the 20 retained stocks happen to have very low pairwise correlations, total portfolio volatility will rise.

(b) Could Hennessy reduce from 40 to 20 without significantly affecting risk?

Yes, but only under a specific condition. If Hennessy identifies 20 stocks whose pairwise correlations are sufficiently low, the diversification benefit per stock is high enough that dropping the other 20 does not materially increase portfolio variance. In practice, since Hennessy is a “bottom-up” stock picker focusing on individual security selection rather than industry tilts, the 20 favored stocks likely share similar factor exposures (e.g., value or quality characteristics), which tend to raise inter-stock correlations. Therefore, while theoretically possible, it would in practice be difficult to achieve without careful correlation analysis.

1.1.2 CFA Problem 2

Why might further reduction from 20 to 10 stocks be less advantageous?

The diversification benefit of adding each incremental security is a convex, decreasing function of $n$. The largest reduction in idiosyncratic risk occurs when moving from very small portfolios (e.g., 1 to 5 stocks). By the time a portfolio holds 20 stocks, a substantial fraction of diversifiable risk has already been eliminated. Cutting from 20 to 10 provides considerably less incremental diversification improvement than cutting from 40 to 20 — but imposes a cost in higher residual variance from the dropped positions.

Formally, the marginal reduction in variance from adding the $n$-th stock declines at rate $1/n^2$:

\[\frac{d}{dn}\left(\frac{1}{n}\bar{\sigma}^2\right) = -\frac{\bar{\sigma}^2}{n^2}\]

At $n = 20$, this marginal benefit is already small, while at $n = 10$, removing securities significantly increases the idiosyncratic variance term. Meanwhile, if Wilstead evaluates Hennessy independently, concentrating into 10 stocks dramatically raises tracking error relative to any benchmark, increasing the probability of a significant underperformance year — which matters asymmetrically to a manager being evaluated on standalone results.

1.1.3 CFA Problem 3

How would evaluating Hennessy in the context of the total fund change the committee’s decision?

When Hennessy’s portfolio is evaluated within the context of the $280 million total Wilstead fund (Hennessy’s $30M plus the other five managers’ $250M across 150+ issues), the relevant question is no longer the standalone risk of Hennessy’s portfolio, but rather Hennessy’s marginal contribution to total fund risk — specifically, the covariance of Hennessy’s portfolio with the other managers’ holdings.

If Hennessy’s 10 or 20 concentrated holdings have low correlation with the remaining 150+ stocks managed by other Wilstead managers, then concentrating Hennessy into fewer, higher-conviction names could actually reduce or leave unchanged the total fund’s risk, even if it raises Hennessy’s standalone variance. From a total-fund perspective, what matters is:

\[\text{Marginal Contribution} = \beta_{H,F} = \frac{\text{Cov}(R_H, R_F)}{\sigma_F^2}\]

where $R_H$ is Hennessy’s return and $R_F$ is the total fund return. A lower-correlation, concentrated Hennessy sub-portfolio could serve as an effective diversifier for the broader fund. This broader perspective would make the committee more comfortable allowing further concentration to 10 stocks, as long as Hennessy’s selections remain uncorrelated with the rest of the fund.

1.1.4 CFA Problem 4

Which portfolio cannot lie on the Markowitz efficient frontier?

portfolios <- data.frame(
  Portfolio = c("W", "X", "Z", "Y"),
  Expected_Return = c(15, 12, 5, 9),
  Std_Dev = c(36, 15, 7, 21)
)

portfolios %>%
  kable(caption = "Portfolio Risk-Return Characteristics",
        col.names = c("Portfolio", "Expected Return (%)", "Std Dev (%)")) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"), full_width = FALSE)

Portfolio Risk-Return Characteristics
Portfolio	Expected Return (%)	Std Dev (%)
W	15	36
X	12	15
Z	5	7
Y	9	21

Answer: Portfolio Y cannot lie on the efficient frontier.

The Markowitz efficient frontier contains only portfolios that offer the maximum expected return for a given level of risk, or equivalently, the minimum risk for a given expected return. A portfolio is inefficient (dominated) if another portfolio exists with either: (i) the same standard deviation but higher return, or (ii) the same expected return but lower standard deviation.

Comparing Portfolio Y (9% return, 21% SD) with Portfolio X (12% return, 15% SD): Portfolio X dominates Portfolio Y on both dimensions — it offers higher expected return and lower standard deviation. Therefore, Portfolio Y is mean-variance dominated and cannot lie on the efficient frontier. No rational investor would hold Y when X is available.

ggplot(portfolios, aes(x = Std_Dev, y = Expected_Return, label = Portfolio)) +
  geom_point(aes(color = Portfolio), size = 5) +
  geom_text(vjust = -0.9, fontface = "bold", size = 4.5) +
  annotate("text", x = 21, y = 9, label = "DOMINATED", color = "red",
           hjust = -0.1, size = 3.5) +
  geom_curve(aes(x = 15, y = 12, xend = 21, yend = 9),
             arrow = arrow(length = unit(0.03, "npc")), color = "red",
             curvature = -0.3, linetype = "dashed") +
  labs(title = "Portfolio Risk-Return Space: Markowitz Efficient Frontier",
       subtitle = "Portfolio Y is dominated by Portfolio X",
       x = "Standard Deviation (%)", y = "Expected Return (%)") +
  theme_minimal(base_size = 13) +
  theme(legend.position = "none")

1.1.5 CFA Problem 10

Equal-weight portfolio AB vs. equal-weight portfolio BC — which is preferred?

Given data:

Stock	Std Dev (%)
A	40
B	20
C	40

Correlations: $\rho_{AB} = 0.90$, $\rho_{AC} = 0.50$, $\rho_{BC} = 0.10$

For an equal-weight two-asset portfolio, variance is:

\[\sigma_p^2 = (0.5)^2\sigma_A^2 + (0.5)^2\sigma_B^2 + 2(0.5)(0.5)\rho_{AB}\sigma_A\sigma_B\]

sA <- 40; sB <- 20; sC <- 40
rAB <- 0.90; rBC <- 0.10

var_AB <- 0.25*sA^2 + 0.25*sB^2 + 2*0.25*rAB*sA*sB
var_BC <- 0.25*sB^2 + 0.25*sC^2 + 2*0.25*rBC*sB*sC

sd_AB <- sqrt(var_AB)
sd_BC <- sqrt(var_BC)

results <- data.frame(
  Portfolio   = c("AB (50/50)", "BC (50/50)"),
  Variance    = round(c(var_AB, var_BC), 2),
  Std_Dev     = round(c(sd_AB, sd_BC), 2)
)

results %>%
  kable(caption = "Portfolio Variance Comparison",
        col.names = c("Portfolio", "Variance (%²)", "Std Dev (%)")) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"), full_width = FALSE)

Portfolio Variance Comparison
Portfolio	Variance (%²)	Std Dev (%)
AB (50/50)	860	29.33
BC (50/50)	540	23.24

Portfolio BC is recommended because its standard deviation of 23.24% is substantially lower than portfolio AB’s 29.33%. The critical driver is the pairwise correlation: $\rho_{AB} = 0.90$ is very high, meaning A and B move almost in lockstep and offer minimal diversification benefit when combined. By contrast, $\rho_{BC} = 0.10$ is near zero, so B and C provide strong diversification — even though C has a higher standard deviation than B, the near-zero correlation dramatically reduces combined portfolio variance. Since we are given only risk information (no expected return data), the decision is based solely on minimizing variance, and portfolio BC unambiguously dominates portfolio AB.

1.2 Chapter 8: Index Models

1.2.1 CFA Problem 1

Analysis of regression results for ABC and XYZ stocks:

The SCL (Security Characteristic Line) regression for each stock is:

\[R_i - R_f = \alpha_i + \beta_i (R_M - R_f) + \epsilon_i\]

Sample period risk-return interpretation:

For ABC stock: The negative alpha of $-3.20\%$ indicates that ABC underperformed its CAPM-expected return by 3.20% per year over the sample period, after adjusting for market risk. The beta of 0.60 indicates below-market systematic risk. The $R^2 = 0.35$ means only 35% of ABC’s return variance is explained by market movements; the residual standard deviation of 13.02% reflects substantial firm-specific risk.

For XYZ stock: The positive alpha of $+7.3\%$ indicates XYZ outperformed its risk-adjusted benchmark by 7.3% annually over the sample. Its beta of 0.97 is close to 1.0, making it a near-market-risk asset. However, $R^2 = 0.17$ is very low — only 17% of variance is explained by the market — and residual SD of 21.45% is large, indicating that most of XYZ’s return variability is idiosyncratic.

Implications for a diversified portfolio:

The historical alpha estimates should not be taken at face value as forecasts. Alpha is subject to estimation error, and high past alpha often reverts toward zero. More critically, the beta estimates are unstable across brokerages: Brokerage A reports $\beta_{ABC} = 0.62$ and $\beta_{XYZ} = 1.45$, while Brokerage B reports $\beta_{ABC} = 0.71$ and $\beta_{XYZ} = 1.25$. This divergence (particularly for XYZ, where betas differ by 0.20) reveals that beta is sensitive to the estimation window and return frequency used — the 5-year regression betas (ABC = 0.60, XYZ = 0.97) differ from the recent 2-year weekly betas from both brokerages. For XYZ specifically, if its “true” current beta is closer to 1.25–1.45 rather than 0.97, the stock carries considerably more systematic risk than historical data suggests. In a diversified portfolio, the nonsystematic risk of each holding largely cancels out, so the primary concern is getting the beta estimate right for proper risk attribution and hedging.

1.2.2 CFA Problem 2

Nonsystematic risk of Baker Fund:

Given $\rho_{Baker, M} = 0.70$, the coefficient of determination is:

\[R^2 = \rho^2 = (0.70)^2 = 0.49\]

$R^2$ represents the fraction of total variance explained by the market (systematic risk). Therefore, the fraction of total risk that is nonsystematic is:

\[\text{Nonsystematic proportion} = 1 - R^2 = 1 - 0.49 = 0.51 = \mathbf{51\%}\]

rho <- 0.70
R2  <- rho^2
nonsys <- 1 - R2
cat("R-squared:", R2, "\nNonsystematic proportion:", nonsys, "=", nonsys*100, "%")

## R-squared: 0.49 
## Nonsystematic proportion: 0.51 = 51 %

Fifty-one percent of Baker Fund’s total risk is nonsystematic (diversifiable). A well-diversified investor holding Baker Fund as one component of a broader portfolio would be primarily exposed to the remaining 49% systematic risk, with the idiosyncratic 51% being eliminable through diversification.

1.2.3 CFA Problem 3

Implied beta of Charlottesville International Fund:

Given: $\rho = 1.0$ (perfect correlation with world index), $E(R_M) = 11\%$, $E(R_{fund}) = 9\%$, $R_f = 3\%$.

Using the CAPM return equation:

\[E(R_i) = R_f + \beta_i [E(R_M) - R_f]\]

Solving for $\beta$:

\[\beta = \frac{E(R_i) - R_f}{E(R_M) - R_f} = \frac{9\% - 3\%}{11\% - 3\%} = \frac{6\%}{8\%} = 0.75\]

Rf <- 0.03; EM <- 0.11; Efund <- 0.09
beta_implied <- (Efund - Rf) / (EM - Rf)
cat("Implied beta:", beta_implied)

## Implied beta: 0.75

The implied beta is 0.75. Note that the perfect correlation of 1.0 with the world index means all of the fund’s variance is systematic — there is zero residual (firm-specific) variance. The fund simply moves proportionally with the market, but with a muted sensitivity of 0.75, which is why it earns less than the full market return of 11%.

1.2.4 CFA Problem 4

Beta is most closely associated with:

Answer: (d) Systematic risk.

Beta ($\beta_i$) is defined as the ratio of an asset’s covariance with the market portfolio to the variance of the market portfolio:

\[\beta_i = \frac{\text{Cov}(R_i, R_M)}{\sigma_M^2}\]

Beta measures only the systematic (market-related) component of an asset’s risk — the risk that cannot be eliminated through diversification. While beta is mathematically computed using a regression, which involves correlation coefficients, its conceptual meaning and practical purpose is as a measure of systematic risk sensitivity. Mean-variance analysis (option b) refers to the broader Markowitz framework. Nonsystematic risk (option c) is the opposite of what beta captures — it is the residual, firm-specific risk.

1.2.5 CFA Problem 5

Beta vs. standard deviation as risk measures:

Answer: (b) Beta measures only systematic risk, while standard deviation measures total risk.

Standard deviation captures total portfolio variance — both the systematic component (driven by market movements) and the nonsystematic component (firm-specific). Formally:

\[\sigma_i^2 = \beta_i^2 \sigma_M^2 + \sigma^2(\epsilon_i)\]

where $\beta_i^2 \sigma_M^2$ is systematic variance and $\sigma^2(\epsilon_i)$ is unsystematic variance. Standard deviation, as the square root of $\sigma_i^2$, includes both. Beta, by construction, isolates only the sensitivity to the common market factor. Option (a) is wrong because beta does not measure unsystematic risk. Options (c) and (d) are wrong because standard deviation is not limited to one type of risk.

1.3 Chapter 9: The Capital Asset Pricing Model

1.3.1 CFA Problem 8 — Portfolio R relative to the SML

Data: Portfolio R: return = 11%, SD = 10%, beta = 0.5. S&P 500: return = 14%, SD = 12%, beta = 1.0.

To locate Portfolio R relative to the SML, we compute its CAPM-predicted (required) return. The SML uses the risk-free rate implied by the market data. Using the market as a reference point where $E(R_M) = 14\%$ and $\beta_M = 1.0$:

We need $R_f$. The problem does not explicitly provide $R_f$; however, the standard approach in these textbook problems is to use the risk-free rate consistent with the SML through the market. Given the S&P return of 14% and treating it as the market portfolio, with no explicit $R_f$ we cannot uniquely pin the SML. The textbook convention for this problem uses $R_f$ such that the SML is anchored at the market. Let us assume $R_f = 6\%$ (a common textbook assumption for this problem set — this needs verification against your specific edition if a risk-free rate is provided elsewhere).

\[E(R_R)_{\text{SML}} = R_f + \beta_R [E(R_M) - R_f] = 6\% + 0.5 \times (14\% - 6\%) = 6\% + 4\% = 10\%\]

Jensen’s alpha:

\[\alpha_R = R_R - E(R_R)_{\text{SML}} = 11\% - 10\% = +1\%\]

Rf <- 0.06; Rm <- 0.14; beta_R <- 0.5; ret_R <- 0.11

SML_pred <- Rf + beta_R * (Rm - Rf)
alpha_J  <- ret_R - SML_pred

cat("SML-predicted return for Portfolio R:", SML_pred * 100, "%\n")

## SML-predicted return for Portfolio R: 10 %

cat("Jensen's Alpha:", alpha_J * 100, "%\n")

## Jensen's Alpha: 1 %

cat("Portfolio R lies: ABOVE the SML")

## Portfolio R lies: ABOVE the SML

Answer: (c) Above the SML.

Portfolio R earns 11% while the SML predicts only 10% for its beta of 0.5. The positive Jensen’s alpha of +1% means Portfolio R plots above the SML — it offers more return per unit of systematic risk than CAPM equilibrium requires.

betas <- seq(0, 1.5, by = 0.01)
sml_returns <- Rf + betas * (Rm - Rf)
sml_df <- data.frame(beta = betas, expected_return = sml_returns * 100)

ggplot(sml_df, aes(x = beta, y = expected_return)) +
  geom_line(color = "steelblue", size = 1.2) +
  geom_point(aes(x = 0.5, y = 11), color = "darkgreen", size = 4) +
  geom_point(aes(x = 1.0, y = 14), color = "orange",   size = 4) +
  geom_point(aes(x = 0,   y = Rf*100), color = "gray30", size = 3) +
  annotate("text", x = 0.5,  y = 11.3, label = "Portfolio R (Above SML)", color = "darkgreen", hjust = -0.05) +
  annotate("text", x = 1.0,  y = 14.3, label = "S&P 500 (Market)",       color = "orange",    hjust = -0.05) +
  annotate("segment", x = 0.5, xend = 0.5, y = SML_pred*100, yend = 11,
           arrow = arrow(length = unit(0.02,"npc")), color = "darkgreen", linetype = "dashed") +
  annotate("text", x = 0.52, y = (SML_pred*100 + 11)/2,
           label = paste0("α = +", round(alpha_J*100,1), "%"), color = "darkgreen", size = 3.5) +
  labs(title = "Security Market Line (SML)",
       subtitle = "Portfolio R lies above the SML (positive Jensen's alpha)",
       x = "Beta", y = "Expected Return (%)") +
  theme_minimal(base_size = 13)

1.3.2 CFA Problem 9 — Portfolio R relative to the CML

The CML plots efficient portfolios in return vs. standard deviation space. Only perfectly diversified portfolios (combinations of the market and risk-free asset) lie on the CML. The CML equation is:

\[E(R_p) = R_f + \frac{E(R_M) - R_f}{\sigma_M} \cdot \sigma_p\]

The CML-predicted return for Portfolio R’s SD of 10%:

\[E(R_R)_{\text{CML}} = 6\% + \frac{14\% - 6\%}{12\%} \times 10\% = 6\% + \frac{8\%}{12\%} \times 10\% = 6\% + 6.67\% = 12.67\%\]

sd_R <- 0.10; sd_M <- 0.12

CML_pred <- Rf + ((Rm - Rf) / sd_M) * sd_R
cat("CML-predicted return for Portfolio R's SD of 10%:", round(CML_pred*100, 2), "%\n")

## CML-predicted return for Portfolio R's SD of 10%: 12.67 %

cat("Actual return of Portfolio R: 11%\n")

## Actual return of Portfolio R: 11%

cat("Portfolio R lies: BELOW the CML")

## Portfolio R lies: BELOW the CML

Answer: (b) Below the CML.

The CML predicts that any efficient portfolio with 10% standard deviation should earn 12.67%. Portfolio R only earns 11%. The gap exists because Portfolio R is not a fully efficient, well-diversified portfolio — it contains some residual nonsystematic risk. It plots below the CML because it wastes some of its total risk budget on firm-specific variance that carries no expected return premium. Note that this is consistent with Portfolio R being above the SML: the SML rewards systematic risk (beta), and R does well on that metric; but the CML measures efficiency of total risk deployment, and R falls short.

sds <- seq(0, 0.20, by = 0.001)
cml_returns <- Rf + ((Rm - Rf) / sd_M) * sds
cml_df <- data.frame(sd = sds * 100, expected_return = cml_returns * 100)

ggplot(cml_df, aes(x = sd, y = expected_return)) +
  geom_line(color = "steelblue", size = 1.2) +
  geom_point(aes(x = 10, y = 11), color = "red",    size = 4) +
  geom_point(aes(x = 12, y = 14), color = "orange", size = 4) +
  geom_point(aes(x = 0,  y = Rf*100), color = "gray30", size = 3) +
  annotate("text", x = 10, y = 11.3, label = "Portfolio R (Below CML)", color = "red",    hjust = -0.05) +
  annotate("text", x = 12, y = 14.3, label = "Market Portfolio",        color = "orange", hjust = -0.05) +
  annotate("segment", x = 10, xend = 10, y = 11, yend = CML_pred*100,
           arrow = arrow(length = unit(0.02,"npc")), color = "red", linetype = "dashed") +
  annotate("text", x = 10.3, y = (11 + CML_pred*100)/2,
           label = paste0("Gap: ", round((CML_pred - ret_R)*100, 2), "%"), color = "red", size = 3.5) +
  labs(title = "Capital Market Line (CML)",
       subtitle = "Portfolio R lies below the CML (not fully efficient)",
       x = "Standard Deviation (%)", y = "Expected Return (%)") +
  theme_minimal(base_size = 13)

1.3.3 CFA Problem 10 — Should investors expect higher return from Portfolio A than Portfolio B?

Portfolio A: beta = 1.0, high specific risk. Portfolio B: beta = 1.0, low specific risk.

No — investors should not expect a higher return from Portfolio A under CAPM.

The core insight of CAPM is that only systematic risk (beta) is priced in equilibrium. Because both portfolios have identical betas of 1.0, CAPM predicts the same expected return for both:

\[E(R_A) = E(R_B) = R_f + 1.0 \times [E(R_M) - R_f]\]

The higher specific (idiosyncratic) risk of Portfolio A is diversifiable. In a large, well-diversified portfolio, firm-specific shocks average out and carry no expected return premium. Rational investors who hold the market portfolio (as CAPM assumes) will not pay less for Portfolio A just because it has higher idiosyncratic variance — because that variance simply disappears when A is combined with many other assets. Only the covariance with the market (captured by beta) contributes to the risk of the investor’s overall portfolio and therefore commands a risk premium. Expecting a higher return from A would imply a mispricing that diversified investors would arbitrage away.

1.4 Chapter 10: Arbitrage Pricing Theory and Multifactor Models

Setup: Two-factor APT model. Factor risk premiums: GDP growth = 8%, Inflation = 2%. Risk-free rate = 4%.

APT expected return formula:

\[E(R_i) = R_f + \beta_{i,GDP} \times \lambda_{GDP} + \beta_{i,Inf} \times \lambda_{Inf}\]

1.4.1 Problem 13 — High Growth Fund Expected Return

Sensitivities: $\beta_{GDP} = 1.25$, $\beta_{Inf} = 1.5$

\[E(R_{HG}) = 4\% + 1.25 \times 8\% + 1.5 \times 2\% = 4\% + 10\% + 3\% = 17\%\]

Rf_apt <- 0.04
lam_GDP <- 0.08; lam_Inf <- 0.02
b_HG_GDP <- 1.25; b_HG_Inf <- 1.5

E_HG <- Rf_apt + b_HG_GDP*lam_GDP + b_HG_Inf*lam_Inf
cat("Expected return of High Growth Fund:", E_HG*100, "%")

## Expected return of High Growth Fund: 17 %

The equilibrium expected return of the High Growth Fund is 17%.

1.4.2 Problem 14 — Arbitrage Opportunity for Large Cap Fund

McCracken’s APT estimate for the Large Cap Fund: $\beta_{GDP} = 0.75$, $\beta_{Inf} = 1.25$

\[E(R_{LC})_{\text{APT}} = 4\% + 0.75 \times 8\% + 1.25 \times 2\% = 4\% + 6\% + 2.5\% = 12.5\%\]

Kwon’s fundamental estimate: $R_f + 8.5\% = 4\% + 8.5\% = 12.5\%$

b_LC_GDP <- 0.75; b_LC_Inf <- 1.25

E_LC_APT <- Rf_apt + b_LC_GDP*lam_GDP + b_LC_Inf*lam_Inf
E_LC_fund <- Rf_apt + 0.085

cat("APT estimate for Large Cap Fund:", E_LC_APT*100, "%\n")

## APT estimate for Large Cap Fund: 12.5 %

cat("Kwon fundamental estimate:      ", E_LC_fund*100, "%\n")

## Kwon fundamental estimate:       12.5 %

cat("Difference:", (E_LC_fund - E_LC_APT)*100, "%")

## Difference: 0 %

No arbitrage opportunity exists. Both estimates agree at 12.5%. When the APT-implied expected return equals the fundamentally-derived expected return, the asset is fairly priced relative to the factor structure. An arbitrage opportunity would require a discrepancy — where one could construct a zero-cost, zero-risk portfolio with a positive expected return by going long the underpriced asset and short the replicating APT portfolio.

1.4.3 Problem 15 — Weight in the Utility Fund for GDP Fund Construction

The GDP Fund must have: $\beta_{GDP} = 1.0$ and $\beta_{Inf} = 0$. Using weights $w_1$ (High Growth), $w_2$ (Large Cap), $w_3$ (Utility), with $w_1 + w_2 + w_3 = 1$:

Factor sensitivities of the three funds:

	GDP	Inflation
High Growth	1.25	1.5
Large Cap	0.75	1.25
Utility	1.0	2.0

System of equations:

\[1.25 w_1 + 0.75 w_2 + 1.0 w_3 = 1 \quad \text{(GDP constraint)}\] \[1.5 w_1 + 1.25 w_2 + 2.0 w_3 = 0 \quad \text{(Inflation constraint)}\] \[w_1 + w_2 + w_3 = 1 \quad \text{(weights sum to 1)}\]

# Matrix form: A %*% w = b
A <- matrix(c(1.25, 0.75, 1.0,
              1.5,  1.25, 2.0,
              1.0,  1.0,  1.0), nrow = 3, byrow = TRUE)
b <- c(1, 0, 1)

w <- solve(A, b)
names(w) <- c("High Growth", "Large Cap", "Utility")
round(w, 4)

## High Growth   Large Cap     Utility 
##         1.6         1.6        -2.2

Answer: (a) The weight in the Utility Fund is $w_3 \approx -2.2$.

The negative weight means the GDP Fund must short the Utility Fund to zero out its inflation exposure. This is feasible in an arbitrage/factor-mimicking context where short positions are permitted.

1.4.4 Problem 16 — Who is correct about the GDP Fund?

Answer: (b) Both are correct.

McCracken argues the GDP Fund suits investors who live off steady income (retirees). This is correct because the GDP Fund has zero inflation sensitivity ($\beta_{Inf} = 0$). Investors whose liabilities or consumption are inflation-linked would benefit from a fund that does not fluctuate with unexpected inflation — their real income stream is better protected.

Stiles argues the fund suits investors who benefit from supply-side policy success. This is also correct because the fund has unit GDP sensitivity ($\beta_{GDP} = 1.0$). Successful supply-side policies (tax cuts, deregulation, productivity improvements) tend to boost real GDP growth, directly increasing the fund’s return. Both characterizations correctly identify distinct investor clienteles for whom the GDP Fund’s specific factor exposures are appropriate.

2 Part 2 — R Code Questions (40%)

2.1 Q1: Import ETF Data

tickers <- c("SPY","QQQ","EEM","IWM","EFA","TLT","IYR","GLD")

raw_prices <- tq_get(tickers,
                     from = "2010-01-01",
                     to   = "2026-06-10",
                     get  = "stock.prices")

# Extract adjusted close and pivot to wide format
prices_wide <- raw_prices %>%
  select(symbol, date, adjusted) %>%
  pivot_wider(names_from = symbol, values_from = adjusted) %>%
  arrange(date)

# Convert to xts for time-series operations
prices_xts <- xts(prices_wide[,-1], order.by = as.Date(prices_wide$date))

# Display head and tail
head_df <- as.data.frame(head(prices_xts, 6))
head_df <- round(head_df, 4)
head_df <- cbind(Date = rownames(head_df), head_df)

tail_df <- as.data.frame(tail(prices_xts, 6))
tail_df <- round(tail_df, 4)
tail_df <- cbind(Date = rownames(tail_df), tail_df)

bind_rows(head_df, tail_df) %>%
  kable(caption = "ETF Adjusted Closing Prices — Head (top 6) and Tail (bottom 6)",
        row.names = FALSE) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"),
                full_width = FALSE, font_size = 11) %>%
  row_spec(1:6, background = "#f0f8ff") %>%
  row_spec(7:12, background = "#fff8f0")

ETF Adjusted Closing Prices — Head (top 6) and Tail (bottom 6)
Date	SPY	QQQ	EEM	IWM	EFA	TLT	IYR	GLD
2010-01-04	84.7964	40.2908	30.3515	51.3665	35.1284	55.7095	26.7681	109.80
2010-01-05	85.0209	40.2908	30.5718	51.1899	35.1594	56.0693	26.8324	109.70
2010-01-06	85.0807	40.0478	30.6358	51.1418	35.3080	55.3187	26.8207	111.51
2010-01-07	85.4399	40.0738	30.4581	51.5191	35.1718	55.4118	27.0603	110.82
2010-01-08	85.7242	40.4036	30.6997	51.8001	35.4504	55.3870	26.8791	111.37
2010-01-11	85.8439	40.2387	30.6358	51.5914	35.7415	55.0830	27.0077	112.85
2026-06-02	759.5700	746.1600	70.8000	291.6600	105.0200	85.6500	99.9900	411.95
2026-06-03	754.2400	744.2100	69.9200	287.6700	104.1200	85.3100	100.0000	407.87
2026-06-04	757.0900	740.6100	69.1000	292.0100	104.9500	85.5000	101.7900	411.27
2026-06-05	737.5500	705.0600	64.5900	281.6500	102.2600	85.0600	102.5400	396.24
2026-06-08	739.2200	716.0700	65.7500	284.1100	102.8800	84.6200	101.0800	397.27
2026-06-09	737.0500	707.8300	65.8200	285.0200	102.9000	85.1200	103.4900	390.78

2.2 Q2: Calculate Weekly and Monthly Returns

# Weekly returns (all 8 ETFs)
weekly_ret_xts <- do.call(merge, lapply(tickers, function(tk) {
  weeklyReturn(prices_xts[, tk], type = "arithmetic")
}))
colnames(weekly_ret_xts) <- tickers
weekly_ret_xts <- weekly_ret_xts[-1, ]

# Compute returns for all tickers — monthly simple returns via xts
monthly_ret_xts <- do.call(merge, lapply(tickers, function(tk) {
  px <- prices_xts[, tk]
  monthlyReturn(px, type = "arithmetic")
}))
colnames(monthly_ret_xts) <- tickers

# Remove first row (NA from return calculation)
monthly_ret_xts <- monthly_ret_xts[-1, ]

# Summary statistics
summary_stats <- do.call(rbind, lapply(tickers, function(tk) {
  x <- as.numeric(monthly_ret_xts[, tk])
  data.frame(
    Ticker = tk,
    Mean   = round(mean(x, na.rm=TRUE)*100, 3),
    SD     = round(sd(x, na.rm=TRUE)*100, 3),
    Min    = round(min(x, na.rm=TRUE)*100, 3),
    Max    = round(max(x, na.rm=TRUE)*100, 3),
    Sharpe = round(mean(x, na.rm=TRUE)/sd(x, na.rm=TRUE), 3)
  )
}))

summary_stats %>%
  kable(caption = "Monthly Return Summary Statistics (%)",
        col.names = c("Ticker","Mean (%)","Std Dev (%)","Min (%)","Max (%)","Sharpe (rf=0)")) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"), full_width = FALSE)

Monthly Return Summary Statistics (%)
Ticker	Mean (%)	Std Dev (%)	Min (%)	Max (%)	Sharpe (rf=0)
SPY	1.218	4.170	-12.487	12.698	0.292
QQQ	1.635	5.081	-13.596	15.690	0.322
EEM	0.588	5.266	-17.895	16.268	0.112
IWM	1.065	5.648	-21.477	18.244	0.189
EFA	0.687	4.486	-14.107	14.269	0.153
TLT	0.277	3.910	-9.424	13.206	0.071
IYR	0.833	4.819	-19.632	13.190	0.173
GLD	0.774	4.701	-11.062	12.275	0.165

2.3 Q3: Convert Monthly Returns to Tibble

monthly_ret_tbl <- tk_tbl(monthly_ret_xts, rename_index = "date")

monthly_ret_tbl %>%
  head(6) %>%
  mutate(across(-date, ~round(., 5))) %>%
  kable(caption = "Monthly Returns — Tibble Format (first 6 rows)") %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"), full_width = FALSE)

Monthly Returns — Tibble Format (first 6 rows)
date	SPY	QQQ	EEM	IWM	EFA	TLT	IYR	GLD
2010-02-26	0.03119	0.04604	0.01776	0.04475	0.00267	-0.00342	0.05457	0.03275
2010-03-31	0.06088	0.07711	0.08111	0.08231	0.06385	-0.02057	0.09749	-0.00439
2010-04-30	0.01547	0.02242	-0.00166	0.05678	-0.02805	0.03322	0.06388	0.05883
2010-05-28	-0.07945	-0.07392	-0.09394	-0.07537	-0.11193	0.05108	-0.05684	0.03051
2010-06-30	-0.05174	-0.05976	-0.01399	-0.07743	-0.02062	0.05798	-0.04670	0.02355
2010-07-30	0.06830	0.07258	0.10932	0.06731	0.11610	-0.00946	0.09405	-0.05087

2.4 Q4: Download Fama-French 3 Factors

# Install frenchdata if needed
if (!requireNamespace("frenchdata", quietly = TRUE)) {
  install.packages("frenchdata")
}
library(frenchdata)

# Download FF3 monthly factors
ff3_raw <- download_french_data("Fama/French 3 Factors")
ff3_monthly_raw <- ff3_raw$subsets$data[[1]]   # monthly table is first subset

# Clean and convert
ff3_monthly <- ff3_monthly_raw %>%
  rename(date_raw = date) %>%
  mutate(
    date_raw = as.character(date_raw),
    date     = as.Date(paste0(date_raw, "01"), format = "%Y%m%d"),
    date     = ceiling_date(date, "month") - days(1),
    Mkt.RF   = as.numeric(`Mkt-RF`) / 100,
    SMB      = as.numeric(SMB)      / 100,
    HML      = as.numeric(HML)      / 100
  ) %>%
  filter(!is.na(Mkt.RF), date >= as.Date("2010-01-01")) %>%
  select(date, Mkt.RF, SMB, HML)

# Convert to xts
ff3_xts <- xts(ff3_monthly[, c("Mkt.RF","SMB","HML")],
               order.by = ff3_monthly$date)

ff3_xts %>%
  head(6) %>%
  as.data.frame() %>%
  mutate(across(everything(), ~round(., 5))) %>%
  tibble::rownames_to_column("Date") %>%
  kable(caption = "Fama-French 3 Factors — Monthly (decimal, first 6 rows)") %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"), full_width = FALSE)

Fama-French 3 Factors — Monthly (decimal, first 6 rows)
Date	Mkt.RF	SMB	HML
2010-01-31	-0.0335	0.0043	0.0033
2010-02-28	0.0339	0.0118	0.0318
2010-03-31	0.0630	0.0146	0.0219
2010-04-30	0.0199	0.0484	0.0296
2010-05-31	-0.0790	0.0013	-0.0248
2010-06-30	-0.0556	-0.0179	-0.0473

2.5 Q5: Merge Monthly Returns and FF3 Factors

# Align index format: both to end-of-month dates
# monthly_ret_xts index already end-of-month from monthlyReturn()
# Make sure ff3_xts dates align

# Convert both to tibbles and merge by date
ret_tbl <- tk_tbl(monthly_ret_xts, rename_index = "date") %>%
  mutate(date = as.Date(date))

ff3_tbl <- tk_tbl(ff3_xts, rename_index = "date") %>%
  mutate(date = as.Date(date))

# Inner join on year-month to handle slight end-of-month discrepancies
ret_tbl2 <- ret_tbl %>%
  mutate(ym = format(date, "%Y-%m"))

ff3_tbl2 <- ff3_tbl %>%
  mutate(ym = format(date, "%Y-%m"))

merged_tbl <- inner_join(ret_tbl2, ff3_tbl2, by = "ym") %>%
  select(-ym, -date.y) %>%
  rename(date = date.x)

merged_tbl %>%
  head(6) %>%
  mutate(across(-date, ~round(., 5))) %>%
  kable(caption = "Merged Monthly Returns + FF3 Factors (first 6 rows)") %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"), full_width = FALSE, font_size = 11)

Merged Monthly Returns + FF3 Factors (first 6 rows)
date	SPY	QQQ	EEM	IWM	EFA	TLT	IYR	GLD	Mkt.RF	SMB	HML
2010-02-26	0.03119	0.04604	0.01776	0.04475	0.00267	-0.00342	0.05457	0.03275	0.0339	0.0118	0.0318
2010-03-31	0.06088	0.07711	0.08111	0.08231	0.06385	-0.02057	0.09749	-0.00439	0.0630	0.0146	0.0219
2010-04-30	0.01547	0.02242	-0.00166	0.05678	-0.02805	0.03322	0.06388	0.05883	0.0199	0.0484	0.0296
2010-05-28	-0.07945	-0.07392	-0.09394	-0.07537	-0.11193	0.05108	-0.05684	0.03051	-0.0790	0.0013	-0.0248
2010-06-30	-0.05174	-0.05976	-0.01399	-0.07743	-0.02062	0.05798	-0.04670	0.02355	-0.0556	-0.0179	-0.0473
2010-07-30	0.06830	0.07258	0.10932	0.06731	0.11610	-0.00946	0.09405	-0.05087	0.0692	0.0022	-0.0050

2.6 Q6: CAPM-Based GMV Portfolio (Single Month)

# --- Subset training window: 2010/02 – 2015/01 ---
train_tbl <- merged_tbl %>%
  filter(date >= as.Date("2010-02-01") & date <= as.Date("2015-01-31"))

# Excess returns (assets and market)
excess_ret  <- as.matrix(train_tbl[, tickers]) - as.numeric(train_tbl$Mkt.RF)
# Note: Mkt.RF is already excess market return
mkt_excess  <- as.numeric(train_tbl$Mkt.RF)

n_assets    <- length(tickers)

# Regress each asset's excess return on Mkt-RF → extract beta and residual variance
betas_capm  <- numeric(n_assets)
resid_var   <- numeric(n_assets)

for (i in seq_along(tickers)) {
  y   <- as.matrix(train_tbl[, tickers[i]]) - train_tbl$Mkt.RF  # excess return of asset i
  fit <- lm(y ~ mkt_excess)
  betas_capm[i] <- coef(fit)[2]
  resid_var[i]  <- var(residuals(fit))
}

names(betas_capm) <- tickers
names(resid_var)  <- tickers

# Market variance
sigma2_mkt <- var(mkt_excess)

# CAPM covariance matrix: Sigma = beta %*% t(beta) * sigma2_mkt + D
Sigma_CAPM <- betas_capm %o% betas_capm * sigma2_mkt + diag(resid_var)

# --- Solve GMV using quadprog ---
# Minimize w'Sigma w subject to: sum(w)=1, w>=0
Dmat <- 2 * Sigma_CAPM
dvec <- rep(0, n_assets)

# Constraints: equality (sum=1) + non-negativity
Amat <- cbind(rep(1, n_assets), diag(n_assets))
bvec <- c(1, rep(0, n_assets))

gmv_capm <- solve.QP(Dmat, dvec, Amat, bvec, meq = 1)
w_capm   <- gmv_capm$solution
names(w_capm) <- tickers

# Apply weights to 2015/02 returns
ret_201502 <- merged_tbl %>%
  filter(format(date, "%Y-%m") == "2015-02") %>%
  select(all_of(tickers)) %>%
  as.numeric()

realized_capm <- sum(w_capm * ret_201502)

# --- Display ---
weight_df <- data.frame(
  ETF    = tickers,
  Weight = round(w_capm * 100, 2)
)

weight_df %>%
  kable(caption = "CAPM-Based GMV Weights (%)", col.names = c("ETF","Weight (%)")) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"), full_width = FALSE)

CAPM-Based GMV Weights (%)
	ETF	Weight (%)
SPY	SPY	87.19
QQQ	QQQ	3.33
EEM	EEM	1.25
IWM	IWM	5.95
EFA	EFA	1.95
TLT	TLT	0.00
IYR	IYR	0.33
GLD	GLD	0.00

cat("Realized Portfolio Return (2015/02) — CAPM GMV:", round(realized_capm * 100, 4), "%\n")

## Realized Portfolio Return (2015/02) — CAPM GMV: 5.6648 %

# Bar chart
ggplot(weight_df, aes(x = reorder(ETF, Weight), y = Weight, fill = ETF)) +
  geom_col(show.legend = FALSE) +
  coord_flip() +
  geom_text(aes(label = paste0(Weight, "%")), hjust = -0.1, size = 3.5) +
  labs(title = "CAPM-Based GMV Portfolio Weights",
       subtitle = "Training window: 2010/02 – 2015/01",
       x = NULL, y = "Weight (%)") +
  theme_minimal(base_size = 13) +
  scale_fill_brewer(palette = "Set2") +
  ylim(0, max(weight_df$Weight) * 1.15)

2.7 Q7: FF3-Based GMV Portfolio (Single Month)

# Factor returns matrix
F_mat <- as.matrix(train_tbl[, c("Mkt.RF","SMB","HML")])

# Regress each asset on all 3 FF factors
B_mat      <- matrix(0, n_assets, 3)   # 8 x 3 loading matrix
resid_var2 <- numeric(n_assets)

for (i in seq_along(tickers)) {
  y   <- as.matrix(train_tbl[, tickers[i]]) - train_tbl$Mkt.RF
  fit <- lm(y ~ F_mat - 1)  # no intercept; factor loadings only (or with intercept)
  B_mat[i, ]    <- coef(fit)
  resid_var2[i] <- var(residuals(fit))
}
rownames(B_mat) <- tickers
colnames(B_mat) <- c("Mkt.RF","SMB","HML")

# Factor covariance matrix (3x3)
Sigma_f <- cov(F_mat)

# FF3 covariance: Sigma = B %*% Sigma_f %*% t(B) + D
Sigma_FF3 <- B_mat %*% Sigma_f %*% t(B_mat) + diag(resid_var2)

# Solve GMV
Dmat2 <- 2 * Sigma_FF3
gmv_ff3 <- solve.QP(Dmat2, dvec, Amat, bvec, meq = 1)
w_ff3   <- gmv_ff3$solution
names(w_ff3) <- tickers

realized_ff3 <- sum(w_ff3 * ret_201502)

# Display
weight_ff3_df <- data.frame(ETF = tickers, Weight = round(w_ff3 * 100, 2))

weight_ff3_df %>%
  kable(caption = "FF3-Based GMV Weights (%)", col.names = c("ETF","Weight (%)")) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"), full_width = FALSE)

FF3-Based GMV Weights (%)
	ETF	Weight (%)
SPY	SPY	83.52
QQQ	QQQ	1.85
EEM	EEM	0.24
IWM	IWM	13.60
EFA	EFA	0.56
TLT	TLT	0.00
IYR	IYR	0.19
GLD	GLD	0.06

cat("Realized Portfolio Return (2015/02) — FF3 GMV:", round(realized_ff3 * 100, 4), "%\n")

## Realized Portfolio Return (2015/02) — FF3 GMV: 5.6734 %

# Side-by-side comparison bar chart
compare_df <- data.frame(
  ETF    = rep(tickers, 2),
  Weight = c(w_capm, w_ff3) * 100,
  Model  = rep(c("CAPM","FF3"), each = n_assets)
)

ggplot(compare_df, aes(x = ETF, y = Weight, fill = Model)) +
  geom_col(position = "dodge") +
  labs(title = "GMV Weights: CAPM vs. FF3 Model",
       subtitle = "Training window: 2010/02 – 2015/01",
       x = NULL, y = "Weight (%)") +
  scale_fill_manual(values = c("CAPM" = "#2196F3", "FF3" = "#FF5722")) +
  theme_minimal(base_size = 13) +
  theme(legend.position = "top")

2.8 Q8: Rolling Backtest (2015/02 – 2026/05)

# All monthly dates available
all_dates <- merged_tbl$date

# Identify the index positions for rolling windows
# Training starts at 2010/02 (row 1 of merged_tbl), ends at 2015/01 (row 60)
# Out-of-sample starts 2015/02

# Find row where 2015-01 ends
start_oos_idx <- which(format(all_dates, "%Y-%m") == "2015-02")[1]

# Number of out-of-sample months
n_oos <- nrow(merged_tbl) - start_oos_idx + 1

# Initialize storage
oos_dates   <- all_dates[start_oos_idx:(start_oos_idx + n_oos - 1)]
ret_capm_roll <- numeric(n_oos)
ret_ff3_roll  <- numeric(n_oos)

for (t in seq_len(n_oos)) {
  # Row index of the out-of-sample month
  oos_row <- start_oos_idx + t - 1
  # Training window: 60 rows ending one row before oos_row
  train_end   <- oos_row - 1
  train_start <- train_end - 59

  if (train_start < 1) next

  train_w <- merged_tbl[train_start:train_end, ]

  # --- CAPM covariance ---
  mkt_ex_t <- as.numeric(train_w$Mkt.RF)
  betas_t  <- numeric(n_assets)
  rv_t     <- numeric(n_assets)
  for (i in seq_along(tickers)) {
    y_t <- as.numeric(train_w[[tickers[i]]]) - mkt_ex_t
    fit_t <- lm(y_t ~ mkt_ex_t)
    betas_t[i] <- coef(fit_t)[2]
    rv_t[i]    <- var(residuals(fit_t))
  }
  Sig_C <- betas_t %o% betas_t * var(mkt_ex_t) + diag(rv_t)

  # GMV CAPM
  gC <- tryCatch(
    solve.QP(2*Sig_C, rep(0,n_assets), Amat, bvec, meq=1)$solution,
    error = function(e) rep(1/n_assets, n_assets)
  )

  # --- FF3 covariance ---
  F_t <- as.matrix(train_w[, c("Mkt.RF","SMB","HML")])
  B_t <- matrix(0, n_assets, 3)
  rv2_t <- numeric(n_assets)
  for (i in seq_along(tickers)) {
    y_t  <- as.numeric(train_w[[tickers[i]]]) - mkt_ex_t
    fit_f <- lm(y_t ~ F_t - 1)
    B_t[i,]  <- coef(fit_f)
    rv2_t[i] <- var(residuals(fit_f))
  }
  Sig_F <- B_t %*% cov(F_t) %*% t(B_t) + diag(rv2_t)

  # GMV FF3
  gF <- tryCatch(
    solve.QP(2*Sig_F, rep(0,n_assets), Amat, bvec, meq=1)$solution,
    error = function(e) rep(1/n_assets, n_assets)
  )

  # OOS returns
  oos_r <- as.numeric(merged_tbl[oos_row, tickers])
  ret_capm_roll[t] <- sum(gC * oos_r)
  ret_ff3_roll[t]  <- sum(gF * oos_r)
}

# Build results data frame
backtest_df <- data.frame(
  date         = oos_dates,
  ret_CAPM     = ret_capm_roll,
  ret_FF3      = ret_ff3_roll
) %>% filter(date >= as.Date("2015-02-01"))

# Cumulative returns
backtest_df <- backtest_df %>%
  mutate(
    cum_CAPM = cumprod(1 + ret_CAPM),
    cum_FF3  = cumprod(1 + ret_FF3)
  )

2.8.1 Cumulative Return Chart

backtest_long <- backtest_df %>%
  select(date, cum_CAPM, cum_FF3) %>%
  pivot_longer(-date, names_to = "Model", values_to = "CumReturn") %>%
  mutate(Model = recode(Model,
                        "cum_CAPM" = "CAPM GMV",
                        "cum_FF3"  = "FF3 GMV"))

ggplot(backtest_long, aes(x = date, y = CumReturn, color = Model)) +
  geom_line(size = 1.1) +
  scale_color_manual(values = c("CAPM GMV" = "#2196F3", "FF3 GMV" = "#FF5722")) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 1)) +
  labs(title    = "Cumulative Returns: CAPM GMV vs. FF3 GMV",
       subtitle = "Rolling 60-month window backtest: 2015/02 – 2026/05",
       x = NULL, y = "Cumulative Return",
       color = "Strategy") +
  theme_minimal(base_size = 13) +
  theme(legend.position = "top")

2.8.2 Performance Summary Table

# Helper functions
ann_ret <- function(r) prod(1 + r)^(12/length(r)) - 1
ann_vol <- function(r) sd(r) * sqrt(12)
sharpe  <- function(r) ann_ret(r) / ann_vol(r)

max_drawdown <- function(r) {
  cum_r  <- cumprod(1 + r)
  peak   <- cummax(cum_r)
  dd     <- (cum_r - peak) / peak
  min(dd)
}

perf_df <- data.frame(
  Metric = c("Annualized Return", "Annualized Volatility",
             "Sharpe Ratio (rf=0)", "Max Drawdown"),
  CAPM_GMV = c(
    paste0(round(ann_ret(backtest_df$ret_CAPM)*100, 2), "%"),
    paste0(round(ann_vol(backtest_df$ret_CAPM)*100, 2), "%"),
    round(sharpe(backtest_df$ret_CAPM), 3),
    paste0(round(max_drawdown(backtest_df$ret_CAPM)*100, 2), "%")
  ),
  FF3_GMV = c(
    paste0(round(ann_ret(backtest_df$ret_FF3)*100, 2), "%"),
    paste0(round(ann_vol(backtest_df$ret_FF3)*100, 2), "%"),
    round(sharpe(backtest_df$ret_FF3), 3),
    paste0(round(max_drawdown(backtest_df$ret_FF3)*100, 2), "%")
  )
)

perf_df %>%
  kable(caption = "Performance Comparison: CAPM GMV vs. FF3 GMV (2015/02–2026/05)",
        col.names = c("Metric","CAPM GMV","FF3 GMV")) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"), full_width = FALSE)

Performance Comparison: CAPM GMV vs. FF3 GMV (2015/02–2026/05)
Metric	CAPM GMV	FF3 GMV
Annualized Return	13.75%	13.47%
Annualized Volatility	15.28%	15.67%
Sharpe Ratio (rf=0)	0.899	0.86
Max Drawdown	-24.68%	-24.88%

2.8.3 Drawdown Chart

# Compute drawdown series
dd_series <- backtest_df %>%
  mutate(
    cum_CAPM_peak = cummax(cum_CAPM),
    cum_FF3_peak  = cummax(cum_FF3),
    dd_CAPM       = (cum_CAPM - cum_CAPM_peak) / cum_CAPM_peak,
    dd_FF3        = (cum_FF3  - cum_FF3_peak)  / cum_FF3_peak
  ) %>%
  select(date, dd_CAPM, dd_FF3) %>%
  pivot_longer(-date, names_to = "Model", values_to = "Drawdown") %>%
  mutate(Model = recode(Model,
                        "dd_CAPM" = "CAPM GMV",
                        "dd_FF3"  = "FF3 GMV"))

ggplot(dd_series, aes(x = date, y = Drawdown, fill = Model)) +
  geom_area(alpha = 0.5, position = "identity") +
  scale_fill_manual(values = c("CAPM GMV" = "#2196F3", "FF3 GMV" = "#FF5722")) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 1)) +
  labs(title    = "Drawdown Chart: CAPM GMV vs. FF3 GMV",
       subtitle = "Rolling 60-month window backtest: 2015/02 – 2026/05",
       x = NULL, y = "Drawdown",
       fill = "Strategy") +
  theme_minimal(base_size = 13) +
  theme(legend.position = "top")

114035109_Jayden Khalifa Armand Investment Portfolio Analysis — Final Exam

114035109 - Jayden Khalifa Armand

2026-06-10