Questions from Textbook

Chapter 7

CFA 1

a. Will limiting to 20 stocks increase or decrease portfolio risk?

Answer: Increase risk.

The variance of an equally weighted portfolio of \(n\) stocks with average variance \(\bar{\sigma}^2\) and average covariance \(\overline{Cov}\) is:

\[\sigma_P^2 = \frac{1}{n}\bar{\sigma}^2 + \frac{n-1}{n}\overline{Cov}\]

As \(n \to \infty\), the first term (average variance scaled by \(1/n\)) vanishes, leaving only systematic covariance risk. For \(n = 40\) vs \(n = 20\):

\[\sigma_{P,40}^2 = \frac{1}{40}\bar{\sigma}^2 + \frac{39}{40}\overline{Cov} \qquad \text{vs} \qquad \sigma_{P,20}^2 = \frac{1}{20}\bar{\sigma}^2 + \frac{19}{20}\overline{Cov}\]

The difference is:

\[\sigma_{P,20}^2 - \sigma_{P,40}^2 = \left(\frac{1}{20} - \frac{1}{40}\right)\bar{\sigma}^2 - \left(\frac{19}{20} - \frac{39}{40}\right)\overline{Cov} = \frac{1}{40}\bar{\sigma}^2 - \frac{1}{40}\overline{Cov} = \frac{\bar{\sigma}^2 - \overline{Cov}}{40} > 0\]

Since \(\bar{\sigma}^2 > \overline{Cov}\) for any realistic portfolio, reducing \(n\) from 40 to 20 increases portfolio variance. Each stock now carries weight \(\frac{1}{20} = 5\%\) instead of \(\frac{1}{40} = 2.5\%\), doubling the idiosyncratic risk contribution per position.

b. Can Hennessy reduce from 40 to 20 without significantly increasing risk?

Answer: Yes, if he retains stocks with low pairwise correlations.

The key insight is that covariance, not just variance, drives portfolio risk. If Hennessy drops stocks that are highly correlated with retained holdings, the effective \(\overline{Cov}\) of the remaining 20 stocks stays low:

\[\sigma_P^2 = \sum_i w_i^2 \sigma_i^2 + \sum_i \sum_{j \neq i} w_i w_j \rho_{ij} \sigma_i \sigma_j\]

By eliminating pairs with \(\rho_{ij} \approx 1\) (which provide no additional diversification), the cross-terms barely increase even as \(n\) falls from 40 to 20. Since Hennessy can identify the 10 best ideas each year and these are likely in different sectors/styles, he can construct a 20-stock portfolio whose pairwise correlations are sufficiently low to preserve most of the diversification benefit.


CFA 2

Answer: Reduction to 10 stocks would be less advantageous.

The marginal diversification benefit of adding the \(n\)-th stock is:

\[\Delta\sigma_P^2 \approx -\frac{\bar{\sigma}^2 - \overline{Cov}}{n^2}\]

This is a decreasing function of \(n\) — the benefit diminishes rapidly as \(n\) grows. Going from 40 → 20 removes partially redundant stocks; going from 20 → 10 removes stocks that were contributing genuine independent risk reduction.

Furthermore, with only 10 stocks each carrying weight \(w = \frac{1}{10} = 10\%\), unsystematic risk becomes material again:

\[\text{Unsystematic risk contribution} = \frac{1}{n}\bar{\sigma}_{e}^2 = \frac{1}{10}\bar{\sigma}_{e}^2\]

versus \(\frac{1}{20}\bar{\sigma}_{e}^2\) for the 20-stock case — double the idiosyncratic variance exposure. A single stock blowup now moves the portfolio by ~10%, which is substantial. The stock-picking alpha per position must be very high to compensate for this risk increase.


CFA 3

Answer: The broader fund-level view makes concentration less costly.

Total portfolio variance of the full Wilstead Fund is:

\[\sigma_{Fund}^2 = \left(\frac{W_{Hennessy}}{W_{Total}}\right)^2 \sigma_{Hennessy}^2 + \left(\frac{W_{Others}}{W_{Total}}\right)^2 \sigma_{Others}^2 + 2\left(\frac{W_{Hennessy}}{W_{Total}}\right)\left(\frac{W_{Others}}{W_{Total}}\right)\sigma_{Hennessy,Others}\]

Hennessy manages $30M of $280M total — a weight of only \(\frac{30}{280} \approx 10.7\%\). Squaring this, Hennessy’s standalone variance contributes only \((0.107)^2 \approx 1.1\%\) of the fund-level variance calculation. Even if Hennessy’s portfolio variance doubles by concentrating to 10 stocks, its impact on fund-level risk is negligible — the other 5 managers collectively hold 150+ stocks across $250M and dominate the fund’s risk profile.

Therefore the committee should focus on maximising the alpha from Hennessy’s best 10–20 ideas rather than imposing diversification constraints that are already satisfied at the fund level.


CFA 4

Which portfolio cannot lie on the Markowitz efficient frontier?

Portfolio \(E(R)\) \(\sigma\) Sharpe (approx)
W 15% 36% 0.42
X 12% 15% 0.80
Z 5% 7% 0.71
Y 9% 21% 0.43

On the efficient frontier, no feasible portfolio can offer a higher return at lower risk than a frontier portfolio. The Markowitz frontier is upward-sloping from the minimum-variance portfolio: higher \(\sigma\) must come with higher \(E(R)\).

Compare portfolios W and X:

  • Portfolio X: \(E(R) = 12\%\), \(\sigma = 15\%\)
  • Portfolio W: \(E(R) = 15\%\), \(\sigma = 36\%\)

Now consider a combination of X and the risk-free asset (or any portfolio on the frontier). At \(\sigma = 36\%\), the frontier must yield a return well above 15% because it already achieves 12% at \(\sigma = 15\%\). In other words, the reward-to-risk ratio for W is:

\[\frac{\Delta E(R)}{\Delta\sigma} = \frac{15\% - 12\%}{36\% - 15\%} = \frac{3\%}{21\%} \approx 0.14\]

This is far below the slope implied by the other portfolios (X vs Z: \(\frac{7\%}{8\%} = 0.875\)), confirming that W is inside the frontier — it is mean-variance dominated.

Answer: (a) Portfolio W — cannot lie on the efficient frontier.


CFA 10

Given:

Stock \(\sigma\)
A 40%
B 20%
C 40%
Correlation Value
\(\rho_{AB}\) 0.90
\(\rho_{BC}\) 0.10
\(\rho_{AC}\) 0.50

Two-asset portfolio variance formula:

\[\sigma_P^2 = w_1^2\sigma_1^2 + w_2^2\sigma_2^2 + 2w_1 w_2 \rho_{12}\sigma_1\sigma_2\]

Portfolio A+B (\(w_A = w_B = 0.5\)):

\[\sigma_{AB}^2 = (0.5)^2(40)^2 + (0.5)^2(20)^2 + 2(0.5)(0.5)(0.90)(40)(20)\] \[= 0.25 \times 1600 + 0.25 \times 400 + 0.5 \times 0.90 \times 800\] \[= 400 + 100 + 360 = 860\]

Wait — recalculating the cross-term correctly:

\[2 \times 0.5 \times 0.5 \times 0.90 \times 40 \times 20 = 0.5 \times 0.90 \times 800 = 360\]

\[\sigma_{AB}^2 = 400 + 100 + 360 \times 2 \div 2\]

Let me be precise:

\[\sigma_{AB}^2 = (0.5)^2(1600) + (0.5)^2(400) + 2(0.5)(0.5)(0.90)(40)(20)\] \[= 400 + 100 + (0.90)(400) = 400 + 100 + 360 = 860\]

Hmm — rechecking: \(2 \times 0.5 \times 0.5 = 0.5\), and \(0.5 \times 0.90 \times 40 \times 20 = 0.5 \times 720 = 360\). But the cross-term formula is \(2 w_A w_B \rho_{AB} \sigma_A \sigma_B\):

\[= 2 \times 0.5 \times 0.5 \times 0.90 \times 40 \times 20 = 1 \times 0.5 \times 0.90 \times 800 = 0.45 \times 800 = 360\]

\[\boxed{\sigma_{AB}^2 = 400 + 100 + 360 = 860 \implies \sigma_{AB} = \sqrt{860} \approx 29.3\%}\]

Portfolio B+C (\(w_B = w_C = 0.5\)):

\[\sigma_{BC}^2 = (0.5)^2(20)^2 + (0.5)^2(40)^2 + 2(0.5)(0.5)(0.10)(20)(40)\] \[= 0.25 \times 400 + 0.25 \times 1600 + 2 \times 0.25 \times 0.10 \times 800\] \[= 100 + 400 + 40 = 540\]

\[\boxed{\sigma_{BC}^2 = 540 \implies \sigma_{BC} = \sqrt{540} \approx 23.2\%}\]

Comparison:

Portfolio Variance Std Dev
A + B 860 29.3%
B + C 540 23.2%

Since no expected return data are provided (we cannot compare \(E(R)\)), the decision rests on risk alone. Portfolio B+C has 37% lower variance. The low correlation \(\rho_{BC} = 0.10\) provides far superior diversification benefits compared to \(\rho_{AB} = 0.90\) (near-perfect co-movement provides almost no risk reduction).

Recommendation: Portfolio B+C — unambiguously lower risk.


Chapter 8

CFA 1

Single-index model regression:

\[R_i - R_f = \alpha_i + \beta_i(R_m - R_f) + e_i\]

Results summary:

Statistic ABC XYZ
\(\alpha\) −3.20% +7.30%
\(\beta\) 0.60 0.97
\(R^2\) 0.35 0.17
\(\sigma(e)\) 13.02% 21.45%

Decomposing total risk using the single-index model:

\[\sigma_i^2 = \beta_i^2 \sigma_m^2 + \sigma^2(e_i)\]

where \(R^2 = \frac{\beta_i^2 \sigma_m^2}{\sigma_i^2}\) measures the fraction of total variance explained by the market.

  • ABC: \(R^2 = 0.35\) → systematic risk = 35% of total; unsystematic = 65%. \(\alpha = -3.20\%\) means ABC underperformed its risk-adjusted required return by 3.20% p.a. over 5 years. Low beta (0.60) indicates defensive, below-market sensitivity.
  • XYZ: \(R^2 = 0.17\) → only 17% systematic; 83% firm-specific. \(\alpha = +7.30\%\) suggests strong outperformance, but the large \(\sigma(e) = 21.45\%\) and very low \(R^2\) signal that most of XYZ’s movement is idiosyncratic.

Implications for a diversified portfolio:

In a well-diversified portfolio, \(\sigma^2(e_i) \to 0\) (cancels out), so only \(\beta\) matters for portfolio risk:

\[\sigma_P^2 \approx \beta_P^2 \sigma_m^2, \quad \beta_P = \sum_i w_i \beta_i\]

The high residual standard deviations of both stocks are irrelevant at the portfolio level. However, beta instability is a concern — brokerage estimates for XYZ range 1.25–1.45, far above the 5-year estimate of 0.97. This suggests XYZ’s true systematic risk is higher than the historical regression implies. Use more recent 2-year betas as better forward estimates.

Alpha persistence is not guaranteed — past alpha does not reliably predict future alpha without a structural explanation for the edge.


CFA 2

Given: \(\rho(\text{Baker Fund}, \text{Market}) = 0.70\)

From the single-index model, \(R^2\) equals the square of the correlation with the market:

\[R^2 = \rho^2 = (0.70)^2 = 0.49\]

Total risk decomposition:

\[\underbrace{\sigma_P^2}_{\text{Total}} = \underbrace{\beta^2\sigma_m^2}_{\text{Systematic}} + \underbrace{\sigma^2(e)}_{\text{Nonsystematic}}\]

Therefore:

\[\frac{\sigma^2(e)}{\sigma_P^2} = 1 - R^2 = 1 - 0.49 = \boxed{0.51 = \mathbf{51\%}}\]

51% of Baker Fund’s total risk is nonsystematic (firm-specific).


CFA 3

Given: \(\rho = 1.0\), \(E(R_m) = 11\%\), \(E(R_C) = 9\%\), \(R_f = 3\%\)

Since \(\rho(\text{Charlottesville}, \text{Market}) = 1.0\), the fund moves in perfect lockstep with the market — all its variance is systematic (\(R^2 = 1\)).

Apply the CAPM / Security Market Line:

\[E(R_i) = R_f + \beta_i \cdot [E(R_m) - R_f]\]

Solving for \(\beta\):

\[\beta = \frac{E(R_i) - R_f}{E(R_m) - R_f} = \frac{9\% - 3\%}{11\% - 3\%} = \frac{6\%}{8\%} = \boxed{\mathbf{0.75}}\]


CFA 4

Answer: (d) Systematic risk.

Beta is defined as:

\[\beta_i = \frac{\text{Cov}(R_i, R_m)}{\sigma_m^2} = \rho_{im} \cdot \frac{\sigma_i}{\sigma_m}\]

It measures only the covariance with the market factor — the non-diversifiable, systematic component. Unsystematic risk (firm-specific) is captured by \(\sigma^2(e_i)\) in the single-index model, not by beta.


CFA 5

Answer: (b) Beta measures only systematic risk, while standard deviation measures total risk.

From the single-index model variance decomposition:

\[\underbrace{\sigma_i^2}_{\text{Total risk (SD}^2\text{)}} = \underbrace{\beta_i^2\sigma_m^2}_{\text{Systematic}} + \underbrace{\sigma^2(e_i)}_{\text{Unsystematic}}\]

  • Standard deviation \(\sigma_i\) captures the full left-hand side: both systematic and unsystematic.
  • Beta \(\beta_i\) captures only the systematic component via \(\text{Cov}(R_i, R_m)/\sigma_m^2\).

In a diversified portfolio, unsystematic risk is eliminated, so beta is the appropriate risk measure. For a standalone asset, standard deviation is relevant.


Chapter 9

Data: Portfolio R: \(E(R) = 11\%\), \(\sigma = 10\%\), \(\beta = 0.5\); S&P 500: \(E(R_m) = 14\%\), \(\sigma_m = 12\%\), \(\beta = 1.0\)

CFA 8 — Portfolio R vs. the SML

The Security Market Line (SML) gives the CAPM required return as a function of beta:

\[E(R_i)^{SML} = R_f + \beta_i \cdot [E(R_m) - R_f]\]

For Portfolio R with \(\beta = 0.5\), the SML-implied required return is:

\[E(R_R)^{SML} = R_f + 0.5 \times (14\% - R_f) = 0.5 R_f + 7\%\]

Portfolio R’s actual return is 11%. For R to lie above the SML:

\[11\% > 0.5 R_f + 7\% \implies 4\% > 0.5 R_f \implies R_f < 8\%\]

This holds for any realistic risk-free rate. The alpha of Portfolio R is:

\[\alpha_R = E(R_R)^{actual} - E(R_R)^{SML} = 11\% - (R_f + 0.5 \times (14\% - R_f)) > 0\]

Portfolio R plots above the SML — it has earned a positive risk-adjusted return (positive alpha).

Answer: (c) Above the SML


CFA 9 — Portfolio R vs. the CML

The Capital Market Line (CML) applies only to efficient portfolios and uses \(\sigma\) on the x-axis:

\[E(R_P)^{CML} = R_f + \frac{E(R_m) - R_f}{\sigma_m} \cdot \sigma_P\]

The CML slope (Sharpe ratio of market portfolio):

\[S_m = \frac{E(R_m) - R_f}{\sigma_m} = \frac{14\% - R_f}{12\%}\]

CML-implied return for Portfolio R at \(\sigma_R = 10\%\):

\[E(R_R)^{CML} = R_f + \frac{14\% - R_f}{12\%} \times 10\%\]

For Portfolio R to lie on or above the CML, we need \(11\% \geq E(R_R)^{CML}\):

\[11\% \geq R_f + \frac{10}{12}(14\% - R_f) = R_f + \frac{140\% - 10R_f}{12}\]

\[132\% \geq 12R_f + 140\% - 10R_f \implies -8\% \geq 2R_f \implies R_f \leq -4\%\]

Since \(R_f\) cannot be negative at that level, Portfolio R’s actual return (11%) falls below the CML return at \(\sigma = 10\%\). This is expected — Portfolio R is not a fully efficient portfolio; it carries uncompensated idiosyncratic risk that inflates its \(\sigma\) without increasing \(E(R)\).

Answer: (b) Below the CML


CFA 10

According to CAPM, investors should NOT expect a higher return on Portfolio A than Portfolio B.

Portfolio A Portfolio B
Systematic risk (\(\beta\)) 1.0 1.0
Specific (idiosyncratic) risk High Low

CAPM expected return depends only on beta:

\[E(R_i) = R_f + \beta_i \cdot [E(R_m) - R_f]\]

Since \(\beta_A = \beta_B = 1.0\):

\[E(R_A) = E(R_B) = R_f + 1.0 \times [E(R_m) - R_f] = E(R_m)\]

In equilibrium, investors hold diversified portfolios. Idiosyncratic (specific) risk is diversifiable and therefore not priced — the market demands no additional return for bearing avoidable risk. Portfolio A’s higher specific risk earns zero premium. Both portfolios should earn the market return.


Chapter 10

Two-factor APT model:

\[E(R_i) = R_f + \beta_{i,GDP} \cdot RP_{GDP} + \beta_{i,\pi} \cdot RP_{\pi}\]

Given parameters:

\(\beta_{GDP}\) \(\beta_{\pi}\)
High Growth Fund (H) 1.25 1.50
Large Cap Fund (L) 0.75 1.25
Utility Fund (U) 1.00 2.00

\(R_f = 4\%\), \(RP_{GDP} = 8\%\), \(RP_{\pi} = 2\%\)


Problem 13

APT expected return for Orb’s High Growth Fund:

\[E(R_H) = R_f + \beta_{H,GDP} \cdot RP_{GDP} + \beta_{H,\pi} \cdot RP_{\pi}\] \[= 4\% + 1.25 \times 8\% + 1.50 \times 2\%\] \[= 4\% + 10\% + 3\%\] \[= \boxed{\mathbf{17\%}}\]


Problem 14

APT expected return for Orb’s Large Cap Fund:

\[E(R_L)^{APT} = 4\% + 0.75 \times 8\% + 1.25 \times 2\% = 4\% + 6\% + 2.5\% = 12.5\%\]

Kwon’s fundamental analysis estimate:

\[E(R_L)^{Fund} = R_f + 8.5\% = 4\% + 8.5\% = 12.5\%\]

Comparison:

\[E(R_L)^{APT} = E(R_L)^{Fund} = 12.5\%\]

The two estimates are equal — the fund is fairly priced according to the APT. There is no mispricing and therefore no arbitrage opportunity available.


Problem 15

Construct GDP Fund with \(\beta_{GDP} = 1\), \(\beta_{\pi} = 0\) using weights \(w_H\), \(w_L\), \(w_U\):

System of three equations:

\[\text{(i) Weights sum to 1:} \quad w_H + w_L + w_U = 1\]

\[\text{(ii) GDP beta = 1:} \quad 1.25w_H + 0.75w_L + 1.00w_U = 1\]

\[\text{(iii) Inflation beta = 0:} \quad 1.50w_H + 1.25w_L + 2.00w_U = 0\]

Step 1: Subtract (i) from (ii):

\[(1.25 - 1)w_H + (0.75 - 1)w_L + (1.00 - 1)w_U = 0\] \[0.25w_H - 0.25w_L = 0 \implies \boxed{w_H = w_L}\]

Step 2: Substitute \(w_U = 1 - w_H - w_L = 1 - 2w_H\) into equation (iii):

\[1.50w_H + 1.25w_H + 2.00(1 - 2w_H) = 0\] \[2.75w_H + 2 - 4w_H = 0\] \[-1.25w_H = -2\] \[\boxed{w_H = 1.6, \quad w_L = 1.6}\]

Step 3: Solve for \(w_U\):

\[w_U = 1 - 1.6 - 1.6 = \boxed{-2.2}\]

Verification: - GDP: \(1.25(1.6) + 0.75(1.6) + 1.0(-2.2) = 2.0 + 1.2 - 2.2 = 1.0\) ✓ - Inflation: \(1.5(1.6) + 1.25(1.6) + 2.0(-2.2) = 2.4 + 2.0 - 4.4 = 0\) ✓ - Weights: \(1.6 + 1.6 - 2.2 = 1.0\)

Answer: (c) \(w_U = -2.2\) — short the Utility Fund by 2.2 times portfolio value.


Problem 16

Stiles claims the GDP Fund suits retirees who live off steady investment income. Since the fund has zero inflation exposure (\(\beta_\pi = 0\)), unexpected inflation does not erode its returns — providing stable real income. ✓ Correct.

McCracken claims the GDP Fund suits investors betting on successful supply-side policies that raise real GDP growth. Since the fund has unit GDP sensitivity (\(\beta_{GDP} = 1\)), it benefits directly from positive GDP surprises. ✓ Also correct for a different investor type.

Answer: (b) Both are correct — each statement is valid for its intended investor scenario.


Questions Using R Codes

Q1: Import ETF Data

library(tidyquant)
library(lubridate)
library(timetk)
library(tidyverse)
library(PerformanceAnalytics)
library(quadprog)

# Define tickers
tickers <- c("SPY", "QQQ", "EEM", "IWM", "EFA", "TLT", "IYR", "GLD")

# Download daily adjusted prices from Yahoo Finance 2010 to today
prices_raw <- tq_get(tickers,
                     from = "2010-01-01",
                     to   = Sys.Date(),
                     get  = "stock.prices")

# Extract adjusted closing prices in wide format
prices_wide <- prices_raw %>%
  select(symbol, date, adjusted) %>%
  pivot_wider(names_from = symbol, values_from = adjusted) %>%
  arrange(date)

# Preview
head(prices_wide)
## # A tibble: 6 × 9
##   date         SPY   QQQ   EEM   IWM   EFA   TLT   IYR   GLD
##   <date>     <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2010-01-04  84.8  40.3  30.4  51.4  35.1  55.7  26.8  110.
## 2 2010-01-05  85.0  40.3  30.6  51.2  35.2  56.1  26.8  110.
## 3 2010-01-06  85.1  40.0  30.6  51.1  35.3  55.3  26.8  112.
## 4 2010-01-07  85.4  40.1  30.5  51.5  35.2  55.4  27.1  111.
## 5 2010-01-08  85.7  40.4  30.7  51.8  35.5  55.4  26.9  111.
## 6 2010-01-11  85.8  40.2  30.6  51.6  35.7  55.1  27.0  113.
tail(prices_wide)
## # A tibble: 6 × 9
##   date         SPY   QQQ   EEM   IWM   EFA   TLT   IYR   GLD
##   <date>     <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2026-06-02  760.  746.  70.8  292.  105.  85.7 100.0  412.
## 2 2026-06-03  754.  744.  69.9  288.  104.  85.3 100    408.
## 3 2026-06-04  757.  741.  69.1  292.  105.  85.5 102.   411.
## 4 2026-06-05  738.  705.  64.6  282.  102.  85.1 103.   396.
## 5 2026-06-08  739.  716.  65.8  284.  103.  84.6 101.   397.
## 6 2026-06-09   NA    NA   NA     NA    NA   NA    NA     NA

Q2: Weekly and Monthly Returns (Simple Returns)

# Convert to xts for timetk/PerformanceAnalytics compatibility
prices_xts <- prices_wide %>%
  column_to_rownames("date") %>%
  as.xts()

# Weekly simple returns
weekly_returns <- Return.calculate(to.weekly(prices_xts, OHLC = FALSE),
                                   method = "discrete")
weekly_returns <- weekly_returns[-1, ]  # remove first NA row

# Monthly simple returns
monthly_returns <- Return.calculate(to.monthly(prices_xts, OHLC = FALSE),
                                    method = "discrete")
monthly_returns <- monthly_returns[-1, ]

cat("Weekly returns: ", nrow(weekly_returns), "observations\n")
## Weekly returns:  857 observations
cat("Monthly returns:", nrow(monthly_returns), "observations\n")
## Monthly returns: 197 observations
head(monthly_returns)
##                  SPY         QQQ         EEM         IWM          EFA
## Feb 2010  0.03119469  0.04603867  0.01776377  0.04475097  0.002667856
## Mar 2010  0.06087965  0.07710897  0.08110899  0.08230679  0.063853728
## Apr 2010  0.01546998  0.02242536 -0.00166194  0.05678513 -0.028045564
## May 2010 -0.07945456 -0.07392373 -0.09393581 -0.07536655 -0.111927733
## Jun 2010 -0.05174110 -0.05975687 -0.01398636 -0.07743415 -0.020619475
## Jul 2010  0.06830047  0.07258261  0.10932476  0.06730935  0.116104239
##                   TLT         IYR          GLD
## Feb 2010 -0.003423950  0.05457032  0.032748219
## Mar 2010 -0.020573303  0.09748457 -0.004386396
## Apr 2010  0.033217826  0.06388110  0.058834363
## May 2010  0.051083520 -0.05683517  0.030513147
## Jun 2010  0.057977943 -0.04670129  0.023553189
## Jul 2010 -0.009463413  0.09404807 -0.050871157

Q3: Convert Monthly Returns to Tibble

# Convert xts monthly returns to tibble with date column
monthly_tbl <- tk_tbl(monthly_returns, rename_index = "date")

# Ensure date is Date type
monthly_tbl <- monthly_tbl %>%
  mutate(date = as.Date(as.yearmon(date)))

head(monthly_tbl)
## # A tibble: 6 × 9
##   date           SPY     QQQ      EEM     IWM      EFA      TLT     IYR      GLD
##   <date>       <dbl>   <dbl>    <dbl>   <dbl>    <dbl>    <dbl>   <dbl>    <dbl>
## 1 2010-02-01  0.0312  0.0460  0.0178   0.0448  0.00267 -0.00342  0.0546  0.0327 
## 2 2010-03-01  0.0609  0.0771  0.0811   0.0823  0.0639  -0.0206   0.0975 -0.00439
## 3 2010-04-01  0.0155  0.0224 -0.00166  0.0568 -0.0280   0.0332   0.0639  0.0588 
## 4 2010-05-01 -0.0795 -0.0739 -0.0939  -0.0754 -0.112    0.0511  -0.0568  0.0305 
## 5 2010-06-01 -0.0517 -0.0598 -0.0140  -0.0774 -0.0206   0.0580  -0.0467  0.0236 
## 6 2010-07-01  0.0683  0.0726  0.109    0.0673  0.116   -0.00946  0.0940 -0.0509

Q4: Download Fama-French 3 Factors

# Install frenchdata if not already available
if (!requireNamespace("frenchdata", quietly = TRUE))
  install.packages("frenchdata", quiet = TRUE)
library(frenchdata)

# Download FF3 monthly factors
ff3_raw <- download_french_data("Fama/French 3 Factors")

# Inspect structure to find correct column names
ff3_monthly_raw <- ff3_raw$subsets$data[[1]]
cat("Column names from frenchdata:\n")
## Column names from frenchdata:
print(names(ff3_monthly_raw))
## [1] "date"   "Mkt-RF" "SMB"    "HML"    "RF"
cat("\nFirst few rows (raw):\n")
## 
## First few rows (raw):
print(head(ff3_monthly_raw, 3))
## # A tibble: 3 × 5
##     date `Mkt-RF`   SMB   HML    RF
##    <dbl>    <dbl> <dbl> <dbl> <dbl>
## 1 192607     2.89 -2.55 -2.39  0.22
## 2 192608     2.64 -1.14  3.81  0.25
## 3 192609     0.38 -1.36  0.05  0.23
# Standardise column names: rename whatever frenchdata returns to our convention
# frenchdata typically returns: date, `Mkt-RF`, SMB, HML, RF  (note the dash)
ff3_monthly_clean <- ff3_monthly_raw %>%
  rename_with(~ gsub("-", ".", .x, fixed = TRUE)) %>%  # "Mkt-RF" -> "Mkt.RF"
  rename_with(trimws) %>%                               # strip any whitespace
  mutate(
    date   = as.Date(paste0(date, "01"), "%Y%m%d"),
    Mkt.RF = as.numeric(Mkt.RF) / 100,
    SMB    = as.numeric(SMB)    / 100,
    HML    = as.numeric(HML)    / 100,
    RF     = as.numeric(RF)     / 100
  ) %>%
  filter(date >= "2010-01-01") %>%
  select(date, Mkt.RF, SMB, HML, RF)

cat("\nFF3 data:", nrow(ff3_monthly_clean), "monthly observations\n")
## 
## FF3 data: 196 monthly observations
head(ff3_monthly_clean)
## # A tibble: 6 × 5
##   date        Mkt.RF     SMB     HML     RF
##   <date>       <dbl>   <dbl>   <dbl>  <dbl>
## 1 2010-01-01 -0.0335  0.0043  0.0033 0     
## 2 2010-02-01  0.0339  0.0118  0.0318 0     
## 3 2010-03-01  0.063   0.0146  0.0219 0.0001
## 4 2010-04-01  0.0199  0.0484  0.0296 0.0001
## 5 2010-05-01 -0.079   0.0013 -0.0248 0.0001
## 6 2010-06-01 -0.0556 -0.0179 -0.0473 0.0001
tail(ff3_monthly_clean)
## # A tibble: 6 × 5
##   date        Mkt.RF     SMB     HML     RF
##   <date>       <dbl>   <dbl>   <dbl>  <dbl>
## 1 2025-11-01 -0.0013  0.0054  0.0357 0.003 
## 2 2025-12-01 -0.0036 -0.0103  0.0236 0.0034
## 3 2026-01-01  0.0103  0.0212  0.0386 0.003 
## 4 2026-02-01 -0.0117  0.0024  0.0265 0.0028
## 5 2026-03-01 -0.0518  0.0044  0.0335 0.0029
## 6 2026-04-01  0.0994  0.0013 -0.0127 0.0029

Q5: Merge Monthly Returns and FF3 Factors

# Merge on date
merged_tbl <- monthly_tbl %>%
  inner_join(ff3_monthly_clean, by = "date") %>%
  arrange(date)

cat("Merged tibble dimensions:", nrow(merged_tbl), "x", ncol(merged_tbl), "\n")
## Merged tibble dimensions: 195 x 13
head(merged_tbl)
## # A tibble: 6 × 13
##   date           SPY     QQQ      EEM     IWM      EFA      TLT     IYR      GLD
##   <date>       <dbl>   <dbl>    <dbl>   <dbl>    <dbl>    <dbl>   <dbl>    <dbl>
## 1 2010-02-01  0.0312  0.0460  0.0178   0.0448  0.00267 -0.00342  0.0546  0.0327 
## 2 2010-03-01  0.0609  0.0771  0.0811   0.0823  0.0639  -0.0206   0.0975 -0.00439
## 3 2010-04-01  0.0155  0.0224 -0.00166  0.0568 -0.0280   0.0332   0.0639  0.0588 
## 4 2010-05-01 -0.0795 -0.0739 -0.0939  -0.0754 -0.112    0.0511  -0.0568  0.0305 
## 5 2010-06-01 -0.0517 -0.0598 -0.0140  -0.0774 -0.0206   0.0580  -0.0467  0.0236 
## 6 2010-07-01  0.0683  0.0726  0.109    0.0673  0.116   -0.00946  0.0940 -0.0509 
## # ℹ 4 more variables: Mkt.RF <dbl>, SMB <dbl>, HML <dbl>, RF <dbl>

Q6: CAPM Covariance Matrix & GMV Portfolio (2015/02 Realized Return)

# Helper: compute CAPM-based covariance matrix
# Sigma_CAPM = beta %*% t(beta) * var(Rm) + diag(residual variances)
compute_capm_cov <- function(returns_mat, mkt_excess) {
  n_assets <- ncol(returns_mat)
  betas    <- numeric(n_assets)
  resid_var <- numeric(n_assets)

  for (i in seq_len(n_assets)) {
    fit <- lm(returns_mat[, i] ~ mkt_excess)
    betas[i]    <- coef(fit)[2]
    resid_var[i] <- var(residuals(fit))
  }

  var_mkt <- var(mkt_excess)
  Sigma   <- outer(betas, betas) * var_mkt + diag(resid_var)
  return(Sigma)
}

# Helper: compute GMV weights using quadprog
gmv_weights <- function(Sigma) {
  n  <- nrow(Sigma)
  Dmat <- 2 * Sigma
  dvec <- rep(0, n)
  # Constraints: sum(w) = 1, w >= 0 (long-only)
  Amat <- cbind(rep(1, n), diag(n))
  bvec <- c(1, rep(0, n))
  sol  <- solve.QP(Dmat, dvec, Amat, bvec, meq = 1)
  return(sol$solution)
}

# Filter training window: 2010/02 – 2015/01 (60 months)
train_data <- merged_tbl %>%
  filter(date >= "2010-02-01" & date <= "2015-01-01")

asset_cols <- tickers  # SPY QQQ EEM IWM EFA TLT IYR GLD
returns_train <- as.matrix(train_data[, asset_cols])
mkt_excess_train <- train_data$Mkt.RF  # market excess return

# CAPM covariance matrix
Sigma_capm <- compute_capm_cov(returns_train, mkt_excess_train)
cat("CAPM Covariance Matrix (2010/02–2015/01):\n")
## CAPM Covariance Matrix (2010/02–2015/01):
round(Sigma_capm, 6)
##           [,1]      [,2]      [,3]      [,4]      [,5]      [,6]      [,7]
## [1,]  0.001396  0.001459  0.001726  0.001828  0.001580 -0.001012  0.001179
## [2,]  0.001459  0.001815  0.001814  0.001922  0.001662 -0.001064  0.001239
## [3,]  0.001726  0.001814  0.003350  0.002274  0.001966 -0.001258  0.001466
## [4,]  0.001828  0.001922  0.002274  0.002699  0.002083 -0.001333  0.001553
## [5,]  0.001580  0.001662  0.001966  0.002083  0.002413 -0.001152  0.001343
## [6,] -0.001012 -0.001064 -0.001258 -0.001333 -0.001152  0.001621 -0.000859
## [7,]  0.001179  0.001239  0.001466  0.001553  0.001343 -0.000859  0.002024
## [8,]  0.000235  0.000247  0.000292  0.000310  0.000268 -0.000171  0.000200
##           [,8]
## [1,]  0.000235
## [2,]  0.000247
## [3,]  0.000292
## [4,]  0.000310
## [5,]  0.000268
## [6,] -0.000171
## [7,]  0.000200
## [8,]  0.002900
# GMV weights
w_capm <- gmv_weights(Sigma_capm)
names(w_capm) <- asset_cols
cat("\nGMV Weights (CAPM):\n")
## 
## GMV Weights (CAPM):
round(w_capm, 4)
##    SPY    QQQ    EEM    IWM    EFA    TLT    IYR    GLD 
## 0.4472 0.0000 0.0000 0.0000 0.0000 0.4483 0.0373 0.0673
# Realized return in 2015/02
ret_201502 <- merged_tbl %>%
  filter(format(date, "%Y-%m") == "2015-02") %>%
  select(all_of(asset_cols)) %>%
  as.numeric()

realized_capm <- sum(w_capm * ret_201502)
cat(sprintf("\nRealized GMV Portfolio Return (CAPM) in 2015/02: %.4f%%\n",
            realized_capm * 100))
## 
## Realized GMV Portfolio Return (CAPM) in 2015/02: -0.7320%

Q7: FF3-Factor Covariance Matrix & GMV Portfolio (2015/02 Realized Return)

# Helper: FF3-based covariance matrix
compute_ff3_cov <- function(returns_mat, ff3_mat) {
  n_assets  <- ncol(returns_mat)
  betas_mat <- matrix(0, nrow = n_assets, ncol = 3)  # 3 factors
  resid_var <- numeric(n_assets)

  for (i in seq_len(n_assets)) {
    fit <- lm(returns_mat[, i] ~ ff3_mat)
    betas_mat[i, ] <- coef(fit)[-1]  # exclude intercept
    resid_var[i]   <- var(residuals(fit))
  }

  Sigma_f <- cov(ff3_mat)
  Sigma   <- betas_mat %*% Sigma_f %*% t(betas_mat) + diag(resid_var)
  return(Sigma)
}

ff3_factors_train <- as.matrix(train_data[, c("Mkt.RF", "SMB", "HML")])

# FF3 covariance matrix
Sigma_ff3 <- compute_ff3_cov(returns_train, ff3_factors_train)
cat("FF3 Covariance Matrix (2010/02–2015/01):\n")
## FF3 Covariance Matrix (2010/02–2015/01):
round(Sigma_ff3, 6)
##           [,1]      [,2]      [,3]      [,4]      [,5]      [,6]      [,7]
## [1,]  0.001396  0.001464  0.001724  0.001787  0.001600 -0.001008  0.001178
## [2,]  0.001464  0.001815  0.001844  0.001869  0.001713 -0.000995  0.001248
## [3,]  0.001724  0.001844  0.003350  0.002270  0.001980 -0.001228  0.001470
## [4,]  0.001787  0.001869  0.002270  0.002699  0.001943 -0.001379  0.001552
## [5,]  0.001600  0.001713  0.001980  0.001943  0.002413 -0.001104  0.001347
## [6,] -0.001008 -0.000995 -0.001228 -0.001379 -0.001104  0.001621 -0.000851
## [7,]  0.001178  0.001248  0.001470  0.001552  0.001347 -0.000851  0.002024
## [8,]  0.000204  0.000337  0.000350  0.000469  0.000237 -0.000072  0.000216
##           [,8]
## [1,]  0.000204
## [2,]  0.000337
## [3,]  0.000350
## [4,]  0.000469
## [5,]  0.000237
## [6,] -0.000072
## [7,]  0.000216
## [8,]  0.002900
# GMV weights
w_ff3 <- gmv_weights(Sigma_ff3)
names(w_ff3) <- asset_cols
cat("\nGMV Weights (FF3):\n")
## 
## GMV Weights (FF3):
round(w_ff3, 4)
##    SPY    QQQ    EEM    IWM    EFA    TLT    IYR    GLD 
## 0.4580 0.0000 0.0000 0.0000 0.0000 0.4507 0.0333 0.0580
# Realized return in 2015/02
realized_ff3 <- sum(w_ff3 * ret_201502)
cat(sprintf("\nRealized GMV Portfolio Return (FF3) in 2015/02: %.4f%%\n",
            realized_ff3 * 100))
## 
## Realized GMV Portfolio Return (FF3) in 2015/02: -0.6213%

Q8: Rolling Backtest — GMV Portfolios (2015/02 to 2026/05)

# Rolling window backtest: train on t-60 to t-1, invest at t
# Window: rolling 60-month estimation, out-of-sample from 2015/02 to 2026/05

all_months <- merged_tbl$date
start_oos  <- as.Date("2015-02-01")
end_oos    <- as.Date("2026-05-01")
oos_dates  <- all_months[all_months >= start_oos & all_months <= end_oos]

backtest_results <- tibble(
  date        = as.Date(character()),
  ret_capm    = numeric(),
  ret_ff3     = numeric()
)

for (t in seq_along(oos_dates)) {
  current_date <- oos_dates[t]

  # Training window: 60 months ending one month before current
  train_end   <- all_months[which(all_months == current_date) - 1]
  train_start <- all_months[which(all_months == train_end) - 59]

  if (is.na(train_start) || train_start < as.Date("2010-02-01")) next

  train_window <- merged_tbl %>%
    filter(date >= train_start & date <= train_end)

  if (nrow(train_window) < 60) next

  ret_mat  <- as.matrix(train_window[, asset_cols])
  mkt_vec  <- train_window$Mkt.RF
  ff3_mat  <- as.matrix(train_window[, c("Mkt.RF", "SMB", "HML")])

  # Compute covariance matrices and GMV weights
  tryCatch({
    S_capm <- compute_capm_cov(ret_mat, mkt_vec)
    S_ff3  <- compute_ff3_cov(ret_mat, ff3_mat)
    w_c    <- gmv_weights(S_capm)
    w_f    <- gmv_weights(S_ff3)

    # Realized return at current_date
    r_t <- merged_tbl %>%
      filter(date == current_date) %>%
      select(all_of(asset_cols)) %>%
      as.numeric()

    backtest_results <- backtest_results %>%
      add_row(
        date     = current_date,
        ret_capm = sum(w_c * r_t),
        ret_ff3  = sum(w_f * r_t)
      )
  }, error = function(e) NULL)
}

cat("Backtest complete:", nrow(backtest_results), "monthly observations\n")
## Backtest complete: 135 monthly observations
# Compute cumulative returns
backtest_cum <- backtest_results %>%
  arrange(date) %>%
  mutate(
    cum_capm = cumprod(1 + ret_capm) - 1,
    cum_ff3  = cumprod(1 + ret_ff3) - 1
  )

# Summary statistics
cat("\n--- Performance Summary (2015/02 – 2026/05) ---\n")
## 
## --- Performance Summary (2015/02 – 2026/05) ---
cat(sprintf("CAPM GMV  | Ann. Return: %.2f%% | Ann. Vol: %.2f%% | Sharpe: %.2f\n",
    mean(backtest_results$ret_capm) * 12 * 100,
    sd(backtest_results$ret_capm) * sqrt(12) * 100,
    (mean(backtest_results$ret_capm) * 12) / (sd(backtest_results$ret_capm) * sqrt(12))))
## CAPM GMV  | Ann. Return: 8.06% | Ann. Vol: 10.48% | Sharpe: 0.77
cat(sprintf("FF3  GMV  | Ann. Return: %.2f%% | Ann. Vol: %.2f%% | Sharpe: %.2f\n",
    mean(backtest_results$ret_ff3) * 12 * 100,
    sd(backtest_results$ret_ff3) * sqrt(12) * 100,
    (mean(backtest_results$ret_ff3) * 12) / (sd(backtest_results$ret_ff3) * sqrt(12))))
## FF3  GMV  | Ann. Return: 7.97% | Ann. Vol: 10.57% | Sharpe: 0.75
# Plot cumulative returns
backtest_cum %>%
  select(date, cum_capm, cum_ff3) %>%
  pivot_longer(-date, names_to = "model", values_to = "cum_return") %>%
  mutate(model = recode(model,
    cum_capm = "GMV (CAPM)",
    cum_ff3  = "GMV (Fama-French 3-Factor)")) %>%
  ggplot(aes(x = date, y = cum_return * 100, color = model)) +
  geom_line(linewidth = 1.2) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "grey50") +
  scale_color_manual(values = c("GMV (CAPM)" = "#2196F3",
                                "GMV (Fama-French 3-Factor)" = "#FF5722")) +
  scale_y_continuous(labels = scales::percent_format(scale = 1)) +
  labs(
    title    = "Cumulative Returns: GMV Portfolios (2015/02 – 2026/05)",
    subtitle = "Rolling 60-month estimation window | 8-Asset ETF Universe",
    x        = NULL,
    y        = "Cumulative Return (%)",
    color    = "Model",
    caption  = "Assets: SPY, QQQ, EEM, IWM, EFA, TLT, IYR, GLD"
  ) +
  theme_minimal(base_size = 13) +
  theme(
    legend.position = "bottom",
    plot.title    = element_text(face = "bold"),
    panel.grid.minor = element_blank()
  )

# Monthly returns comparison
backtest_results %>%
  pivot_longer(-date, names_to = "model", values_to = "return") %>%
  mutate(model = recode(model,
    ret_capm = "GMV (CAPM)",
    ret_ff3  = "GMV (Fama-French 3-Factor)")) %>%
  ggplot(aes(x = date, y = return * 100, fill = model)) +
  geom_col(position = "dodge", alpha = 0.75) +
  facet_wrap(~model, ncol = 1) +
  scale_fill_manual(values = c("GMV (CAPM)" = "#2196F3",
                               "GMV (Fama-French 3-Factor)" = "#FF5722")) +
  labs(
    title = "Monthly Returns: GMV Portfolios (2015/02 – 2026/05)",
    x = NULL, y = "Monthly Return (%)"
  ) +
  theme_minimal(base_size = 12) +
  theme(legend.position = "none", panel.grid.minor = element_blank())