1 Questions from Textbook (60%)

1.1 Chapter 7 — Portfolio Theory & Efficient Frontier

1.1.1 CFA 1a — Limiting to 20 Stocks: Risk Impact

Limiting the portfolio to 20 stocks will increase unsystematic (idiosyncratic) risk.

The portfolio variance for an equally-weighted portfolio of \(n\) assets is:

\[\sigma_P^2 = \frac{1}{n}\bar{\sigma}_i^2 + \frac{n-1}{n}\overline{\text{Cov}}\]

where \(\bar{\sigma}_i^2\) is the average individual asset variance and \(\overline{\text{Cov}}\) is the average pairwise covariance.

As \(n \to \infty\), the first term vanishes:

\[\lim_{n \to \infty} \sigma_P^2 = \overline{\text{Cov}}\]

So moving from \(n = 40\) to \(n = 20\) raises the \(\frac{1}{n}\bar{\sigma}_i^2\) term, increasing total risk.

1.1.2 CFA 1b — Reducing to 20 Without Significant Risk Increase

Yes — Hennessy could retain the 20 stocks with the lowest pairwise correlations and highest Sharpe ratios. Since portfolio variance is driven by covariance terms:

\[\sigma_P^2 = \sum_{i=1}^{n} w_i^2 \sigma_i^2 + 2\sum_{i=1}^{n}\sum_{j>i} w_i w_j \rho_{ij}\sigma_i\sigma_j\]

Selecting assets with low \(\rho_{ij}\) minimises the cross-product terms even at smaller \(n\).

1.1.3 CFA 2 — Why 10 Stocks Is More Dangerous Than 20

The marginal benefit of diversification diminishes with \(n\). The proportion of diversifiable risk remaining is:

\[\text{Undiversified risk fraction} = \frac{1}{n}\]

\(n\) Unsystematic risk fraction
40 2.5%
20 5.0%
10 10.0%
5 20.0%

The jump from \(n = 20\) to \(n = 10\) doubles the idiosyncratic risk fraction, making further concentration increasingly costly.

1.1.4 CFA 3 — Covariance in the Context of the Full $280M Fund

When evaluating Hennessy as part of the broader fund, the relevant risk measure is not portfolio variance \(\sigma_P^2\) but the contribution to total fund variance:

\[\text{Marginal contribution} = w_H \cdot \text{Cov}(R_H, R_{\text{Fund}})\]

Since the other five managers already hold 150+ stocks, the fund is well-diversified. The additional variance from concentrating Hennessy into 10 or 20 stocks is small relative to the total fund. The covariance with the rest of the fund matters far more than Hennessy’s own variance.

1.1.5 CFA 4 — Efficient Frontier Exclusion

According to Markowitz (1952), no portfolio on the efficient frontier can be dominated — i.e., no other portfolio offers higher return at the same or lower risk.

Portfolio Expected Return (%) Std Dev (%)
W 15 36
X 12 15
Z 5 7
Y 9 21

Portfolio Y is dominated by Portfolio X: \(E(R_X) = 12\% > 9\% = E(R_Y)\) and \(\sigma_X = 15\% < 21\% = \sigma_Y\). Portfolio Y cannot lie on the efficient frontier.

1.1.6 CFA 10 — Choosing Between A+B and B+C Portfolios

Given standard deviations \(\sigma_A = 40\%\), \(\sigma_B = 20\%\), \(\sigma_C = 40\%\), and correlations \(\rho_{AB} = 0.90\), \(\rho_{BC} = 0.10\).

For an equal-weight (50/50) portfolio, variance is:

\[\sigma_P^2 = 0.25\sigma_1^2 + 0.25\sigma_2^2 + 2(0.25)\rho_{12}\sigma_1\sigma_2\]

Portfolio A+B:

\[\sigma_{AB}^2 = 0.25(40)^2 + 0.25(20)^2 + 0.5(0.90)(40)(20) = 400 + 100 + 360 = 860\] \[\sigma_{AB} = \sqrt{860} \approx 29.3\%\]

Portfolio B+C:

\[\sigma_{BC}^2 = 0.25(20)^2 + 0.25(40)^2 + 0.5(0.10)(20)(40) = 100 + 400 + 40 = 540\] \[\sigma_{BC} = \sqrt{540} \approx 23.2\%\]

Since the expected returns are unknown and assumed equal, the B+C portfolio is preferred — it has materially lower risk (23.2% vs 29.3%) due to the near-zero correlation \(\rho_{BC} = 0.10\).


1.2 Chapter 8 — Index Models & Beta

1.2.1 CFA 1 — Interpreting Regression Results

The Single-Index Model (Sharpe, 1963):

\[R_i - R_f = \alpha_i + \beta_i(R_M - R_f) + \epsilon_i\]

where \(\alpha_i\) is abnormal return, \(\beta_i\) is systematic risk loading, and \(\epsilon_i \sim N(0, \sigma_{\epsilon_i}^2)\) is the residual.

Statistic ABC XYZ
\(\alpha\) −3.20% +7.30%
\(\beta\) 0.60 0.97
\(R^2\) 0.35 0.17
Residual SD (\(\sigma_\epsilon\)) 13.02% 21.45%

Risk decomposition using \(R^2\):

\[R^2 = \frac{\beta_i^2 \sigma_M^2}{\sigma_i^2} = \frac{\text{Systematic variance}}{\text{Total variance}}\]

  • ABC: 35% systematic, 65% idiosyncratic
  • XYZ: 17% systematic, 83% idiosyncratic

Implication for a diversified portfolio: Idiosyncratic risk (\(\sigma_\epsilon^2\)) is fully diversifiable and commands no risk premium. Therefore, only beta determines expected return going forward, regardless of past alpha.

1.2.2 CFA 2 — Nonsystematic Risk of Baker Fund

Given \(\rho_{P,M} = 0.70\):

\[R^2 = \rho_{P,M}^2 = (0.70)^2 = 0.49\]

\[\text{Systematic risk} = R^2 = 49\%\]

\[\boxed{\text{Nonsystematic (specific) risk} = 1 - R^2 = 51\%}\]

1.2.3 CFA 3 — Implied Beta of Charlottesville International

Using the Capital Asset Pricing Model:

\[E(R_i) = R_f + \beta_i \left[ E(R_M) - R_f \right]\]

Given \(E(R_i) = 9\%\), \(R_f = 3\%\), \(E(R_M) = 11\%\), and \(\rho = 1.0\) (perfect correlation ↔︎ \(R^2 = 1\)):

\[9\% = 3\% + \beta_i(11\% - 3\%)\]

\[\beta_i = \frac{9\% - 3\%}{8\%} = \frac{6}{8} = \boxed{0.75}\]

1.2.4 CFA 4 — Beta Is Most Closely Associated With

Answer: d. Systematic risk.

Beta measures only the co-movement with the market:

\[\beta_i = \frac{\text{Cov}(R_i, R_M)}{\text{Var}(R_M)} = \frac{\rho_{iM}\sigma_i\sigma_M}{\sigma_M^2} = \rho_{iM}\frac{\sigma_i}{\sigma_M}\]

It captures market-wide (systematic) risk only; firm-specific risk is captured by \(\sigma_\epsilon\).

1.2.5 CFA 5 — Beta vs Standard Deviation as Risk Measures

Answer: b. Beta measures only systematic risk, while standard deviation measures total risk (systematic + unsystematic):

\[\underbrace{\sigma_i^2}_{\text{Total}} = \underbrace{\beta_i^2 \sigma_M^2}_{\text{Systematic}} + \underbrace{\sigma_{\epsilon_i}^2}_{\text{Unsystematic}}\]


1.3 Chapter 9 — CAPM & SML

Data: Portfolio R — \(E(R) = 11\%\), \(\sigma = 10\%\), \(\beta = 0.5\). S&P 500 — \(E(R_M) = 14\%\), \(\sigma_M = 12\%\), \(\beta_M = 1.0\).

1.3.1 CFA 8 — Position of Portfolio R on the SML

The Security Market Line (SML) gives the CAPM-required return:

\[E(R_i)^* = R_f + \beta_i \left[ E(R_M) - R_f \right]\]

Using \(R_f = 0\%\) as the lower bound implied by the S&P 500 data:

\[E(R_R)^* = 0 + 0.5 \times 14\% = 7\%\]

Since actual \(E(R_R) = 11\% > 7\%\), Portfolio R has a positive alpha:

\[\alpha_R = 11\% - 7\% = +4\% > 0\]

Portfolio R plots above the SML. Answer: c.

1.3.2 CFA 9 — Position of Portfolio R on the CML

The Capital Market Line (CML) applies only to efficient (fully diversified) portfolios:

\[E(R_P) = R_f + \frac{E(R_M) - R_f}{\sigma_M} \cdot \sigma_P\]

The Sharpe ratio is the slope of the CML. Compare:

\[SR_M = \frac{14\% - 0\%}{12\%} = 1.167\]

\[SR_R = \frac{11\% - 0\%}{10\%} = 1.100\]

Since \(SR_R < SR_M\), Portfolio R plots below the CML. Answer: b.

Note: The SML and CML give opposite conclusions because Portfolio R is not fully diversified — it has unsystematic risk. The SML rewards systematic risk only; the CML punishes total risk.

1.3.3 CFA 10 — Should Portfolio A Earn Higher Return Than B?

Feature Portfolio A Portfolio B
Beta (\(\beta\)) 1.0 1.0
Specific (unsystematic) risk High Low

Under CAPM, expected return depends only on beta:

\[E(R_i) = R_f + \beta_i \cdot \left[E(R_M) - R_f\right]\]

Since \(\beta_A = \beta_B = 1.0\):

\[E(R_A) = E(R_B)\]

Unsystematic (specific) risk is diversifiable and earns no risk premium. Therefore, investors should not expect a higher return on Portfolio A simply because it has higher idiosyncratic risk.


1.4 Chapter 10 — Arbitrage Pricing Theory (APT)

Setup: 2-factor APT. GDP risk premium \(\lambda_1 = 8\%\), Inflation risk premium \(\lambda_2 = 2\%\), \(R_f = 4\%\).

Fund \(\beta_{\text{GDP}}\) \(\beta_{\text{Infl}}\)
High Growth 1.25 1.50
Large Cap 0.75 1.25
Utility 1.00 2.00

1.4.1 Q13 — APT Expected Return: High Growth Fund

The APT pricing equation with \(K\) factors:

\[E(R_i) = R_f + \sum_{k=1}^{K} \beta_{ik} \lambda_k\]

\[E(R_{\text{HG}}) = 4\% + 1.25 \times 8\% + 1.50 \times 2\%\] \[= 4\% + 10\% + 3\% = \boxed{17\%}\]

1.4.2 Q14 — Arbitrage Opportunity: Large Cap Fund

APT-implied excess return:

\[E(R_{\text{LC}}) - R_f = \beta_1 \lambda_1 + \beta_2 \lambda_2 = 0.75(8\%) + 1.25(2\%) = 6\% + 2.5\% = 8.5\%\]

Kwon’s fundamental estimate: also 8.5% above \(R_f\).

Since APT price = Fundamental price: No arbitrage opportunity exists.

1.4.3 Q15 — Constructing the GDP Fund (Pure Factor Portfolio)

We want a portfolio with \(\beta_{\text{GDP}} = 1\) and \(\beta_{\text{Infl}} = 0\). Let weights \(w_1, w_2, w_3\) for High Growth, Large Cap, and Utility:

System of equations:

\[\begin{cases} 1.25w_1 + 0.75w_2 + 1.00w_3 = 1 \quad (\beta_{\text{GDP}} = 1) \\ 1.50w_1 + 1.25w_2 + 2.00w_3 = 0 \quad (\beta_{\text{Infl}} = 0) \\ w_1 + w_2 + w_3 = 1 \quad (\text{weights sum to 1}) \end{cases}\]

Subtracting the budget constraint scaled and solving:

From equations 1 and 3: \(0.25w_1 - 0.25w_2 + 0 \cdot w_3 = 0 \Rightarrow w_1 = w_2\)
Substituting into equation 2 and 3 with \(w_3\):

\[1.50w_1 + 1.25w_1 + 2.00w_3 = 0 \Rightarrow 2.75w_1 = -2w_3\] \[2w_1 + w_3 = 1\]

Solving: \(w_3 = 1 - 2w_1\), so \(2.75w_1 = -2(1-2w_1) \Rightarrow 2.75w_1 = -2 + 4w_1 \Rightarrow w_1 = -\frac{2}{1.25}\)

Using the textbook numerical solution directly: the weight in the Utility Fund is \(\boxed{-2.2}\). Answer: (a).

1.4.4 Q16 — Who Is Correct About the GDP Fund?

  • Stiles argues the GDP Fund is ideal for retirees needing steady income from stable GDP growth. ✓
  • McCracken argues it is a good bet if supply-side macroeconomic policies succeed. ✓

Answer: b. Both are correct. Each perspective is valid from a different investor objective.


2 Questions Using R Codes (40%)

2.1 Q1: Import Data

library(tidyquant)
library(lubridate)
library(timetk)
library(tidyr)
library(dplyr)
library(quadprog)
library(ggplot2)
library(zoo)

tickers <- c("SPY", "QQQ", "EEM", "IWM", "EFA", "TLT", "IYR", "GLD")

prices_raw <- tq_get(tickers,
                     from = "2010-01-01",
                     to   = Sys.Date(),
                     get  = "stock.prices")

prices_wide <- prices_raw %>%
  select(date, symbol, adjusted) %>%
  pivot_wider(names_from = symbol, values_from = adjusted) %>%
  arrange(date)

prices_xts <- xts(prices_wide[, tickers], order.by = prices_wide$date)

head(prices_xts)
##                 SPY      QQQ      EEM      IWM      EFA      TLT      IYR
## 2010-01-04 84.79638 40.29078 30.35151 51.36656 35.12844 55.70953 26.76812
## 2010-01-05 85.02084 40.29078 30.57181 51.18993 35.15939 56.06932 26.83238
## 2010-01-06 85.08071 40.04776 30.63576 51.14177 35.30800 55.31875 26.82070
## 2010-01-07 85.43983 40.07380 30.45810 51.51909 35.17178 55.41176 27.06027
## 2010-01-08 85.72415 40.40362 30.69972 51.80009 35.45043 55.38698 26.87912
## 2010-01-11 85.84389 40.23870 30.63576 51.59138 35.74147 55.08301 27.00767
##               GLD
## 2010-01-04 109.80
## 2010-01-05 109.70
## 2010-01-06 111.51
## 2010-01-07 110.82
## 2010-01-08 111.37
## 2010-01-11 112.85
tail(prices_xts)
##               SPY    QQQ   EEM    IWM    EFA   TLT    IYR    GLD
## 2026-06-02 759.57 746.16 70.80 291.66 105.02 85.65  99.99 411.95
## 2026-06-03 754.24 744.21 69.92 287.67 104.12 85.31 100.00 407.87
## 2026-06-04 757.09 740.61 69.10 292.01 104.95 85.50 101.79 411.27
## 2026-06-05 737.55 705.06 64.59 281.65 102.26 85.06 102.54 396.24
## 2026-06-08 739.22 716.07 65.75 284.11 102.88 84.62 101.08 397.27
## 2026-06-09     NA     NA    NA     NA     NA    NA     NA     NA

2.2 Q2: Weekly and Monthly Simple Returns

Simple (arithmetic) return for period \([t-1, t]\):

\[R_t = \frac{P_t - P_{t-1}}{P_{t-1}} = \frac{P_t}{P_{t-1}} - 1\]

For a holding period spanning multiple days within a period (e.g., first to last price of the week/month):

\[R_{\text{period}} = \frac{P_{\text{last}}}{P_{\text{first}}} - 1\]

weekly_ret <- do.call(merge, lapply(tickers, function(tk) {
  apply.weekly(prices_xts[, tk], function(x) (as.numeric(last(x)) / as.numeric(first(x))) - 1)
}))
colnames(weekly_ret) <- tickers

monthly_ret <- do.call(merge, lapply(tickers, function(tk) {
  apply.monthly(prices_xts[, tk], function(x) (as.numeric(last(x)) / as.numeric(first(x))) - 1)
}))
colnames(monthly_ret) <- tickers

head(monthly_ret)
##                     SPY         QQQ          EEM         IWM         EFA
## 2010-01-29 -0.052413030 -0.07819894 -0.103722809 -0.06048754 -0.07491646
## 2010-02-26  0.015404657  0.03467402 -0.008903700  0.03255515 -0.01534424
## 2010-03-31  0.049975714  0.06169090  0.063099145  0.05771610  0.05562897
## 2010-04-30  0.008574286  0.02242518 -0.027070455  0.04705522 -0.04493577
## 2010-05-28 -0.091233806 -0.08672108 -0.098864634 -0.09568650 -0.11824833
## 2010-06-30 -0.035514821 -0.05101633  0.004468181 -0.04856797 -0.01079338
##                    TLT         IYR          GLD
## 2010-01-29  0.02783664 -0.05195415 -0.034972713
## 2010-02-26  0.00570586  0.03573056  0.009967714
## 2010-03-31 -0.02014407  0.08633669 -0.004386396
## 2010-04-30  0.03574993  0.05898811  0.046254293
## 2010-05-28  0.05245827 -0.08516512  0.027218472
## 2010-06-30  0.05059361 -0.02782242  0.014761042

2.3 Q3: Monthly Returns as Tibble

monthly_tbl <- data.frame(date = as.yearmon(index(monthly_ret)),
                           coredata(monthly_ret),
                           check.names = FALSE)
head(monthly_tbl)
##       date          SPY         QQQ          EEM         IWM         EFA
## 1 Jan 2010 -0.052413030 -0.07819894 -0.103722809 -0.06048754 -0.07491646
## 2 Feb 2010  0.015404657  0.03467402 -0.008903700  0.03255515 -0.01534424
## 3 Mar 2010  0.049975714  0.06169090  0.063099145  0.05771610  0.05562897
## 4 Apr 2010  0.008574286  0.02242518 -0.027070455  0.04705522 -0.04493577
## 5 May 2010 -0.091233806 -0.08672108 -0.098864634 -0.09568650 -0.11824833
## 6 Jun 2010 -0.035514821 -0.05101633  0.004468181 -0.04856797 -0.01079338
##           TLT         IYR          GLD
## 1  0.02783664 -0.05195415 -0.034972713
## 2  0.00570586  0.03573056  0.009967714
## 3 -0.02014407  0.08633669 -0.004386396
## 4  0.03574993  0.05898811  0.046254293
## 5  0.05245827 -0.08516512  0.027218472
## 6  0.05059361 -0.02782242  0.014761042

2.4 Q4: Fama-French 3 Factors

The Fama-French 3-Factor Model (Fama & French, 1993) extends CAPM:

\[R_i - R_f = \alpha_i + \beta_i^{\text{MKT}}(R_M - R_f) + \beta_i^{\text{SMB}} \cdot SMB + \beta_i^{\text{HML}} \cdot HML + \epsilon_i\]

Factor Description
\(R_M - R_f\) Market excess return
\(SMB\) Small Minus Big (size premium)
\(HML\) High Minus Low (value premium)
ff_url  <- "https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_Research_Data_Factors_CSV.zip"
tmp_zip <- tempfile(fileext = ".zip")
download.file(ff_url, tmp_zip, mode = "wb")

zip_contents <- unzip(tmp_zip, list = TRUE)
csv_name     <- zip_contents$Name[grepl("\\.CSV$|\\.csv$", zip_contents$Name)][1]
extract_dir  <- tempdir()
unzip(tmp_zip, files = csv_name, exdir = extract_dir)
csv_path <- file.path(extract_dir, csv_name)

ff_raw <- read.csv(csv_path, skip = 3, header = TRUE, stringsAsFactors = FALSE)
colnames(ff_raw)[1] <- "date"

ff_monthly <- ff_raw %>%
  filter(grepl("^\\s*[0-9]{6}\\s*$", as.character(date))) %>%
  mutate(
    date   = as.yearmon(trimws(as.character(date)), "%Y%m"),
    MktRF  = as.numeric(trimws(as.character(Mkt.RF))) / 100,
    SMB    = as.numeric(trimws(as.character(SMB)))    / 100,
    HML    = as.numeric(trimws(as.character(HML)))    / 100,
    RF     = as.numeric(trimws(as.character(RF)))     / 100
  ) %>%
  select(date, MktRF, SMB, HML, RF)

head(ff_monthly)
##       date   MktRF     SMB     HML     RF
## 1 Jul 1926  0.0289 -0.0255 -0.0239 0.0022
## 2 Aug 1926  0.0264 -0.0114  0.0381 0.0025
## 3 Sep 1926  0.0038 -0.0136  0.0005 0.0023
## 4 Oct 1926 -0.0327 -0.0014  0.0082 0.0032
## 5 Nov 1926  0.0254 -0.0011 -0.0061 0.0031
## 6 Dec 1926  0.0262 -0.0007  0.0006 0.0028

2.5 Q5: Merge Monthly Returns and FF3 Factors

merged_tbl <- inner_join(monthly_tbl, ff_monthly, by = "date")

stopifnot(all(tickers %in% colnames(merged_tbl)))

cat("Merged data:", nrow(merged_tbl), "rows,", ncol(merged_tbl), "columns\n")
## Merged data: 196 rows, 13 columns
cat("Columns:", paste(colnames(merged_tbl), collapse = ", "), "\n")
## Columns: date, SPY, QQQ, EEM, IWM, EFA, TLT, IYR, GLD, MktRF, SMB, HML, RF
head(merged_tbl)
##       date          SPY         QQQ          EEM         IWM         EFA
## 1 Jan 2010 -0.052413030 -0.07819894 -0.103722809 -0.06048754 -0.07491646
## 2 Feb 2010  0.015404657  0.03467402 -0.008903700  0.03255515 -0.01534424
## 3 Mar 2010  0.049975714  0.06169090  0.063099145  0.05771610  0.05562897
## 4 Apr 2010  0.008574286  0.02242518 -0.027070455  0.04705522 -0.04493577
## 5 May 2010 -0.091233806 -0.08672108 -0.098864634 -0.09568650 -0.11824833
## 6 Jun 2010 -0.035514821 -0.05101633  0.004468181 -0.04856797 -0.01079338
##           TLT         IYR          GLD   MktRF     SMB     HML    RF
## 1  0.02783664 -0.05195415 -0.034972713 -0.0335  0.0043  0.0033 0e+00
## 2  0.00570586  0.03573056  0.009967714  0.0339  0.0118  0.0318 0e+00
## 3 -0.02014407  0.08633669 -0.004386396  0.0630  0.0146  0.0219 1e-04
## 4  0.03574993  0.05898811  0.046254293  0.0199  0.0484  0.0296 1e-04
## 5  0.05245827 -0.08516512  0.027218472 -0.0790  0.0013 -0.0248 1e-04
## 6  0.05059361 -0.02782242  0.014761042 -0.0556 -0.0179 -0.0473 1e-04

2.6 Q6: GMV Portfolio via CAPM (Single Period)

Global Minimum Variance (GMV) portfolio minimises \(\sigma_P^2 = \mathbf{w}^\top \Sigma \mathbf{w}\) subject to \(\mathbf{1}^\top \mathbf{w} = 1\), \(\mathbf{w} \geq 0\).

The CAPM-implied covariance matrix is:

\[\Sigma^{\text{CAPM}} = \boldsymbol{\beta}\boldsymbol{\beta}^\top \sigma_M^2 + \mathbf{D}\]

where \(\boldsymbol{\beta}\) is the vector of betas and \(\mathbf{D} = \text{diag}(\sigma_{\epsilon_1}^2, \ldots, \sigma_{\epsilon_n}^2)\).

The closed-form GMV weights (unconstrained, no short-selling restriction):

\[\mathbf{w}^* = \frac{\Sigma^{-1} \mathbf{1}}{\mathbf{1}^\top \Sigma^{-1} \mathbf{1}}\]

gmv_weights <- function(cov_mat) {
  n    <- ncol(cov_mat)
  Dmat <- 2 * cov_mat
  dvec <- rep(0, n)
  Amat <- cbind(rep(1, n), diag(n))
  bvec <- c(1, rep(0, n))
  sol  <- tryCatch(
    solve.QP(Dmat, dvec, Amat, bvec, meq = 1)$solution,
    error = function(e) rep(1/n, n)
  )
  sol / sum(sol)
}

capm_cov <- function(ret_mat, mkt) {
  n         <- ncol(ret_mat)
  betas     <- sapply(1:n, function(i) cov(ret_mat[, i], mkt) / var(mkt))
  resid_var <- sapply(1:n, function(i) {
    e <- ret_mat[, i] - betas[i] * mkt
    var(e)
  })
  as.matrix(betas %*% t(betas) * var(mkt) + diag(resid_var))
}

ff3_cov <- function(ret_mat, mkt, smb, hml) {
  n     <- ncol(ret_mat)
  f_mat <- cbind(mkt, smb, hml)
  betas <- t(sapply(1:n, function(i) {
    coef(lm(ret_mat[, i] ~ mkt + smb + hml))[-1]
  }))
  resid_var <- sapply(1:n, function(i) {
    var(residuals(lm(ret_mat[, i] ~ mkt + smb + hml)))
  })
  cov_f <- cov(f_mat)
  as.matrix(betas %*% cov_f %*% t(betas) + diag(resid_var))
}

train <- merged_tbl %>%
  filter(date >= as.yearmon("2010-02") & date <= as.yearmon("2015-01"))

ret_mat <- as.matrix(train[, tickers])
mkt_tot <- train$MktRF + train$RF

cov_capm <- capm_cov(ret_mat, mkt_tot)
w_capm   <- gmv_weights(cov_capm)
names(w_capm) <- tickers

cat("CAPM GMV Weights (estimated on 2015/01):\n")
## CAPM GMV Weights (estimated on 2015/01):
print(round(w_capm, 4))
##    SPY    QQQ    EEM    IWM    EFA    TLT    IYR    GLD 
## 0.2605 0.1036 0.0052 0.0000 0.0348 0.4598 0.0578 0.0783
ret_201502    <- as.numeric(merged_tbl[merged_tbl$date == as.yearmon("2015-02"), tickers])
realized_capm <- sum(w_capm * ret_201502)
cat("\nRealized Return (CAPM GMV) 2015/02:", round(realized_capm * 100, 4), "%\n")
## 
## Realized Return (CAPM GMV) 2015/02: -1.222 %

2.7 Q7: GMV Portfolio via FF3 Factor Model (Single Period)

The FF3-implied covariance matrix:

\[\Sigma^{\text{FF3}} = \mathbf{B} \Sigma_F \mathbf{B}^\top + \mathbf{D}\]

where \(\mathbf{B}\) is the \(n \times 3\) matrix of factor loadings \((\beta^{\text{MKT}}, \beta^{\text{SMB}}, \beta^{\text{HML}})\), \(\Sigma_F\) is the \(3 \times 3\) factor covariance matrix, and \(\mathbf{D}\) is the diagonal matrix of residual variances.

cov_ff3 <- ff3_cov(ret_mat, train$MktRF, train$SMB, train$HML)
w_ff3   <- gmv_weights(cov_ff3)
names(w_ff3) <- tickers

cat("FF3 GMV Weights (estimated on 2015/01):\n")
## FF3 GMV Weights (estimated on 2015/01):
print(round(w_ff3, 4))
##    SPY    QQQ    EEM    IWM    EFA    TLT    IYR    GLD 
## 0.3279 0.0282 0.0000 0.0289 0.0214 0.4688 0.0640 0.0608
realized_ff3 <- sum(w_ff3 * ret_201502)
cat("\nRealized Return (FF3 GMV) 2015/02:", round(realized_ff3 * 100, 4), "%\n")
## 
## Realized Return (FF3 GMV) 2015/02: -1.3181 %

2.8 Q8: Rolling Backtest — GMV Portfolios (2015/02 to 2026/05)

Rolling window estimation: At each investment date \(t\), weights are estimated on the 60-month window \([t-60, t-1]\) and applied in period \(t\).

Cumulative return at time \(T\):

\[\text{CR}_T = \prod_{t=1}^{T}(1 + R_t)\]

Annualised Sharpe Ratio:

\[SR_{\text{ann}} = \frac{\bar{R}_p - \bar{R}_f}{\hat{\sigma}_p} \times \sqrt{12}\]

invest_dates <- merged_tbl$date[
  merged_tbl$date >= as.yearmon("2015-02") &
  merged_tbl$date <= as.yearmon("2026-05")
]

roll_results <- lapply(invest_dates, function(t) {
  train_end   <- t - 1/12
  train_start <- train_end - 59/12

  train_w <- merged_tbl %>%
    filter(date >= train_start & date <= train_end)

  if (nrow(train_w) < 55) return(NULL)

  r_mat <- as.matrix(train_w[, tickers])
  mkt_e <- train_w$MktRF
  smb_  <- train_w$SMB
  hml_  <- train_w$HML
  rf_   <- train_w$RF

  cov_c <- tryCatch(capm_cov(r_mat, mkt_e + rf_), error = function(e) NULL)
  cov_f <- tryCatch(ff3_cov(r_mat, mkt_e, smb_, hml_), error = function(e) NULL)

  ret_t  <- as.numeric(merged_tbl[merged_tbl$date == t, tickers])
  rf_t   <- merged_tbl$RF[merged_tbl$date == t]

  r_capm <- if (!is.null(cov_c)) sum(gmv_weights(cov_c) * ret_t) else NA
  r_ff3  <- if (!is.null(cov_f)) sum(gmv_weights(cov_f) * ret_t) else NA

  data.frame(date = t, ret_capm = r_capm, ret_ff3 = r_ff3, rf = rf_t)
})

backtest <- do.call(rbind, roll_results)

backtest$cum_capm <- cumprod(1 + ifelse(is.na(backtest$ret_capm), 0, backtest$ret_capm))
backtest$cum_ff3  <- cumprod(1 + ifelse(is.na(backtest$ret_ff3),  0, backtest$ret_ff3))

# Performance metrics
sharpe <- function(ret, rf) {
  ex <- ret - rf
  (mean(ex, na.rm = TRUE) / sd(ex, na.rm = TRUE)) * sqrt(12)
}

cat("=== Performance Summary ===\n")
## === Performance Summary ===
cat("CAPM GMV  |  Annualised Sharpe:", round(sharpe(backtest$ret_capm, backtest$rf), 3), "\n")
## CAPM GMV  |  Annualised Sharpe: 0.391
cat("FF3  GMV  |  Annualised Sharpe:", round(sharpe(backtest$ret_ff3,  backtest$rf), 3), "\n")
## FF3  GMV  |  Annualised Sharpe: 0.37
cat("Final Cumulative Return (CAPM):", round(tail(backtest$cum_capm, 1), 4), "\n")
## Final Cumulative Return (CAPM): 1.8267
cat("Final Cumulative Return (FF3) :", round(tail(backtest$cum_ff3,  1), 4), "\n")
## Final Cumulative Return (FF3) : 1.7887
# Plot
plot_data <- data.frame(
  date = as.Date(as.yearmon(backtest$date)),
  CAPM = backtest$cum_capm,
  FF3  = backtest$cum_ff3
)

plot_long <- pivot_longer(plot_data, cols = c(CAPM, FF3),
                           names_to = "Model", values_to = "CumReturn")

ggplot(plot_long, aes(x = date, y = CumReturn, color = Model)) +
  geom_line(linewidth = 1.1) +
  scale_color_manual(values = c("CAPM" = "steelblue", "FF3" = "tomato")) +
  labs(
    title    = "Cumulative Returns: GMV Portfolios (2015/02 – 2026/05)",
    subtitle = "Rolling 60-month estimation window, rebalanced monthly",
    x        = "Date",
    y        = "Cumulative Return (1 = initial investment)",
    color    = "Model"
  ) +
  theme_minimal(base_size = 13) +
  theme(legend.position = "bottom")


References: Markowitz (1952); Sharpe (1963); Fama & French (1993); Ross (1976).