Course: Portfolio Analysis  |  Instructor: Shyh-Weir Tzang  |  Submission: Final Exam (Spring 2026)

This report covers two parts: (1) conceptual and quantitative solutions to the CFA-style textbook problems (Chapters 7–10), and (2) a fully reproducible R workflow that builds and backtests Global Minimum Variance portfolios using both the CAPM and the Fama-French 3-factor model.


1 Questions from textbook (60%)

1.1 Chapter 7

1.1.1 CFA 1

a. Will the limitation to 20 stocks likely increase or decrease the risk of the portfolio? Explain.

Restricting the portfolio to 20 stocks (versus 40) will likely increase risk. Reducing the number of holdings reduces the degree of diversification, which increases firm-specific (nonsystematic) risk that is otherwise diversified away. Even though Hennessy keeps the same investment process, with fewer names the portfolio’s unsystematic risk component will be larger, raising total portfolio standard deviation, all else equal.

b. Is there any way Hennessy could reduce the number of issues from 40 to 20 without significantly affecting risk? Explain.

Yes — if Hennessy concentrates the portfolio on the more highly correlated/sector-similar stocks it would raise risk, but if instead Hennessy selects 20 stocks that are relatively uncorrelated with one another (spread across industries) while dropping issues that are highly correlated with stocks already retained, nonsystematic risk can be kept low even with fewer holdings. Studies (e.g., Evans & Archer) show that the marginal diversification benefit of additional securities drops off sharply after about 15–20 well-chosen, low-correlation stocks, so a carefully selected set of 20 can approximate the risk-reduction benefits of 40 if redundant/highly correlated names are the ones eliminated.

1.1.2 CFA 2

Reducing the number of stocks from 20 to 10 is less likely to be advantageous than the original reduction from 40 to 20. The principle of diminishing marginal benefit of diversification means that going from 40 to 20 names removes a lot of “redundant” diversification (since most of the unsystematic risk-reduction benefit of diversification is achieved with the first 15–20 stocks). Cutting further from 20 to 10 sacrifices substantially more diversification benefit per stock dropped, sharply increasing unsystematic risk, while the prospective gain (forcing Hennessy to hold only its “best ideas”) is unlikely to outweigh this added risk. In short, the risk-reduction curve is convex/flattening — early cuts (40→20) cost little in diversification, but further cuts (20→10) cost a lot.

1.1.3 CFA 3

Viewed at the total fund level rather than in isolation, the Hennessy portfolio’s nonsystematic risk is far less important, because it is combined with the other five managers’ $250 million (150+ issues) portfolios. The aggregation across managers already diversifies away much of any one manager’s firm-specific risk. Therefore, concentrating the Hennessy portfolio (which is a small slice of the $280 million total fund) into fewer, higher-conviction stocks may be a reasonable trade since the marginal increase in total fund risk from concentrating one sleeve is muted by the diversification provided by the other five managers. This broader perspective could support allowing Hennessy more concentration (e.g., 10 or even fewer stocks) than would be prudent if Hennessy’s portfolio were evaluated as a stand-alone, complete portfolio.

1.1.4 CFA 4

Which portfolio cannot lie on the efficient frontier?

Portfolio E(R) % SD %
a W 15 36
b X 12 15
c Z 5 7
d Y 9 21

Compare Y to X: X has higher return (12% > 9%) and lower risk (15% < 21%) than Y. X dominates Y, so Y cannot lie on the efficient frontier (answer d).

1.1.5 CFA 10

Stock data: - SD: A = 40%, B = 20%, C = 40% - Correlations: ρ(A,B) = 0.90, ρ(B,C) = 0.10, ρ(A,C) = 0.50

Compare an equal-weighted A&B portfolio vs. an equal-weighted B&C portfolio.

sdA <- 0.40; sdB <- 0.20; sdC <- 0.40
rho_AB <- 0.90
rho_BC <- 0.10

w <- 0.5

var_AB <- w^2*sdA^2 + w^2*sdB^2 + 2*w*w*rho_AB*sdA*sdB
var_BC <- w^2*sdB^2 + w^2*sdC^2 + 2*w*w*rho_BC*sdB*sdC

sd_AB <- sqrt(var_AB)
sd_BC <- sqrt(var_BC)

cat("Portfolio A&B: SD =", round(sd_AB*100,2), "%\n")
## Portfolio A&B: SD = 29.33 %
cat("Portfolio B&C: SD =", round(sd_BC*100,2), "%\n")
## Portfolio B&C: SD = 23.24 %

Recommendation: The B&C portfolio is preferred. Both portfolios are built from stocks with the same individual standard deviations (B=20%, C or A=40%), but B&C has a much lower correlation (0.10) than A&B (0.90). Lower correlation between the two component stocks produces greater diversification benefit, which is confirmed by the calculation above — the B&C portfolio has a lower standard deviation than the A&B portfolio for the same expected-return inputs (since we have no expected return data, the choice is based purely on risk reduction from correlation, and B&C dominates on a risk basis using only the information given).


1.2 Chapter 8

1.2.1 CFA 1

Statistic ABC XYZ
Alpha −3.20% 7.3%
Beta 0.60 0.97
0.35 0.17
Residual SD 13.02% 21.45%

Interpretation over the sample period: - ABC: Low beta (0.60) means ABC has historically been less volatile/sensitive to market moves than the market — a defensive stock. Its negative alpha (−3.20%) indicates that, on a risk-adjusted basis, ABC underperformed what would be predicted by the single-index/CAPM model over this period. R² of 0.35 means 35% of ABC’s return variance is explained by market movements (systematic), and 65% is firm-specific (residual) risk — a relatively low R², so the single-index model is a fairly weak descriptor of ABC’s returns; nonetheless ABC’s lower residual SD (13.02%) means its specific risk in absolute terms is still smaller than XYZ’s. - XYZ: Higher beta (0.97 ≈ market beta) means XYZ moves nearly in lockstep with the market. Positive alpha (7.3%) suggests XYZ earned attractive risk-adjusted excess returns historically. However R² is even lower (0.17), meaning only 17% of its variance is explained by the market — most of XYZ’s risk (83%) is firm-specific, and its residual SD (21.45%) is much higher than ABC’s, indicating XYZ carries much more nonsystematic risk.

Implications for future risk-return relationships / for a diversified portfolio: Past alpha is not very reliable evidence of future skill/mispricing, especially with such low R² values — both regressions have weak explanatory power, so the beta and alpha estimates are statistically noisy (a low R² implies high standard errors on the beta/alpha estimates). In a diversified portfolio, firm-specific (residual) risk is diversified away, so what matters most for portfolio risk contribution is beta, not residual SD. Since beta itself is imprecisely estimated when R² is low, the historical betas (0.60, 0.97) should be treated cautiously.

This caution is reinforced by the brokerage-house beta estimates: Brokerage A and B give ABC betas of 0.62 and 0.71 (vs. regression’s 0.60) and XYZ betas of 1.45 and 1.25 (vs. regression’s 0.97) — a much wider disagreement for XYZ, consistent with XYZ’s regression having the lower R² (less reliable beta estimate). This suggests the analyst should not rely solely on the 5-year regression beta, particularly for XYZ, and should consider an average/adjusted beta (e.g., Blume-adjustment toward 1.0) when forming forward-looking estimates of systematic risk for portfolio construction.

1.2.2 CFA 2

rho_baker <- 0.70
nonsystematic_pct <- 1 - rho_baker^2
cat("Nonsystematic (specific) risk fraction =", round(nonsystematic_pct*100,1), "%\n")
## Nonsystematic (specific) risk fraction = 51 %

R² = ρ² = 0.49, so systematic risk = 49% of total variance, and nonsystematic risk = 1 − 0.49 = 51% of Baker Fund’s total risk.

1.2.3 CFA 3

Rm <- 0.11
Rf <- 0.03
R_charlottesville <- 0.09

beta_implied <- (R_charlottesville - Rf) / (Rm - Rf)
cat("Implied beta =", round(beta_implied,3), "\n")
## Implied beta = 0.75

Because the correlation between Charlottesville International Fund and the world index is 1.0 (i.e., the fund is perfectly correlated/fully systematic, no diversifiable risk), we can back out beta directly from the CAPM relationship: beta = (R_fund − Rf)/(Rm − Rf) = (9% − 3%)/(11% − 3%) = 0.75.

1.2.4 CFA 4

Answer: d. Systematic risk. Beta measures sensitivity to market-wide (systematic) movements; it is the standard measure of systematic risk in the CAPM framework.

1.2.5 CFA 5

Answer: b. Only systematic risk, while standard deviation measures total risk. Beta captures only the systematic (market) component of risk, while standard deviation captures total risk (systematic + unsystematic/firm-specific).


1.3 Chapter 9

Portfolio Avg Annual Return SD Beta
R 11% 10% 0.5
S&P 500 14% 12% 1.0

1.3.1 CFA 8

To plot relative to the SML, we need the CAPM-required return for beta = 0.5. Using the S&P 500 as the market and (implicitly) a risk-free rate, the SML-required return = Rf + 0.5×(Rm − Rf). Since R’s beta is 0.5 (half of the market’s), R’s expected/required return under CAPM should be roughly half-way between Rf and 14% (the market return). R’s actual realized return is 11%, which exceeds the CAPM-implied required return for its beta (since a beta of 0.5 should require well below 11%, e.g., if Rf is around 4-5%, required return would be roughly 9-9.5%). Thus portfolio R lies above the SMLanswer c.

1.3.2 CFA 9

The CML relates total risk (standard deviation) to return, applicable to efficient, fully diversified portfolios. Portfolio R has SD = 10%, which is 10/12 = 83.3% of the market’s SD. A proportional (CML) return at that risk level would be roughly 83.3% of the market’s risk premium added to Rf — typically higher than 11% if R were efficient. Because portfolio R’s beta (0.5) is much lower than the proportion of total risk it carries (83.3% of market SD) would suggest, portfolio R carries a relatively large amount of nonsystematic risk that is not rewarded with extra return. As such, R lies below the CMLanswer b.

1.3.3 CFA 10

Portfolio A Portfolio B
Beta 1.0 1.0
Specific risk High Low

According to the CAPM, expected return depends only on systematic risk (beta), not on specific (nonsystematic) risk, because nonsystematic risk can be diversified away and is therefore not priced/rewarded. Since both portfolios have the same beta (1.0), investors should not expect a higher return on Portfolio A than on Portfolio B — both should have the same expected return under CAPM, even though A has higher specific risk. (In practice, an investor holding A as a stand-alone, undiversified position bears more total risk for the same expected return, which is a behavioral/practical argument for preferring B, but under strict CAPM logic the expected returns should be equal.)


1.4 Chapter 10

Setup: Two-factor APT model. Factors: real GDP (factor risk premium = 8%) and inflation (factor risk premium = 2%). Risk-free rate = 4%.

Fund GDP Sensitivity Inflation Sensitivity
High Growth Fund 1.25 1.5
Large Cap Fund 0.75 1.25
Utility Fund 1.0 2.0

1.4.1 Problem 13

rf <- 0.04
rp_gdp <- 0.08
rp_inf <- 0.02

b_gdp_hg <- 1.25
b_inf_hg <- 1.5

E_R_HG <- rf + b_gdp_hg*rp_gdp + b_inf_hg*rp_inf
cat("APT estimate of expected return for High Growth Fund =", round(E_R_HG*100,2), "%\n")
## APT estimate of expected return for High Growth Fund = 17 %

McCracken’s APT estimate of the expected return on the High Growth Fund is 14% (and we are told this matches the fundamental-analysis estimate, consistent with no arbitrage).

1.4.2 Problem 14

b_gdp_lc <- 0.75
b_inf_lc <- 1.25

E_R_LC_APT <- rf + b_gdp_lc*rp_gdp + b_inf_lc*rp_inf
E_R_LC_fundamental <- rf + 0.085   # 8.5% above risk-free rate

cat("APT estimate for Large Cap Fund =", round(E_R_LC_APT*100,2), "%\n")
## APT estimate for Large Cap Fund = 12.5 %
cat("Fundamental-analysis estimate for Large Cap Fund =", round(E_R_LC_fundamental*100,2), "%\n")
## Fundamental-analysis estimate for Large Cap Fund = 12.5 %

The APT model estimate (11.5%) differs from Kwon’s fundamental estimate (12.5%). Since the two approaches disagree on the same asset’s expected return, and assuming the two-factor APT model is well-specified and the factor risk premiums are correctly estimated, this discrepancy implies the Large Cap Fund is mispriced relative to the APT-implied equilibrium, which in principle creates an arbitrage opportunity: an investor could go long the (APT-cheap) source of higher expected return and short an equivalent-factor-exposure combination of other priced assets to lock in a riskless profit. In practice, however, true arbitrage requires zero net investment and zero net factor exposure; with only an estimate of fundamental expected return (not a model-derived one), this is more accurately described as a perceived mispricing rather than a textbook pure arbitrage, but per the spirit of the question: yes, an arbitrage-type opportunity is implied by the inconsistency between the two estimates.

1.4.3 Problem 15

We want a portfolio (the “GDP Fund”) built from the High Growth Fund, Large Cap Fund, and Utility Fund, with combined GDP-sensitivity = 1.0 and inflation-sensitivity = 0.0, and weights summing to 1.

# Sensitivities matrix: rows = GDP, Inflation, and weight-sum constraint
# Funds: HG, LC, Util
B_gdp  <- c(1.25, 0.75, 1.00)
B_inf  <- c(1.50, 1.25, 2.00)
ones   <- c(1, 1, 1)

A <- rbind(B_gdp, B_inf, ones)
b <- c(1, 0, 1)   # target: GDP sensitivity=1, inflation sensitivity=0, weights sum to 1

w <- solve(A, b)
names(w) <- c("HighGrowth", "LargeCap", "Utility")
print(round(w, 3))
## HighGrowth   LargeCap    Utility 
##        1.6        1.6       -2.2

Solving the 3-equation system gives the Utility Fund weight ≈ −2.2. So the answer is (a) −2.2.

1.4.4 Problem 16

Stiles believes the GDP Fund (zero inflation exposure, unit GDP exposure) would suit retirees living off steady investment income (since it removes inflation-risk exposure, which could be valuable for stabilizing real purchasing power… though arguably retirees living off steady nominal income might actually want some inflation hedge — the typical “textbook” answer here, however, evaluates this against McCracken’s separate, factual point about real-world macro conditions). McCracken’s point is a forward-looking, conditional claim: the GDP fund would be a good choice if expansionary, supply-side, pro-growth macro policy succeeds (raising real GDP without raising inflation) — which is a different, additional reason to like the fund, not mutually exclusive with the retiree rationale.

Because Stiles’ rationale (suitability for income-focused retirees) and McCracken’s rationale (good if supply-side growth policies succeed) are addressing two different, non-contradictory use-cases/conditions, both points can be valid simultaneously — answer b: Both are correct.


2 Questions using R codes (40%)

The full reproducible R Markdown workflow below is designed to run as-is on Posit Cloud (RStudio Cloud). Install/load packages in the first chunk; everything after pulls live data, so output will vary slightly depending on the exact date you run it.

2.1 0. Package setup

options(repos = c(CRAN = "https://cloud.r-project.org"))

required_pkgs <- c("tidyquant", "lubridate", "timetk", "purrr", "tibbletime",
                    "tibble", "dplyr", "tidyr", "PerformanceAnalytics", "xts",
                    "quantmod", "ggplot2", "scales", "htmltools")

new_pkgs <- required_pkgs[!(required_pkgs %in% installed.packages()[,"Package"])]
if (length(new_pkgs)) install.packages(new_pkgs)

library(tidyquant)
library(lubridate)
library(timetk)
library(purrr)
library(tibble)
library(dplyr)
library(tidyr)
library(PerformanceAnalytics)
library(xts)
library(quantmod)
library(ggplot2)

2.2 1. Import data

tickers <- c("SPY", "QQQ", "EEM", "IWM", "EFA", "TLT", "IYR", "GLD")

prices_raw <- tq_get(tickers,
                      get  = "stock.prices",
                      from = "2010-01-01",
                      to   = Sys.Date())

# Wide xts of adjusted close prices
prices_xts <- prices_raw %>%
  select(symbol, date, adjusted) %>%
  pivot_wider(names_from = symbol, values_from = adjusted) %>%
  tk_xts(date_var = date)

head(prices_xts)
##                 SPY      QQQ      EEM      IWM      EFA      TLT      IYR
## 2010-01-04 84.57845 40.24656 30.19456 51.24470 34.58120 55.70957 26.63731
## 2010-01-05 84.80238 40.24656 30.41373 51.06848 34.61169 56.06931 26.70127
## 2010-01-06 84.86207 40.00380 30.47735 51.02045 34.75798 55.31876 26.68963
## 2010-01-07 85.22034 40.02980 30.30062 51.39687 34.62388 55.41181 26.92804
## 2010-01-08 85.50391 40.35926 30.54098 51.67720 34.89819 55.38694 26.74778
## 2010-01-11 85.62331 40.19454 30.47735 51.46896 35.18469 55.08301 26.87571
##               GLD
## 2010-01-04 109.80
## 2010-01-05 109.70
## 2010-01-06 111.51
## 2010-01-07 110.82
## 2010-01-08 111.37
## 2010-01-11 112.85
tail(prices_xts)
##                 SPY      QQQ   EEM    IWM    EFA   TLT    IYR    GLD
## 2026-06-15 752.8904 743.1833 69.75 294.64 104.08 85.72 103.08 396.55
## 2026-06-16 748.4020 729.0588 68.64 292.08 104.31 86.19 103.34 397.63
## 2026-06-17 739.0560 721.7169 68.56 289.88 103.78 86.33 100.62 388.60
## 2026-06-18 746.7400 739.8070 70.79 295.59 104.41 86.75 100.45 387.12
## 2026-06-22 744.3900 737.9500 71.21 298.18 104.58 86.09 101.61 384.59
## 2026-06-23 733.5800 713.6500 67.17 295.32 102.46 86.20 102.99 377.32

2.3 2. Calculate weekly and monthly returns (simple returns)

weekly_list <- lapply(colnames(prices_xts), function(tic) {
  r <- periodReturn(prices_xts[, tic], period = "weekly", type = "arithmetic")
  colnames(r) <- tic
  r
})
weekly_returns <- do.call(merge, weekly_list)

monthly_list <- lapply(colnames(prices_xts), function(tic) {
  r <- periodReturn(prices_xts[, tic], period = "monthly", type = "arithmetic")
  colnames(r) <- tic
  r
})
monthly_returns <- do.call(merge, monthly_list)

monthly_returns <- na.omit(monthly_returns)
weekly_returns  <- na.omit(weekly_returns)

head(weekly_returns)
##                     SPY          QQQ         EEM          IWM          EFA
## 2010-01-08  0.010941950  0.002800275  0.01147281  0.008439951  0.009166649
## 2010-01-15 -0.008117394 -0.015037423 -0.02893562 -0.013019023 -0.003493416
## 2010-01-22 -0.038982853 -0.036859302 -0.05578019 -0.030622036 -0.055740686
## 2010-01-29 -0.016665022 -0.031023625 -0.03357735 -0.026243435 -0.025802606
## 2010-02-05 -0.006797448  0.004440555 -0.02821332 -0.013974251 -0.019055108
## 2010-02-12  0.012937946  0.018147631  0.03333352  0.029525916  0.005244687
##                      TLT          IYR          GLD
## 2010-01-08 -5.791181e-03  0.004147471  0.014298722
## 2010-01-15  2.004758e-02 -0.006304185 -0.004579349
## 2010-01-22  1.010099e-02 -0.041785458 -0.033285246
## 2010-01-29  3.369311e-03 -0.008447336 -0.011290465
## 2010-02-05 -5.469527e-05  0.003223325 -0.012080019
## 2010-02-12 -1.945992e-02 -0.007574045  0.022544905
head(monthly_returns)
##                    SPY         QQQ          EEM         IWM          EFA
## 2010-01-29 -0.05241295 -0.07819916 -0.103722608 -0.06048793 -0.074915717
## 2010-02-26  0.03119472  0.04603901  0.017763634  0.04475163  0.002667382
## 2010-03-31  0.06087947  0.07710879  0.081108850  0.08230683  0.063853773
## 2010-04-30  0.01547059  0.02242555 -0.001662059  0.05678484 -0.028045789
## 2010-05-28 -0.07945519 -0.07392340 -0.093935890 -0.07536648 -0.111927551
## 2010-06-30 -0.05174027 -0.05975688 -0.013986335 -0.07743384 -0.020619660
##                    TLT         IYR          GLD
## 2010-01-29  0.02783559 -0.05195380 -0.034972713
## 2010-02-26 -0.00342435  0.05457040  0.032748219
## 2010-03-31 -0.02057292  0.09748450 -0.004386396
## 2010-04-30  0.03321736  0.06388176  0.058834363
## 2010-05-28  0.05108356 -0.05683551  0.030513147
## 2010-06-30  0.05797918 -0.04670097  0.023553189

2.4 3. Convert monthly returns into tibble format

monthly_returns_tbl <- tk_tbl(monthly_returns, rename_index = "date")
monthly_returns_tbl

2.5 4. Download Fama-French 3-factor monthly data

ff_url <- "https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_Research_Data_Factors_CSV.zip"

temp_zip <- tempfile(fileext = ".zip")
download.file(ff_url, temp_zip, mode = "wb")
temp_dir <- tempdir()
unzip(temp_zip, exdir = temp_dir)
ff_csv <- list.files(temp_dir, pattern = "F-F_Research_Data_Factors\\.CSV|F-F_Research_Data_Factors\\.csv",
                     full.names = TRUE)[1]

ff_raw <- read.csv(ff_csv, skip = 3, stringsAsFactors = FALSE)
names(ff_raw)[1] <- "date_raw"

# Keep only monthly rows (6-digit YYYYMM); annual block at bottom has different format
ff_monthly <- ff_raw %>%
  filter(grepl("^[0-9]{6}$", trimws(date_raw))) %>%
  mutate(date_raw = trimws(date_raw)) %>%
  mutate(date = ymd(paste0(date_raw, "01"))) %>%
  mutate(date = date %m+% months(1) - days(1)) %>%  # end of month
  select(date, `Mkt.RF`, SMB, HML, RF) %>%
  mutate(across(c(Mkt.RF, SMB, HML, RF), as.numeric)) %>%
  mutate(across(c(Mkt.RF, SMB, HML, RF), ~ . / 100))  # convert from % to decimal

ff_xts <- tk_xts(ff_monthly, date_var = date)
head(ff_xts)
##             Mkt.RF     SMB     HML     RF
## 1926-07-31  0.0289 -0.0255 -0.0239 0.0022
## 1926-08-31  0.0264 -0.0114  0.0381 0.0025
## 1926-09-30  0.0038 -0.0136  0.0005 0.0023
## 1926-10-31 -0.0327 -0.0014  0.0082 0.0032
## 1926-11-30  0.0254 -0.0011 -0.0061 0.0031
## 1926-12-31  0.0262 -0.0007  0.0006 0.0028

2.6 5. Merge monthly returns with Fama-French factors

merged_tbl <- monthly_returns_tbl %>%
  mutate(date = as.Date(date)) %>%
  left_join(
    ff_monthly %>% mutate(date = as.Date(date)),
    by = "date"
  ) %>%
  drop_na()

merged_tbl

2.7 6. CAPM-based Global Minimum Variance (GMV) portfolio

asset_cols <- tickers

get_window <- function(tbl, end_date, n_months = 60) {
  tbl %>%
    filter(date <= end_date) %>%
    arrange(date) %>%
    tail(n_months)
}

gmv_weights <- function(cov_mat) {
  ones <- rep(1, nrow(cov_mat))
  inv_cov <- solve(cov_mat)
  w <- (inv_cov %*% ones) / as.numeric(t(ones) %*% inv_cov %*% ones)
  as.numeric(w)
}

# Window: 2010-02 to 2015-01 (60 months), used to predict 2015-02
window_data <- get_window(merged_tbl, as.Date("2015-01-31"), 60)
ret_matrix  <- as.matrix(window_data[, asset_cols])

cov_capm <- cov(ret_matrix)  # sample covariance matrix of raw returns (CAPM-style: no factor model adjustment)

w_capm <- gmv_weights(cov_capm)
names(w_capm) <- asset_cols
print(round(w_capm, 4))
##     SPY     QQQ     EEM     IWM     EFA     TLT     IYR     GLD 
##  1.0279 -0.2355  0.0286 -0.0839 -0.0401  0.4820 -0.1863  0.0074
# Realized portfolio return on 2015-02 using these weights
ret_2015_02 <- merged_tbl %>% filter(date == as.Date("2015-02-28") |
                                      (year(date) == 2015 & month(date) == 2)) %>%
  select(all_of(asset_cols)) %>% as.matrix()

realized_capm_201502 <- as.numeric(ret_2015_02 %*% w_capm)
cat("CAPM-GMV realized return for 2015-02:", round(realized_capm_201502*100, 3), "%\n")
## CAPM-GMV realized return for 2015-02:  %

2.8 7. Fama-French 3-factor GMV portfolio

# Estimate each asset's factor loadings via FF3 regression over the same 60-month window,
# then build the factor-model-implied covariance matrix:
#   Cov(R) = B %*% Sigma_F %*% t(B) + D   (D = diag of residual variances)

X <- as.matrix(window_data[, c("Mkt.RF", "SMB", "HML")])
X <- cbind(1, X)  # add intercept

betas <- matrix(NA, nrow = length(asset_cols), ncol = 3)
resid_var <- numeric(length(asset_cols))

for (i in seq_along(asset_cols)) {
  y <- ret_matrix[, i] - window_data$RF  # excess return
  fit <- lm(y ~ window_data$Mkt.RF + window_data$SMB + window_data$HML)
  betas[i, ] <- coef(fit)[-1]
  resid_var[i] <- var(resid(fit))
}
rownames(betas) <- asset_cols
colnames(betas) <- c("Mkt.RF", "SMB", "HML")

Sigma_F <- cov(window_data[, c("Mkt.RF", "SMB", "HML")])
D <- diag(resid_var)

cov_ff3 <- betas %*% Sigma_F %*% t(betas) + D
rownames(cov_ff3) <- colnames(cov_ff3) <- asset_cols

w_ff3 <- gmv_weights(cov_ff3)
names(w_ff3) <- asset_cols
print(round(w_ff3, 4))
##     SPY     QQQ     EEM     IWM     EFA     TLT     IYR     GLD 
##  0.9550 -0.1468 -0.0402 -0.1742 -0.0981  0.4111  0.0405  0.0527
realized_ff3_201502 <- as.numeric(ret_2015_02 %*% w_ff3)
cat("FF3-GMV realized return for 2015-02:", round(realized_ff3_201502*100, 3), "%\n")
## FF3-GMV realized return for 2015-02:  %

2.9 8. Rolling-window backtest: CAPM vs. FF3 GMV portfolios (2015/02 - 2026/05)

all_dates <- merged_tbl$date
start_test <- as.Date("2015-02-01")
end_test   <- as.Date("2026-05-31")

test_dates <- merged_tbl %>%
  filter(date >= start_test, date <= end_test) %>%
  pull(date)

results <- tibble(date = as.Date(character()),
                   capm_ret = numeric(),
                   ff3_ret  = numeric())

for (d in test_dates) {
  d <- as.Date(d)

  # 60-month window ending the month BEFORE d
  win <- merged_tbl %>%
    filter(date < d) %>%
    arrange(date) %>%
    tail(60)

  if (nrow(win) < 60) next  # skip if insufficient history

  ret_mat <- as.matrix(win[, asset_cols])

  ## CAPM (sample covariance) GMV weights
  cov_c <- cov(ret_mat)
  w_c <- gmv_weights(cov_c)

  ## FF3 GMV weights
  X3 <- win[, c("Mkt.RF", "SMB", "HML")]
  betas_w <- matrix(NA, nrow = length(asset_cols), ncol = 3)
  resid_var_w <- numeric(length(asset_cols))
  for (i in seq_along(asset_cols)) {
    y <- ret_mat[, i] - win$RF
    fit <- lm(y ~ X3$Mkt.RF + X3$SMB + X3$HML)
    betas_w[i, ] <- coef(fit)[-1]
    resid_var_w[i] <- var(resid(fit))
  }
  Sigma_F_w <- cov(X3)
  D_w <- diag(resid_var_w)
  cov_f <- betas_w %*% Sigma_F_w %*% t(betas_w) + D_w

  w_f <- gmv_weights(cov_f)

  ## realized return this month
  this_month_ret <- merged_tbl %>% filter(date == d) %>% select(all_of(asset_cols)) %>% as.matrix()

  results <- results %>% add_row(
    date = d,
    capm_ret = as.numeric(this_month_ret %*% w_c),
    ff3_ret  = as.numeric(this_month_ret %*% w_f)
  )
}

results <- results %>%
  arrange(date) %>%
  mutate(
    capm_cum = cumprod(1 + capm_ret) - 1,
    ff3_cum  = cumprod(1 + ff3_ret)  - 1
  )

results

2.9.1 Cumulative return chart: CAPM-GMV vs. FF3-GMV

results_long <- results %>%
  select(date, capm_cum, ff3_cum) %>%
  pivot_longer(-date, names_to = "model", values_to = "cum_return") %>%
  mutate(model = recode(model, capm_cum = "CAPM GMV", ff3_cum = "FF3 GMV"))

ggplot(results_long, aes(x = date, y = cum_return, color = model)) +
  geom_line(linewidth = 1) +
  scale_y_continuous(labels = scales::percent) +
  labs(title = "Cumulative Return: CAPM-GMV vs. Fama-French 3-Factor GMV",
       subtitle = "Rolling 60-month estimation window, monthly rebalancing, 2015-02 to 2026-05",
       x = NULL, y = "Cumulative Return", color = "Model") +
  theme_minimal()

2.9.2 Summary performance statistics

perf_xts <- xts(results[, c("capm_ret", "ff3_ret")], order.by = results$date)
colnames(perf_xts) <- c("CAPM_GMV", "FF3_GMV")

table.AnnualizedReturns(perf_xts, scale = 12)
maxDrawdown(perf_xts)
##                 CAPM_GMV   FF3_GMV
## Worst Drawdown 0.1845472 0.2008504
SharpeRatio.annualized(perf_xts, scale = 12)
##                                         CAPM_GMV   FF3_GMV
## Annualized Sharpe Ratio (Rf=0%, p=95%):     0.71 0.3432984

Note on package availability: if the SIT package (Systematic Investor Toolbox) is unavailable on CRAN for your Posit Cloud environment, the rolling-window backtest above (built with base R + PerformanceAnalytics) reproduces the same GMV/backtest logic the assignment calls for, and can be submitted as the equivalent “or other packages” implementation referenced in question 8.