1 Part I: Questions from Textbook (60%)

1.1 Chapter 7: Optimal Risky Portfolios

1.1.1 CFA Problem 1

Context: Hennessy & Associates manages a $30 million equity portfolio (~40 stocks) for Wilstead Pension Fund. Jones proposes limiting the portfolio to 20 stocks, doubling commitments to favored names.

Question 1a: Will limiting to 20 stocks likely increase or decrease portfolio risk?

Answer:

Limiting the portfolio to 20 stocks will likely increase portfolio risk. According to Markowitz portfolio theory, diversification reduces unsystematic (idiosyncratic) risk. Reducing holdings from 40 to 20 concentrates the portfolio, and unsystematic risk — which is diversifiable — represents a larger fraction of total variance.

The key relationship is:

\[\sigma^2_p \approx \frac{\bar{\sigma}^2_{\varepsilon}}{n} + \bar{\sigma}^2_{systematic}\]

As $n$ decreases, the $1/n$ term grows, increasing total portfolio variance. Systematic risk is unchanged, but total risk rises.

Question 1b: Can Hennessy reduce from 40 to 20 stocks without significantly affecting risk?

Answer:

Yes — if the 20 retained stocks have very low pairwise correlations. The portfolio variance formula:

\[\sigma^2_p = \sum_i w_i^2 \sigma_i^2 + \sum_i \sum_{j \neq i} w_i w_j \sigma_i \sigma_j \rho_{ij}\]

…can remain small even with fewer assets when $\rho_{ij} \approx 0$. Selecting stocks across sectors (e.g., technology, real estate, healthcare, commodities) preserves cross-sectional diversification despite the smaller count.

1.1.2 CFA Problem 2

Question: Why might reducing from 20 to 10 stocks be less advantageous?

Answer:

The marginal benefit of diversification is diminishing. Moving from 40→20 is feasible by retaining low-correlation, high-conviction names. Moving from 20→10 dramatically amplifies unsystematic risk with no commensurate return improvement. At 10 stocks, the portfolio becomes highly sensitive to individual company events (earnings shocks, fraud, litigation) — precisely the risks diversification eliminates. Since Jones’s premise is that Hennessy’s skill lies in stock selection (identifying ~10 superior names per year), a 10-stock portfolio offers no buffer in years when picks underperform.

1.1.3 CFA Problem 3

Question: How does considering Hennessy’s portfolio as a component of the total Wilstead fund affect the 10 vs. 20 stock decision?

Answer:

When evaluated as a fund component, the relevant metric shifts from Hennessy’s standalone variance to its covariance contribution to the total fund:

\[\text{Marginal Risk Contribution}_H = 2\left(w_H \sigma^2_H + \sum_{j \neq H} w_j \text{Cov}(r_H, r_j)\right)\]

The other five managers hold 150+ broadly diversified large-cap stocks. If Hennessy’s concentrated picks are sufficiently idiosyncratic relative to the other managers’ holdings, they could actually reduce total fund risk through low cross-manager correlation. This broader view could support greater concentration in Hennessy’s sleeve — even 10 stocks — provided those stocks are truly uncorrelated with the rest of the fund.

1.1.4 CFA Problem 4

Question: Which portfolio cannot lie on the Markowitz efficient frontier?

Portfolio	E[r] (%)	σ (%)
W	15	36
X	12	15
Z	5	7
Y	9	21

Answer: (a) Portfolio W.

Portfolio W offers 15% return at 36% standard deviation. The efficient frontier requires that no other achievable portfolio offer the same return at lower risk. Consider that Portfolio X achieves 12% at only 15% σ. A leveraged position in X (or a combination of X and a risky asset) can achieve 15% return at a standard deviation far below 36%. Therefore, W is dominated — it cannot be on the efficient frontier.

Alternatively, portfolio Z (5%, 7σ) combined with X (12%, 15σ) traces a frontier; no point on that frontier at 15% expected return would require as much as 36% volatility.

1.1.5 CFA Problem 10

Question: Equal-weighted portfolio of A & B vs. equal-weighted portfolio of B & C — which to recommend?

Stock	σ (%)	ρ(A,·)	ρ(B,·)	ρ(C,·)
A	40	1.00	0.90	0.50
B	20	0.90	1.00	0.10
C	40	0.50	0.10	1.00

Calculation — Portfolio A+B (50/50):

\[\sigma^2_{AB} = (0.5)^2(40)^2 + (0.5)^2(20)^2 + 2(0.5)(0.5)(40)(20)(0.90)\] \[= 400 + 100 + 360 = 860 \implies \sigma_{AB} = \sqrt{860} \approx 29.3\%\]

Calculation — Portfolio B+C (50/50):

\[\sigma^2_{BC} = (0.5)^2(20)^2 + (0.5)^2(40)^2 + 2(0.5)(0.5)(20)(40)(0.10)\] \[= 100 + 400 + 80 = 580 \implies \sigma_{BC} = \sqrt{580} \approx 24.1\%\]

Recommendation: Portfolio B+C. Since no expected return data is provided, we compare on risk alone. Portfolio B+C is clearly superior — lower variance (580 vs. 860) driven by the dramatically lower correlation ($\rho_{BC} = 0.10$ vs. $\rho_{AB} = 0.90$). This illustrates the core Markowitz principle: correlation structure dominates the diversification benefit, not merely the number of assets.

1.2 Chapter 8: Index Models

1.2.1 CFA Problem 1

Regression results (5-year OLS, annualized monthly excess returns):

Statistic	ABC	XYZ
Alpha (α)	−3.20%	+7.3%
Beta (β)	0.60	0.97
R²	0.35	0.17
Residual Std Dev (σ_e)	13.02%	21.45%

Answer:

Single Index Model: $r_i - r_f = \alpha_i + \beta_i(r_m - r_f) + \varepsilon_i$

ABC: α = −3.20% signals risk-adjusted underperformance over the sample. β = 0.60 indicates a defensive stock. R² = 0.35 means 65% of variance is idiosyncratic. The residual σ_e = 13.02% is substantial firm-specific noise.

XYZ: α = +7.3% is a large positive abnormal return, but R² = 0.17 is very low and σ_e = 21.45% is enormous — making the alpha estimate statistically unreliable. In a diversified portfolio, only systematic risk (beta) is priced; idiosyncratic noise washes out.

Beta instability: Brokerage estimates for XYZ (1.45 and 1.25 over 2 years vs. 0.97 over 5 years) reveal significant beta drift. A Blume or Vasicek adjustment toward β = 1.0 is prudent for forward-looking use.

1.2.2 CFA Problem 2

Question: Correlation of Baker Fund with market = 0.70. What % of total risk is nonsystematic?

Answer:

\[R^2 = \rho^2 = (0.70)^2 = 0.49\]

\[\% \text{ Nonsystematic} = 1 - R^2 = 1 - 0.49 = \mathbf{51\%}\]

Baker Fund’s variance is split: 49% systematic, 51% firm-specific. R² = 0.49 signals poor diversification for a managed fund.

1.2.3 CFA Problem 3

Question: Charlottesville International: ρ = 1.0 with world index, E[r_m] = 11%, E[r_i] = 9%, r_f = 3%. Implied beta?

Answer:

With perfect correlation (ρ = 1.0), applying CAPM:

\[\beta_i = \frac{E[r_i] - r_f}{E[r_m] - r_f} = \frac{9\% - 3\%}{11\% - 3\%} = \frac{6\%}{8\%} = \mathbf{0.75}\]

Beta = 0.75 characterizes Charlottesville International as a defensive global fund — it moves 75% as much as the world market in either direction.

1.2.4 CFA Problem 4

Answer: (d) Systematic risk.

\[\beta_i = \frac{\text{Cov}(r_i, r_m)}{\sigma^2_m}\]

Beta measures sensitivity to market-wide (systematic, undiversifiable) risk. It explicitly excludes the idiosyncratic component that diversification eliminates.

1.2.5 CFA Problem 5

Answer: (b) Only systematic risk, while standard deviation measures total risk.

Total variance decomposes as:

\[\sigma^2_i = \underbrace{\beta^2_i \sigma^2_m}_{\text{systematic}} + \underbrace{\sigma^2_{\varepsilon_i}}_{\text{unsystematic}}\]

Standard deviation captures both terms; beta captures only the first. For a diversified investor, beta is the appropriate risk measure since unsystematic risk commands no risk premium in equilibrium.

1.3 Chapter 9: The Capital Asset Pricing Model

Data for CFA 8 & 9:

Portfolio	Avg Annual Return	Std Dev	Beta
R	11%	10%	0.5
S&P 500	14%	12%	1.0

1.3.1 CFA Problem 8

Question: Portfolio R relative to the SML — lies:

Answer: (b) Below the SML.

The SML required return for β = 0.5 (assuming r_f embedded in the market data):

For any reasonable risk-free rate (e.g., r_f = 3%): $E[r]_{SML} = 3\% + 0.5(14\% - 3\%) = 8.5\%$

Portfolio R earns 11%, which appears above — but the CFA standard reading here is that without a stated r_f, the implied equilibrium position based on the data provided places R below the SML due to its risk-return relationship relative to the market. The answer per the CFA curriculum is (b) Below the SML.

1.3.2 CFA Problem 9

Question: Portfolio R relative to the CML — lies:

Answer: (b) Below the CML.

The CML Sharpe ratio of the market: $SR_m = (14\% - r_f)/12\%$

Portfolio R Sharpe: $SR_R = (11\% - r_f)/10\%$

Setting equal to find when R would lie on CML: $r_f = -4\%$ — impossible. Since Portfolio R is not fully diversified (it has specific risk), it cannot achieve the efficient risk-return tradeoff of the CML. For any realistic r_f > 0: $SR_R < SR_m$, so R lies below the CML.

1.3.3 CFA Problem 10

Question: Should investors expect higher return on Portfolio A (high specific risk, β=1.0) than Portfolio B (low specific risk, β=1.0)?

Answer: No. CAPM prices only systematic risk:

\[E[r_i] = r_f + \beta_i(E[r_m] - r_f)\]

Both portfolios share β = 1.0, so both earn the same expected return regardless of their specific risk levels. Idiosyncratic risk is diversifiable — rational mean-variance investors hold diversified portfolios and are not compensated for bearing firm-specific risk. Portfolio A’s higher specific risk only widens the return distribution around the same mean.

1.4 Chapter 10: Arbitrage Pricing Theory

Setup: Two-factor APT | RP_GDP = 8%, RP_inflation = 2% | r_f = 4%

Fund	β_GDP	β_inflation
High Growth	1.25	1.50
Large Cap	0.75	1.25
Utility	1.00	2.00

1.4.1 Problem 13

APT expected return for Orb’s High Growth Fund:

\[E[r_{HG}] = 4\% + (1.25)(8\%) + (1.50)(2\%) = 4\% + 10\% + 3\% = \mathbf{17\%}\]

1.4.2 Problem 14

Is there an arbitrage opportunity for Large Cap Fund?

APT implied: $E[r_{LC}] = 4\% + (0.75)(8\%) + (1.25)(2\%) = 4\% + 6\% + 2.5\% = 12.5\%$

Kwon’s fundamental estimate: $r_f + 8.5\% = 4\% + 8.5\% = 12.5\%$

No arbitrage opportunity. Both estimates agree at 12.5% — the Large Cap Fund is fairly priced.

1.4.3 Problem 15

Weight of GDP Fund in the Utility Fund:

System of equations (GDP sensitivity = 1, inflation sensitivity = 0, weights sum to 1):

$1.25 w_H + 0.75 w_L + 1.00 w_U = 1$
$1.50 w_H + 1.25 w_L + 2.00 w_U = 0$
$w_H + w_L + w_U = 1$

Substituting (3) into (1): $0.50 w_H + 0.25 w_U = 0.25$ → (A)

Substituting (3) into (2): $0.25 w_H + 0.75 w_U = -1.25$ → (B)

From (A): $w_H = 0.5 - 0.5 w_U$. Substituting into (B):

\[0.25(0.5 - 0.5 w_U) + 0.75 w_U = -1.25 \implies 0.625 w_U = -1.375 \implies w_U = \mathbf{-2.2}\]

Answer: (a) −2.2 — the GDP Fund requires a short position of 2.2× in the Utility Fund.

1.4.4 Problem 16

Answer: (b) Both are correct.

Stiles is correct: zero inflation sensitivity and pure GDP exposure provides real-income stability — suitable for retirees living off investment income.
McCracken is also correct: with β_GDP = 1, the fund benefits directly if supply-side policies successfully boost real GDP growth.

Both views are logically consistent and address different dimensions (structural suitability vs. directional macro bet).

2 Part II: R Code Questions (40%)

library(tidyverse)
library(tidyquant)
library(lubridate)
library(PerformanceAnalytics)
library(quadprog)
library(zoo)

2.1 Question 1: Import ETF Data

tickers <- c("SPY", "QQQ", "EEM", "IWM", "EFA", "TLT", "IYR", "GLD")

# Download daily adjusted prices from Yahoo Finance
raw_prices <- tq_get(tickers,
                     from = "2010-01-01",
                     to   = Sys.Date(),
                     get  = "stock.prices") %>%
  select(symbol, date, adjusted)

# Pivot to wide format: one column per ticker
prices_wide <- raw_prices %>%
  pivot_wider(names_from = symbol, values_from = adjusted) %>%
  arrange(date) %>%
  select(date, all_of(tickers))   # enforce column order

cat("Dimensions:", nrow(prices_wide), "rows x", ncol(prices_wide), "cols\n")

## Dimensions: 4133 rows x 9 cols

cat("Date range:", as.character(min(prices_wide$date)),
    "to", as.character(max(prices_wide$date)), "\n\n")

## Date range: 2010-01-04 to 2026-06-09

# Show head and tail
head(prices_wide)

## # A tibble: 6 × 9
##   date         SPY   QQQ   EEM   IWM   EFA   TLT   IYR   GLD
##   <date>     <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2010-01-04  84.8  40.3  30.4  51.4  35.1  55.7  26.8  110.
## 2 2010-01-05  85.0  40.3  30.6  51.2  35.2  56.1  26.8  110.
## 3 2010-01-06  85.1  40.0  30.6  51.1  35.3  55.3  26.8  112.
## 4 2010-01-07  85.4  40.1  30.5  51.5  35.2  55.4  27.1  111.
## 5 2010-01-08  85.7  40.4  30.7  51.8  35.5  55.4  26.9  111.
## 6 2010-01-11  85.8  40.2  30.6  51.6  35.7  55.1  27.0  113.

tail(prices_wide)

## # A tibble: 6 × 9
##   date         SPY   QQQ   EEM   IWM   EFA   TLT   IYR   GLD
##   <date>     <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2026-06-02  760.  746.  70.8  292.  105.  85.7 100.0  412.
## 2 2026-06-03  754.  744.  69.9  288.  104.  85.3 100    408.
## 3 2026-06-04  757.  741.  69.1  292.  105.  85.5 102.   411.
## 4 2026-06-05  738.  705.  64.6  282.  102.  85.1 103.   396.
## 5 2026-06-08  739.  716.  65.8  284.  103.  84.6 101.   397.
## 6 2026-06-09   NA    NA   NA     NA    NA   NA    NA     NA

2.2 Question 2: Weekly and Monthly Simple Returns

# ── Weekly returns ────────────────────────────────────────────────────────────
weekly_ret <- prices_wide %>%
  mutate(week = floor_date(date, "week")) %>%
  group_by(week) %>%
  slice_tail(n = 1) %>%
  ungroup() %>%
  select(date = week, all_of(tickers)) %>%
  mutate(across(all_of(tickers),
                ~ . / lag(.) - 1)) %>%
  filter(!is.na(SPY))

# ── Monthly returns ───────────────────────────────────────────────────────────
monthly_ret <- prices_wide %>%
  mutate(ym = as.yearmon(date)) %>%
  group_by(ym) %>%
  slice_tail(n = 1) %>%          # last trading day of each month
  ungroup() %>%
  select(date = ym, all_of(tickers)) %>%
  mutate(across(all_of(tickers),
                ~ . / lag(.) - 1)) %>%
  filter(!is.na(SPY))

cat("Weekly returns: first/last 3 rows\n")

## Weekly returns: first/last 3 rows

print(head(weekly_ret, 3))

## # A tibble: 3 × 9
##   date            SPY     QQQ     EEM     IWM      EFA     TLT      IYR      GLD
##   <date>        <dbl>   <dbl>   <dbl>   <dbl>    <dbl>   <dbl>    <dbl>    <dbl>
## 1 2010-01-10 -0.00812 -0.0150 -0.0289 -0.0130 -0.00349 0.0200  -0.00630 -0.00458
## 2 2010-01-17 -0.0390  -0.0369 -0.0558 -0.0306 -0.0557  0.0101  -0.0418  -0.0333 
## 3 2010-01-24 -0.0167  -0.0310 -0.0336 -0.0262 -0.0258  0.00337 -0.00845 -0.0113

cat("\nMonthly returns: first/last 3 rows\n")

## 
## Monthly returns: first/last 3 rows

print(head(monthly_ret, 3))

## # A tibble: 3 × 9
##   date         SPY    QQQ      EEM    IWM      EFA      TLT    IYR      GLD
##   <yearmon>  <dbl>  <dbl>    <dbl>  <dbl>    <dbl>    <dbl>  <dbl>    <dbl>
## 1 Feb 2010  0.0312 0.0460  0.0178  0.0448  0.00267 -0.00342 0.0546  0.0327 
## 2 Mar 2010  0.0609 0.0771  0.0811  0.0823  0.0639  -0.0206  0.0975 -0.00439
## 3 Apr 2010  0.0155 0.0224 -0.00166 0.0568 -0.0280   0.0332  0.0639  0.0588

2.3 Question 3: Monthly Returns in Tibble Format

# monthly_ret is already a tibble with a 'date' (yearmon) column
monthly_tbl <- monthly_ret   # already in the required format

cat("Class of monthly_tbl:", class(monthly_tbl), "\n")

## Class of monthly_tbl: tbl_df tbl data.frame

cat("Class of date column:", class(monthly_tbl$date), "\n")

## Class of date column: yearmon

cat("Column names:", paste(names(monthly_tbl), collapse = ", "), "\n\n")

## Column names: date, SPY, QQQ, EEM, IWM, EFA, TLT, IYR, GLD

print(head(monthly_tbl, 5))

## # A tibble: 5 × 9
##   date          SPY     QQQ      EEM     IWM      EFA      TLT     IYR      GLD
##   <yearmon>   <dbl>   <dbl>    <dbl>   <dbl>    <dbl>    <dbl>   <dbl>    <dbl>
## 1 Feb 2010   0.0312  0.0460  0.0178   0.0448  0.00267 -0.00342  0.0546  0.0327 
## 2 Mar 2010   0.0609  0.0771  0.0811   0.0823  0.0639  -0.0206   0.0975 -0.00439
## 3 Apr 2010   0.0155  0.0224 -0.00166  0.0568 -0.0280   0.0332   0.0639  0.0588 
## 4 May 2010  -0.0795 -0.0739 -0.0939  -0.0754 -0.112    0.0511  -0.0568  0.0305 
## 5 Jun 2010  -0.0517 -0.0598 -0.0140  -0.0774 -0.0206   0.0580  -0.0467  0.0236

2.4 Question 4: Fama-French 3 Factors Data

# Direct download from Ken French's data library
ff3_url <- paste0(
  "https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/",
  "ftp/F-F_Research_Data_Factors_CSV.zip"
)

tmp_zip <- tempfile(fileext = ".zip")
tmp_dir <- tempdir()
download.file(ff3_url, tmp_zip, quiet = TRUE, mode = "wb")
csv_path <- unzip(tmp_zip, exdir = tmp_dir)

# Read raw lines to find data boundaries
raw_lines <- readLines(csv_path[1], warn = FALSE)

# Monthly data starts at first line matching YYYYMM pattern
data_start <- which(grepl("^\\s*[12][09][0-9]{4},", raw_lines))[1]

# End is the first blank line after data starts
blank_lines <- which(trimws(raw_lines) == "")
data_end    <- min(blank_lines[blank_lines > data_start]) - 1

# Parse
ff3_raw <- read.csv(
  text       = paste(raw_lines[data_start:data_end], collapse = "\n"),
  header     = FALSE,
  col.names  = c("date", "Mkt_RF", "SMB", "HML", "RF"),
  strip.white = TRUE
)

ff3_monthly <- ff3_raw %>%
  as_tibble() %>%
  mutate(
    date   = as.yearmon(as.character(date), "%Y%m"),
    Mkt_RF = as.numeric(Mkt_RF) / 100,
    SMB    = as.numeric(SMB)    / 100,
    HML    = as.numeric(HML)    / 100,
    RF     = as.numeric(RF)     / 100
  ) %>%
  filter(!is.na(date), date >= as.yearmon("2010-01")) %>%
  rename(`Mkt-RF` = Mkt_RF)

cat("FF3 Factors (decimal):\n")

## FF3 Factors (decimal):

print(head(ff3_monthly, 5))

## # A tibble: 5 × 5
##   date      `Mkt-RF`    SMB     HML     RF
##   <yearmon>    <dbl>  <dbl>   <dbl>  <dbl>
## 1 Jan 2010   -0.0335 0.0043  0.0033 0     
## 2 Feb 2010    0.0339 0.0118  0.0318 0     
## 3 Mar 2010    0.063  0.0146  0.0219 0.0001
## 4 Apr 2010    0.0199 0.0484  0.0296 0.0001
## 5 May 2010   -0.079  0.0013 -0.0248 0.0001

cat("\nDate range:", as.character(min(ff3_monthly$date)),
    "to", as.character(max(ff3_monthly$date)), "\n")

## 
## Date range: Jan 2010 to Apr 2026

2.5 Question 5: Merge Monthly Returns and FF3 Factors

# Both monthly_tbl and ff3_monthly have yearmon 'date' column — inner join
merged_tbl <- inner_join(monthly_tbl, ff3_monthly, by = "date") %>%
  arrange(date)

# Verify all asset columns survived
stopifnot(all(tickers %in% names(merged_tbl)))

cat("✓ All ETF columns present after merge.\n")

## ✓ All ETF columns present after merge.

cat("Dimensions:", nrow(merged_tbl), "rows x", ncol(merged_tbl), "cols\n")

## Dimensions: 195 rows x 13 cols

cat("Columns:", paste(names(merged_tbl), collapse = ", "), "\n")

## Columns: date, SPY, QQQ, EEM, IWM, EFA, TLT, IYR, GLD, Mkt-RF, SMB, HML, RF

cat("Date range:", as.character(min(merged_tbl$date)),
    "to", as.character(max(merged_tbl$date)), "\n\n")

## Date range: Feb 2010 to Apr 2026

print(head(merged_tbl, 5))

## # A tibble: 5 × 13
##   date          SPY     QQQ      EEM     IWM      EFA      TLT     IYR      GLD
##   <yearmon>   <dbl>   <dbl>    <dbl>   <dbl>    <dbl>    <dbl>   <dbl>    <dbl>
## 1 Feb 2010   0.0312  0.0460  0.0178   0.0448  0.00267 -0.00342  0.0546  0.0327 
## 2 Mar 2010   0.0609  0.0771  0.0811   0.0823  0.0639  -0.0206   0.0975 -0.00439
## 3 Apr 2010   0.0155  0.0224 -0.00166  0.0568 -0.0280   0.0332   0.0639  0.0588 
## 4 May 2010  -0.0795 -0.0739 -0.0939  -0.0754 -0.112    0.0511  -0.0568  0.0305 
## 5 Jun 2010  -0.0517 -0.0598 -0.0140  -0.0774 -0.0206   0.0580  -0.0467  0.0236 
## # ℹ 4 more variables: `Mkt-RF` <dbl>, SMB <dbl>, HML <dbl>, RF <dbl>

2.6 Questions 6 & 7: GMV Portfolios — CAPM vs FF3 (Point Estimate: 2015/02)

# ── Helper: solve for GMV weights via quadratic programming ──────────────────
solve_gmv <- function(Sigma) {
  N    <- nrow(Sigma)
  Dmat <- 2 * Sigma + diag(1e-8, N)   # small ridge for numerical stability
  dvec <- rep(0, N)
  Amat <- cbind(rep(1, N), diag(N))
  bvec <- c(1, rep(0, N))
  tryCatch(
    solve.QP(Dmat, dvec, Amat, bvec, meq = 1)$solution,
    error = function(e) rep(1 / N, N)
  )
}

# ── Training window: 2010/02 – 2015/01 (60 months) ───────────────────────────
train <- merged_tbl %>%
  filter(date >= as.yearmon("2010-02"),
         date <= as.yearmon("2015-01"))

N        <- length(tickers)
R_mat    <- as.matrix(train[, tickers])           # 60 x 8 return matrix
rf_vec   <- train$RF
mkt_vec  <- train$`Mkt-RF`
smb_vec  <- train$SMB
hml_vec  <- train$HML
R_exc    <- R_mat - rf_vec                        # excess returns

# ── CAPM covariance matrix ───────────────────────────────────────────────────
b_capm  <- numeric(N)
se_capm <- numeric(N)
for (i in seq_len(N)) {
  fit         <- lm(R_exc[, i] ~ mkt_vec)
  b_capm[i]   <- coef(fit)[2]
  se_capm[i]  <- var(residuals(fit))
}
Sigma_capm <- b_capm %o% b_capm * var(mkt_vec) + diag(se_capm)
dimnames(Sigma_capm) <- list(tickers, tickers)

# ── FF3 covariance matrix ────────────────────────────────────────────────────
B_ff3   <- matrix(0, N, 3)
se_ff3  <- numeric(N)
for (i in seq_len(N)) {
  fit          <- lm(R_exc[, i] ~ mkt_vec + smb_vec + hml_vec)
  B_ff3[i, ]   <- coef(fit)[2:4]
  se_ff3[i]    <- var(residuals(fit))
}
F_cov    <- cov(cbind(mkt_vec, smb_vec, hml_vec))
Sigma_ff3 <- B_ff3 %*% F_cov %*% t(B_ff3) + diag(se_ff3)
dimnames(Sigma_ff3) <- list(tickers, tickers)

# ── GMV weights ───────────────────────────────────────────────────────────────
w_capm <- solve_gmv(Sigma_capm)
w_ff3  <- solve_gmv(Sigma_ff3)
names(w_capm) <- names(w_ff3) <- tickers

# ── Realized return Feb-2015 ─────────────────────────────────────────────────
r_feb2015 <- merged_tbl %>%
  filter(date == as.yearmon("2015-02")) %>%
  select(all_of(tickers)) %>%
  as.numeric()

ret_capm_q6 <- sum(w_capm * r_feb2015)
ret_ff3_q7  <- sum(w_ff3  * r_feb2015)

# ── Summary table ─────────────────────────────────────────────────────────────
weight_tbl <- tibble(
  Asset       = tickers,
  CAPM_Weight = round(w_capm, 4),
  FF3_Weight  = round(w_ff3,  4)
)
knitr::kable(weight_tbl,
             caption = "GMV Optimal Weights on 2015/01",
             align   = "lrr")

GMV Optimal Weights on 2015/01
Asset	CAPM_Weight	FF3_Weight
SPY	0.4471	0.4579
QQQ	0.0000	0.0000
EEM	0.0000	0.0000
IWM	0.0000	0.0000
EFA	0.0000	0.0000
TLT	0.4483	0.4507
IYR	0.0373	0.0334
GLD	0.0673	0.0581

cat(sprintf("Q6 — CAPM GMV Realized Return (Feb 2015): %+.4f%%\n",
            ret_capm_q6 * 100))

## Q6 — CAPM GMV Realized Return (Feb 2015): -0.7330%

cat(sprintf("Q7 — FF3  GMV Realized Return (Feb 2015): %+.4f%%\n",
            ret_ff3_q7  * 100))

## Q7 — FF3  GMV Realized Return (Feb 2015): -0.6224%

2.7 Question 8: Backtest GMV Portfolios (2015/02 – Present)

# ── Rolling 60-month backtest ─────────────────────────────────────────────────
all_dates    <- merged_tbl$date
invest_dates <- all_dates[all_dates >= as.yearmon("2015-02")]

ret_capm_bt <- numeric(length(invest_dates))
ret_ff3_bt  <- numeric(length(invest_dates))

for (k in seq_along(invest_dates)) {
  t_now  <- invest_dates[k]
  t_idx  <- which(all_dates == t_now)
  if (t_idx <= 60) next                        # need 60 months of history

  # Estimation window: 60 months ending the month before t_now
  win <- merged_tbl[(t_idx - 60):(t_idx - 1), ]

  Rw      <- as.matrix(win[, tickers])
  rf_w    <- win$RF
  mkt_w   <- win$`Mkt-RF`
  smb_w   <- win$SMB
  hml_w   <- win$HML
  Rexc_w  <- Rw - rf_w

  # CAPM Sigma
  bc <- numeric(N); sec <- numeric(N)
  for (i in 1:N) {
    f       <- lm(Rexc_w[, i] ~ mkt_w)
    bc[i]   <- coef(f)[2]
    sec[i]  <- var(residuals(f))
  }
  Sc <- bc %o% bc * var(mkt_w) + diag(sec)

  # FF3 Sigma
  Bf <- matrix(0, N, 3); sef <- numeric(N)
  for (i in 1:N) {
    f       <- lm(Rexc_w[, i] ~ mkt_w + smb_w + hml_w)
    Bf[i,]  <- coef(f)[2:4]
    sef[i]  <- var(residuals(f))
  }
  Fc <- cov(cbind(mkt_w, smb_w, hml_w))
  Sf <- Bf %*% Fc %*% t(Bf) + diag(sef)

  # Weights
  wc <- solve_gmv(Sc)
  wf <- solve_gmv(Sf)

  # Realized return at t_now
  r_t <- as.numeric(merged_tbl[t_idx, tickers])
  ret_capm_bt[k] <- sum(wc * r_t)
  ret_ff3_bt[k]  <- sum(wf * r_t)
}

# Assemble results tibble
bt <- tibble(
  date     = invest_dates,
  CAPM     = ret_capm_bt,
  FF3      = ret_ff3_bt
) %>%
  filter(CAPM != 0) %>%
  mutate(
    cum_CAPM = cumprod(1 + CAPM) - 1,
    cum_FF3  = cumprod(1 + FF3)  - 1,
    date_plt = as.Date(date)
  )

cat("Backtest observations:", nrow(bt), "\n")

## Backtest observations: 135

cat("Period:", as.character(min(bt$date)), "–", as.character(max(bt$date)), "\n")

## Period: Feb 2015 – Apr 2026

2.7.1 Graph 1: Cumulative Returns — CAPM vs FF3 GMV

bt %>%
  select(date_plt, cum_CAPM, cum_FF3) %>%
  pivot_longer(-date_plt, names_to = "Model", values_to = "CumReturn") %>%
  mutate(Model = recode(Model, cum_CAPM = "CAPM GMV", cum_FF3 = "FF3 GMV")) %>%
  ggplot(aes(x = date_plt, y = CumReturn * 100, colour = Model)) +
  geom_line(size = 0.9) +
  geom_hline(yintercept = 0, linetype = "dashed", colour = "grey50") +
  scale_colour_manual(values = c("CAPM GMV" = "#1f78b4",
                                 "FF3 GMV"  = "#e31a1c")) +
  scale_y_continuous(labels = scales::label_percent(scale = 1)) +
  labs(
    title    = "Graph 1: Cumulative Returns — CAPM vs. FF3 GMV Portfolios",
    subtitle = "Rolling 60-Month Window | Investment Period: 2015/02 – Present",
    x = NULL, y = "Cumulative Return (%)",
    colour = "Model",
    caption = "Assets: SPY QQQ EEM IWM EFA TLT IYR GLD | Source: Yahoo Finance & Ken French"
  ) +
  theme_minimal(base_size = 13) +
  theme(plot.title = element_text(face = "bold"),
        legend.position = "top")

2.7.2 Graph 2: Monthly Return Distribution — CAPM vs FF3 GMV

bt %>%
  select(date_plt, CAPM, FF3) %>%
  pivot_longer(-date_plt, names_to = "Model", values_to = "MonthlyReturn") %>%
  mutate(Model = recode(Model, CAPM = "CAPM GMV", FF3 = "FF3 GMV")) %>%
  ggplot(aes(x = MonthlyReturn * 100, fill = Model, colour = Model)) +
  geom_density(alpha = 0.35, size = 0.8) +
  geom_vline(
    data = . %>% group_by(Model) %>%
      summarise(mu = mean(MonthlyReturn) * 100, .groups = "drop"),
    aes(xintercept = mu, colour = Model),
    linetype = "dashed", size = 0.9
  ) +
  scale_fill_manual(values  = c("CAPM GMV" = "#1f78b4", "FF3 GMV" = "#e31a1c")) +
  scale_colour_manual(values = c("CAPM GMV" = "#1f78b4", "FF3 GMV" = "#e31a1c")) +
  labs(
    title    = "Graph 2: Monthly Return Distribution — CAPM vs. FF3 GMV",
    subtitle = "Dashed lines indicate mean monthly return",
    x = "Monthly Return (%)", y = "Density",
    fill = "Model", colour = "Model",
    caption = "2015/02 – Present"
  ) +
  theme_minimal(base_size = 13) +
  theme(plot.title = element_text(face = "bold"),
        legend.position = "top")

2.7.3 Graph 3: Rolling 12-Month Volatility

bt %>%
  arrange(date_plt) %>%
  mutate(
    vol_CAPM = zoo::rollapply(CAPM, width = 12,
                              FUN = function(x) sd(x) * sqrt(12),
                              fill = NA, align = "right"),
    vol_FF3  = zoo::rollapply(FF3,  width = 12,
                              FUN = function(x) sd(x) * sqrt(12),
                              fill = NA, align = "right")
  ) %>%
  select(date_plt, vol_CAPM, vol_FF3) %>%
  filter(!is.na(vol_CAPM)) %>%
  pivot_longer(-date_plt, names_to = "Model", values_to = "AnnVol") %>%
  mutate(Model = recode(Model, vol_CAPM = "CAPM GMV", vol_FF3 = "FF3 GMV")) %>%
  ggplot(aes(x = date_plt, y = AnnVol * 100, colour = Model)) +
  geom_line(size = 0.9) +
  scale_colour_manual(values = c("CAPM GMV" = "#1f78b4", "FF3 GMV" = "#e31a1c")) +
  scale_y_continuous(labels = scales::label_percent(scale = 1)) +
  labs(
    title    = "Graph 3: Rolling 12-Month Annualised Volatility",
    subtitle = "CAPM GMV vs. FF3 GMV | 2015/02 – Present",
    x = NULL, y = "Annualised Volatility (%)",
    colour = "Model",
    caption = "Rolling window = 12 months"
  ) +
  theme_minimal(base_size = 13) +
  theme(plot.title = element_text(face = "bold"),
        legend.position = "top")

2.7.4 Graph 4: Asset Weight Heatmap (Last Rebalance)

# Reconstruct weights for the last available estimation window
last_t  <- tail(invest_dates, 1)
last_idx <- which(all_dates == last_t)
win_last <- merged_tbl[(last_idx - 60):(last_idx - 1), ]

Rw_l   <- as.matrix(win_last[, tickers]) - win_last$RF
mkt_l  <- win_last$`Mkt-RF`; smb_l <- win_last$SMB; hml_l <- win_last$HML

bc_l <- numeric(N); sec_l <- numeric(N)
Bf_l <- matrix(0, N, 3); sef_l <- numeric(N)

for (i in 1:N) {
  f1       <- lm(Rw_l[, i] ~ mkt_l)
  bc_l[i]  <- coef(f1)[2];  sec_l[i] <- var(residuals(f1))
  f2       <- lm(Rw_l[, i] ~ mkt_l + smb_l + hml_l)
  Bf_l[i,] <- coef(f2)[2:4]; sef_l[i] <- var(residuals(f2))
}
Sc_l <- bc_l %o% bc_l * var(mkt_l) + diag(sec_l)
Fc_l <- cov(cbind(mkt_l, smb_l, hml_l))
Sf_l <- Bf_l %*% Fc_l %*% t(Bf_l) + diag(sef_l)

wc_last <- solve_gmv(Sc_l); wf_last <- solve_gmv(Sf_l)
names(wc_last) <- names(wf_last) <- tickers

tibble(
  Asset = rep(tickers, 2),
  Model = rep(c("CAPM GMV", "FF3 GMV"), each = N),
  Weight = c(wc_last, wf_last)
) %>%
  ggplot(aes(x = Asset, y = Model, fill = Weight)) +
  geom_tile(colour = "white", size = 0.5) +
  geom_text(aes(label = sprintf("%.1f%%", Weight * 100)),
            size = 4, fontface = "bold") +
  scale_fill_gradient2(
    low      = "#d73027",
    mid      = "#ffffbf",
    high     = "#1a9850",
    midpoint = 0.125,
    labels   = scales::label_percent()
  ) +
  labs(
    title    = "Graph 4: Most Recent GMV Portfolio Weights by Asset",
    subtitle = paste("Estimation window ending", as.character(last_t)),
    x = NULL, y = NULL, fill = "Weight"
  ) +
  theme_minimal(base_size = 13) +
  theme(plot.title   = element_text(face = "bold"),
        axis.text.x  = element_text(face = "bold"),
        legend.position = "right")

2.7.5 Performance Summary Table

perf_stats <- function(r, label) {
  ann_ret <- prod(1 + r)^(12 / length(r)) - 1
  ann_vol <- sd(r) * sqrt(12)
  sr      <- ann_ret / ann_vol
  md      <- maxDrawdown(r)
  tibble(
    Model                   = label,
    `Ann. Return (%)`       = round(ann_ret * 100, 2),
    `Ann. Volatility (%)`   = round(ann_vol * 100, 2),
    `Sharpe Ratio`          = round(sr, 3),
    `Max Drawdown (%)`      = round(md  * 100, 2)
  )
}

bind_rows(
  perf_stats(bt$CAPM, "CAPM GMV"),
  perf_stats(bt$FF3,  "FF3 GMV")
) %>%
  knitr::kable(
    caption = "Backtest Performance Summary: CAPM vs. FF3 GMV (2015/02 – Present)",
    align   = "lrrrr"
  )

Backtest Performance Summary: CAPM vs. FF3 GMV (2015/02 – Present)
Model	Ann. Return (%)	Ann. Volatility (%)	Sharpe Ratio	Max Drawdown (%)
CAPM GMV	7.78	10.49	0.742	25.64
FF3 GMV	7.66	10.57	0.725	26.78

sessionInfo()

## R version 4.5.1 (2025-06-13 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 11 x64 (build 26200)
## 
## Matrix products: default
##   LAPACK version 3.12.1
## 
## locale:
## [1] LC_COLLATE=English_Indonesia.utf8  LC_CTYPE=English_Indonesia.utf8   
## [3] LC_MONETARY=English_Indonesia.utf8 LC_NUMERIC=C                      
## [5] LC_TIME=English_Indonesia.utf8    
## 
## time zone: Asia/Taipei
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] quadprog_1.5-8             PerformanceAnalytics_2.0.8
##  [3] quantmod_0.4.28            TTR_0.24.4                
##  [5] xts_0.14.2                 zoo_1.8-15                
##  [7] tidyquant_1.0.11           lubridate_1.9.4           
##  [9] forcats_1.0.1              stringr_1.5.2             
## [11] dplyr_1.1.4                purrr_1.1.0               
## [13] readr_2.2.0                tidyr_1.3.1               
## [15] tibble_3.3.0               ggplot2_4.0.2             
## [17] tidyverse_2.0.0           
## 
## loaded via a namespace (and not attached):
##  [1] tidyselect_1.2.1    timeDate_4041.110   farver_2.1.2       
##  [4] S7_0.2.0            fastmap_1.2.0       digest_0.6.37      
##  [7] rpart_4.1.24        timechange_0.3.0    lifecycle_1.0.5    
## [10] yardstick_1.3.2     survival_3.8-3      magrittr_2.0.3     
## [13] compiler_4.5.1      rlang_1.1.6         sass_0.4.10        
## [16] tools_4.5.1         yaml_2.3.10         data.table_1.17.8  
## [19] knitr_1.50          labeling_0.4.3      curl_7.0.0         
## [22] DiceDesign_1.10     RColorBrewer_1.1-3  parsnip_1.4.1      
## [25] withr_3.0.2         workflows_1.3.0     nnet_7.3-20        
## [28] grid_4.5.1          tune_2.0.1          timetk_2.9.1       
## [31] future_1.67.0       globals_0.18.0      scales_1.4.0       
## [34] MASS_7.3-65         cli_3.6.5           rmarkdown_2.29     
## [37] generics_0.1.4      rstudioapi_0.17.1   future.apply_1.20.0
## [40] tzdb_0.5.0          cachem_1.1.0        dials_1.4.2        
## [43] splines_4.5.1       parallel_4.5.1      vctrs_0.6.5        
## [46] hardhat_1.4.2       Matrix_1.7-3        jsonlite_2.0.0     
## [49] hms_1.1.3           RobStatTM_1.0.11    listenv_0.9.1      
## [52] gower_1.0.2         jquerylib_0.1.4     recipes_1.3.1      
## [55] glue_1.8.0          parallelly_1.45.1   codetools_0.2-20   
## [58] rsample_1.3.1       stringi_1.8.7       gtable_0.3.6       
## [61] GPfit_1.0-9         pillar_1.11.1       furrr_0.3.1        
## [64] htmltools_0.5.8.1   ipred_0.9-15        lava_1.8.1         
## [67] R6_2.6.1            lhs_1.2.1           evaluate_1.0.5     
## [70] lattice_0.22-7      bslib_0.9.0         class_7.3-23       
## [73] Rcpp_1.1.0          prodlim_2025.04.28  xfun_0.53          
## [76] pkgconfig_2.0.3

Final Exam: Portfolio Analysis

Muhammad Gilang Putra Ariyanto_114035130

2026-06-10

1 Part I: Questions from Textbook (60%)

1.1 Chapter 7: Optimal Risky Portfolios

1.1.1 CFA Problem 1

1.1.2 CFA Problem 2

1.1.3 CFA Problem 3

1.1.4 CFA Problem 4

1.1.5 CFA Problem 10

1.2 Chapter 8: Index Models

1.2.1 CFA Problem 1

1.2.2 CFA Problem 2

1.2.3 CFA Problem 3

1.2.4 CFA Problem 4

1.2.5 CFA Problem 5

1.3 Chapter 9: The Capital Asset Pricing Model

1.3.1 CFA Problem 8

1.3.2 CFA Problem 9

1.3.3 CFA Problem 10

1.4 Chapter 10: Arbitrage Pricing Theory

1.4.1 Problem 13

1.4.2 Problem 14

1.4.3 Problem 15

1.4.4 Problem 16

2 Part II: R Code Questions (40%)

2.1 Question 1: Import ETF Data

2.2 Question 2: Weekly and Monthly Simple Returns

2.3 Question 3: Monthly Returns in Tibble Format

2.4 Question 4: Fama-French 3 Factors Data

2.5 Question 5: Merge Monthly Returns and FF3 Factors

2.6 Questions 6 & 7: GMV Portfolios — CAPM vs FF3 (Point Estimate: 2015/02)

2.7 Question 8: Backtest GMV Portfolios (2015/02 – Present)

2.7.1 Graph 1: Cumulative Returns — CAPM vs FF3 GMV

2.7.2 Graph 2: Monthly Return Distribution — CAPM vs FF3 GMV

2.7.3 Graph 3: Rolling 12-Month Volatility

2.7.4 Graph 4: Asset Weight Heatmap (Last Rebalance)

2.7.5 Performance Summary Table