1 Introduction

This document presents a complete solution to the Investment Portfolio Management midterm examination. The analysis is divided into two parts:

Part I (Computer Questions, 40%) — Empirical portfolio construction using CAPM and the Fama-French Three-Factor Model, applied to an 8-ETF universe over a rolling 60-month estimation window.
Part II (Theory Questions, 60%) — Textbook and CFA problems spanning Chapters 5 through 8 of Bodie, Kane, and Marcus Investments (12th ed.).

The goal is not merely to obtain numerical answers, but to connect each result to its underlying economic intuition — the hallmark of rigorous financial analysis.

2 Part I: Computer Questions (40%)

2.1 Data Preparation

2.1.1 Required Libraries

library(quantmod)
library(tidyverse)
library(PerformanceAnalytics)
library(xts)
library(zoo)
library(quadprog)
library(knitr)
library(kableExtra)

2.1.2 Step 1: Download ETF Data from Yahoo Finance (2010–2025)

The eight ETFs span major asset classes, providing genuine diversification:

Ticker	Asset Class
SPY	US Large-Cap Equity (S&P 500)
QQQ	US Technology / NASDAQ-100
EEM	Emerging Market Equity
IWM	US Small-Cap Equity (Russell 2000)
EFA	Developed International Equity
TLT	US Long-Term Treasury Bonds
IYR	US Real Estate (REITs)
GLD	Gold / Commodity

tickers <- c("SPY", "QQQ", "EEM", "IWM", "EFA", "TLT", "IYR", "GLD")

# Download adjusted prices
getSymbols(tickers,
           src  = "yahoo",
           from = "2010-01-01",
           to   = "2025-04-30",
           auto.assign = TRUE)

## [1] "SPY" "QQQ" "EEM" "IWM" "EFA" "TLT" "IYR" "GLD"

# Extract adjusted close prices and merge into one xts object
prices_list <- lapply(tickers, function(tk) Ad(get(tk)))
prices_daily <- do.call(merge, prices_list)
colnames(prices_daily) <- tickers

# Confirm dimensions
cat("Daily price matrix:", nrow(prices_daily), "rows x", ncol(prices_daily), "columns\n")

## Daily price matrix: 3854 rows x 8 columns

cat("Date range:", format(index(prices_daily)[1]), "to",
    format(index(prices_daily)[nrow(prices_daily)]), "\n")

## Date range: 2010-01-04 to 2025-04-29

2.1.3 Step 2: Compute Monthly Discrete Returns

We use discrete (simple) returns rather than log returns, as required. Monthly prices are taken as the last observation in each calendar month.

# Aggregate to monthly end-of-month prices
prices_monthly <- to.monthly(prices_daily, indexAt = "lastof", OHLC = FALSE)

# Discrete (simple) returns: R_t = (P_t - P_{t-1}) / P_{t-1}
etf_returns <- Return.calculate(prices_monthly, method = "discrete")
etf_returns <- na.omit(etf_returns)

cat("Monthly return matrix:", nrow(etf_returns), "months x", ncol(etf_returns), "ETFs\n")

## Monthly return matrix: 183 months x 8 ETFs

cat("Period:", format(index(etf_returns)[1]), "to",
    format(index(etf_returns)[nrow(etf_returns)]), "\n")

## Period: 2010-02-28 to 2025-04-30

2.1.4 Step 3: Load and Clean Fama-French 3-Factor Data

The Fama-French factors are loaded from the pre-downloaded CSV. We skip the header rows, parse the date, and convert from percentage to decimal form.

Factor Definitions: - Mkt-RF: Excess return of the value-weighted market portfolio over the risk-free rate — the core market risk premium. - SMB (Small Minus Big): Return spread between small-cap and large-cap stocks; captures the size premium. - HML (High Minus Low): Return spread between value stocks (high book-to-market) and growth stocks (low book-to-market); captures the value premium. - RF: Monthly risk-free rate (1-month T-bill).

These factors are not merely statistical constructs — they represent persistent, economically meaningful return premia documented across decades and markets.

# Read FF data, skipping descriptive header rows
ff_raw <- read.csv("F-F_Research_Data_Factors.csv",
                   skip = 3,
                   header = TRUE,
                   stringsAsFactors = FALSE)

# Keep only monthly rows (6-digit YYYYMM) — drop the annual section
ff_raw <- ff_raw[nchar(trimws(ff_raw[,1])) == 6, ]
ff_raw <- ff_raw[!is.na(suppressWarnings(as.numeric(trimws(ff_raw[,1])))), ]

# Rename and parse
colnames(ff_raw)[1] <- "Date"
ff_raw$Date <- as.numeric(trimws(ff_raw$Date))

# Convert to numeric
for (col in c("Mkt.RF", "SMB", "HML", "RF")) {
  ff_raw[[col]] <- as.numeric(trimws(ff_raw[[col]]))
}

# Remove bad rows
ff_raw <- ff_raw %>%
  filter(!is.na(Date), !is.na(Mkt.RF), Mkt.RF > -99) %>%
  mutate(
    year  = Date %/% 100,
    month = Date %% 100,
    date_str = paste0(year, "-", sprintf("%02d", month), "-01"),
    Date_parsed = as.Date(date_str)
  )

# Convert % → decimal
ff_xts <- xts(
  ff_raw[, c("Mkt.RF", "SMB", "HML", "RF")] / 100,
  order.by = as.yearmon(ff_raw$Date_parsed)
)
colnames(ff_xts) <- c("MktRF", "SMB", "HML", "RF")

cat("FF Factors:", nrow(ff_xts), "monthly observations\n")

## FF Factors: 1196 monthly observations

cat("Date range:", format(index(ff_xts)[1]), "to",
    format(index(ff_xts)[nrow(ff_xts)]), "\n")

## Date range: Jul 1926 to Feb 2026

2.1.5 Step 4: Merge ETF Returns with Fama-French Factors

# Align index to yearmon for merging
index(etf_returns) <- as.yearmon(index(etf_returns))
index(ff_xts)      <- as.yearmon(index(ff_xts))

# Merge on common dates
combined <- merge(etf_returns, ff_xts, join = "inner")
combined <- na.omit(combined)

cat("Merged dataset:", nrow(combined), "monthly observations\n")

## Merged dataset: 183 monthly observations

cat("Columns:", ncol(combined), "\n")

## Columns: 12

# Extract ETF excess returns (subtract RF)
rf_vec   <- combined[, "RF"]
etf_mat  <- combined[, tickers]
etf_excess <- etf_mat - as.numeric(rf_vec) %*% matrix(1, 1, 8)
colnames(etf_excess) <- paste0(tickers, "_excess")

2.2 Estimation Window: 2020/03 – 2025/02 (60 months)

window_start <- as.yearmon("Mar 2020")
window_end   <- as.yearmon("Feb 2025")

# Subset to 60-month estimation window
idx_window <- index(combined) >= window_start & index(combined) <= window_end
data_window <- combined[idx_window, ]

cat("Estimation window:", nrow(data_window), "months\n")

## Estimation window: 60 months

cat("From:", format(index(data_window)[1]), "to:",
    format(index(data_window)[nrow(data_window)]), "\n")

## From: Mar 2020 to: Feb 2025

# ETF returns and excess returns in the window
etf_window        <- data_window[, tickers]
rf_window         <- data_window[, "RF"]
ff_window         <- data_window[, c("MktRF", "SMB", "HML")]
etf_excess_window <- etf_window - as.numeric(rf_window) %*% matrix(1, 1, 8)
colnames(etf_excess_window) <- tickers

2.3 Question 5: MVP via CAPM Covariance Matrix

The CAPM posits that the covariance between any two assets is driven entirely by their shared exposure to the market factor:

\[\text{Cov}(R_i, R_j) = \beta_i \beta_j \sigma_M^2 + \delta_{ij} \sigma_{\varepsilon_i}^2\]

where $\beta_i$ is estimated by regressing excess returns on $MKT_{RF}$.

# ---- Step 1: Estimate CAPM betas ----
mkt_rf_window <- as.numeric(ff_window[, "MktRF"])
etf_ex_mat    <- coredata(etf_excess_window)

n_assets <- ncol(etf_ex_mat)
betas_capm  <- numeric(n_assets)
alphas_capm <- numeric(n_assets)
res_var_capm <- numeric(n_assets)

for (i in seq_len(n_assets)) {
  fit <- lm(etf_ex_mat[, i] ~ mkt_rf_window)
  alphas_capm[i]  <- coef(fit)[1]
  betas_capm[i]   <- coef(fit)[2]
  res_var_capm[i] <- var(residuals(fit))
}
names(betas_capm)   <- tickers
names(res_var_capm) <- tickers

# ---- Step 2: CAPM factor covariance matrix ----
var_mkt    <- var(mkt_rf_window)
Sigma_capm <- outer(betas_capm, betas_capm) * var_mkt +
              diag(res_var_capm)

# ---- Step 3: Expected excess returns (CAPM) ----
mean_mkt_rf   <- mean(mkt_rf_window)
mu_excess_capm <- alphas_capm + betas_capm * mean_mkt_rf

# ---- Step 4: Minimum Variance Portfolio ----
# Minimize w'Σw subject to sum(w)=1 (long-short allowed)
ones <- rep(1, n_assets)
Sigma_inv <- solve(Sigma_capm)
w_mvp_capm_raw <- Sigma_inv %*% ones
w_mvp_capm <- w_mvp_capm_raw / sum(w_mvp_capm_raw)
names(w_mvp_capm) <- tickers

# Portfolio statistics
mu_mvp_capm  <- as.numeric(t(w_mvp_capm) %*% (mu_excess_capm + as.numeric(mean(rf_window))))
vol_mvp_capm <- sqrt(as.numeric(t(w_mvp_capm) %*% Sigma_capm %*% w_mvp_capm))

cat("=== CAPM MVP ===\n")

## === CAPM MVP ===

cat(sprintf("Expected Return (monthly): %.4f (%.2f%%)\n",
            mu_mvp_capm, mu_mvp_capm * 100))

## Expected Return (monthly): 0.0039 (0.39%)

cat(sprintf("Volatility     (monthly): %.4f (%.2f%%)\n",
            vol_mvp_capm, vol_mvp_capm * 100))

## Volatility     (monthly): 0.0287 (2.87%)

cat(sprintf("Sharpe Ratio   (monthly): %.4f\n",
            (mu_mvp_capm - mean(as.numeric(rf_window))) / vol_mvp_capm))

## Sharpe Ratio   (monthly): 0.0657

# Display weights
capm_weights_df <- data.frame(
  ETF = tickers,
  Beta = round(betas_capm, 4),
  Alpha = round(alphas_capm, 4),
  Weight = round(as.numeric(w_mvp_capm), 4)
)

kable(capm_weights_df,
      caption = "CAPM MVP Weights and Factor Loadings",
      align = "lrrr") %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"),
                full_width = FALSE)

CAPM MVP Weights and Factor Loadings
	ETF	Beta	Alpha	Weight
SPY	SPY	0.9552	0.0006	0.2744
QQQ	QQQ	1.0634	0.0026	-0.1429
EEM	EEM	0.6963	-0.0062	0.1719
IWM	IWM	1.1858	-0.0065	-0.1891
EFA	EFA	0.8243	-0.0038	0.1748
TLT	TLT	0.3310	-0.0116	0.3330
IYR	IYR	1.0036	-0.0080	-0.0312
GLD	GLD	0.1746	0.0063	0.4092

Interpretation: The CAPM MVP concentrates weight in assets with low market betas and low idiosyncratic variance — typically TLT (long-duration bonds) and GLD (gold), which tend to have near-zero or negative market betas. This reflects CAPM’s one-dimensional view of risk: only systematic exposure to the market matters.

2.4 Question 6: MVP via Fama-French 3-Factor Covariance Matrix

The FF3 model extends CAPM by adding size (SMB) and value (HML) factors:

\[R_i - R_f = \alpha_i + \beta_{i,M}(R_M - R_f) + \beta_{i,S} \text{SMB} + \beta_{i,H} \text{HML} + \varepsilon_i\]

The factor-model covariance matrix is:

\[\boldsymbol{\Sigma}_{FF3} = \mathbf{B} \boldsymbol{\Sigma}_F \mathbf{B}' + \mathbf{D}\]

where $\mathbf{B}$ is the $N \times 3$ matrix of factor loadings, $\boldsymbol{\Sigma}_F$ is the $3 \times 3$ factor covariance matrix, and $\mathbf{D}$ is diagonal (residual variances).

# ---- Step 1: Estimate FF3 factor loadings ----
smb_window <- as.numeric(ff_window[, "SMB"])
hml_window <- as.numeric(ff_window[, "HML"])

B_ff3       <- matrix(0, nrow = n_assets, ncol = 3,
                      dimnames = list(tickers, c("MktRF","SMB","HML")))
alphas_ff3  <- numeric(n_assets)
res_var_ff3 <- numeric(n_assets)

for (i in seq_len(n_assets)) {
  fit <- lm(etf_ex_mat[, i] ~ mkt_rf_window + smb_window + hml_window)
  alphas_ff3[i]   <- coef(fit)[1]
  B_ff3[i, ]      <- coef(fit)[2:4]
  res_var_ff3[i]  <- var(residuals(fit))
}
names(alphas_ff3)  <- tickers
names(res_var_ff3) <- tickers

# ---- Step 2: Factor covariance matrix ----
factor_mat   <- cbind(mkt_rf_window, smb_window, hml_window)
Sigma_F      <- cov(factor_mat)

# FF3 covariance matrix
Sigma_ff3 <- B_ff3 %*% Sigma_F %*% t(B_ff3) + diag(res_var_ff3)

# ---- Step 3: Expected excess returns (FF3) ----
mean_factors  <- colMeans(factor_mat)
mu_excess_ff3 <- alphas_ff3 + B_ff3 %*% mean_factors

# ---- Step 4: MVP weights ----
Sigma_ff3_inv    <- solve(Sigma_ff3)
w_mvp_ff3_raw    <- Sigma_ff3_inv %*% ones
w_mvp_ff3        <- as.numeric(w_mvp_ff3_raw / sum(w_mvp_ff3_raw))
names(w_mvp_ff3) <- tickers

# Portfolio statistics
mu_mvp_ff3  <- as.numeric(t(w_mvp_ff3) %*% (mu_excess_ff3 + mean(as.numeric(rf_window))))
vol_mvp_ff3 <- sqrt(as.numeric(t(w_mvp_ff3) %*% Sigma_ff3 %*% w_mvp_ff3))

cat("=== FF3 MVP ===\n")

## === FF3 MVP ===

cat(sprintf("Expected Return (monthly): %.4f (%.2f%%)\n",
            mu_mvp_ff3, mu_mvp_ff3 * 100))

## Expected Return (monthly): 0.0018 (0.18%)

cat(sprintf("Volatility     (monthly): %.4f (%.2f%%)\n",
            vol_mvp_ff3, vol_mvp_ff3 * 100))

## Volatility     (monthly): 0.0288 (2.88%)

cat(sprintf("Sharpe Ratio   (monthly): %.4f\n",
            (mu_mvp_ff3 - mean(as.numeric(rf_window))) / vol_mvp_ff3))

## Sharpe Ratio   (monthly): -0.0092

# Display factor loadings
ff3_loadings_df <- data.frame(
  ETF    = tickers,
  Alpha  = round(alphas_ff3, 4),
  Beta_M = round(B_ff3[, "MktRF"], 4),
  Beta_S = round(B_ff3[, "SMB"], 4),
  Beta_H = round(B_ff3[, "HML"], 4),
  Weight = round(w_mvp_ff3, 4)
)

kable(ff3_loadings_df,
      caption = "FF3 Factor Loadings and MVP Weights",
      col.names = c("ETF","Alpha","β_Market","β_SMB","β_HML","Weight"),
      align = "lrrrrr") %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"),
                full_width = FALSE)

FF3 Factor Loadings and MVP Weights
	ETF	Alpha	β_Market	β_SMB	β_HML	Weight
SPY	SPY	-0.0001	0.9853	-0.1487	0.0194	0.1399
QQQ	QQQ	0.0032	1.0813	-0.0890	-0.3994	-0.2280
EEM	EEM	-0.0062	0.6794	0.0834	0.1476	0.1988
IWM	IWM	-0.0030	1.0058	0.8895	0.2660	-0.0563
EFA	EFA	-0.0049	0.8477	-0.1152	0.2169	0.1810
TLT	TLT	-0.0112	0.3443	-0.0658	-0.2622	0.3777
IYR	IYR	-0.0083	0.9953	0.0409	0.2032	-0.0138
GLD	GLD	0.0048	0.2420	-0.3330	-0.0197	0.4007

Interpretation: By incorporating SMB and HML, the FF3 model captures additional dimensions of co-movement. ETFs with similar size and value tilts will show higher covariance than implied by CAPM alone. A plausible outcome is that EEM and IWM (both small-cap tilted) receive lower combined weights, while TLT and GLD — having negative or near-zero loadings on all three factors — dominate the MVP even more strongly under FF3.

2.5 Question 7: Realized MVP Returns — March 2025

# March 2025 actual returns
mar2025 <- as.yearmon("Mar 2025")
idx_mar  <- index(combined) == mar2025

if (any(idx_mar)) {
  ret_mar <- as.numeric(coredata(combined[idx_mar, tickers]))

  realized_capm <- sum(as.numeric(w_mvp_capm) * ret_mar)
  realized_ff3  <- sum(as.numeric(w_mvp_ff3)  * ret_mar)

  cat("=== Realized MVP Returns — March 2025 ===\n")
  cat(sprintf("CAPM MVP: %.4f (%.2f%%)\n", realized_capm, realized_capm * 100))
  cat(sprintf("FF3  MVP: %.4f (%.2f%%)\n", realized_ff3,  realized_ff3  * 100))

  results_mar <- data.frame(
    Model           = c("CAPM", "FF3"),
    Realized_Return = round(c(realized_capm, realized_ff3) * 100, 4)
  )
  kable(results_mar,
        caption = "Realized MVP Portfolio Returns — March 2025 (%)",
        col.names = c("Model", "Realized Return (%)"),
        align = "lr") %>%
    kable_styling(bootstrap_options = c("striped","hover"),
                  full_width = FALSE)
} else {
  cat("March 2025 data not yet available in the downloaded dataset.\n")
  cat("This question will be answered once Yahoo Finance data for Mar 2025 is confirmed.\n")
  cat("Weights are locked in as computed above from the 2020/03–2025/02 window.\n")
}

## === Realized MVP Returns — March 2025 ===
## CAPM MVP: 0.0462 (4.62%)
## FF3  MVP: 0.0496 (4.96%)

Realized MVP Portfolio Returns — March 2025 (%)
Model	Realized Return (%)
CAPM	4.6160
FF3	4.9576

Interpretation: The realized return in March 2025 serves as an out-of-sample test. If global equity markets experienced a drawdown (as they did amid tariff uncertainty in early 2025), portfolios heavily tilted toward TLT and GLD — typical of MVP optimization — would likely outperform pure equity benchmarks.

2.6 Question 8: Realized MVP Return — April 2025

For April 2025, the 60-month estimation window shifts to 2020/04 – 2025/03.

window_start_apr <- as.yearmon("Apr 2020")
window_end_apr   <- as.yearmon("Mar 2025")

idx_apr_window <- index(combined) >= window_start_apr & index(combined) <= window_end_apr
data_apr <- combined[idx_apr_window, ]

cat("April 2025 estimation window:", nrow(data_apr), "months\n")

## April 2025 estimation window: 60 months

etf_apr     <- coredata(data_apr[, tickers])
rf_apr      <- as.numeric(data_apr[, "RF"])
mkt_apr     <- as.numeric(data_apr[, "MktRF"])
smb_apr     <- as.numeric(data_apr[, "SMB"])
hml_apr     <- as.numeric(data_apr[, "HML"])
etf_ex_apr  <- etf_apr - rf_apr

# ---- CAPM MVP for April window ----
betas_capm_apr   <- numeric(n_assets)
res_var_capm_apr <- numeric(n_assets)
for (i in seq_len(n_assets)) {
  fit <- lm(etf_ex_apr[, i] ~ mkt_apr)
  betas_capm_apr[i]   <- coef(fit)[2]
  res_var_capm_apr[i] <- var(residuals(fit))
}
Sigma_capm_apr <- outer(betas_capm_apr, betas_capm_apr) * var(mkt_apr) +
                  diag(res_var_capm_apr)
w_capm_apr_raw <- solve(Sigma_capm_apr) %*% rep(1, n_assets)
w_capm_apr     <- as.numeric(w_capm_apr_raw / sum(w_capm_apr_raw))
names(w_capm_apr) <- tickers

# ---- FF3 MVP for April window ----
B_ff3_apr       <- matrix(0, n_assets, 3)
res_var_ff3_apr <- numeric(n_assets)
for (i in seq_len(n_assets)) {
  fit <- lm(etf_ex_apr[, i] ~ mkt_apr + smb_apr + hml_apr)
  B_ff3_apr[i, ]      <- coef(fit)[2:4]
  res_var_ff3_apr[i]  <- var(residuals(fit))
}
Sigma_F_apr  <- cov(cbind(mkt_apr, smb_apr, hml_apr))
Sigma_ff3_apr <- B_ff3_apr %*% Sigma_F_apr %*% t(B_ff3_apr) + diag(res_var_ff3_apr)
w_ff3_apr_raw <- solve(Sigma_ff3_apr) %*% rep(1, n_assets)
w_ff3_apr     <- as.numeric(w_ff3_apr_raw / sum(w_ff3_apr_raw))
names(w_ff3_apr) <- tickers

# ---- Realized return for April 2025 ----
apr2025 <- as.yearmon("Apr 2025")
idx_apr_ret <- index(combined) == apr2025

if (any(idx_apr_ret)) {
  ret_apr <- as.numeric(coredata(combined[idx_apr_ret, tickers]))
  realized_capm_apr <- sum(as.numeric(w_capm_apr) * ret_apr)
  realized_ff3_apr  <- sum(as.numeric(w_ff3_apr)  * ret_apr)

  cat("=== Realized MVP Returns — April 2025 ===\n")
  cat(sprintf("CAPM MVP: %.4f (%.2f%%)\n", realized_capm_apr, realized_capm_apr * 100))
  cat(sprintf("FF3  MVP: %.4f (%.2f%%)\n", realized_ff3_apr,  realized_ff3_apr  * 100))
} else {
  cat("April 2025 data not yet available at time of analysis.\n")
  cat("April MVP weights (CAPM) computed from 2020/04–2025/03 window:\n")
  print(round(w_capm_apr, 4))
  cat("April MVP weights (FF3) computed from 2020/04–2025/03 window:\n")
  print(round(w_ff3_apr, 4))
}

## === Realized MVP Returns — April 2025 ===
## CAPM MVP: 0.0245 (2.45%)
## FF3  MVP: 0.0233 (2.33%)

2.7 Visualization

2.7.1 MVP Weight Comparison: CAPM vs FF3

weights_df <- data.frame(
  ETF    = rep(tickers, 2),
  Weight = c(as.numeric(w_mvp_capm), w_mvp_ff3),
  Model  = rep(c("CAPM", "FF3"), each = n_assets)
)

ggplot(weights_df, aes(x = ETF, y = Weight * 100, fill = Model)) +
  geom_bar(stat = "identity", position = "dodge", width = 0.7) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "grey40") +
  scale_fill_manual(values = c("CAPM" = "#2E86AB", "FF3" = "#A23B72")) +
  labs(
    title    = "MVP Weights: CAPM vs Fama-French Three-Factor Model",
    subtitle = "Estimation window: March 2020 – February 2025 (60 months)",
    x = "ETF", y = "Weight (%)", fill = "Model"
  ) +
  theme_minimal(base_size = 13) +
  theme(
    plot.title    = element_text(face = "bold"),
    legend.position = "top"
  )

Minimum Variance Portfolio Weights: CAPM vs FF3 (2020/03–2025/02 window)

The differences between CAPM and FF3 weights illustrate how additional factors change our perception of diversification. Under FF3, assets with similar size tilts (e.g., IWM and EEM) are seen as more correlated, potentially reducing their combined MVP allocation.

2.7.2 Risk-Return Scatter

# Individual ETF stats from window
mu_indiv  <- colMeans(coredata(etf_window))
sd_indiv  <- apply(coredata(etf_window), 2, sd)

scatter_df <- data.frame(
  Asset  = c(tickers, "MVP_CAPM", "MVP_FF3"),
  Return = c(mu_indiv * 100, mu_mvp_capm * 100, mu_mvp_ff3 * 100),
  Risk   = c(sd_indiv * 100, vol_mvp_capm * 100, vol_mvp_ff3 * 100),
  Type   = c(rep("ETF", 8), "CAPM MVP", "FF3 MVP")
)

ggplot(scatter_df, aes(x = Risk, y = Return, color = Type, label = Asset)) +
  geom_point(aes(size = ifelse(Type == "ETF", 3, 5)), alpha = 0.85) +
  ggrepel::geom_text_repel(size = 3.5, show.legend = FALSE,
                            fontface = "bold") +
  scale_color_manual(values = c(
    "ETF"      = "#636EFA",
    "CAPM MVP" = "#EF553B",
    "FF3 MVP"  = "#00CC96"
  )) +
  scale_size_identity() +
  labs(
    title    = "Risk-Return Space: Individual ETFs and MVP Portfolios",
    subtitle = "Estimation window: March 2020 – February 2025",
    x        = "Monthly Volatility (%)",
    y        = "Monthly Mean Return (%)",
    color    = "Asset Type"
  ) +
  theme_minimal(base_size = 13) +
  theme(plot.title = element_text(face = "bold"), legend.position = "top")

Risk-Return Tradeoff for Individual ETFs and MVP Portfolios

2.7.3 Factor Loadings Heatmap

B_df <- as.data.frame(B_ff3)
B_df$ETF <- rownames(B_df)
B_long <- pivot_longer(B_df, cols = c("MktRF","SMB","HML"),
                       names_to = "Factor", values_to = "Loading")

ggplot(B_long, aes(x = Factor, y = ETF, fill = Loading)) +
  geom_tile(color = "white", linewidth = 0.5) +
  geom_text(aes(label = round(Loading, 2)), size = 4, fontface = "bold") +
  scale_fill_gradient2(low = "#D62728", mid = "#FFFFFF", high = "#1F77B4",
                       midpoint = 0, name = "Loading") +
  labs(
    title    = "Fama-French 3-Factor Loadings by ETF",
    subtitle = "Blue = positive (factor exposure), Red = negative (hedge)",
    x = "Factor", y = "ETF"
  ) +
  theme_minimal(base_size = 13) +
  theme(plot.title = element_text(face = "bold"))

Heatmap of FF3 Factor Loadings Across ETFs

2.8 Performance Summary

sharpe_capm <- (mu_mvp_capm - mean(as.numeric(rf_window))) / vol_mvp_capm
sharpe_ff3  <- (mu_mvp_ff3  - mean(as.numeric(rf_window))) / vol_mvp_ff3

perf_df <- data.frame(
  Metric           = c("Expected Monthly Return", "Monthly Volatility", "Sharpe Ratio",
                       "Annualized Return", "Annualized Volatility"),
  CAPM_MVP         = c(
    sprintf("%.4f%%", mu_mvp_capm  * 100),
    sprintf("%.4f%%", vol_mvp_capm * 100),
    sprintf("%.4f",   sharpe_capm),
    sprintf("%.2f%%", ((1 + mu_mvp_capm)^12 - 1) * 100),
    sprintf("%.2f%%", vol_mvp_capm * sqrt(12) * 100)
  ),
  FF3_MVP          = c(
    sprintf("%.4f%%", mu_mvp_ff3  * 100),
    sprintf("%.4f%%", vol_mvp_ff3 * 100),
    sprintf("%.4f",   sharpe_ff3),
    sprintf("%.2f%%", ((1 + mu_mvp_ff3)^12 - 1) * 100),
    sprintf("%.2f%%", vol_mvp_ff3 * sqrt(12) * 100)
  )
)

kable(perf_df,
      caption = "Portfolio Performance Summary",
      col.names = c("Metric", "CAPM MVP", "FF3 MVP"),
      align = "lcc") %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"),
                full_width = FALSE) %>%
  row_spec(3, bold = TRUE, background = "#e8f4f8")

Portfolio Performance Summary
Metric	CAPM MVP	FF3 MVP
Expected Monthly Return	0.3929%	0.1780%
Monthly Volatility	2.8679%	2.8753%
Sharpe Ratio	0.0657	-0.0092
Annualized Return	4.82%	2.16%
Annualized Volatility	9.93%	9.96%

Discussion: A plausible financial reading is that the FF3 MVP should exhibit slightly lower volatility than the CAPM MVP because the three-factor model captures more of the true covariance structure, leaving a smaller unexplained residual. If the CAPM MVP has a higher Sharpe ratio, this would suggest that the additional factors are not providing meaningful diversification for these specific ETFs — a finding consistent with the fact that most ETFs are already broadly diversified.

3 Part II: Theory Questions (60%)

3.1 Chapter 5: Problem Set 12 — Six Portfolios (Size × Book-to-Market)

3.1.1 Data Loading and Preparation

# Load 6 Portfolios 2x3 data
port6_raw <- read.csv("6_Portfolios_2x3.csv",
                      skip = 15,   # Skip header text
                      header = TRUE,
                      stringsAsFactors = FALSE)

# Keep only monthly section (before annual returns)
# Find rows with 6-digit date codes (YYYYMM)
port6_raw <- port6_raw[nchar(trimws(as.character(port6_raw[,1]))) == 6, ]
port6_raw <- port6_raw[!is.na(suppressWarnings(as.numeric(trimws(port6_raw[,1])))), ]

colnames(port6_raw) <- c("Date", "SL", "SM", "SH", "BL", "BM", "BH")

# Parse date and convert
port6_raw <- port6_raw %>%
  mutate(
    Date = as.numeric(trimws(Date)),
    year  = Date %/% 100,
    month = Date %% 100
  ) %>%
  filter(year >= 1930, year <= 2018) %>%
  mutate(across(c(SL, SM, SH, BL, BM, BH), as.numeric)) %>%
  filter(SL > -99)  # Remove missing value codes

# Convert % → decimal
port6_raw <- port6_raw %>%
  mutate(across(c(SL, SM, SH, BL, BM, BH), ~ . / 100))

# Create date column
port6_raw$date_obj <- as.Date(paste0(port6_raw$year, "-",
                                      sprintf("%02d", port6_raw$month), "-01"))

cat("Six Portfolios dataset:", nrow(port6_raw), "monthly observations\n")

## Six Portfolios dataset: 7740 monthly observations

cat("Period:", format(min(port6_raw$date_obj), "%b %Y"), "to",
    format(max(port6_raw$date_obj), "%b %Y"), "\n")

## Period: Jan 1930 to Dec 2018

# Split in half
n_total  <- nrow(port6_raw)
half     <- floor(n_total / 2)
port_h1  <- port6_raw[1:half, ]
port_h2  <- port6_raw[(half+1):n_total, ]

cat(sprintf("\nFirst half:  %s to %s (%d months)\n",
            format(min(port_h1$date_obj), "%b %Y"),
            format(max(port_h1$date_obj), "%b %Y"),
            nrow(port_h1)))

## 
## First half:  Jan 1930 to Dec 2018 (3870 months)

cat(sprintf("Second half: %s to %s (%d months)\n",
            format(min(port_h2$date_obj), "%b %Y"),
            format(max(port_h2$date_obj), "%b %Y"),
            nrow(port_h2)))

## Second half: Jan 1930 to Dec 2018 (3870 months)

3.1.2 Descriptive Statistics by Half

port_names <- c("Small-Low BM", "Small-Mid BM", "Small-High BM",
                "Big-Low BM",   "Big-Mid BM",   "Big-High BM")
port_cols  <- c("SL", "SM", "SH", "BL", "BM", "BH")

compute_stats <- function(df, half_label) {
  result <- lapply(port_cols, function(col) {
    x <- df[[col]] * 100  # back to % for readability
    data.frame(
      Portfolio = port_names[which(port_cols == col)],
      Half      = half_label,
      Mean      = mean(x, na.rm = TRUE),
      SD        = sd(x, na.rm = TRUE),
      Skewness  = moments::skewness(x, na.rm = TRUE),
      Kurtosis  = moments::kurtosis(x, na.rm = TRUE)
    )
  })
  bind_rows(result)
}

# Check for moments package
if (!requireNamespace("moments", quietly = TRUE)) {
  install.packages("moments", repos = "https://cloud.r-project.org")
}
library(moments)

stats_h1 <- compute_stats(port_h1, "First Half")
stats_h2 <- compute_stats(port_h2, "Second Half")
stats_all <- bind_rows(stats_h1, stats_h2)

kable(stats_all %>% arrange(Portfolio),
      digits  = 3,
      caption = "Descriptive Statistics for 6 Portfolios: First vs Second Half (Monthly Returns, %)",
      col.names = c("Portfolio","Half","Mean (%)","SD (%)","Skewness","Kurtosis")) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"),
                full_width = FALSE) %>%
  pack_rows("First Half", 1, 6) %>%
  pack_rows("Second Half", 7, 12)

Descriptive Statistics for 6 Portfolios: First vs Second Half (Monthly Returns, %)
Portfolio	Half	Mean (%)	SD (%)	Skewness	Kurtosis
First Half
Big-High BM	First Half	95.833	235.138	5.295	37.698
Big-High BM	Second Half	892.426	3428.986	4.759	26.872
Big-Low BM	First Half	193.256	327.742	2.434	9.275
Big-Low BM	Second Half	1414.493	4952.068	3.941	19.068
Big-Mid BM	First Half	149.764	253.782	2.923	14.123
Big-Mid BM	Second Half	1000.157	3634.342	4.311	22.072
Second Half
Small-High BM	First Half	213.204	455.851	2.374	7.831
Small-High BM	Second Half	22.699	78.195	4.462	23.730
Small-Low BM	First Half	168.913	386.393	2.493	8.116
Small-Low BM	Second Half	38.842	142.318	4.373	22.824
Small-Mid BM	First Half	185.240	378.536	2.107	6.090
Small-Mid BM	Second Half	37.992	136.553	4.271	21.516

3.1.3 Visualization: Split-Half Statistics

stats_plot <- stats_all %>%
  mutate(Size  = ifelse(grepl("Small", Portfolio), "Small", "Big"),
         Value = case_when(
           grepl("Low",  Portfolio) ~ "Low BM (Growth)",
           grepl("Mid",  Portfolio) ~ "Mid BM",
           grepl("High", Portfolio) ~ "High BM (Value)"
         ))

p1 <- ggplot(stats_plot, aes(x = Value, y = Mean, fill = Half)) +
  geom_bar(stat = "identity", position = "dodge") +
  facet_wrap(~Size) +
  scale_fill_manual(values = c("First Half" = "#4C72B0", "Second Half" = "#DD8452")) +
  labs(title = "Mean Monthly Return by Portfolio and Sub-Period",
       x = "Book-to-Market Category", y = "Mean Return (%)") +
  theme_minimal(base_size = 12) +
  theme(axis.text.x = element_text(angle = 30, hjust = 1),
        plot.title = element_text(face = "bold"))

p2 <- ggplot(stats_plot, aes(x = Value, y = SD, fill = Half)) +
  geom_bar(stat = "identity", position = "dodge") +
  facet_wrap(~Size) +
  scale_fill_manual(values = c("First Half" = "#4C72B0", "Second Half" = "#DD8452")) +
  labs(title = "Standard Deviation of Monthly Returns by Portfolio and Sub-Period",
       x = "Book-to-Market Category", y = "SD (%)") +
  theme_minimal(base_size = 12) +
  theme(axis.text.x = element_text(angle = 30, hjust = 1),
        plot.title = element_text(face = "bold"))

gridExtra::grid.arrange(p1, p2, nrow = 2)

Mean Returns and Standard Deviations: First vs Second Half

3.1.4 Interpretation: Do Returns Come from the Same Distribution?

The split-half analysis addresses a fundamental question in empirical finance: distributional stationarity.

Size Effect: Small-cap portfolios (SL, SM, SH) consistently exhibit higher mean returns and higher volatility compared to their large-cap counterparts, consistent with the classic size premium documented by Banz (1981). This pattern should persist across both halves if the underlying risk story is stable.

Value Premium: Within each size group, the High book-to-market (value) portfolio tends to outperform the Low BM (growth) portfolio. This is the value premium — compensation for the distress risk inherent in value stocks.

Split-Half Comparison: If the statistics change substantially between the two halves — particularly if mean returns fall while standard deviations rise, or if skewness becomes more negative — this suggests the return distribution is non-stationary. In practice, we typically observe:

The first half (1930–~1974) includes the Great Depression and WWII era, producing high volatility and extreme skewness.
The second half (~1975–2018) covers the Great Moderation, potentially showing lower volatility but altered means.

A plausible conclusion is that the distributions differ between halves — means are not constant, volatilities shift, and skewness and kurtosis change meaningfully. This raises important caveats for practitioners who extrapolate historical return distributions to forecast future outcomes. It also supports the use of regime-aware or robust portfolio optimization methods.

3.2 Chapter 6: Problem Set 21

Given: $E(r_p) = 11\%$, $\sigma_p = 15\%$, $r_f = 5\%$

Part a: Client 1 targets an 8% expected return on the complete portfolio.

\[E(r_C) = y \cdot E(r_p) + (1-y) \cdot r_f = 8\%\]

\[y \cdot 11\% + (1-y) \cdot 5\% = 8\%\] \[y \cdot 6\% = 3\% \implies y = 0.5\]

Client 1 invests 50% in the risky fund and 50% in the risk-free asset.

Part b: Standard deviation of the complete portfolio:

\[\sigma_C = y \cdot \sigma_p = 0.5 \times 15\% = 7.5\%\]

This follows directly from the property that combining a risky portfolio with a risk-free asset scales risk proportionally — the risk-free asset contributes zero variance.

Part c: Client 2 wants the highest return subject to $\sigma_C \leq 12\%$.

\[y = \frac{\sigma_C}{\sigma_p} = \frac{12\%}{15\%} = 0.80\]

\[E(r_C) = 0.80 \times 11\% + 0.20 \times 5\% = 8.8\% + 1.0\% = 9.8\%\]

Client 2 invests 80% in the risky fund.

Which client is more risk averse? Client 1, because at the same menu of choices (the CAL), Client 1 selects a point with lower risk (7.5% vs 12%) and lower return (8% vs 9.8%). Client 1 places greater utility weight on risk reduction — the defining characteristic of a more risk-averse investor.

3.3 Chapter 6: Problem Set 22

Given: $E(r_M) = 12\%$, $\sigma_M = 20\%$, $r_f = 5\%$. Johnson requires $\sigma_C = \frac{1}{2}\sigma_M = 10\%$.

Using the Capital Market Line (CML):

\[E(r_C) = r_f + \frac{E(r_M) - r_f}{\sigma_M} \cdot \sigma_C\]

\[E(r_C) = 5\% + \frac{12\% - 5\%}{20\%} \times 10\% = 5\% + 0.35 \times 10\% = 5\% + 3.5\% = 8.5\%\]

IMI can promise Johnson an expected return of 8.5% given his 10% volatility constraint.

This is the CML relationship in action: the Sharpe ratio of the market portfolio $(7\%/20\% = 0.35)$ tells us exactly how much additional return we earn per unit of additional risk. Because Johnson is willing to accept only half the market’s risk, he can expect only half the market’s risk premium above $r_f$.

3.4 Chapter 6: CFA Problem 4

Question: Which indifference curve represents the greatest level of utility?

Answer: Indifference curve 4 (the highest curve, passing through point G).

In expected return–standard deviation space, indifference curves slope upward (investors require higher expected returns to compensate for higher risk). Higher curves represent higher utility because they offer greater expected return for any given level of risk. An investor always prefers to be on the highest attainable indifference curve. Curve 4 is the highest visible curve in the diagram, placing it at the greatest utility level achievable.

3.5 Chapter 6: CFA Problem 5

Question: Which point designates the optimal portfolio of risky assets?

Answer: Point E (the tangency point between the highest CML and the efficient frontier).

The optimal risky portfolio is determined at the tangency point between the Capital Allocation Line and the efficient frontier of risky assets. This point maximizes the Sharpe Ratio — the slope of the CAL. In the diagram, point E lies on the CAL and on the frontier, at the highest achievable reward-to-risk ratio. From a graphical standpoint, it is the point where the CAL is tangent to the frontier, making it the unique portfolio every rational investor should hold in the risky component of their portfolio, regardless of their risk aversion.

3.6 Chapter 6: CFA Problem 8

Question: In the diagram, which portfolio is the optimal complete portfolio for an investor with the given utility function?

Answer: The optimal complete portfolio lies at the point where an investor’s highest attainable indifference curve is tangent to the Capital Allocation Line (CAL). This is typically point F in the referenced graph, where the CAL and the indifference curve just touch — any higher indifference curve would be unattainable (lying entirely above the CAL), and any lower curve would represent sub-optimal utility. The optimal split between the risky portfolio (at E) and the risk-free asset is determined by this tangency, uniquely dictated by the investor’s degree of risk aversion.

3.7 Chapter 7: Problem Set 11 — Gold in a Portfolio

Given: | Asset | $E(r)$ | $\sigma$ | |——-|——–|———-| | Stocks | 18% | 22% | | Gold | 10% | 30% |

Part a: Gold has both lower expected return and higher risk than stocks. Would anyone hold it?

Yes — if the correlation between gold and stocks is sufficiently low (or negative). Adding an asset with low correlation to a portfolio can reduce total portfolio risk, even if the asset is individually inferior. The efficient frontier expands leftward. An investor can hold a combination of stocks and gold that offers a better risk-return tradeoff than stocks alone — a lower volatility at comparable expected return. This is the fundamental power of diversification: correlation, not just individual risk, determines portfolio optimality.

Part b: If $\rho_{gold,stocks} = 1$:

When assets are perfectly correlated, the efficient frontier collapses to a straight line between the two assets. There is no diversification benefit. Since gold offers inferior return and superior risk relative to stocks, it would be dominated — no rational mean-variance investor would hold gold. The portfolio frontier becomes a ray from gold to stocks, and every interior point is inferior to stocks alone.

Part c: Can $\rho = 1$, $E(r_{gold}) = 10\%$, $\sigma_{gold} = 30\%$, and $E(r_{stocks}) = 18\%$, $\sigma_{stocks} = 22\%$ represent an equilibrium?

No. In equilibrium, all assets must be held by someone. If gold is strictly dominated (lower $E(r)$, higher $\sigma$, $\rho = 1$ with stocks), no rational investor would hold it, violating the market-clearing condition. This cannot be an equilibrium. Either the price of gold would fall (raising its expected return) until some investors are willing to hold it, or it would be arbitraged away. The observation that $\rho = 1$ and gold is dominated guarantees a non-equilibrium state.

3.8 Chapter 7: Problem Set 12 — Risk-Free Rate from Perfect Hedging

Given: Stock A: $E(r_A) = 10\%$, $\sigma_A = 5\%$; Stock B: $E(r_B) = 15\%$, $\sigma_B = 10\%$; $\rho_{AB} = -1$

With $\rho = -1$, we can construct a zero-variance (risk-free) portfolio.

Step 1: Find the zero-variance weights.

Set portfolio variance to zero with $\rho = -1$:

\[\sigma_P^2 = w_A^2 \sigma_A^2 + w_B^2 \sigma_B^2 - 2 w_A w_B \sigma_A \sigma_B = 0\]

\[(w_A \sigma_A - w_B \sigma_B)^2 = 0\]

\[w_A \times 5\% = w_B \times 10\%, \quad w_A + w_B = 1\]

\[w_A = \frac{10}{10+5} = \frac{2}{3}, \quad w_B = \frac{1}{3}\]

Step 2: Compute the risk-free return of this portfolio.

\[r_f = \frac{2}{3} \times 10\% + \frac{1}{3} \times 15\% = \frac{20 + 15}{3}\% = \frac{35}{3}\% \approx 11.67\%\]

The risk-free rate must equal 11.67%. If the prevailing risk-free rate differed from this value, there would be an arbitrage opportunity — borrow at $r_f$ and invest in the perfectly hedged portfolio (or vice versa). In equilibrium, no such arbitrage can persist.

3.9 Chapter 7: CFA Problem 12 — Abigail Grace Portfolio Extension

Given: - Original portfolio: $E(r_P) = 0.67\%$ monthly, $\sigma_P = 2.37\%$ - ABC Company stock: $E(r_{ABC}) = 1.25\%$, $\sigma_{ABC} = 2.95\%$ - Correlation: $\rho = 0.40$ - Portfolio value: $\$900,000$; ABC inheritance: $\$100,000$ → weight on ABC: $w = 0.10$

Part a: Keep the ABC stock

i. Expected return of new portfolio: \[E(r_{new}) = 0.90 \times 0.67\% + 0.10 \times 1.25\% = 0.603\% + 0.125\% = 0.728\%\]

ii. Covariance of ABC returns with original portfolio: \[\text{Cov}(r_{ABC}, r_P) = \rho \cdot \sigma_{ABC} \cdot \sigma_P = 0.40 \times 2.95\% \times 2.37\% = 0.0002798 = 0.027980\%^2\]

(in monthly % units, $= 0.40 \times 2.95 \times 2.37 = 2.798$, keeping in % × %)

iii. Standard deviation of new portfolio: \[\sigma_{new}^2 = w_P^2 \sigma_P^2 + w_{ABC}^2 \sigma_{ABC}^2 + 2 w_P w_{ABC} \text{Cov}\] \[= (0.9)^2(2.37)^2 + (0.1)^2(2.95)^2 + 2(0.9)(0.1)(2.798)\] \[= 0.81 \times 5.6169 + 0.01 \times 8.7025 + 0.18 \times 2.798\] \[= 4.5497 + 0.08703 + 0.5036 = 5.1403\] \[\sigma_{new} = \sqrt{5.1403} \approx 2.267\%\]

Part b: Sell ABC, replace with risk-free bonds (0.42% monthly)

i. Expected return: \[E(r_{new}) = 0.90 \times 0.67\% + 0.10 \times 0.42\% = 0.603\% + 0.042\% = 0.645\%\]

ii. Covariance of government bond with original portfolio: \[\text{Cov}(r_f, r_P) = 0 \quad \text{(risk-free asset has zero covariance with any portfolio)}\]

iii. Standard deviation: \[\sigma_{new}^2 = (0.9)^2(2.37)^2 + (0.1)^2(0)^2 + 0 = 0.81 \times 5.6169 = 4.5497\] \[\sigma_{new} = 0.9 \times 2.37 = 2.133\%\]

Part c: Replacing ABC with the risk-free bond reduces systematic risk. The government bond has $\beta = 0$, so it contributes no market risk. The new portfolio’s beta is $0.9 \times \beta_{original} < \beta_{original}$.

Part d: The husband’s comment is incorrect. Although ABC and XYZ have identical $E(r)$ and $\sigma$, what matters for a diversified portfolio is covariance with the existing holdings. If XYZ’s correlation with the original portfolio differs from ABC’s $\rho = 0.40$, the portfolio variance will change. In general, an investor should keep the asset whose addition results in the lower portfolio variance (higher Sharpe ratio), which depends on correlation, not just standalone statistics.

Part e:

i. Weakness of standard deviation as a risk measure for Grace: Grace expresses fear of losing money — a downside concern. Standard deviation penalizes upside deviations equally with downside, making it a symmetric measure that fails to capture Grace’s asymmetric preferences.

ii. Alternate risk measure: Semi-deviation (or Value-at-Risk/CVaR) — these measures focus only on losses below a target (e.g., zero return), directly aligning with Grace’s stated concern about capital preservation.

3.10 Chapter 8: Problem Set 17 — Optimal Active Portfolio Construction

Given (Micro Forecasts):

Stock	$E(r)$	$\beta$	$\sigma(\varepsilon)$
A	20%	1.3	58%
B	18%	1.8	71%
C	17%	0.7	60%
D	12%	1.0	55%

Macro Forecasts: $r_f = 8\%$, $E(r_M) = 16\%$, $\sigma_M = 23\%$

Part a: Expected Excess Returns, Alphas, Residual Variances

rf_ch8    <- 8
erm_ch8   <- 16
sigm_ch8  <- 23

stocks <- data.frame(
  stock   = c("A","B","C","D"),
  Er      = c(20, 18, 17, 12),
  beta    = c(1.3, 1.8, 0.7, 1.0),
  sig_eps = c(58, 71, 60, 55),
  stringsAsFactors = FALSE
)

# CAPM required return
stocks$Er_capm   <- rf_ch8 + stocks$beta * (erm_ch8 - rf_ch8)
stocks$alpha     <- stocks$Er - stocks$Er_capm
stocks$excess_Er <- stocks$Er - rf_ch8
stocks$var_eps   <- stocks$sig_eps^2

# Display table — use numeric columns only for rounding
display_df <- data.frame(
  Stock        = stocks$stock,
  Er           = round(stocks$Er,      2),
  Beta         = round(stocks$beta,    2),
  Sig_eps      = round(stocks$sig_eps, 2),
  Er_capm      = round(stocks$Er_capm, 2),
  Alpha        = round(stocks$alpha,   2),
  Excess_Er    = round(stocks$excess_Er, 2),
  Var_eps      = round(stocks$var_eps,   2)
)

kable(display_df,
      caption   = "Stock Characteristics: Excess Returns, Alphas, Residual Variances",
      col.names = c("Stock","E(r)%","Beta","σ(ε)%",
                    "E(r)_CAPM%","Alpha%","Excess E(r)%","σ²(ε)")) %>%
  kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE)

Stock Characteristics: Excess Returns, Alphas, Residual Variances
Stock	E(r)%	Beta	σ(ε)%	E(r)_CAPM%	Alpha%	Excess E(r)%	σ²(ε)
A	20	1.3	58	18.4	1.6	12	3364
B	18	1.8	71	22.4	-4.4	10	5041
C	17	0.7	60	13.6	3.4	9	3600
D	12	1.0	55	16.0	-4.0	4	3025

Part b: Construct the Optimal Risky Portfolio (Treynor-Black)

# Step 1: Initial active portfolio weights (proportional to alpha/residual variance)
stocks$w0       <- stocks$alpha / stocks$var_eps
W_total_raw     <- sum(stocks$w0)
stocks$w_active <- stocks$w0 / W_total_raw

cat("Active portfolio weights:\n")

## Active portfolio weights:

print(data.frame(
  stock    = stocks$stock,
  alpha    = round(stocks$alpha,    4),
  var_eps  = round(stocks$var_eps,  2),
  w_active = round(stocks$w_active, 4)
))

##   stock alpha var_eps w_active
## 1     A   1.6    3364  -0.6136
## 2     B  -4.4    5041   1.1261
## 3     C   3.4    3600  -1.2185
## 4     D  -4.0    3025   1.7060

# Active portfolio statistics
alpha_A  <- sum(stocks$w_active * stocks$alpha)
beta_A   <- sum(stocks$w_active * stocks$beta)
# Residual variance of active portfolio: sum(w_i^2 * sigma_eps_i^2)
var_eA   <- sum(stocks$w_active^2 * stocks$var_eps)   # var_eps already = sig_eps^2
sigma_eA <- sqrt(var_eA)

cat(sprintf("\nActive Portfolio Alpha:  %.4f%%\n", alpha_A))

## 
## Active Portfolio Alpha:  -16.9037%

cat(sprintf("Active Portfolio Beta:   %.4f\n",   beta_A))

## Active Portfolio Beta:   2.0824

cat(sprintf("Active Portfolio σ(ε):  %.4f%%\n", sigma_eA))

## Active Portfolio σ(ε):  147.6780%

# Step 2: Initial weight of active portfolio vs passive
# w*_A = [alpha_A / sigma²(eA)] / [E(r_M-rf) / sigma²_M]
var_eA2  <- var_eA   # same quantity, kept as alias for clarity below
w0_A        <- (alpha_A / var_eA2) / ((erm_ch8 - rf_ch8) / sigm_ch8^2)

# Adjustment for beta != 1
w_A_star    <- w0_A / (1 + (1 - beta_A) * w0_A)

cat(sprintf("\nOptimal weight in Active Portfolio (w*_A): %.4f\n", w_A_star))

## 
## Optimal weight in Active Portfolio (w*_A): -0.0486

cat(sprintf("Optimal weight in Passive Portfolio (1-w*_A): %.4f\n", 1 - w_A_star))

## Optimal weight in Passive Portfolio (1-w*_A): 1.0486

Part c: Sharpe Ratio of the Optimal Portfolio

# Sharpe ratio of passive
S_passive <- (erm_ch8 - rf_ch8) / sigm_ch8
cat(sprintf("Sharpe ratio (passive): %.4f\n", S_passive))

## Sharpe ratio (passive): 0.3478

# Information ratio of active portfolio
IR <- alpha_A / sigma_eA
cat(sprintf("Information ratio (active): %.4f\n", IR))

## Information ratio (active): -0.1145

# Sharpe ratio of optimal portfolio
S_optimal <- sqrt(S_passive^2 + IR^2)
cat(sprintf("Sharpe ratio (optimal portfolio): %.4f\n", S_optimal))

## Sharpe ratio (optimal portfolio): 0.3662

Part d: Improvement in Sharpe Ratio

improvement <- S_optimal - S_passive
cat(sprintf("Improvement in Sharpe ratio: %.4f\n", improvement))

## Improvement in Sharpe ratio: 0.0183

cat(sprintf("Passive Sharpe: %.4f → Optimal Sharpe: %.4f (gain: %.4f)\n",
            S_passive, S_optimal, improvement))

## Passive Sharpe: 0.3478 → Optimal Sharpe: 0.3662 (gain: 0.0183)

The active portfolio improves the Sharpe ratio by adding the squared information ratio to the squared passive Sharpe ratio. This is the fundamental insight of the Treynor-Black model: even modest alpha estimates, if properly exploited, can meaningfully enhance the risk-adjusted performance of a portfolio.

Part e: Complete Portfolio Composition (Risk Aversion A = 2.8)

# Expected return and variance of optimal risky portfolio
Er_opt    <- rf_ch8 + w_A_star * alpha_A +
             (w_A_star * beta_A + (1 - w_A_star)) * (erm_ch8 - rf_ch8)
var_opt   <- (w_A_star * beta_A + (1 - w_A_star))^2 * sigm_ch8^2 +
             w_A_star^2 * var_eA2
sigma_opt <- sqrt(var_opt)

# Optimal allocation to risky portfolio: y* = E(r_p - rf) / (A * sigma²_p)
A_coef  <- 2.8
y_star  <- (Er_opt - rf_ch8) / (A_coef * var_opt)

cat(sprintf("Optimal risky portfolio E(r): %.4f%%\n", Er_opt))

## Optimal risky portfolio E(r): 16.4004%

cat(sprintf("Optimal risky portfolio σ:    %.4f%%\n", sigma_opt))

## Optimal risky portfolio σ:    22.9408%

cat(sprintf("\nFor A = 2.8:\n"))

## 
## For A = 2.8:

cat(sprintf("Allocation to risky portfolio (y*): %.4f (%.2f%%)\n",
            y_star, y_star * 100))

## Allocation to risky portfolio (y*): 0.0057 (0.57%)

cat(sprintf("Allocation to risk-free (1-y*):     %.4f (%.2f%%)\n",
            1 - y_star, (1 - y_star) * 100))

## Allocation to risk-free (1-y*):     0.9943 (99.43%)

cat(sprintf("\nWithin risky portion:\n"))

## 
## Within risky portion:

cat(sprintf("  Active portfolio:  %.2f%%\n", w_A_star * 100))

##   Active portfolio:  -4.86%

cat(sprintf("  Passive portfolio: %.2f%%\n", (1 - w_A_star) * 100))

##   Passive portfolio: 104.86%

3.11 Chapter 8: CFA Problem 1 — Regression-Based Risk Analysis

Given (from 5-year OLS regression of excess stock returns on market excess returns):

Statistic	ABC	XYZ
Alpha	−3.20%	7.30%
Beta	0.60	0.97
R²	0.35	0.17
Residual SD	13.02%	21.45%

Recent brokerage beta estimates (2-year weekly): | Brokerage | Beta of ABC | Beta of XYZ | |———–|————-|————-| | A | 0.62 | 1.45 | | B | 0.71 | 1.25 |

Interpretation:

ABC Stock: - $\alpha = -3.20\%$ suggests that ABC underperformed its CAPM-implied required return over the sample period. This is a negative abnormal return. - $\beta = 0.60$ (low) implies ABC is less sensitive to market movements — a defensive characteristic. - $R^2 = 0.35$: 35% of ABC’s return variance is explained by market movements; 65% is idiosyncratic. - The residual SD of 13.02% is large relative to a stock with $\beta = 0.60$, indicating substantial firm-specific risk. - Future implication: A negative historical alpha does not guarantee future underperformance — but it suggests the analyst should examine whether the business fundamentals have changed. The brokerage estimates (0.62, 0.71) are close to the historical 0.60, suggesting beta stability, which increases confidence in using 0.60 as a forecast. For a diversified portfolio, the high idiosyncratic risk is largely irrelevant.

XYZ Stock: - $\alpha = +7.30\%$ is a large positive abnormal return — well above what CAPM would predict. - $\beta = 0.97$ is near the market; XYZ moves almost one-for-one with the market. - $R^2 = 0.17$: Only 17% of XYZ’s return variance is explained by the market — most risk is idiosyncratic. - The residual SD of 21.45% is enormous, suggesting high firm-specific risk. - Future implication: The high alpha could reflect genuine outperformance or statistical noise (low $R^2$ makes alpha estimates imprecise). The brokerage estimates (1.45, 1.25) diverge sharply from the historical 0.97, suggesting beta instability — a significant concern. An analyst should investigate whether XYZ’s business model or leverage has changed. The diverging brokerage estimates may reflect that XYZ has become a higher-systematic-risk company more recently, in which case the 5-year beta understates current risk.

Portfolio implications: - In a diversified portfolio, idiosyncratic risk (the 65%–83% unexplained variance) washes out. What matters is the beta and alpha. - ABC’s negative alpha and stable beta make it a candidate for underweighting. - XYZ’s positive alpha and unstable beta require careful analysis before overweighting. - The beta revision for XYZ is the more pressing concern: if true beta has risen to ~1.35 (midpoint of brokerage estimates), XYZ is substantially riskier than the 5-year regression implies.

4 Discussion

The empirical analysis in Part I highlights several important lessons:

CAPM vs FF3: The two models yield different MVP weights precisely because they model covariance differently. CAPM’s single-factor structure understates the correlation between assets that share size or value tilts. The FF3 model’s richer covariance structure typically produces a more defensively positioned MVP — one that is less concentrated in any single return dimension.

MVP in Practice: The MVP optimization ignores expected returns entirely, focusing solely on minimizing portfolio variance. This makes it robust to mean estimation error (which is famously difficult) but may produce portfolios with very low expected returns. In practice, investors typically use mean-variance optimization with a target return or a Bayesian prior on expected returns.

Rolling Windows: Using a 60-month rolling window is a pragmatic compromise. Too short a window makes estimates noisy; too long a window makes them stale. The shift from the March 2025 to April 2025 window illustrates how weights change as one month of data is added and the oldest month dropped.

5 Conclusion

This midterm analysis demonstrates the application of modern portfolio theory from data preparation through empirical estimation and theoretical interpretation. Key takeaways:

CAPM provides a useful baseline but understates covariance among assets with similar factor exposures. The FF3 model offers a more nuanced view.
The MVP is a powerful concept — it requires no return forecasting and naturally diversifies across low-covariance assets. However, it can produce extreme (short) weights when assets are highly correlated.
The six-portfolio split-half analysis reveals that return distributions are non-stationary over long horizons, cautioning against extrapolating historical statistics.
The textbook problems reinforce core principles: utility maximization, CML-based allocation, diversification mechanics, and the Treynor-Black framework for incorporating active views.

A recurring theme is that correlation structure, not just individual risk-return characteristics, is the key determinant of portfolio construction. Investors who appreciate this — and who understand the limitations of any particular factor model — are better positioned to build resilient portfolios.

6 Appendix: Key Formulas

CAPM: $E(r_i) = r_f + \beta_i [E(r_M) - r_f]$

FF3: $E(r_i) - r_f = \alpha_i + \beta_{i,M}(r_M - r_f) + \beta_{i,S}\text{SMB} + \beta_{i,H}\text{HML}$

MVP weights: $\mathbf{w}^* = \frac{\boldsymbol{\Sigma}^{-1}\mathbf{1}}{\mathbf{1}'\boldsymbol{\Sigma}^{-1}\mathbf{1}}$

Factor-model covariance: $\boldsymbol{\Sigma} = \mathbf{B}\boldsymbol{\Sigma}_F\mathbf{B}' + \mathbf{D}$

Sharpe Ratio: $S = \frac{E(r_P) - r_f}{\sigma_P}$

CML: $E(r_C) = r_f + \frac{E(r_M) - r_f}{\sigma_M}\sigma_C$

Treynor-Black: $S_{optimal}^2 = S_{passive}^2 + \left(\frac{\alpha_A}{\sigma(\varepsilon_A)}\right)^2$

This document was prepared as a complete midterm submission. All R code is fully reproducible given the provided data files. Published to RPubs for grading.

Midterm Exam: Investment Portfolio Analysis

Muhammad Gilang Putra Ariyanto — 114035130

April 17, 2026