Limiting the portfolio to just 20 stocks (instead of the current 40–50) would generally raise the overall risk, but that increase might not be very large.
For example, if all 50 stocks had the same level of risk (standard deviation, σ) and the same level of correlation (ρ) with one another, then the overall portfolio risk depends largely on the number of holdings. The math shows that dropping from 50 to 20 stocks slightly increases portfolio volatility — say from 20.91% to 22.05% if σ = 45% and ρ = 0.2.
This jump in risk could be acceptable if Hennessy’s concentrated picks deliver higher expected returns.
# Load packages
library(MASS)
# Simulate returns for 40 stocks
set.seed(123)
n_assets <- 40
mu <- rep(0.08, n_assets) # Mean return
sigma <- 0.20 # Std dev per asset
cor_matrix <- matrix(0.2, n_assets, n_assets)
diag(cor_matrix) <- 1
cov_matrix <- sigma^2 * cor_matrix
# Equal weights
weights_40 <- rep(1/n_assets, n_assets)
port_var_40 <- t(weights_40) %*% cov_matrix %*% weights_40
port_sd_40 <- sqrt(port_var_40)
# Now simulate 20-stock version
n_assets_20 <- 20
weights_20 <- rep(1/n_assets_20, n_assets_20)
cov_matrix_20 <- cov_matrix[1:20, 1:20]
port_var_20 <- t(weights_20) %*% cov_matrix_20 %*% weights_20
port_sd_20 <- sqrt(port_var_20)
cat("Std Dev with 40 stocks:", port_sd_40, "\n")
## Std Dev with 40 stocks: 0.09380832
cat("Std Dev with 20 stocks:", port_sd_20, "\n")
## Std Dev with 20 stocks: 0.09797959
To manage the risk after reducing the number of holdings, Hennessy would need to keep diversification intact across the remaining 20 stocks.
That means including stocks that don’t move together — i.e., maintaining low correlations among them. Practically, this would involve choosing companies from a broad range of industries. If the stocks are too similar or come from the same sector, correlations would rise and so would portfolio risk.
library(quadprog)
# Example of minimizing risk for 20 assets
Dmat <- 2 * cov_matrix_20
dvec <- rep(0, n_assets_20)
Amat <- cbind(rep(1, n_assets_20), diag(1, n_assets_20))
bvec <- c(1, rep(0, n_assets_20)) # Weights sum to 1, no short-selling
sol <- solve.QP(Dmat, dvec, Amat, bvec, meq=1)
min_risk_weights <- sol$solution
min_risk_sd <- sqrt(t(min_risk_weights) %*% cov_matrix_20 %*% min_risk_weights)
cat("Minimized risk SD with 20 stocks:", min_risk_sd, "\n")
## Minimized risk SD with 20 stocks: 0.09797959
Diversification has diminishing returns—the biggest benefits come from adding stocks to a small portfolio.
So, reducing the number of holdings from 20 to 10 would have a much bigger negative impact on risk than going from 50 to 30 or 30 to 20. In our earlier example, cutting from 20 to 10 increases the standard deviation more (+1.76%) than cutting from 50 to 20 (+1.14%).
This shows that dropping to just 10 stocks could significantly increase risk, even more than the previous reduction.
This is an important consideration. Since Hennessy’s portfolio is just one part of the larger pension fund, the effect of increasing its concentration may not dramatically affect the total fund’s risk level.
Given that the fund already includes multiple diversified portfolios, giving Hennessy more freedom to focus on his best stock picks might improve overall performance without materially raising total fund volatility.
In R, combine the covariance of multiple managers’ portfolios:
# Simulate returns for all 6 managers
cov_all <- matrix(c(0.04, 0.01, 0.01, 0.01, 0.01, 0.01,
0.01, 0.05, 0.01, 0.01, 0.01, 0.01,
0.01, 0.01, 0.03, 0.01, 0.01, 0.01,
0.01, 0.01, 0.01, 0.06, 0.01, 0.01,
0.01, 0.01, 0.01, 0.01, 0.04, 0.01,
0.01, 0.01, 0.01, 0.01, 0.01, 0.07), nrow=6)
# Manager weights (Hennessy = $30M, others = $250M)
weights_fund <- c(30, rep(50, 5)) / 280
# Portfolio variance for total fund
total_fund_var <- t(weights_fund) %*% cov_all %*% weights_fund
total_fund_sd <- sqrt(total_fund_var)
cat("Total Fund Std Dev:", total_fund_sd, "\n")
## Total Fund Std Dev: 0.1293133
This shows the overall risk, not just Hennessy’s, which is the more appropriate measure.
Portfolio Y cannot lie on the efficient frontier because another portfolio (like Portfolio X) offers a higher return with lower risk.
That means Portfolio Y is dominated—it’s inefficient and would never be chosen by a rational investor aiming to maximize return for a given level of risk.
df <- data.frame(
Portfolio = c("W", "X", "Y", "Z"),
Return = c(15, 12, 9, 5),
Risk = c(36, 15, 21, 7)
)
library(ggplot2)
ggplot(df, aes(x=Risk, y=Return, label=Portfolio)) +
geom_point(size=3, color='purple4') +
geom_text(nudge_y = 0.5) +
labs(title = "Efficient Frontier Candidates", x = "Standard Deviation (%)", y = "Expected Return (%)") +
theme_minimal()
Since there’s no information about expected returns, I compared the risk of the two portfolio options.
Stock A and Stock C both have the same individual risk (standard deviation), but the correlation between B and C is much lower (0.10) than that between A and B (0.90).
Because of this lower correlation, a portfolio combining Stocks B and C offers better diversification and therefore less total risk than a portfolio made up of A and B.
The regression analysis gives a quantitative summary of the return and risk profiles for ABC and XYZ, based on monthly data from the past five years.
For ABC, the beta was 0.60, meaning its price movements were much less sensitive to the market than the average stock, which typically has a beta of 1.0. In practical terms, when the market moved by 1%, ABC’s return tended to move by only 0.60%. This shows that ABC had relatively low market (systematic) risk.
ABC’s alpha was –3.2%, which suggests that even when the market had no movement (a 0% return), ABC’s average return was still negative. This could indicate underperformance relative to market expectations. The residual (unsystematic) risk was 13.02%, reflecting the level of variability in ABC’s returns that couldn’t be explained by market movements. The R² value of 0.35 means that 35% of the return variation in ABC can be explained by changes in the market — a moderately strong relationship.
For XYZ, the beta was 0.97, suggesting its performance closely mirrored that of the overall market. It had average market risk compared to the broader index. Unlike ABC, XYZ had a positive alpha of 7.3%, which implies it delivered strong returns beyond what market movements alone would predict. However, its residual risk was higher at 21.45%, showing more return variability not explained by the market. The lower R² of 0.17 confirms a weaker relationship between XYZ’s returns and the market, indicating less predictability using this model.
From a portfolio perspective, these two stocks behave quite differently. Assuming their betas stay consistent, ABC contributes lower market risk, while XYZ aligns more with the broader market and brings higher total volatility due to its greater residual risk.
Looking at more recent beta estimates from two brokerage firms adds further insight. For ABC, all beta estimates are in a narrow range (0.60 to 0.71), confirming its historically low market sensitivity. On the other hand, XYZ shows much more variation — with one recent estimate as high as 1.45, based on weekly data. This suggests that XYZ’s future market risk might be higher than what the 5-year regression indicated.
In summary, ABC and XYZ differ significantly in their risk profiles. If added to a diversified portfolio, XYZ would likely contribute more to total volatility due to its higher residual risk and possibly rising beta.
# Define the known values
expected_return_fund <- 0.09 # 9%
expected_return_market <- 0.11 # 11%
risk_free_rate <- 0.03 # 3%
# Calculate implied beta
implied_beta <- (expected_return_fund - risk_free_rate) / (expected_return_market - risk_free_rate)
# Print result
implied_beta
## [1] 0.75
The concept of beta is most closely associated with: d.Systematic risk
Beta and standard deviation differ as risk measures in that beta measures: b. Only systematic risk, while standard deviation measures total risk.
When plotting portfolio R in the preceding table relative to the SML, portfolio R lies: d. Insufficient data given. (Need to know the risk-free rate)
When plotting portfolio R relative to the capital market line, portfolio R lies: d. Insufficient data given. (Need to know the risk-free rate)
According to the Capital Asset Pricing Model (CAPM), investors are only rewarded for taking on systematic risk, which is the portion of risk that cannot be eliminated through diversification. Since both Portfolio A and Portfolio B have the same beta of 1.0, they carry the same level of market risk. Therefore, investors should expect identical returns from both portfolios. Additionally, because both portfolios are well-diversified, the level of specific (or unsystematic) risk within individual securities is irrelevant — that risk has been effectively reduced through diversification.
The revised estimate of the expected rate of return on the stock would be the old estimate plus the sum of the products of the unexpected change in each factor times the respective sensitivity coefficient:
# Inputs
expected_return <- 0.12 # 12%
beta_IP <- 1.0
beta_IR <- 0.5
IP_expected <- 0.03
IP_actual <- 0.05
IR_expected <- 0.05
IR_actual <- 0.08
# Calculate revised return
revised_return <- expected_return +
beta_IP * (IP_actual - IP_expected) +
beta_IR * (IR_actual - IR_expected)
# Output result as percentage
revised_return_percent <- revised_return * 100
cat("Revised expected return:", revised_return_percent, "%\n")
## Revised expected return: 15.5 %
# Define the coefficients in matrix form
A <- matrix(c(1.5, 2.0, 2.2, -0.2), nrow = 2, byrow = TRUE)
b <- c(31 - 6, 27 - 6)
# Solve the system of equations: A %*% x = b
risk_premiums <- solve(A, b)
# Display the results
names(risk_premiums) <- c("RP1", "RP2")
risk_premiums
## RP1 RP2
## 10 5
Thus, the expected return–beta relationship in this economy is: E(rp) = 6% + (β1 × 10%) + (β2 × 5%)
The semistrong form of the efficient market hypothesis asserts that stock prices: b.Fully reflect all publicly available information.
Assume that a company announces an unexpectedly large cash dividend to its shareholders. In an efficient market without information leakage, one might expect: a.An abnormal price change at the announcement.
A “random walk” occurs when: c.Future price changes are uncorrelated with past price changes.
Two basic assumptions of technical analysis are that security prices adjust: d.Gradually to new information, and prices are determined by the interaction between supply and demand.
1.Mutual Fund Performance Studies: Most actively managed mutual funds do not outperform passive benchmarks after fees and expenses. This suggests that public information is already priced in, and managers cannot consistently beat the market.
2.Event Studies: Stock prices react quickly and efficiently to new public information (e.g., earnings announcements, mergers). This rapid adjustment is consistent with the semistrong form of EMH.
1.Market Anomalies (e.g., January Effect): Stocks, especially small caps, tend to earn abnormally high returns in January—suggesting predictable patterns exist.
2.Momentum Effect: Past winners tend to continue performing well in the short term, which contradicts EMH. This means past price movements may carry predictive power, violating the “random walk” notion.
1.Personal Preferences: Investors might prefer customized strategies or specific exposures (e.g., ESG, religious filters) not captured by index funds.
2.Tax Considerations: Active managers may employ tax-loss harvesting or time gains/losses to optimize after-tax returns.
3.Non-financial Goals: Some investors may derive personal satisfaction or entertainment from investing or researching stocks, regardless of performance.
Statement #3 shows mental accounting. Sampson insists that his income needs should be met only through interest and dividends, which is a sign of separating different types of returns into mental “buckets.” Investors using mental accounting treat dividends and capital gains separately, and may view losses in one area independently of gains or losses in another. This bias can cause a preference for dividend income over total return and prevent proper evaluation of the full investment picture.
Statement #6 reflects overconfidence, especially the illusion that Sampson can control outcomes better than he really can. His urge to pick stocks that don’t fit his overall strategy shows risk-taking behavior often seen in overconfident investors. This bias includes overestimating one’s ability to predict outcomes, underestimating uncertainty, and being too sure about decisions without enough evidence.
Statement #5 illustrates reference dependence. Sampson wants to hold onto losing investments and quickly sell winners based on their original purchase prices. This suggests he evaluates investments relative to what he paid, not based on current or future value. Instead of focusing on total returns, he reacts emotionally to gains or losses compared to a fixed reference point, which can distort decision-making.
Frost plans to sell his international investments only when their prices return to his purchase price, showing reference dependence. His decision relies not just on the future value of the investment but on recovering a past loss. This contrasts with standard financial thinking, which evaluates choices based solely on expected outcomes, not on previous prices.
Frost is also showing overconfidence by assuming that five years of good returns in Country XYZ guarantee future success. This confidence is not backed by strong statistical evidence. He also demonstrates asset segregation—evaluating the performance of Country XYZ in isolation rather than considering how it fits into his overall portfolio. In contrast, rational investors analyze how new investments affect total portfolio risk and return, and avoid drawing conclusions from short data samples (a bias known as the “law of small numbers”).
Frost’s distinction between his retirement fund and his speculative investments reflects mental accounting. He treats these as separate layers in his portfolio: one for safety and the other for high risk. He applies different attitudes toward risk to each “account.” Standard financial theory advises evaluating all investments together, looking at the risk and return of the entire portfolio and how each asset interacts with others. This avoids treating money differently based on arbitrary categories.
Even if the single-factor consumption-based CAPM (CCAPM)—which uses a portfolio that tracks consumption growth—performs better than the traditional CAPM, it might still miss important characteristics such as company size and value versus growth traits. These are captured by the SMB (Small Minus Big) and HML (High Minus Low book-to-market) factors in the Fama-French three-factor model. Therefore, a model that combines consumption patterns with the Fama-French factors is expected to offer a more complete and accurate explanation of stock returns than relying on consumption alone.
Wealth and consumption generally move in the same direction—when wealth increases, consumption tends to rise, and vice versa. As a result, periods of high market volatility often occur alongside periods of high consumption volatility.
The traditional CAPM explains returns based on how much an asset’s returns move with the overall market portfolio, which reflects total wealth. In contrast, the consumption-based CAPM looks at how an asset’s returns move with changes in consumption. Since wealth and consumption are correlated, both models can potentially explain return patterns reasonably well.
To illustrate this more formally: In the standard CAPM, the market price of risk is the expected excess return of the market divided by its variance. In the consumption-based CAPM, the price of risk is the expected excess return divided by the covariance between the asset and consumption growth (g).
This covariance depends on the correlation between the market return and consumption growth, multiplied by their standard deviations.
Therefore, if the correlation between market returns and consumption remains steady, then when market volatility rises, consumption volatility will also tend to rise. This suggests that market and consumption-based models may align during periods of economic fluctuation.
returns <- data.frame(
Year = 1:12,
Market = c(29.65, -11.91, 14.73, 27.68, 5.18, 25.97, 10.64, 1.02, 18.82, 23.92, -41.61, -6.64),
A = c(33.88, -49.87, 65.14, 14.46, 15.67, -32.17, -31.55, -23.79, -4.59, -8.03, 78.22, 4.75),
B = c(-25.20, 24.70, -25.04, -38.64, 61.93, 44.94, -74.65, 47.02, 28.69, 48.61, -85.02, 42.95),
C = c(36.48, -25.11, 18.91, -23.31, 63.95, -19.56, 50.18, -42.28, -0.54, 23.65, -0.79, -48.60),
D = c(42.89, -54.39, -39.86, -0.72, -32.82, 69.42, 74.52, 28.61, 2.32, 26.26, -68.70, 26.27),
E = c(-39.89, 44.92, -3.91, -3.21, 44.26, 90.43, 15.38, -17.64, 42.36, -3.65, -85.71, 13.24),
F = c(39.67, -54.33, -5.69, 92.39, -42.96, 76.72, 21.95, 28.83, 18.93, 23.31, -45.64, -34.34),
G = c(74.57, -79.76, 26.73, -3.82, 101.67, 1.72, -43.95, 98.01, -2.45, 15.36, 2.27, -54.47),
H = c(40.22, -71.58, 14.49, 13.74, 24.24, 77.22, -13.40, 28.12, 37.65, 80.59, -72.47, -1.50),
I = c(90.19, -26.64, 18.14, 0.09, 8.98, 72.38, 28.95, 39.41, 94.67, 52.51, -80.26, -24.46)
)
# Load libraries
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.2 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ lubridate 1.9.2 ✔ tibble 3.2.1
## ✔ purrr 1.0.2 ✔ tidyr 1.3.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ✖ dplyr::select() masks MASS::select()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(broom)
# Set up
stock_names <- colnames(returns)[-c(1, 2)] # All stock names
market <- returns$Market
# --- First-Pass Regressions ---
first_pass <- lapply(stock_names, function(stock) {
model <- lm(returns[[stock]] ~ market)
summary_model <- summary(model)
data.frame(
Stock = stock,
R_Square = summary_model$r.squared,
Alpha = coef(model)[1],
Beta = coef(model)[2],
t_Alpha = summary_model$coefficients[1, "t value"],
t_Beta = summary_model$coefficients[2, "t value"]
)
})
# Combine all into one data frame
first_pass_results <- bind_rows(first_pass)
# --- Add average excess returns ---
first_pass_results$Average_Excess_Return <- sapply(stock_names, function(s) mean(returns[[s]]))
# Show first-pass results
print("First-Pass Regression Results:")
## [1] "First-Pass Regression Results:"
print(first_pass_results)
## Stock R_Square Alpha Beta t_Alpha t_Beta
## (Intercept)...1 A 0.06218356 8.9969458 -0.4704295 0.72600895 -0.8142896
## (Intercept)...2 B 0.05740334 -0.6311024 0.5937735 -0.03866759 0.7803791
## (Intercept)...3 C 0.05677479 -0.6362193 0.4167741 -0.05521267 0.7758362
## (Intercept)...4 D 0.37020013 -5.0505884 1.3792413 -0.41388647 2.4244710
## (Intercept)...5 E 0.16801011 0.7289817 0.9013055 0.05358162 1.4210478
## (Intercept)...6 F 0.59449351 -4.5275764 1.7770233 -0.45478885 3.8289055
## (Intercept)...7 G 0.05748884 5.9364702 0.6633387 0.32583966 0.7809955
## (Intercept)...8 H 0.67056405 -2.4102498 1.9111647 -0.26525346 4.5116420
## (Intercept)...9 I 0.69801510 5.9182355 2.0825159 0.63695127 4.8077270
## Average_Excess_Return
## (Intercept)...1 5.176667
## (Intercept)...2 4.190833
## (Intercept)...3 2.748333
## (Intercept)...4 6.150000
## (Intercept)...5 8.048333
## (Intercept)...6 9.903333
## (Intercept)...7 11.323333
## (Intercept)...8 13.110000
## (Intercept)...9 22.830000
# --- Second-Pass Regression (SML Test) ---
second_pass_model <- lm(Average_Excess_Return ~ Beta, data = first_pass_results)
summary(second_pass_model)
##
## Call:
## lm(formula = Average_Excess_Return ~ Beta, data = first_pass_results)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.9524 -3.2697 -0.7612 3.7024 8.0668
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.923 2.542 1.543 0.167
## Beta 5.205 1.966 2.648 0.033 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.623 on 7 degrees of freedom
## Multiple R-squared: 0.5005, Adjusted R-squared: 0.4291
## F-statistic: 7.013 on 1 and 7 DF, p-value: 0.03303
The hypotheses for the second-pass regression for the SML are: . The intercept is zero; and, . The slope is equal to the average return on the index portfolio.
# Load necessary libraries
library(dplyr)
# === Step 1: Input data ===
returns <- data.frame(
Year = 1:12,
Market = c(29.65, -11.91, 14.73, 27.68, 5.18, 25.97, 10.64, 1.02, 18.82, 23.92, -41.61, -6.64),
A = c(33.88, -49.87, 65.14, 14.46, 15.67, -32.17, -31.55, -23.79, -4.59, -8.03, 78.22, 4.75),
B = c(-25.20, 24.70, -25.04, -38.64, 61.93, 44.94, -74.65, 47.02, 28.69, 48.61, -85.02, 42.95),
C = c(36.48, -25.11, 18.91, -23.31, 63.95, -19.56, 50.18, -42.28, -0.54, 23.65, -0.79, -48.60),
D = c(42.89, -54.39, -39.86, -0.72, -32.82, 69.42, 74.52, 28.61, 2.32, 26.26, -68.70, 26.27),
E = c(-39.89, 44.92, -3.91, -3.21, 44.26, 90.43, 15.38, -17.64, 42.36, -3.65, -85.71, 13.24),
F = c(39.67, -54.33, -5.69, 92.39, -42.96, 76.72, 21.95, 28.83, 18.93, 23.31, -45.64, -34.34),
G = c(74.57, -79.76, 26.73, -3.82, 101.67, 1.72, -43.95, 98.01, -2.45, 15.36, 2.27, -54.47),
H = c(40.22, -71.58, 14.49, 13.74, 24.24, 77.22, -13.40, 28.12, 37.65, 80.59, -72.47, -1.50),
I = c(90.19, -26.64, 18.14, 0.09, 8.98, 72.38, 28.95, 39.41, 94.67, 52.51, -80.26, -24.46)
)
# === Step 2: Calculate excess returns ===
excess_returns <- returns %>%
mutate(across(A:I, ~ .x - Market, .names = "ER_{col}")) %>%
select(Year, starts_with("ER_"))
# === Step 3: First-pass regression to estimate betas ===
betas <- sapply(letters[1:9], function(asset) {
fit <- lm(returns[[toupper(asset)]] ~ returns$Market)
coef(fit)[2]
})
# Calculate average excess returns
avg_asset_returns <- sapply(returns[, 3:11], mean)
avg_market_return <- mean(returns$Market)
avg_excess_returns <- avg_asset_returns - avg_market_return
second_pass_data <- data.frame(
Asset = LETTERS[1:9],
Beta = betas,
AvgExcessReturn = avg_excess_returns
)
# === Step 4: Second-pass regression ===
second_pass_fit <- lm(AvgExcessReturn ~ Beta, data = second_pass_data)
summary_fit <- summary(second_pass_fit)
cat("Second-Pass Regression Table:\n")
## Second-Pass Regression Table:
print(second_pass_data)
## Asset Beta AvgExcessReturn
## a.returns$Market A -0.4704295 -2.944167
## b.returns$Market B 0.5937735 -3.930000
## c.returns$Market C 0.4167741 -5.372500
## d.returns$Market D 1.3792413 -1.970833
## e.returns$Market E 0.9013055 -0.072500
## f.returns$Market F 1.7770233 1.782500
## g.returns$Market G 0.6633387 3.202500
## h.returns$Market H 1.9111647 4.989167
## i.returns$Market I 2.0825159 14.709167
cat("\nRegression Summary:\n")
##
## Regression Summary:
print(summary_fit)
##
## Call:
## lm(formula = AvgExcessReturn ~ Beta, data = second_pass_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.9524 -3.2697 -0.7612 3.7024 8.0668
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -4.198 2.542 -1.652 0.143
## Beta 5.205 1.966 2.648 0.033 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.623 on 7 degrees of freedom
## Multiple R-squared: 0.5005, Adjusted R-squared: 0.4291
## F-statistic: 7.013 on 1 and 7 DF, p-value: 0.03303
As we saw in the chapter, the intercept is too high (3.92% per year instead of 0) and the slope is too flat (5.21% instead of a predicted value equal to the sample-average risk premium: rM -r(= 8.12%). The intercept is not significantly greater than zero (the t-statistic is less than 2) and the slope is not significantly different from its theoretical value (the t-statistic for this hypothesis is -1.48). This lack of statistical significance is probably due to the small size of the sample.
Arranging the securities in three portfolios based on betas from the SCL estimates, the first pass input data are:
# === Step 5: Portfolio groupings ===
portfolio_data <- returns %>%
mutate(
ABC = rowMeans(select(., A, B, C)),
DEG = rowMeans(select(., D, E, G)),
FHI = rowMeans(select(., F, H, I))
) %>%
select(Year, ABC, DEG, FHI)
# Portfolio statistics
portfolio_avg <- colMeans(portfolio_data[,-1])
portfolio_sd <- apply(portfolio_data[,-1], 2, sd)
# === Step 6: Display results ===
cat("\nPortfolio Returns by Year:\n")
##
## Portfolio Returns by Year:
print(portfolio_data)
## Year ABC DEG FHI
## 1 1 15.053333 25.856667 56.693333
## 2 2 -16.760000 -29.743333 -50.850000
## 3 3 19.670000 -5.680000 8.980000
## 4 4 -15.830000 -2.583333 35.406667
## 5 5 47.183333 37.703333 -3.246667
## 6 6 -2.263333 53.856667 75.440000
## 7 7 -18.673333 15.316667 12.500000
## 8 8 -6.350000 36.326667 32.120000
## 9 9 7.853333 14.076667 50.416667
## 10 10 21.410000 12.656667 52.136667
## 11 11 -2.530000 -50.713333 -66.123333
## 12 12 -0.300000 -4.986667 -20.100000
cat("\nPortfolio Averages:\n")
##
## Portfolio Averages:
print(portfolio_avg)
## ABC DEG FHI
## 4.038611 8.507222 15.281111
cat("\nPortfolio Standard Deviations:\n")
##
## Portfolio Standard Deviations:
print(portfolio_sd)
## ABC DEG FHI
## 19.29727 29.47273 43.96076
# Optional: t-stat for β = 8.12 (market)
beta_market <- 8.12
slope <- coef(summary_fit)["Beta", "Estimate"]
se <- coef(summary_fit)["Beta", "Std. Error"]
t_for_812 <- (slope - beta_market) / se
cat(sprintf("\nt statistic for β = 8.12: %.2f\n", t_for_812))
##
## t statistic for β = 8.12: -1.48
When evaluating a portfolio manager’s performance, I compare the return of the managed portfolio to the return that would be expected from an unmanaged portfolio with the same level of risk. This expected return is based on the Security Market Line (SML), using the formula: Expected Return = Risk-Free Rate + Beta × (Market Return − Risk-Free Rate) In this formula: The risk-free rate is a baseline return (e.g., from government bonds), The market return is the average return of the entire market (usually represented by an index), Beta measures the portfolio’s sensitivity to market movements. The unmanaged portfolio—often represented by a broad market index like the S&P 500—serves as the benchmark for performance comparison.
However, this method can run into issues known as benchmark errors. This happens when the chosen benchmark (like the S&P 500) isn’t actually a mean-variance efficient portfolio. In other words, if the benchmark itself isn’t optimized for the best trade-off between risk and return, then using it as a performance standard may not fairly evaluate how well the managed portfolio is doing.
In my graph, I would include two efficient frontiers: one based on actual historical returns, and another that represents expected returns before the fact (ex-ante), which we can’t observe directly. This illustrates how the CML and SML from real data often diverge from what the CAPM predicts. The theoretical lines based on expectations, however, align more closely with the CAPM. This contrast highlights the gap between real-world data and theoretical models.
This ultimately depends on what I already believe. If a manager has a strong and consistent track record, I might be convinced that they are genuinely skilled. However, I also recognize that many people try to outperform the market, so just by chance, a few will appear successful. This means I need to be cautious before attributing outperformance to true skill rather than luck.
This raises the question of whether the CAPM is even testable. If the benchmark portfolio I use isn’t perfectly efficient, then any test of the return-beta relationship might be flawed. Based on Roll’s critique, even a small inefficiency in the benchmark can completely undermine the results. This makes me think that the real test of the CAPM’s usefulness might not be theoretical at all—it’s whether anyone can consistently outperform a passive market strategy in practice.
# Load additional required libraries
library(tidyquant)
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
## ── Attaching core tidyquant packages ─────────────────────── tidyquant 1.0.11 ──
## ✔ PerformanceAnalytics 2.0.8 ✔ TTR 0.24.4
## ✔ quantmod 0.4.27 ✔ xts 0.14.1
## ── Conflicts ────────────────────────────────────────── tidyquant_conflicts() ──
## ✖ zoo::as.Date() masks base::as.Date()
## ✖ zoo::as.Date.numeric() masks base::as.Date.numeric()
## ✖ dplyr::filter() masks stats::filter()
## ✖ xts::first() masks dplyr::first()
## ✖ dplyr::lag() masks stats::lag()
## ✖ xts::last() masks dplyr::last()
## ✖ PerformanceAnalytics::legend() masks graphics::legend()
## ✖ dplyr::select() masks MASS::select()
## ✖ quantmod::summary() masks base::summary()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dplyr)
library(xts)
library(PerformanceAnalytics)
library(quadprog)
library(lubridate)
# Your existing code (with small fix in FF data processing)
# Load libraries
library(tidyquant)
library(dplyr)
# Set ticker symbols
etf_tickers <- c("SPY", "QQQ", "EEM", "IWM", "EFA", "TLT", "IYR", "GLD")
# Download data from Yahoo Finance
etf_data <- tq_get(etf_tickers, from = "2010-01-01", to = Sys.Date())
# Calculate weekly simple returns
etf_weekly_returns <- etf_data %>%
group_by(symbol) %>%
tq_transmute(select = adjusted,
mutate_fun = periodReturn,
period = "weekly",
col_rename = "weekly_return",
type = "arithmetic")
# Calculate monthly simple returns
etf_monthly_returns <- etf_data %>%
group_by(symbol) %>%
tq_transmute(select = adjusted,
mutate_fun = periodReturn,
period = "monthly",
col_rename = "monthly_return",
type = "arithmetic")
# Wide format tibble for monthly returns
monthly_tibble <- etf_monthly_returns %>%
pivot_wider(names_from = symbol, values_from = monthly_return) %>%
rename(date = date)
monthly_tibble <- as_tibble(monthly_tibble)
# Load FF data (with small correction)
ff_data <- read.csv("F-F_Research_Data_Factors.csv", skip = 3)
# Keep only valid rows and rename columns (fixed)
ff_data <- ff_data %>%
rename(date_raw = X) %>%
filter(!is.na(RF), !date_raw %in% c(""), nchar(as.character(date_raw)) == 6) %>%
mutate(
date = as.Date(paste0(substr(date_raw, 1, 4), "-", substr(date_raw, 5, 6), "-01")),
Mkt_RF = as.numeric(Mkt.RF) / 100,
SMB = as.numeric(SMB) / 100,
HML = as.numeric(HML) / 100,
RF = as.numeric(RF) / 100
) %>%
select(date, Mkt_RF, SMB, HML, RF) %>%
filter(!is.na(date))
print("FF data structure:")
## [1] "FF data structure:"
head(ff_data)
## date Mkt_RF SMB HML RF
## 1 1926-07-01 0.0289 -0.0255 -0.0239 0.0022
## 2 1926-08-01 0.0264 -0.0114 0.0381 0.0025
## 3 1926-09-01 0.0038 -0.0136 0.0005 0.0023
## 4 1926-10-01 -0.0327 -0.0014 0.0082 0.0032
## 5 1926-11-01 0.0254 -0.0011 -0.0061 0.0031
## 6 1926-12-01 0.0262 -0.0007 0.0006 0.0028
# ============================================================================
# ============================================================================
combined_data <- monthly_tibble %>%
left_join(ff_data, by = "date") %>%
filter(!is.na(Mkt_RF)) %>% # Remove rows where FF data is missing
arrange(date)
print("Combined data structure:")
## [1] "Combined data structure:"
head(combined_data)
## # A tibble: 0 × 13
## # ℹ 13 variables: date <date>, SPY <dbl>, QQQ <dbl>, EEM <dbl>, IWM <dbl>,
## # EFA <dbl>, TLT <dbl>, IYR <dbl>, GLD <dbl>, Mkt_RF <dbl>, SMB <dbl>,
## # HML <dbl>, RF <dbl>
print(paste("Combined data has", nrow(combined_data), "observations"))
## [1] "Combined data has 0 observations"