The Fama-MacBeth (1973) procedure provides standard errors corrected for cross-sectional correlation. It is preferred when we have more cross-sections but less time-series data.
| Step | Description |
|---|---|
| Step 0 | Time Series Regression — Run N time-series regressions (one per asset). Regress each asset’s return on risk factors to obtain factor loadings (betas). |
| Step 1 | Cross-Sectional Regression — Run T cross-sectional regressions (one per time period). Regress asset returns on the betas from Step 0. |
| Step 2 | Average the Coefficients — Take the time-series average of the cross-sectional slope estimates and test if they are significantly different from zero. |
## symbol date ri MKT SMB HML
## 1 AAPL 4-Jan-11 0.0052062641 -0.001313890 -0.0065 0.0008
## 2 AAPL 5-Jan-11 0.0081462879 0.004994670 0.0018 0.0013
## 3 AAPL 6-Jan-11 -0.0008082435 -0.002125228 0.0001 -0.0025
## 4 AAPL 7-Jan-11 0.0071360567 -0.001846505 0.0022 -0.0006
## 5 AAPL 10-Jan-11 0.0186572890 -0.001377275 0.0041 0.0039
## 6 AAPL 11-Jan-11 -0.0023681840 0.003718222 0.0016 0.0036
The dataset contains daily returns for 6 stocks (AAPL, FORD, GE, GM, IBM, MSFT) from 4-Jan-11 to 31-Dec-15, along with the three Fama-French factors: MKT, SMB, and HML.
## Rows: 7,542
## Columns: 6
## $ symbol <chr> "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL",…
## $ date <chr> "4-Jan-11", "5-Jan-11", "6-Jan-11", "7-Jan-11", "10-Jan-11", "1…
## $ ri <dbl> 0.0052062641, 0.0081462879, -0.0008082435, 0.0071360567, 0.0186…
## $ MKT <dbl> -0.0013138901, 0.0049946699, -0.0021252276, -0.0018465050, -0.0…
## $ SMB <dbl> -0.0065, 0.0018, 0.0001, 0.0022, 0.0041, 0.0016, 0.0031, -0.002…
## $ HML <dbl> 0.0008, 0.0013, -0.0025, -0.0006, 0.0039, 0.0036, 0.0000, -0.00…
For each stock, regress its return (ri) on the three
Fama-French factors to obtain the factor loadings
(betas).
step0 <- data %>%
nest(data = c(date, ri, MKT, SMB, HML)) %>%
mutate(estimates = map(
data,
~tidy(lm(ri ~ MKT + SMB + HML, data = .x))
)) %>%
unnest(estimates) %>%
select(symbol, estimate, term) %>%
pivot_wider(names_from = term,
values_from = estimate) %>%
select(symbol,
b_MKT = MKT,
b_HML = HML,
b_SMB = SMB)
step0## # A tibble: 6 × 4
## symbol b_MKT b_HML b_SMB
## <chr> <dbl> <dbl> <dbl>
## 1 AAPL 0.900 -0.0578 0.0685
## 2 FORD 0.513 0.138 -0.264
## 3 GE 1.08 0.0902 0.0994
## 4 GM 1.29 -0.0222 0.00390
## 5 IBM 0.817 -0.0121 0.0336
## 6 MSFT 0.966 -0.0641 0.0582
These are the estimated factor betas for each stock over the full sample period.
## symbol date ri MKT SMB HML b_MKT
## 1 AAPL 4-Jan-11 0.0052062641 -0.001313890 -0.0065 0.0008 0.9000063
## 2 AAPL 5-Jan-11 0.0081462879 0.004994670 0.0018 0.0013 0.9000063
## 3 AAPL 6-Jan-11 -0.0008082435 -0.002125228 0.0001 -0.0025 0.9000063
## 4 AAPL 7-Jan-11 0.0071360567 -0.001846505 0.0022 -0.0006 0.9000063
## 5 AAPL 10-Jan-11 0.0186572890 -0.001377275 0.0041 0.0039 0.9000063
## 6 AAPL 11-Jan-11 -0.0023681840 0.003718222 0.0016 0.0036 0.9000063
## b_HML b_SMB
## 1 -0.05782126 0.06853513
## 2 -0.05782126 0.06853513
## 3 -0.05782126 0.06853513
## 4 -0.05782126 0.06853513
## 5 -0.05782126 0.06853513
## 6 -0.05782126 0.06853513
For each date, regress asset returns on the betas estimated in Step 0 to obtain cross-sectional risk premia estimates.
step1 <- step0 %>%
nest(data = c(symbol, ri, b_MKT, b_SMB, b_HML)) %>%
mutate(estimates = map(
data,
~tidy(lm(ri ~ b_MKT + b_SMB + b_HML, data = .x))
)) %>%
unnest(estimates) %>%
select(date, estimate, term) %>%
pivot_wider(names_from = term,
values_from = estimate) %>%
select(date, b_MKT, b_HML, b_SMB)
head(step1)## # A tibble: 6 × 4
## date b_MKT b_HML b_SMB
## <chr> <dbl> <dbl> <dbl>
## 1 4-Jan-11 0.0416 0.0574 -0.0255
## 2 5-Jan-11 -0.0113 0.0628 -0.158
## 3 6-Jan-11 0.0373 -0.173 0.00703
## 4 7-Jan-11 0.0127 -0.0642 0.0323
## 5 10-Jan-11 -0.0366 0.0586 0.0171
## 6 11-Jan-11 0.00409 0.0899 -0.0954
Average the cross-sectional coefficients and test whether each risk premium is significantly different from zero.
##
## One Sample t-test
##
## data: step1$b_MKT
## t = -0.37879, df = 1256, p-value = 0.7049
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## -0.002546371 0.001722208
## sample estimates:
## mean of x
## -0.0004120813
##
## One Sample t-test
##
## data: step1$b_SMB
## t = 0.97712, df = 1256, p-value = 0.3287
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## -0.003711466 0.011076953
## sample estimates:
## mean of x
## 0.003682744
##
## One Sample t-test
##
## data: step1$b_HML
## t = -0.18044, df = 1256, p-value = 0.8568
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## -0.005541205 0.004607776
## sample estimates:
## mean of x
## -0.0004667146
make_row <- function(x, name) {
tt <- t.test(x, mu = 0)
data.frame(
Factor = name,
Mean = mean(x, na.rm = TRUE),
StdDev = sd(x, na.rm = TRUE),
t_statistic = as.numeric(tt$statistic),
p_value = tt$p.value,
Significant = ifelse(tt$p.value < 0.05, "Yes *", "No"),
stringsAsFactors = FALSE
)
}
summary_table <- bind_rows(
make_row(step1$b_MKT, "b_MKT"),
make_row(step1$b_SMB, "b_SMB"),
make_row(step1$b_HML, "b_HML")
)
knitr::kable(summary_table, digits = 5,
caption = "Fama-MacBeth Risk Premia Estimates",
col.names = c("Factor", "Mean", "Std Dev", "t-Statistic", "p-Value", "Significant (5%)"))| Factor | Mean | Std Dev | t-Statistic | p-Value | Significant (5%) |
|---|---|---|---|---|---|
| b_MKT | -0.00041 | 0.03857 | -0.37879 | 0.70491 | No |
| b_SMB | 0.00368 | 0.13363 | 0.97712 | 0.32870 | No |
| b_HML | -0.00047 | 0.09171 | -0.18044 | 0.85684 | No |
step1 %>%
pivot_longer(cols = c(b_MKT, b_SMB, b_HML),
names_to = "Factor",
values_to = "Estimate") %>%
ggplot(aes(x = as.Date(date, format = "%d-%b-%y"),
y = Estimate, colour = Factor)) +
geom_line(alpha = 0.7) +
geom_hline(yintercept = 0, linetype = "dashed", colour = "black") +
facet_wrap(~Factor, scales = "free_y", ncol = 1) +
labs(
title = "Fama-MacBeth Cross-Sectional Risk Premia Over Time",
subtitle = "Daily estimates from cross-sectional regressions",
x = "Date",
y = "Risk Premium Estimate",
colour = "Factor"
) +
theme_minimal(base_size = 12) +
theme(legend.position = "none",
strip.text = element_text(face = "bold"))The t-tests assess whether each factor’s average cross-sectional price is statistically different from zero — the central question of the Fama-MacBeth methodology.
Replication based on the tutorial video: Fama-MacBeth Regression in R