The Fama-MacBeth (1973) procedure is a two-pass regression method widely used in empirical asset pricing to estimate risk premia. It provides standard errors corrected for cross-sectional correlation, making it especially useful when we have more cross-sections than time-series observations.
Three-step procedure:
## symbol date ri MKT SMB HML
## 1 AAPL 4-Jan-11 0.0052062641 -0.001313890 -0.0065 0.0008
## 2 AAPL 5-Jan-11 0.0081462879 0.004994670 0.0018 0.0013
## 3 AAPL 6-Jan-11 -0.0008082435 -0.002125228 0.0001 -0.0025
## 4 AAPL 7-Jan-11 0.0071360567 -0.001846505 0.0022 -0.0006
## 5 AAPL 10-Jan-11 0.0186572890 -0.001377275 0.0041 0.0039
## 6 AAPL 11-Jan-11 -0.0023681840 0.003718222 0.0016 0.0036
## 7 AAPL 12-Jan-11 0.0081042033 0.008967269 0.0031 0.0000
## 8 AAPL 13-Jan-11 0.0036515722 -0.001712249 -0.0026 -0.0044
## 9 AAPL 14-Jan-11 0.0080672745 0.007357425 -0.0010 -0.0073
## 10 AAPL 18-Jan-11 -0.0227252300 0.001375442 0.0056 0.0015
## Rows: 7,542
## Columns: 6
## $ symbol <chr> "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL",…
## $ date <chr> "4-Jan-11", "5-Jan-11", "6-Jan-11", "7-Jan-11", "10-Jan-11", "1…
## $ ri <dbl> 0.0052062641, 0.0081462879, -0.0008082435, 0.0071360567, 0.0186…
## $ MKT <dbl> -0.0013138901, 0.0049946699, -0.0021252276, -0.0018465050, -0.0…
## $ SMB <dbl> -0.0065, 0.0018, 0.0001, 0.0022, 0.0041, 0.0016, 0.0031, -0.002…
## $ HML <dbl> 0.0008, 0.0013, -0.0025, -0.0006, 0.0039, 0.0036, 0.0000, -0.00…
## symbol date ri
## Length:7542 Length:7542 Min. :-0.3908663
## Class :character Class :character 1st Qu.:-0.0087263
## Mode :character Mode :character Median : 0.0000000
## Mean : 0.0002109
## 3rd Qu.: 0.0093507
## Max. : 0.9614112
## MKT SMB HML
## Min. :-0.0689583 Min. :-1.660e-02 Min. :-0.01490
## 1st Qu.:-0.0040125 1st Qu.:-3.100e-03 1st Qu.:-0.00260
## Median : 0.0005438 Median : 1.000e-04 Median : 0.00000
## Mean : 0.0003774 Mean : 2.227e-06 Mean : 0.00013
## 3rd Qu.: 0.0052641 3rd Qu.: 3.100e-03 3rd Qu.: 0.00260
## Max. : 0.0463174 Max. : 2.490e-02 Max. : 0.02250
For each stock/asset, regress its returns
(ri) on the three Fama-French factors — Market excess
return (MKT), Small-Minus-Big (SMB), and High-Minus-Low (HML) — over all
time periods.
step0 <- data %>%
nest(data = c(date, ri, MKT, SMB, HML)) %>%
mutate(estimates = map(
data,
~tidy(lm(ri ~ MKT + SMB + HML, data = .x))
)) %>%
unnest(estimates) %>%
select(symbol, estimate, term) %>%
pivot_wider(names_from = term,
values_from = estimate) %>%
select(symbol,
b_MKT = MKT,
b_HML = HML,
b_SMB = SMB)
# View estimated betas
step0## # A tibble: 6 × 4
## symbol b_MKT b_HML b_SMB
## <chr> <dbl> <dbl> <dbl>
## 1 AAPL 0.900 -0.0578 0.0685
## 2 FORD 0.513 0.138 -0.264
## 3 GE 1.08 0.0902 0.0994
## 4 GM 1.29 -0.0222 0.00390
## 5 IBM 0.817 -0.0121 0.0336
## 6 MSFT 0.966 -0.0641 0.0582
## symbol date ri MKT SMB HML b_MKT
## 1 AAPL 4-Jan-11 0.0052062641 -0.001313890 -0.0065 0.0008 0.9000063
## 2 AAPL 5-Jan-11 0.0081462879 0.004994670 0.0018 0.0013 0.9000063
## 3 AAPL 6-Jan-11 -0.0008082435 -0.002125228 0.0001 -0.0025 0.9000063
## 4 AAPL 7-Jan-11 0.0071360567 -0.001846505 0.0022 -0.0006 0.9000063
## 5 AAPL 10-Jan-11 0.0186572890 -0.001377275 0.0041 0.0039 0.9000063
## 6 AAPL 11-Jan-11 -0.0023681840 0.003718222 0.0016 0.0036 0.9000063
## 7 AAPL 12-Jan-11 0.0081042033 0.008967269 0.0031 0.0000 0.9000063
## 8 AAPL 13-Jan-11 0.0036515722 -0.001712249 -0.0026 -0.0044 0.9000063
## 9 AAPL 14-Jan-11 0.0080672745 0.007357425 -0.0010 -0.0073 0.9000063
## 10 AAPL 18-Jan-11 -0.0227252300 0.001375442 0.0056 0.0015 0.9000063
## b_HML b_SMB
## 1 -0.05782126 0.06853513
## 2 -0.05782126 0.06853513
## 3 -0.05782126 0.06853513
## 4 -0.05782126 0.06853513
## 5 -0.05782126 0.06853513
## 6 -0.05782126 0.06853513
## 7 -0.05782126 0.06853513
## 8 -0.05782126 0.06853513
## 9 -0.05782126 0.06853513
## 10 -0.05782126 0.06853513
For each time period (date), regress the cross-section of asset returns on the betas estimated in Step 0. This yields a set of monthly lambda estimates.
step1 <- step0 %>%
nest(data = c(symbol, ri, b_MKT, b_SMB, b_HML)) %>%
mutate(estimates = map(
data,
~tidy(lm(ri ~ b_MKT + b_SMB + b_HML, data = .x))
)) %>%
unnest(estimates) %>%
select(date, estimate, term) %>%
pivot_wider(names_from = term,
values_from = estimate) %>%
select(date, b_MKT, b_HML, b_SMB)
# View monthly cross-sectional lambda estimates
head(step1, 10)## # A tibble: 10 × 4
## date b_MKT b_HML b_SMB
## <chr> <dbl> <dbl> <dbl>
## 1 4-Jan-11 0.0416 0.0574 -0.0255
## 2 5-Jan-11 -0.0113 0.0628 -0.158
## 3 6-Jan-11 0.0373 -0.173 0.00703
## 4 7-Jan-11 0.0127 -0.0642 0.0323
## 5 10-Jan-11 -0.0366 0.0586 0.0171
## 6 11-Jan-11 0.00409 0.0899 -0.0954
## 7 12-Jan-11 -0.0554 0.0430 -0.164
## 8 13-Jan-11 -0.0194 0.0256 0.00181
## 9 14-Jan-11 -0.0165 0.0392 0.0633
## 10 18-Jan-11 0.0101 -0.0900 0.0525
Take the time-series average of the cross-sectional coefficients and test whether they are significantly different from zero using t-tests.
##
## One Sample t-test
##
## data: step1$b_MKT
## t = -0.37879, df = 1256, p-value = 0.7049
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## -0.002546371 0.001722208
## sample estimates:
## mean of x
## -0.0004120813
##
## One Sample t-test
##
## data: step1$b_SMB
## t = 0.97712, df = 1256, p-value = 0.3287
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## -0.003711466 0.011076953
## sample estimates:
## mean of x
## 0.003682744
##
## One Sample t-test
##
## data: step1$b_HML
## t = -0.18044, df = 1256, p-value = 0.8568
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## -0.005541205 0.004607776
## sample estimates:
## mean of x
## -0.0004667146
# Compute summary statistics for all three lambdas
summary_table <- step1 %>%
select(b_MKT, b_SMB, b_HML) %>%
pivot_longer(everything(), names_to = "Factor", values_to = "Lambda") %>%
group_by(Factor) %>%
summarise(
Mean = mean(Lambda, na.rm = TRUE),
StdDev = sd(Lambda, na.rm = TRUE),
SE = StdDev / sqrt(n()),
t_stat = Mean / SE,
p_value = 2 * pt(-abs(t_stat), df = n() - 1),
.groups = "drop"
) %>%
mutate(
Factor = recode(Factor,
b_MKT = "Market (MKT)",
b_SMB = "SMB",
b_HML = "HML"),
Significance = case_when(
p_value < 0.01 ~ "***",
p_value < 0.05 ~ "**",
p_value < 0.10 ~ "*",
TRUE ~ ""
)
)
summary_table## # A tibble: 3 × 7
## Factor Mean StdDev SE t_stat p_value Significance
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
## 1 HML -0.000467 0.0917 0.00259 -0.180 0.857 ""
## 2 Market (MKT) -0.000412 0.0386 0.00109 -0.379 0.705 ""
## 3 SMB 0.00368 0.134 0.00377 0.977 0.329 ""
step1 %>%
pivot_longer(cols = c(b_MKT, b_SMB, b_HML),
names_to = "Factor", values_to = "Lambda") %>%
mutate(Factor = recode(Factor,
b_MKT = "Market (MKT)",
b_SMB = "SMB",
b_HML = "HML")) %>%
ggplot(aes(x = date, y = Lambda, colour = Factor, group = Factor)) +
geom_line() +
geom_hline(yintercept = 0, linetype = "dashed", colour = "grey50") +
facet_wrap(~Factor, scales = "free_y", ncol = 1) +
labs(title = "Monthly Cross-Sectional Lambda Estimates Over Time",
x = "Date", y = "Lambda Estimate") +
theme_minimal(base_size = 11) +
theme(legend.position = "none",
axis.text.x = element_text(angle = 45, hjust = 1))summary_table %>%
ggplot(aes(x = Factor, y = Mean, fill = Factor)) +
geom_col(width = 0.5) +
geom_errorbar(aes(ymin = Mean - 1.96 * SE,
ymax = Mean + 1.96 * SE),
width = 0.2, linewidth = 0.8) +
geom_hline(yintercept = 0, linetype = "dashed") +
labs(title = "Fama-MacBeth Risk Premia Estimates",
subtitle = "Error bars = 95% Confidence Interval",
x = "Factor", y = "Mean Lambda (Risk Premium)") +
theme_minimal(base_size = 12) +
theme(legend.position = "none")| Factor | Mean Lambda | t-statistic | Interpretation |
|---|---|---|---|
| MKT | -4^{-4} | -0.379 | Market risk premium |
| SMB | 0.0037 | 0.977 | Size premium |
| HML | -5^{-4} | -0.18 | Value premium |