1 Introduction

1.1 The Fama-MacBeth Method

The Fama-MacBeth (1973) two-pass regression procedure is one of the most widely used methods in empirical asset pricing. It tests whether risk factors — estimated from time-series regressions — can explain the cross-section of expected returns.

“The main advantage of the Fama-MacBeth procedure is that it provides standard errors corrected for cross-sectional correlation of returns.”

1.1.1 Why Use It?

Feature OLS Panel Fama-MacBeth
Handles cross-sectional correlation
Time-varying risk premia
Standard practice in finance Partial
Works well when N > T

1.2 The Three-Factor Model (Fama-French 1992/1993)

The model we test is:

\[r_{i,t} - r_{f,t} = \alpha_i + \beta_{i,MKT} \cdot MKT_t + \beta_{i,SMB} \cdot SMB_t + \beta_{i,HML} \cdot HML_t + \varepsilon_{i,t}\]

Where:

  • \(MKT_t\) = Excess market return (Market Risk Premium)
  • \(SMB_t\) = Small-Minus-Big (size factor)
  • \(HML_t\) = High-Minus-Low (value factor)

2 Setup & Data

2.1 Load Libraries

# Install if needed (uncomment):
# install.packages(c("broom", "tidyverse", "knitr", "kableExtra",
#                    "ggthemes", "scales", "corrplot", "ggridges", "gt"))

library(broom)
library(tidyverse)
library(knitr)
library(kableExtra)
library(scales)
library(ggthemes)
library(ggridges)
library(gt)

2.2 Load Data

data <- read.csv("data.csv")

# Preview
head(data, 10) %>%
  kbl(caption = "First 10 Rows of Dataset", digits = 6) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE)
First 10 Rows of Dataset
symbol date ri MKT SMB HML
AAPL 4-Jan-11 0.005206 -0.001314 -0.0065 0.0008
AAPL 5-Jan-11 0.008146 0.004995 0.0018 0.0013
AAPL 6-Jan-11 -0.000808 -0.002125 0.0001 -0.0025
AAPL 7-Jan-11 0.007136 -0.001847 0.0022 -0.0006
AAPL 10-Jan-11 0.018657 -0.001377 0.0041 0.0039
AAPL 11-Jan-11 -0.002368 0.003718 0.0016 0.0036
AAPL 12-Jan-11 0.008104 0.008967 0.0031 0.0000
AAPL 13-Jan-11 0.003652 -0.001712 -0.0026 -0.0044
AAPL 14-Jan-11 0.008067 0.007357 -0.0010 -0.0073
AAPL 18-Jan-11 -0.022725 0.001375 0.0056 0.0015

2.3 Dataset Overview

cat("Dataset Dimensions:", nrow(data), "rows x", ncol(data), "columns\n")
## Dataset Dimensions: 7542 rows x 6 columns
cat("Stocks:", paste(unique(data$symbol), collapse = ", "), "\n")
## Stocks: AAPL, FORD, GE, GM, IBM, MSFT
cat("Date Range:", data$date[1], "to", data$date[nrow(data)], "\n")
## Date Range: 4-Jan-11 to 31-Dec-15
cat("Observations per stock:", nrow(data) / length(unique(data$symbol)), "\n")
## Observations per stock: 1257

2.4 Descriptive Statistics

data %>%
  select(ri, MKT, SMB, HML) %>%
  summary() %>%
  kbl(caption = "Summary Statistics") %>%
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = FALSE)
Summary Statistics
ri MKT SMB HML
Min. :-0.3908663 Min. :-0.0689583 Min. :-1.660e-02 Min. :-0.01490
1st Qu.:-0.0087263 1st Qu.:-0.0040125 1st Qu.:-3.100e-03 1st Qu.:-0.00260
Median : 0.0000000 Median : 0.0005438 Median : 1.000e-04 Median : 0.00000
Mean : 0.0002109 Mean : 0.0003774 Mean : 2.227e-06 Mean : 0.00013
3rd Qu.: 0.0093507 3rd Qu.: 0.0052641 3rd Qu.: 3.100e-03 3rd Qu.: 0.00260
Max. : 0.9614112 Max. : 0.0463174 Max. : 2.490e-02 Max. : 0.02250
data %>%
  group_by(symbol) %>%
  summarise(
    N         = n(),
    Mean_ri   = mean(ri),
    SD_ri     = sd(ri),
    Min_ri    = min(ri),
    Max_ri    = max(ri),
    Sharpe    = mean(ri) / sd(ri) * sqrt(252)
  ) %>%
  mutate(across(where(is.numeric), ~round(., 5))) %>%
  kbl(caption = "Per-Stock Return Statistics (Daily)") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE) %>%
  column_spec(7, bold = TRUE, color = ifelse(
    data %>% group_by(symbol) %>%
      summarise(s = mean(ri)/sd(ri)*sqrt(252)) %>% pull(s) > 0,
    "darkgreen", "red"))
Per-Stock Return Statistics (Daily)
symbol N Mean_ri SD_ri Min_ri Max_ri Sharpe
AAPL 1257 0.00070 0.01680 -0.13188 0.08502 0.65828
FORD 1257 -0.00058 0.05549 -0.39087 0.96141 -0.16679
GE 1257 0.00056 0.01345 -0.06765 0.10260 0.66159
GM 1257 -0.00001 0.01895 -0.11544 0.09108 -0.00664
IBM 1257 -0.00006 0.01221 -0.08642 0.05511 -0.07155
MSFT 1257 0.00065 0.01479 -0.12103 0.09941 0.70212

3 Exploratory Data Analysis

3.1 Return Distributions by Stock

ggplot(data, aes(x = ri, y = symbol, fill = symbol)) +
  geom_density_ridges(alpha = 0.7, scale = 1.2, quantile_lines = TRUE,
                      quantiles = c(0.05, 0.5, 0.95)) +
  scale_fill_brewer(palette = "Set2") +
  scale_x_continuous(labels = percent_format()) +
  labs(
    title    = "Return Distributions by Stock (with 5th, 50th, 95th Percentiles)",
    subtitle = "Fama-French Three-Factor Dataset | 2011–2015",
    x        = "Daily Return",
    y        = NULL,
    caption  = "Vertical lines indicate 5%, 50%, 95% quantiles"
  ) +
  theme_few(base_size = 13) +
  theme(legend.position = "none")

3.2 Factor Returns Over Time

# Get unique dates with factor values
factor_data <- data %>%
  distinct(date, MKT, SMB, HML) %>%
  mutate(date_num = row_number())

factor_long <- factor_data %>%
  pivot_longer(cols = c(MKT, SMB, HML), names_to = "Factor", values_to = "Return")

ggplot(factor_long, aes(x = date_num, y = Return, color = Factor)) +
  geom_line(alpha = 0.7) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "gray40") +
  facet_wrap(~Factor, ncol = 1, scales = "free_y") +
  scale_color_manual(values = c("MKT" = "#2196F3", "SMB" = "#4CAF50", "HML" = "#FF5722")) +
  scale_y_continuous(labels = percent_format()) +
  labs(
    title    = "Fama-French Factor Returns Over Time",
    subtitle = "Daily MKT, SMB, HML factors (2011–2015)",
    x        = "Trading Day",
    y        = "Factor Return",
    caption  = "Source: Fama-French Data Library"
  ) +
  theme_few(base_size = 12) +
  theme(legend.position = "none")

3.3 Cumulative Factor Performance

factor_data %>%
  mutate(
    cum_MKT = cumprod(1 + MKT) - 1,
    cum_SMB = cumprod(1 + SMB) - 1,
    cum_HML = cumprod(1 + HML) - 1
  ) %>%
  pivot_longer(cols = starts_with("cum_"), names_to = "Factor", values_to = "Cum_Return") %>%
  mutate(Factor = str_remove(Factor, "cum_")) %>%
  ggplot(aes(x = date_num, y = Cum_Return, color = Factor)) +
  geom_line(size = 1.2) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "gray50") +
  scale_y_continuous(labels = percent_format()) +
  scale_color_manual(values = c("MKT" = "#2196F3", "SMB" = "#4CAF50", "HML" = "#FF5722")) +
  labs(
    title    = "Cumulative Factor Returns",
    subtitle = "Growth of $1 invested in each factor (2011–2015)",
    x        = "Trading Day",
    y        = "Cumulative Return",
    color    = "Factor"
  ) +
  theme_few(base_size = 13)

3.4 Correlation Between Variables

cor_matrix <- data %>%
  select(ri, MKT, SMB, HML) %>%
  cor()

cor_matrix %>%
  round(4) %>%
  kbl(caption = "Correlation Matrix: Returns and Factors") %>%
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = FALSE) %>%
  column_spec(1, bold = TRUE)
Correlation Matrix: Returns and Factors
ri MKT SMB HML
ri 1.0000 0.3388 -0.0096 0.0087
MKT 0.3388 1.0000 -0.0272 0.0197
SMB -0.0096 -0.0272 1.0000 -0.1866
HML 0.0087 0.0197 -0.1866 1.0000
data %>%
  select(symbol, ri, MKT, SMB, HML) %>%
  pivot_longer(cols = c(MKT, SMB, HML), names_to = "Factor", values_to = "Factor_Return") %>%
  ggplot(aes(x = Factor_Return, y = ri, color = symbol)) +
  geom_point(alpha = 0.15, size = 0.8) +
  geom_smooth(method = "lm", se = FALSE, size = 1, color = "black") +
  facet_wrap(~Factor, scales = "free_x") +
  scale_x_continuous(labels = percent_format()) +
  scale_y_continuous(labels = percent_format()) +
  labs(
    title  = "Stock Returns vs. Fama-French Factors",
    x      = "Factor Return",
    y      = "Stock Return (ri)",
    color  = "Stock"
  ) +
  theme_few(base_size = 12)


4 Fama-MacBeth Two-Pass Regression

4.1 Step 0 — Time-Series Regressions (Estimate Betas)

For each stock \(i\), regress its daily returns on the three Fama-French factors:

\[r_{i,t} = \alpha_i + \beta_{i,MKT} \cdot MKT_t + \beta_{i,SMB} \cdot SMB_t + \beta_{i,HML} \cdot HML_t + \varepsilon_{i,t}\]

step0 <- data %>%
  nest(data = c(date, ri, MKT, SMB, HML)) %>%
  mutate(estimates = map(
    data,
    ~tidy(lm(ri ~ MKT + SMB + HML, data = .x))
  )) %>%
  unnest(estimates) %>%
  select(symbol, estimate, term) %>%
  pivot_wider(names_from  = term,
              values_from = estimate) %>%
  select(symbol,
         b_MKT = MKT,
         b_HML = HML,
         b_SMB = SMB)

step0 %>%
  mutate(across(where(is.numeric), ~round(., 4))) %>%
  kbl(caption = "Step 0: Estimated Factor Loadings (Betas) per Stock") %>%
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = FALSE) %>%
  column_spec(2:4, bold = TRUE)
Step 0: Estimated Factor Loadings (Betas) per Stock
symbol b_MKT b_HML b_SMB
AAPL 0.9000 -0.0578 0.0685
FORD 0.5129 0.1380 -0.2644
GE 1.0779 0.0902 0.0994
GM 1.2854 -0.0222 0.0039
IBM 0.8169 -0.0121 0.0336
MSFT 0.9656 -0.0641 0.0582

4.1.1 R² and Model Fit per Stock

r2_table <- data %>%
  nest(data = c(date, ri, MKT, SMB, HML)) %>%
  mutate(fit = map(data, ~lm(ri ~ MKT + SMB + HML, data = .x)),
         glance_out = map(fit, glance)) %>%
  unnest(glance_out) %>%
  select(symbol, r.squared, adj.r.squared, statistic, p.value, nobs) %>%
  mutate(across(where(is.numeric), ~round(., 4)))

r2_table %>%
  kbl(caption = "Time-Series Regression: Model Fit per Stock") %>%
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = FALSE) %>%
  column_spec(2, color = ifelse(r2_table$r.squared > 0.3, "darkgreen", "darkorange"),
              bold = TRUE)
Time-Series Regression: Model Fit per Stock
symbol r.squared adj.r.squared statistic p.value nobs
AAPL 0.2729 0.2712 156.7577 0.0000 1257
FORD 0.0090 0.0067 3.8122 0.0098 1257
GE 0.6125 0.6116 660.1958 0.0000 1257
GM 0.4379 0.4366 325.4098 0.0000 1257
IBM 0.4255 0.4241 309.2936 0.0000 1257
MSFT 0.4054 0.4040 284.7811 0.0000 1257

4.1.2 Beta Visualization

step0 %>%
  pivot_longer(cols = starts_with("b_"), names_to = "Factor", values_to = "Beta") %>%
  mutate(Factor = str_remove(Factor, "b_")) %>%
  ggplot(aes(x = symbol, y = Beta, fill = Factor)) +
  geom_col(position = "dodge", color = "white", width = 0.7) +
  geom_hline(yintercept = 0, linetype = "dashed") +
  scale_fill_brewer(palette = "Set1") +
  labs(
    title    = "Estimated Factor Betas by Stock",
    subtitle = "Step 0: Time-Series OLS estimates",
    x        = "Stock",
    y        = "Beta Coefficient",
    fill     = "Factor"
  ) +
  theme_few(base_size = 13)


4.2 Step 1 — Cross-Sectional Regressions

For each date \(t\), regress the cross-section of returns on the estimated betas:

\[r_{i,t} = \lambda_0 + \lambda_{MKT} \hat{\beta}_{i,MKT} + \lambda_{SMB} \hat{\beta}_{i,SMB} + \lambda_{HML} \hat{\beta}_{i,HML} + \alpha_{i,t}\]

# Join betas back to data
step0_joined <- data %>%
  left_join(step0, by = "symbol")

# Run T cross-sectional regressions (one per date)
step1 <- step0_joined %>%
  nest(data = c(symbol, ri, b_MKT, b_SMB, b_HML)) %>%
  mutate(estimates = map(
    data,
    ~tidy(lm(ri ~ b_MKT + b_SMB + b_HML, data = .x))
  )) %>%
  unnest(estimates) %>%
  select(date, estimate, term) %>%
  pivot_wider(names_from  = term,
              values_from = estimate) %>%
  select(date, b_MKT, b_HML, b_SMB)

cat("Number of cross-sectional regressions run:", nrow(step1), "\n")
## Number of cross-sectional regressions run: 1257
head(step1, 10) %>%
  mutate(across(where(is.numeric), ~round(., 6))) %>%
  kbl(caption = "Step 1: First 10 Cross-Sectional Lambda Estimates") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE)
Step 1: First 10 Cross-Sectional Lambda Estimates
date b_MKT b_HML b_SMB
4-Jan-11 0.041629 0.057372 -0.025520
5-Jan-11 -0.011347 0.062847 -0.158046
6-Jan-11 0.037301 -0.173234 0.007029
7-Jan-11 0.012722 -0.064226 0.032269
10-Jan-11 -0.036631 0.058646 0.017123
11-Jan-11 0.004089 0.089858 -0.095361
12-Jan-11 -0.055365 0.043036 -0.164496
13-Jan-11 -0.019357 0.025630 0.001815
14-Jan-11 -0.016486 0.039214 0.063259
18-Jan-11 0.010146 -0.090027 0.052508

4.2.1 Distribution of Time-Varying Lambdas

step1 %>%
  pivot_longer(cols = c(b_MKT, b_SMB, b_HML),
               names_to = "Factor", values_to = "Lambda") %>%
  mutate(Factor = str_remove(Factor, "b_")) %>%
  ggplot(aes(x = Lambda, fill = Factor)) +
  geom_histogram(bins = 50, alpha = 0.7, color = "white") +
  geom_vline(xintercept = 0, linetype = "dashed", color = "black") +
  facet_wrap(~Factor, scales = "free") +
  scale_fill_brewer(palette = "Set1") +
  scale_x_continuous(labels = percent_format()) +
  labs(
    title    = "Distribution of Cross-Sectional Lambda Estimates",
    subtitle = "One lambda per trading day (Step 1)",
    x        = "Lambda (Risk Premium)",
    y        = "Count"
  ) +
  theme_few(base_size = 13) +
  theme(legend.position = "none")

4.2.2 Lambda Time Series

step1 %>%
  mutate(t = row_number()) %>%
  pivot_longer(cols = c(b_MKT, b_SMB, b_HML),
               names_to = "Factor", values_to = "Lambda") %>%
  mutate(Factor = str_remove(Factor, "b_")) %>%
  ggplot(aes(x = t, y = Lambda, color = Factor)) +
  geom_line(alpha = 0.6) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "gray40") +
  geom_smooth(method = "loess", se = TRUE, alpha = 0.15, size = 1.2) +
  facet_wrap(~Factor, ncol = 1, scales = "free_y") +
  scale_y_continuous(labels = percent_format()) +
  scale_color_manual(values = c("MKT" = "#2196F3", "SMB" = "#4CAF50", "HML" = "#FF5722")) +
  labs(
    title    = "Time-Varying Risk Premia (Lambdas) — Step 1",
    subtitle = "Cross-sectional lambda estimates over time with LOESS trend",
    x        = "Trading Day",
    y        = "Lambda"
  ) +
  theme_few(base_size = 12) +
  theme(legend.position = "none")


4.3 Step 2 — Time-Series Averages & Hypothesis Tests

The Fama-MacBeth estimate of the risk premium is the time-series average of the step-1 lambdas. We test whether each is significantly different from zero.

\[\hat{\lambda}_k = \frac{1}{T} \sum_{t=1}^{T} \hat{\lambda}_{k,t}, \quad t\text{-stat} = \frac{\hat{\lambda}_k}{SE(\hat{\lambda}_k)}\]

cat("=== Fama-MacBeth Results ===\n\n")
## === Fama-MacBeth Results ===
cat("--- MKT Factor ---\n")
## --- MKT Factor ---
mkt_test <- t.test(step1$b_MKT, mu = 0)
print(mkt_test)
## 
##  One Sample t-test
## 
## data:  step1$b_MKT
## t = -0.37879, df = 1256, p-value = 0.7049
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  -0.002546371  0.001722208
## sample estimates:
##     mean of x 
## -0.0004120813
cat("\n--- SMB Factor ---\n")
## 
## --- SMB Factor ---
smb_test <- t.test(step1$b_SMB, mu = 0)
print(smb_test)
## 
##  One Sample t-test
## 
## data:  step1$b_SMB
## t = 0.97712, df = 1256, p-value = 0.3287
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  -0.003711466  0.011076953
## sample estimates:
##   mean of x 
## 0.003682744
cat("\n--- HML Factor ---\n")
## 
## --- HML Factor ---
hml_test <- t.test(step1$b_HML, mu = 0)
print(hml_test)
## 
##  One Sample t-test
## 
## data:  step1$b_HML
## t = -0.18044, df = 1256, p-value = 0.8568
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  -0.005541205  0.004607776
## sample estimates:
##     mean of x 
## -0.0004667146

4.3.1 Results Summary Table

results <- tibble(
  Factor     = c("MKT", "SMB", "HML"),
  Mean_Lambda = c(mean(step1$b_MKT), mean(step1$b_SMB), mean(step1$b_HML)),
  Std_Dev    = c(sd(step1$b_MKT),   sd(step1$b_SMB),   sd(step1$b_HML)),
  Std_Error  = c(mkt_test$stderr,    smb_test$stderr,    hml_test$stderr),
  T_Stat     = c(mkt_test$statistic, smb_test$statistic, hml_test$statistic),
  P_Value    = c(mkt_test$p.value,   smb_test$p.value,   hml_test$p.value),
  CI_Low     = c(mkt_test$conf.int[1], smb_test$conf.int[1], hml_test$conf.int[1]),
  CI_High    = c(mkt_test$conf.int[2], smb_test$conf.int[2], hml_test$conf.int[2]),
  Significant = c(
    ifelse(mkt_test$p.value < 0.05, "✓ Yes", "✗ No"),
    ifelse(smb_test$p.value < 0.05, "✓ Yes", "✗ No"),
    ifelse(hml_test$p.value < 0.05, "✓ Yes", "✗ No")
  )
)

results %>%
  mutate(across(where(is.numeric), ~round(., 5))) %>%
  kbl(caption = "Fama-MacBeth Final Results: Risk Premium Estimates",
      col.names = c("Factor", "Mean λ", "Std Dev", "Std Error",
                    "t-stat", "p-value", "CI Low", "CI High", "Sig. (5%)")) %>%
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = TRUE) %>%
  row_spec(which(results$P_Value < 0.05), bold = TRUE,
           background = "#e8f5e9") %>%
  row_spec(which(results$P_Value >= 0.05), color = "gray40")
Fama-MacBeth Final Results: Risk Premium Estimates
Factor Mean λ Std Dev Std Error t-stat p-value CI Low CI High Sig. (5%)
MKT -0.00041 0.03857 0.00109 -0.37879 0.70491 -0.00255 0.00172 ✗ No
SMB 0.00368 0.13363 0.00377 0.97712 0.32870 -0.00371 0.01108 ✗ No
HML -0.00047 0.09171 0.00259 -0.18044 0.85684 -0.00554 0.00461 ✗ No

5 Visualization of Final Results

5.1 Risk Premium Estimates with Confidence Intervals

results %>%
  ggplot(aes(x = Factor, y = Mean_Lambda, fill = Factor,
             color = Significant)) +
  geom_col(alpha = 0.8, width = 0.5, color = NA) +
  geom_errorbar(aes(ymin = CI_Low, ymax = CI_High),
                width = 0.15, size = 1.2, color = "gray30") +
  geom_hline(yintercept = 0, linetype = "dashed", color = "black") +
  geom_text(aes(label = paste0("λ = ", round(Mean_Lambda, 5),
                               "\np = ", round(P_Value, 4))),
            vjust = -0.5, size = 3.5, color = "black") +
  scale_fill_brewer(palette = "Set1") +
  scale_y_continuous(labels = percent_format()) +
  labs(
    title    = "Fama-MacBeth Risk Premium Estimates (λ)",
    subtitle = "With 95% Confidence Intervals | Two-tailed t-test (μ = 0)",
    x        = "Factor",
    y        = "Average Risk Premium",
    caption  = "Green shading = significant at 5% level"
  ) +
  theme_few(base_size = 14) +
  theme(legend.position = "none")

5.2 Rolling Mean Lambdas (Convergence Check)

step1 %>%
  mutate(
    t          = row_number(),
    roll_MKT   = cumsum(b_MKT) / t,
    roll_SMB   = cumsum(b_SMB) / t,
    roll_HML   = cumsum(b_HML) / t
  ) %>%
  pivot_longer(cols = starts_with("roll_"),
               names_to = "Factor", values_to = "Rolling_Mean") %>%
  mutate(Factor = str_remove(Factor, "roll_")) %>%
  ggplot(aes(x = t, y = Rolling_Mean, color = Factor)) +
  geom_line(size = 1) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "gray50") +
  facet_wrap(~Factor, ncol = 1, scales = "free_y") +
  scale_y_continuous(labels = percent_format()) +
  scale_color_manual(values = c("MKT" = "#2196F3", "SMB" = "#4CAF50", "HML" = "#FF5722")) +
  labs(
    title    = "Cumulative Average Lambda (Convergence Plot)",
    subtitle = "Running mean of cross-sectional risk premia — does it stabilize?",
    x        = "Trading Day",
    y        = "Cumulative Mean Lambda"
  ) +
  theme_few(base_size = 12) +
  theme(legend.position = "none")


6 Summary of Findings

findings <- results %>%
  mutate(
    Interpretation = case_when(
      P_Value < 0.01 ~ "Highly significant risk premium",
      P_Value < 0.05 ~ "Significant risk premium at 5%",
      P_Value < 0.10 ~ "Marginally significant (10%)",
      TRUE           ~ "Not statistically significant"
    ),
    Annual_Lambda  = round(Mean_Lambda * 252, 4)
  ) %>%
  select(Factor, Mean_Lambda, T_Stat, P_Value, Annual_Lambda, Interpretation)

findings %>%
  mutate(across(c(Mean_Lambda, T_Stat, P_Value, Annual_Lambda), ~round(., 5))) %>%
  kbl(caption = "Final Interpretation Table") %>%
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = TRUE) %>%
  column_spec(6, italic = TRUE)
Final Interpretation Table
Factor Mean_Lambda T_Stat P_Value Annual_Lambda Interpretation
MKT -0.00041 -0.37879 0.70491 -0.1038 Not statistically significant
SMB 0.00368 0.97712 0.32870 0.9281 Not statistically significant
HML -0.00047 -0.18044 0.85684 -0.1176 Not statistically significant

6.1 Key Takeaways

The Fama-MacBeth two-pass procedure applied to a cross-section of 6 U.S. stocks (2011–2015) yields the following conclusions:

  1. MKT (Market Beta): The estimated market risk premium \(\hat{\lambda}_{MKT}\) tests whether exposure to market risk is priced. A significant positive value would confirm the CAPM intuition.

  2. SMB (Size Factor): Tests whether small-cap exposure earns a return premium over the sample period.

  3. HML (Value Factor): Tests whether value stocks (high book-to-market) outperform growth stocks.

6.1.1 Methodological Notes

  • Standard errors in Fama-MacBeth are computed from the time-series of cross-sectional lambda estimates, naturally correcting for cross-sectional correlation.
  • The procedure cannot handle time-invariant regressors (e.g., dummy variables that don’t change across \(t\)).
  • The test uses a two-tailed t-test against \(H_0: \lambda = 0\).

7 References

  • Fama, E. F., & MacBeth, J. D. (1973). Risk, return, and equilibrium: Empirical tests. Journal of Political Economy, 81(3), 607–636.
  • Fama, E. F., & French, K. R. (1992). The cross-section of expected stock returns. Journal of Finance, 47(2), 427–465.
  • Fama, E. F., & French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33(1), 3–56.
  • Broom package: Robinson, D. et al. (2023). broom: Convert Statistical Objects into Tidy Tibbles. R package.

Analysis conducted in R using the broom and tidyverse packages. Data sourced from the Fama-French Data Library.