1 Introduction

1.1 Motivation

Asset pricing lies at the core of modern finance. A fundamental question is: which risk factors are priced in the cross-section of stock returns? The Fama-MacBeth (1973) procedure provides a rigorous econometric answer by decomposing the problem into two sequential regression passes — first estimating factor loadings (betas) in the time-series dimension, then testing whether those betas are compensated with a positive risk premium in the cross-sectional dimension.

This document replicates the Fama-MacBeth two-pass regression procedure using daily returns for six well-known U.S. equities (AAPL, FORD, GE, GM, IBM, MSFT) over the period January 2011 – December 2015, with the Fama-French Three-Factor (FF3) model as the theoretical backbone.

1.2 The Fama-French Three-Factor Model

Fama and French (1992, 1993) extended the CAPM by adding two empirically motivated risk factors:

\[r_{i,t} - r_{f,t} = \alpha_i + \beta_i^{MKT} \cdot MKT_t + \beta_i^{SMB} \cdot SMB_t + \beta_i^{HML} \cdot HML_t + \varepsilon_{i,t}\]

Symbol Factor Description
\(MKT_t\) Market Excess return of the market portfolio
\(SMB_t\) Small Minus Big Return spread between small- and large-cap stocks
\(HML_t\) High Minus Low Return spread between high- and low-book-to-market stocks

1.3 The Fama-MacBeth Procedure

The two-pass regression unfolds as follows:

Pass 1 — Time-Series (N regressions, one per asset): Regress each asset’s excess return on the three factors across all \(T\) time periods, yielding asset-specific factor loadings \((\hat{\beta}_i^{MKT},\, \hat{\beta}_i^{SMB},\, \hat{\beta}_i^{HML})\).

Pass 2 — Cross-Sectional (T regressions, one per date): For each date \(t\), regress the cross-section of returns on the estimated betas:

\[r_{i,t} = \lambda_t^0 + \lambda_t^{MKT}\hat{\beta}_i^{MKT} + \lambda_t^{SMB}\hat{\beta}_i^{SMB} + \lambda_t^{HML}\hat{\beta}_i^{HML} + \eta_{i,t}\]

Pass 3 — Averaging and Inference: The Fama-MacBeth risk premium estimate is the time-series average:

\[\hat{\lambda}^k = \frac{1}{T}\sum_{t=1}^{T} \hat{\lambda}_t^k, \qquad t\text{-stat} = \frac{\hat{\lambda}^k}{\text{SE}(\hat{\lambda}^k)}\]


2 Setup and Package Loading

# Install any missing packages
required_pkgs <- c("broom", "tidyverse", "knitr", "kableExtra",
                   "ggthemes", "scales", "sandwich", "lmtest")
new_pkgs <- required_pkgs[!(required_pkgs %in% installed.packages()[, "Package"])]
if (length(new_pkgs)) install.packages(new_pkgs, repos = "https://cloud.r-project.org")

library(broom)
library(tidyverse)
library(knitr)
library(kableExtra)
library(ggthemes)
library(scales)
library(sandwich)
library(lmtest)

3 Load and Inspect Data

data <- read.csv("data.csv")
data$date <- as.Date(data$date, format = "%d-%b-%y")

head(data, 10) %>%
  kable(digits = 6,
        caption = "Table 1: First 10 Rows of the Dataset") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE)
Table 1: First 10 Rows of the Dataset
symbol date ri MKT SMB HML
AAPL 2011-01-04 0.005206 -0.001314 -0.0065 0.0008
AAPL 2011-01-05 0.008146 0.004995 0.0018 0.0013
AAPL 2011-01-06 -0.000808 -0.002125 0.0001 -0.0025
AAPL 2011-01-07 0.007136 -0.001847 0.0022 -0.0006
AAPL 2011-01-10 0.018657 -0.001377 0.0041 0.0039
AAPL 2011-01-11 -0.002368 0.003718 0.0016 0.0036
AAPL 2011-01-12 0.008104 0.008967 0.0031 0.0000
AAPL 2011-01-13 0.003652 -0.001712 -0.0026 -0.0044
AAPL 2011-01-14 0.008067 0.007357 -0.0010 -0.0073
AAPL 2011-01-18 -0.022725 0.001375 0.0056 0.0015

3.1 Dataset Dimensions

cat("Rows       :", nrow(data), "\n")
#> Rows       : 7542
cat("Columns    :", ncol(data), "\n")
#> Columns    : 6
cat("Stocks     :", paste(unique(data$symbol), collapse = ", "), "\n")
#> Stocks     : AAPL, FORD, GE, GM, IBM, MSFT
cat("Date range :", format(min(data$date)), "to", format(max(data$date)), "\n")
#> Date range : 2011-01-04 to 2015-12-31

3.2 Descriptive Statistics

data %>%
  group_by(symbol) %>%
  summarise(
    N        = n(),
    Mean_ri  = round(mean(ri),  6),
    SD_ri    = round(sd(ri),    6),
    Min_ri   = round(min(ri),   6),
    Max_ri   = round(max(ri),   6),
    Mean_MKT = round(mean(MKT), 6),
    Mean_SMB = round(mean(SMB), 6),
    Mean_HML = round(mean(HML), 6)
  ) %>%
  kable(
    caption   = "Table 2: Descriptive Statistics by Stock",
    col.names = c("Stock", "N", "Mean ri", "SD ri",
                  "Min ri", "Max ri",
                  "Mean MKT", "Mean SMB", "Mean HML")
  ) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = TRUE) %>%
  row_spec(0, background = "#2c3e50", color = "white")
Table 2: Descriptive Statistics by Stock
Stock N Mean ri SD ri Min ri Max ri Mean MKT Mean SMB Mean HML
AAPL 1257 0.000697 0.016804 -0.131884 0.085022 0.000377 2e-06 0.00013
FORD 1257 -0.000583 0.055493 -0.390866 0.961411 0.000377 2e-06 0.00013
GE 1257 0.000561 0.013450 -0.067653 0.102597 0.000377 2e-06 0.00013
GM 1257 -0.000008 0.018947 -0.115440 0.091084 0.000377 2e-06 0.00013
IBM 1257 -0.000055 0.012213 -0.086419 0.055106 0.000377 2e-06 0.00013
MSFT 1257 0.000654 0.014791 -0.121033 0.099413 0.000377 2e-06 0.00013

4 Exploratory: Daily Return Series

4.1 Return Distributions

ggplot(data, aes(x = ri, fill = symbol)) +
  geom_histogram(bins = 60, colour = "white", alpha = 0.85) +
  facet_wrap(~symbol, scales = "free_y", ncol = 3) +
  scale_x_continuous(labels = percent_format(accuracy = 0.1)) +
  scale_fill_colorblind() +
  labs(
    title    = "Distribution of Daily Excess Returns",
    subtitle = "Fama-French Three-Factor Universe  |  Jan 2011 - Dec 2015",
    x        = "Daily Excess Return",
    y        = "Count"
  ) +
  theme_clean() +
  theme(legend.position = "none",
        strip.text = element_text(face = "bold"))
Figure 1: Daily Excess Return Distributions (2011-2015)

Figure 1: Daily Excess Return Distributions (2011-2015)

4.2 Cumulative Returns

data %>%
  group_by(symbol) %>%
  arrange(date) %>%
  mutate(cum_ri = cumprod(1 + ri) - 1) %>%
  ungroup() %>%
  ggplot(aes(x = date, y = cum_ri, colour = symbol)) +
  geom_line(size = 0.7) +
  scale_y_continuous(labels = percent_format()) +
  scale_colour_colorblind() +
  labs(
    title  = "Cumulative Excess Returns (2011-2015)",
    x      = NULL, y = "Cumulative Return", colour = "Stock"
  ) +
  theme_clean()
Figure 2: Cumulative Return Paths (2011-2015)

Figure 2: Cumulative Return Paths (2011-2015)

4.3 Factor Time-Series

data %>%
  select(date, MKT, SMB, HML) %>%
  distinct() %>%
  pivot_longer(c(MKT, SMB, HML), names_to = "factor", values_to = "ret") %>%
  ggplot(aes(x = date, y = ret, colour = factor)) +
  geom_line(alpha = 0.7, size = 0.45) +
  facet_wrap(~factor, ncol = 1, scales = "free_y") +
  scale_y_continuous(labels = percent_format(accuracy = 0.01)) +
  scale_colour_colorblind() +
  labs(
    title = "Daily Fama-French Factor Returns",
    x = NULL, y = "Factor Return"
  ) +
  theme_clean() +
  theme(legend.position = "none")
Figure 3: Fama-French Factor Returns Over Time

Figure 3: Fama-French Factor Returns Over Time


5 Step 1: First Pass — Time-Series Regressions

In the first pass we run \(N = 6\) OLS regressions — one per stock — across all \(T\) daily observations, obtaining factor loadings (betas) for each stock.

5.1 Estimation of Factor Loadings

step0_betas <- data %>%
  nest(data = c(date, ri, MKT, SMB, HML)) %>%
  mutate(
    model     = map(data, ~ lm(ri ~ MKT + SMB + HML, data = .x)),
    estimates = map(model, tidy),
    fit       = map(model, glance)
  ) %>%
  unnest(estimates) %>%
  select(symbol, term, estimate) %>%
  pivot_wider(names_from = term, values_from = estimate) %>%
  select(
    symbol,
    alpha = `(Intercept)`,
    b_MKT = MKT,
    b_SMB = SMB,
    b_HML = HML
  )

5.2 Model Fit Statistics

data %>%
  nest(data = c(date, ri, MKT, SMB, HML)) %>%
  mutate(
    model = map(data, ~ lm(ri ~ MKT + SMB + HML, data = .x)),
    fit   = map(model, glance)
  ) %>%
  unnest(fit) %>%
  select(symbol, r.squared, adj.r.squared, statistic, p.value, nobs) %>%
  mutate(across(where(is.numeric), ~ round(.x, 4))) %>%
  kable(
    col.names = c("Stock", "R-sq", "Adj R-sq", "F-stat", "p-value", "Obs"),
    caption   = "Table 3: Pass 1 - Time-Series Regression Fit Statistics"
  ) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE) %>%
  row_spec(0, background = "#2c3e50", color = "white")
Table 3: Pass 1 - Time-Series Regression Fit Statistics
Stock R-sq Adj R-sq F-stat p-value Obs
AAPL 0.2729 0.2712 156.7577 0.0000 1257
FORD 0.0090 0.0067 3.8122 0.0098 1257
GE 0.6125 0.6116 660.1958 0.0000 1257
GM 0.4379 0.4366 325.4098 0.0000 1257
IBM 0.4255 0.4241 309.2936 0.0000 1257
MSFT 0.4054 0.4040 284.7811 0.0000 1257

5.3 Estimated Beta Coefficients

step0_betas %>%
  mutate(across(where(is.numeric), ~ round(.x, 4))) %>%
  kable(
    col.names = c("Stock", "Alpha", "Beta MKT", "Beta SMB", "Beta HML"),
    caption   = "Table 4: Pass 1 - Estimated Factor Loadings"
  ) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE) %>%
  column_spec(3:5, bold = TRUE) %>%
  row_spec(0, background = "#2c3e50", color = "white")
Table 4: Pass 1 - Estimated Factor Loadings
Stock Alpha Beta MKT Beta SMB Beta HML
AAPL 4e-04 0.9000 0.0685 -0.0578
FORD -8e-04 0.5129 -0.2644 0.1380
GE 1e-04 1.0779 0.0994 0.0902
GM -5e-04 1.2854 0.0039 -0.0222
IBM -4e-04 0.8169 0.0336 -0.0121
MSFT 3e-04 0.9656 0.0582 -0.0641

Interpretation: Beta-MKT near 1 = near-perfect market co-movement. Beta-SMB > 0 = small-cap-like behaviour. Beta-HML > 0 = value-stock orientation.

5.4 Beta Visualisation

step0_betas %>%
  pivot_longer(c(b_MKT, b_SMB, b_HML),
               names_to = "factor", values_to = "beta") %>%
  mutate(factor = recode(factor,
                         b_MKT = "Market (MKT)",
                         b_SMB = "Size (SMB)",
                         b_HML = "Value (HML)")) %>%
  ggplot(aes(x = symbol, y = beta, fill = factor)) +
  geom_col(position = "dodge", colour = "white", alpha = 0.9) +
  geom_hline(yintercept = 0, linetype = "dashed", colour = "grey40") +
  scale_fill_colorblind() +
  labs(
    title    = "Fama-French Factor Loadings by Stock",
    subtitle = "Pass 1: Time-Series OLS Estimates",
    x = "Stock", y = "Beta Coefficient", fill = "Factor"
  ) +
  theme_clean()
Figure 4: Factor Loadings for Each Stock

Figure 4: Factor Loadings for Each Stock


6 Step 2: Second Pass — Cross-Sectional Regressions

In the second pass we merge the estimated betas back into the panel and run \(T\) cross-sectional regressions — one per trading day — of realised returns on the estimated betas. Each regression yields time-\(t\) estimates of the factor risk premia \(\lambda_t\).

6.1 Estimation of Risk Premia

step0_panel <- data %>%
  left_join(
    step0_betas %>% select(symbol, b_MKT, b_SMB, b_HML),
    by = "symbol"
  )

step1 <- step0_panel %>%
  nest(data = c(symbol, ri, b_MKT, b_SMB, b_HML)) %>%
  mutate(
    estimates = map(data,
      ~ tidy(lm(ri ~ b_MKT + b_SMB + b_HML, data = .x)))
  ) %>%
  unnest(estimates) %>%
  select(date, term, estimate) %>%
  pivot_wider(names_from = term, values_from = estimate) %>%
  select(
    date,
    lambda0 = `(Intercept)`,
    lam_MKT = b_MKT,
    lam_SMB = b_SMB,
    lam_HML = b_HML
  )

head(step1, 8) %>%
  mutate(across(where(is.numeric), ~ round(.x, 6))) %>%
  kable(
    col.names = c("Date", "Lambda 0", "Lambda MKT", "Lambda SMB", "Lambda HML"),
    caption   = "Table 5: Pass 2 - Cross-Sectional Risk Premium Estimates (First 8 Dates)"
  ) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE) %>%
  row_spec(0, background = "#2c3e50", color = "white")
Table 5: Pass 2 - Cross-Sectional Risk Premium Estimates (First 8 Dates)
Date Lambda 0 Lambda MKT Lambda SMB Lambda HML
2011-01-04 -0.029761 0.041629 -0.025520 0.057372
2011-01-05 0.022334 -0.011347 -0.158046 0.062847
2011-01-06 -0.029243 0.037301 0.007029 -0.173234
2011-01-07 -0.017503 0.012722 0.032269 -0.064226
2011-01-10 0.036414 -0.036631 0.017123 0.058646
2011-01-11 0.002744 0.004089 -0.095361 0.089858
2011-01-12 0.072331 -0.055365 -0.164496 0.043036
2011-01-13 0.015027 -0.019357 0.001815 0.025630

6.2 Time-Series of Risk Premia

step1 %>%
  pivot_longer(c(lam_MKT, lam_SMB, lam_HML),
               names_to = "factor", values_to = "lambda") %>%
  mutate(factor = recode(factor,
                         lam_MKT = "Market (MKT)",
                         lam_SMB = "Size (SMB)",
                         lam_HML = "Value (HML)")) %>%
  ggplot(aes(x = date, y = lambda, colour = factor)) +
  geom_line(alpha = 0.45, size = 0.35) +
  geom_smooth(method = "loess", span = 0.3, se = FALSE, size = 1.1) +
  geom_hline(yintercept = 0, linetype = "dashed", colour = "grey30") +
  facet_wrap(~factor, ncol = 1, scales = "free_y") +
  scale_colour_colorblind() +
  scale_y_continuous(labels = percent_format(accuracy = 0.01)) +
  labs(
    title    = "Daily Risk Premium Estimates - Pass 2",
    subtitle = "Raw daily (thin line)  |  LOESS trend (thick line)",
    x = NULL, y = "Risk Premium Estimate"
  ) +
  theme_clean() +
  theme(legend.position = "none")
Figure 5: Daily Lambda Estimates with LOESS Trend

Figure 5: Daily Lambda Estimates with LOESS Trend


7 Fama-MacBeth Risk Premia Estimates

7.1 t-Tests

test_MKT <- t.test(step1$lam_MKT, mu = 0)
test_SMB <- t.test(step1$lam_SMB, mu = 0)
test_HML <- t.test(step1$lam_HML, mu = 0)

cat("=== Market Risk Premium (MKT) ===\n"); print(test_MKT)
#> === Market Risk Premium (MKT) ===
#> 
#>  One Sample t-test
#> 
#> data:  step1$lam_MKT
#> t = -0.37879, df = 1256, p-value = 0.7049
#> alternative hypothesis: true mean is not equal to 0
#> 95 percent confidence interval:
#>  -0.002546371  0.001722208
#> sample estimates:
#>     mean of x 
#> -0.0004120813
cat("\n=== Size Risk Premium (SMB) ===\n");   print(test_SMB)
#> 
#> === Size Risk Premium (SMB) ===
#> 
#>  One Sample t-test
#> 
#> data:  step1$lam_SMB
#> t = 0.97712, df = 1256, p-value = 0.3287
#> alternative hypothesis: true mean is not equal to 0
#> 95 percent confidence interval:
#>  -0.003711466  0.011076953
#> sample estimates:
#>   mean of x 
#> 0.003682744
cat("\n=== Value Risk Premium (HML) ===\n");  print(test_HML)
#> 
#> === Value Risk Premium (HML) ===
#> 
#>  One Sample t-test
#> 
#> data:  step1$lam_HML
#> t = -0.18044, df = 1256, p-value = 0.8568
#> alternative hypothesis: true mean is not equal to 0
#> 95 percent confidence interval:
#>  -0.005541205  0.004607776
#> sample estimates:
#>     mean of x 
#> -0.0004667146

7.2 Summary Table

fm_summary <- tibble(
  Factor      = c("Market (MKT)", "Size (SMB)", "Value (HML)"),
  Mean_Lambda = c(mean(step1$lam_MKT),
                  mean(step1$lam_SMB),
                  mean(step1$lam_HML)),
  SE_Lambda   = c(sd(step1$lam_MKT) / sqrt(nrow(step1)),
                  sd(step1$lam_SMB) / sqrt(nrow(step1)),
                  sd(step1$lam_HML) / sqrt(nrow(step1))),
  t_stat      = c(test_MKT$statistic,
                  test_SMB$statistic,
                  test_HML$statistic),
  p_value     = c(test_MKT$p.value,
                  test_SMB$p.value,
                  test_HML$p.value),
  CI_lower    = c(test_MKT$conf.int[1],
                  test_SMB$conf.int[1],
                  test_HML$conf.int[1]),
  CI_upper    = c(test_MKT$conf.int[2],
                  test_SMB$conf.int[2],
                  test_HML$conf.int[2]),
  Significant = c(
    ifelse(test_MKT$p.value < 0.05, "Yes ***", "No"),
    ifelse(test_SMB$p.value < 0.05, "Yes ***", "No"),
    ifelse(test_HML$p.value < 0.05, "Yes ***", "No")
  )
)

fm_summary %>%
  mutate(across(where(is.numeric), ~ round(.x, 6))) %>%
  kable(
    col.names = c("Factor",
                  "Mean Lambda", "SE (Lambda)",
                  "t-stat", "p-value",
                  "CI Lower (95%)", "CI Upper (95%)",
                  "Sig. at 5%?"),
    caption   = "Table 6: Fama-MacBeth Risk Premia Estimates - Final Results"
  ) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = TRUE) %>%
  row_spec(which(fm_summary$p_value < 0.05),  background = "#d4edda") %>%
  row_spec(which(fm_summary$p_value >= 0.05), background = "#fff3cd") %>%
  column_spec(8, bold = TRUE) %>%
  row_spec(0, background = "#2c3e50", color = "white")
Table 6: Fama-MacBeth Risk Premia Estimates - Final Results
Factor Mean Lambda SE (Lambda) t-stat p-value CI Lower (95%) CI Upper (95%) Sig. at 5%?
Market (MKT) -0.000412 0.001088 -0.378788 0.704909 -0.002546 0.001722 No
Size (SMB) 0.003683 0.003769 0.977117 0.328699 -0.003711 0.011077 No
Value (HML) -0.000467 0.002587 -0.180437 0.856839 -0.005541 0.004608 No

8 Visualisation of Results

8.1 Risk Premia with Confidence Intervals

fm_summary %>%
  ggplot(aes(x = Factor, y = Mean_Lambda, colour = Factor)) +
  geom_point(size = 5) +
  geom_errorbar(aes(ymin = CI_lower, ymax = CI_upper),
                width = 0.2, size = 1.2) +
  geom_hline(yintercept = 0, linetype = "dashed",
             colour = "grey40", size = 0.8) +
  scale_colour_colorblind() +
  scale_y_continuous(labels = percent_format(accuracy = 0.0001)) +
  labs(
    title    = "Fama-MacBeth Risk Premium Estimates",
    subtitle = "Point estimates with 95% CI  |  H0: Lambda = 0",
    x = NULL, y = "Average Risk Premium (Mean Lambda)"
  ) +
  theme_clean() +
  theme(legend.position = "none")
Figure 6: Risk Premia Point Estimates with 95% CIs

Figure 6: Risk Premia Point Estimates with 95% CIs

8.2 Distribution of Daily Lambda Estimates

step1 %>%
  pivot_longer(c(lam_MKT, lam_SMB, lam_HML),
               names_to = "factor", values_to = "lambda") %>%
  mutate(factor = recode(factor,
                         lam_MKT = "Market (MKT)",
                         lam_SMB = "Size (SMB)",
                         lam_HML = "Value (HML)")) %>%
  ggplot(aes(x = lambda, fill = factor)) +
  geom_histogram(bins = 60, colour = "white", alpha = 0.85) +
  geom_vline(xintercept = 0, linetype = "dashed", colour = "grey20") +
  facet_wrap(~factor, scales = "free", ncol = 1) +
  scale_x_continuous(labels = percent_format(accuracy = 0.01)) +
  scale_fill_colorblind() +
  labs(
    title    = "Distribution of Daily Cross-Sectional Lambda Estimates",
    subtitle = "Dashed line = zero",
    x = "Lambda Estimate", y = "Count"
  ) +
  theme_clean() +
  theme(legend.position = "none")
Figure 7: Distribution of Daily Lambda Estimates

Figure 7: Distribution of Daily Lambda Estimates


9 Newey-West Adjusted Standard Errors

Standard Fama-MacBeth SEs assume no serial correlation in the \(\lambda_t\) series. Newey-West (1987) HAC standard errors relax this assumption, providing more robust inference when lambda series exhibit autocorrelation.

9.1 HAC Standard Errors

nw_se <- function(lambda_vec, lags = 5) {
  df_tmp  <- data.frame(y = lambda_vec, t = seq_along(lambda_vec))
  fit     <- lm(y ~ 1, data = df_tmp)
  nw_test <- coeftest(fit, vcov = NeweyWest(fit, lag = lags, prewhite = FALSE))
  tibble(
    estimate = nw_test[1, "Estimate"],
    nw_se    = nw_test[1, "Std. Error"],
    nw_tstat = nw_test[1, "t value"],
    nw_pval  = nw_test[1, "Pr(>|t|)"]
  )
}

nw_results <- tibble(
  Factor = c("Market (MKT)", "Size (SMB)", "Value (HML)"),
  bind_rows(
    nw_se(step1$lam_MKT),
    nw_se(step1$lam_SMB),
    nw_se(step1$lam_HML)
  )
)

nw_results %>%
  mutate(across(where(is.numeric), ~ round(.x, 6))) %>%
  kable(
    col.names = c("Factor", "Mean Lambda", "NW Std. Error",
                  "NW t-stat", "NW p-value"),
    caption   = "Table 7: Newey-West HAC Standard Errors (5 lags)"
  ) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE) %>%
  row_spec(0, background = "#2c3e50", color = "white")
Table 7: Newey-West HAC Standard Errors (5 lags)
Factor Mean Lambda NW Std. Error NW t-stat NW p-value
Market (MKT) -0.000412 0.001018 -0.404790 0.685701
Size (SMB) 0.003683 0.002970 1.240176 0.215142
Value (HML) -0.000467 0.002226 -0.209703 0.833934

9.2 Robustness: ACF of Lambda Series

par(mfrow = c(3, 1), mar = c(4, 4, 3, 1))
acf(step1$lam_MKT, main = "ACF: Lambda MKT", lag.max = 30, col = "#0072B2")
acf(step1$lam_SMB, main = "ACF: Lambda SMB", lag.max = 30, col = "#E69F00")
acf(step1$lam_HML, main = "ACF: Lambda HML", lag.max = 30, col = "#009E73")
Figure 8: ACF of Daily Lambda Estimates

Figure 8: ACF of Daily Lambda Estimates

par(mfrow = c(1, 1))

Significant autocorrelation at short lags indicates Newey-West SEs are preferred over standard Fama-MacBeth SEs.


10 Comparison: Standard FM vs Newey-West SE

comparison <- tibble(
  Factor   = c("Market (MKT)", "Size (SMB)", "Value (HML)"),
  FM_SE    = c(sd(step1$lam_MKT) / sqrt(nrow(step1)),
               sd(step1$lam_SMB) / sqrt(nrow(step1)),
               sd(step1$lam_HML) / sqrt(nrow(step1))),
  FM_tstat = c(test_MKT$statistic, test_SMB$statistic, test_HML$statistic),
  FM_pval  = c(test_MKT$p.value,   test_SMB$p.value,   test_HML$p.value),
  NW_SE    = nw_results$nw_se,
  NW_tstat = nw_results$nw_tstat,
  NW_pval  = nw_results$nw_pval
)

comparison %>%
  mutate(across(where(is.numeric), ~ round(.x, 5))) %>%
  kable(
    col.names = c("Factor",
                  "FM Std. Error", "FM t-stat", "FM p-value",
                  "NW Std. Error", "NW t-stat", "NW p-value"),
    caption   = "Table 8: Standard FM vs Newey-West Standard Errors"
  ) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = TRUE) %>%
  add_header_above(c(" " = 1,
                     "Standard Fama-MacBeth" = 3,
                     "Newey-West HAC" = 3)) %>%
  row_spec(0, background = "#2c3e50", color = "white")
Table 8: Standard FM vs Newey-West Standard Errors
Standard Fama-MacBeth
Newey-West HAC
Factor FM Std. Error FM t-stat FM p-value NW Std. Error NW t-stat NW p-value
Market (MKT) 0.00109 -0.37879 0.70491 0.00102 -0.40479 0.68570
Size (SMB) 0.00377 0.97712 0.32870 0.00297 1.24018 0.21514
Value (HML) 0.00259 -0.18044 0.85684 0.00223 -0.20970 0.83393
comparison %>%
  select(Factor, FM_SE, NW_SE) %>%
  pivot_longer(c(FM_SE, NW_SE),
               names_to = "method", values_to = "se") %>%
  mutate(method = recode(method,
                         FM_SE = "Standard FM",
                         NW_SE = "Newey-West HAC")) %>%
  ggplot(aes(x = Factor, y = se, fill = method)) +
  geom_col(position = "dodge", colour = "white", alpha = 0.9) +
  scale_fill_colorblind() +
  scale_y_continuous(labels = scientific_format()) +
  labs(
    title    = "Standard Error Comparison by Factor",
    subtitle = "Standard FM vs Newey-West HAC (5 lags)",
    x = NULL, y = "Standard Error", fill = "Method"
  ) +
  theme_clean()
Figure 9: SE Comparison - Standard FM vs Newey-West

Figure 9: SE Comparison - Standard FM vs Newey-West


11 Summary & Interpretation

11.1 Results Overview

tibble(
  Factor     = c("Market (MKT)", "Size (SMB)", "Value (HML)"),
  Mean_Lam   = round(c(mean(step1$lam_MKT),
                       mean(step1$lam_SMB),
                       mean(step1$lam_HML)), 6),
  Sign       = c(
    ifelse(mean(step1$lam_MKT) > 0, "Positive", "Negative"),
    ifelse(mean(step1$lam_SMB) > 0, "Positive", "Negative"),
    ifelse(mean(step1$lam_HML) > 0, "Positive", "Negative")
  ),
  FF3_Pred   = c("Positive (CAPM/FF3)", "Positive (FF3)", "Positive (FF3)"),
  Consistent = c(
    ifelse(mean(step1$lam_MKT) > 0, "Yes", "No"),
    ifelse(mean(step1$lam_SMB) > 0, "Yes", "No"),
    ifelse(mean(step1$lam_HML) > 0, "Yes", "No")
  ),
  FM_Sig  = c(
    ifelse(test_MKT$p.value < 0.05, "Yes (5%)", "No"),
    ifelse(test_SMB$p.value < 0.05, "Yes (5%)", "No"),
    ifelse(test_HML$p.value < 0.05, "Yes (5%)", "No")
  ),
  NW_Sig  = c(
    ifelse(nw_results$nw_pval[1] < 0.05, "Yes (5%)", "No"),
    ifelse(nw_results$nw_pval[2] < 0.05, "Yes (5%)", "No"),
    ifelse(nw_results$nw_pval[3] < 0.05, "Yes (5%)", "No")
  )
) %>%
  kable(
    col.names = c("Factor", "Mean Lambda", "Sign",
                  "FF3 Prediction", "Consistent?",
                  "FM Significant?", "NW Significant?"),
    caption   = "Table 9: Economic Interpretation of Fama-MacBeth Risk Premia"
  ) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = TRUE) %>%
  row_spec(0, background = "#2c3e50", color = "white")
Table 9: Economic Interpretation of Fama-MacBeth Risk Premia
Factor Mean Lambda Sign FF3 Prediction Consistent? FM Significant? NW Significant?
Market (MKT) -0.000412 Negative Positive (CAPM/FF3) No No No
Size (SMB) 0.003683 Positive Positive (FF3) Yes No No
Value (HML) -0.000467 Negative Positive (FF3) No No No

11.2 Theoretical Predictions for Fama-MacBeth Risk Premia

Under the Fama-French three-factor framework:

  • Market (MKT): \(\lambda_{MKT}\) should be positive — investors require compensation for non-diversifiable market risk.
  • Size (SMB): \(\lambda_{SMB}\) should be positive — small-cap stocks historically earn higher returns, consistent with a priced size risk factor.
  • Value (HML): \(\lambda_{HML}\) should be positive — value stocks earn a premium over growth stocks, consistent with distress-risk pricing.

11.3 Discussion

  1. Market Factor (MKT): Statistical significance at daily frequency is typically weak because the daily market premium is small. Annualised estimates should be benchmarked against the historical equity premium (~5–7%).

  2. Size Factor (SMB): This sample is dominated by large/mid-cap names (AAPL, IBM, MSFT, GE). The SMB premium may be attenuated relative to a full Fama-French universe spanning thousands of stocks.

  3. Value Factor (HML): The 2011–2015 period coincided with a growth-stock bull run — particularly in technology — which can suppress or invert the value premium in sub-period analyses.

  4. Sample size limitation: With \(N = 6\) stocks, the Pass 2 cross-sectional regressions have only 2 degrees of freedom. A larger cross-section is required for reliable risk premia estimates.

  5. Standard FM vs Newey-West: When \(\lambda_t\) series exhibit positive autocorrelation, standard FM standard errors are downward-biased. Newey-West SEs provide conservative, more reliable inference in finite samples.