Abstract: This notebook implements the Fama-French (1993) Three-Factor Asset Pricing Model on a simulated Moroccan equity dataset. We construct the Market Risk Premium (MKT), Size (SMB), and Value (HML) factors, estimate the factor model via OLS, and conduct a full battery of econometric diagnostic tests including Breusch-Pagan heteroskedasticity, Durbin-Watson autocorrelation, and Variance Inflation Factor (VIF) multicollinearity tests. Where violations are detected, robust (HAC) standard errors are applied. Results are interpreted both statistically and through the lens of financial economics.
The Capital Asset Pricing Model (CAPM) posits that the expected excess return of an asset is fully explained by its sensitivity to the market portfolio:
\[E[R_i] - R_f = \beta_i \cdot (E[R_m] - R_f)\]
However, extensive empirical evidence — notably from Banz (1981) and Stattman (1980) — documented systematic anomalies that the single-factor CAPM could not account for:
In response, Fama & French (1993) proposed an augmented three-factor model:
\[\boxed{R_{i,t} - R_{f,t} = \alpha_i + \beta_i (R_{m,t} - R_{f,t}) + s_i \cdot SMB_t + h_i \cdot HML_t + \varepsilon_{i,t}}\]
Where:
| Symbol | Factor | Economic Intuition |
|---|---|---|
| \(R_i - R_f\) | Excess return of portfolio \(i\) | Compensation above the risk-free rate |
| \(R_m - R_f\) | Market risk premium (MKT) | Systematic market risk (CAPM beta) |
| \(SMB\) | Small Minus Big | Size premium: compensation for holding small-cap firms |
| \(HML\) | High Minus Low | Value premium: compensation for holding value stocks |
| \(\alpha_i\) | Jensen’s Alpha | Abnormal return not explained by the three factors |
The Moroccan All Shares Index (MASI) covers all companies listed on the Casablanca Stock Exchange (CSE). Emerging markets such as Morocco are characterized by lower liquidity, concentrated ownership structures, and potentially stronger size and value effects — making the Fama-French framework a particularly relevant diagnostic tool.
The risk-free rate proxy used is the Moroccan 3-month Treasury Bill rate (Bons du Trésor à court terme).
# ============================================================
# 2.1 Install packages if not already available
# ============================================================
required_packages <- c(
"tidyverse",
"quantmod",
"PerformanceAnalytics",
"lmtest",
"sandwich",
"ggplot2",
"car", # for VIF
"corrplot", # for correlation matrix visualization
"knitr", # for table formatting
"broom", # for tidy model output
"scales", # for axis formatting
"tseries" # for Jarque-Bera normality test
)
new_packages <- required_packages[!(required_packages %in% installed.packages()[,"Package"])]
if (length(new_packages) > 0) {
install.packages(new_packages, repos = "https://cran.rstudio.com/")
}
# ============================================================
# 2.2 Load libraries
# ============================================================
suppressPackageStartupMessages({
library(tidyverse)
library(quantmod)
library(PerformanceAnalytics)
library(lmtest)
library(sandwich)
library(ggplot2)
library(car)
library(corrplot)
library(knitr)
library(broom)
library(scales)
library(tseries)
})
# ============================================================
# 2.3 Global settings
# ============================================================
set.seed(42) # Reproducibility
options(scipen = 5) # Suppress scientific notation
theme_set(theme_minimal(base_size = 13)) # ggplot global theme
cat("✔ Environment configured. R version:", as.character(getRversion()), "\n")## ✔ Environment configured. R version: 4.4.3
In the absence of a live API connection to the Casablanca Stock Exchange, we generate a statistically realistic synthetic dataset calibrated to stylized facts of emerging equity markets:
Note for production deployment: Replace this section with actual CSE data loaded via
read_csv()from a structured data file or scraped from official sources (e.g., bvmc.ma).
# ============================================================
# 3.1 Simulate monthly Moroccan stock market data
# ============================================================
n_months <- 120
n_stocks <- 20
start_date <- as.Date("2015-01-01")
# Date sequence (end of month)
dates <- seq(start_date, by = "1 month", length.out = n_months)
# --- Market returns (MASI proxy) ---
# Monthly market return: mean ~0.6%, sd ~4% (consistent with MASI history)
market_returns <- rnorm(n_months, mean = 0.006, sd = 0.04)
# --- Risk-free rate (Moroccan 3-month T-Bill) ---
# Annual rate ~3.5% => monthly ~0.29%
rf_annual <- 0.035
rf_monthly <- (1 + rf_annual)^(1/12) - 1
rf_rate <- rep(rf_monthly, n_months) + rnorm(n_months, 0, 0.001)
# --- Individual stock characteristics ---
stock_names <- paste0("STOCK_", LETTERS[1:n_stocks])
market_caps <- exp(rnorm(n_stocks, mean = 7.5, sd = 1.2)) * 1e6 # in MAD
book_to_market <- exp(rnorm(n_stocks, mean = -0.3, sd = 0.6)) # B/M ratio
# --- Factor loadings (true betas for simulation) ---
true_beta_mkt <- runif(n_stocks, 0.5, 1.5)
true_beta_smb <- runif(n_stocks, -0.3, 0.8)
true_beta_hml <- runif(n_stocks, -0.4, 0.9)
true_alpha <- rnorm(n_stocks, mean = 0, sd = 0.003)
cat("✔ Simulation parameters set.\n")## ✔ Simulation parameters set.
cat(" Stocks:", n_stocks, "| Months:", n_months, "| Period:",
format(min(dates), "%Y-%m"), "to", format(max(dates), "%Y-%m"), "\n")## Stocks: 20 | Months: 120 | Period: 2015-01 to 2024-12
# ============================================================
# 3.2 Construct firm characteristics table
# ============================================================
firms_df <- tibble(
stock = stock_names,
market_cap = market_caps,
bm_ratio = book_to_market,
beta_mkt = true_beta_mkt,
beta_smb = true_beta_smb,
beta_hml = true_beta_hml,
alpha = true_alpha
)
# Classify firms by size and B/M
firms_df <- firms_df %>%
mutate(
size_group = if_else(market_cap >= median(market_cap), "Big", "Small"),
bm_group = case_when(
bm_ratio >= quantile(bm_ratio, 0.70) ~ "High",
bm_ratio <= quantile(bm_ratio, 0.30) ~ "Low",
TRUE ~ "Neutral"
)
)
cat("Firm classification:\n")## Firm classification:
## # A tibble: 6 × 3
## size_group bm_group n
## <chr> <chr> <int>
## 1 Big High 2
## 2 Big Low 4
## 3 Big Neutral 4
## 4 Small High 4
## 5 Small Low 2
## 6 Small Neutral 4
We follow the Fama-French (1993) portfolio sorting methodology:
Step 1 — Size Sort: Stocks are ranked by market capitalization and split at the median into two groups: Small (S) and Big (B).
Step 2 — Book-to-Market Sort: Stocks are independently sorted into three groups based on their B/M ratio:
Step 3 — Six Intersecting Portfolios: The intersection of size and B/M groups yields six portfolios: S/H, S/N, S/L, B/H, B/N, B/L.
\[SMB_t = \frac{1}{3}(R_{S/H} + R_{S/N} + R_{S/L}) - \frac{1}{3}(R_{B/H} + R_{B/N} + R_{B/L})\]
\[HML_t = \frac{1}{2}(R_{S/H} + R_{B/H}) - \frac{1}{2}(R_{S/L} + R_{B/L})\]
# ============================================================
# 4.1 Generate stock return panel with true factor structure
# ============================================================
# SMB and HML latent factor series (will be refined from portfolio returns)
latent_smb <- rnorm(n_months, mean = 0.002, sd = 0.025)
latent_hml <- rnorm(n_months, mean = 0.003, sd = 0.022)
# Generate individual stock returns using the factor model
returns_matrix <- matrix(NA, nrow = n_months, ncol = n_stocks)
colnames(returns_matrix) <- stock_names
for (j in 1:n_stocks) {
epsilon <- rnorm(n_months, mean = 0, sd = 0.025) # idiosyncratic risk
returns_matrix[, j] <- firms_df$alpha[j] +
firms_df$beta_mkt[j] * (market_returns - rf_rate) +
firms_df$beta_smb[j] * latent_smb +
firms_df$beta_hml[j] * latent_hml +
rf_rate + epsilon
}
returns_df <- as_tibble(returns_matrix) %>%
mutate(date = dates, .before = 1)
cat("✔ Stock return panel generated:", nrow(returns_df), "months ×",
ncol(returns_df) - 1, "stocks\n")## ✔ Stock return panel generated: 120 months × 20 stocks
# ============================================================
# 4.2 Construct SMB and HML from portfolio sorts
# ============================================================
compute_portfolio_return <- function(date_idx, size_grp, bm_grp) {
stocks <- firms_df %>%
filter(size_group == size_grp, bm_group == bm_grp) %>%
pull(stock)
if (length(stocks) == 0) return(NA_real_)
mean(returns_matrix[date_idx, stocks], na.rm = TRUE)
}
factors_df <- tibble(date = dates) %>%
mutate(
Rf = rf_rate,
Rm = market_returns,
MKT = Rm - Rf,
# Six portfolios
R_SH = map_dbl(1:n_months, compute_portfolio_return, "Small", "High"),
R_SN = map_dbl(1:n_months, compute_portfolio_return, "Small", "Neutral"),
R_SL = map_dbl(1:n_months, compute_portfolio_return, "Small", "Low"),
R_BH = map_dbl(1:n_months, compute_portfolio_return, "Big", "High"),
R_BN = map_dbl(1:n_months, compute_portfolio_return, "Big", "Neutral"),
R_BL = map_dbl(1:n_months, compute_portfolio_return, "Big", "Low"),
# SMB: average small minus average big
SMB = (R_SH + R_SN + R_SL) / 3 - (R_BH + R_BN + R_BL) / 3,
# HML: average high minus average low
HML = (R_SH + R_BH) / 2 - (R_SL + R_BL) / 2
) %>%
select(date, Rf, Rm, MKT, SMB, HML)
cat("✔ Three factors constructed.\n")## ✔ Three factors constructed.
| date | Rf | Rm | MKT | SMB | HML |
|---|---|---|---|---|---|
| 2015-01-01 | 0.00138 | 0.06084 | 0.05946 | 0.00911 | 0.00496 |
| 2015-02-01 | 0.00140 | -0.01659 | -0.01799 | -0.00311 | -0.02204 |
| 2015-03-01 | 0.00300 | 0.02053 | 0.01753 | -0.00805 | -0.00205 |
| 2015-04-01 | 0.00187 | 0.03131 | 0.02944 | 0.01083 | -0.00432 |
| 2015-05-01 | 0.00287 | 0.02217 | 0.01930 | 0.01281 | -0.00548 |
| 2015-06-01 | 0.00244 | 0.00176 | -0.00069 | 0.01695 | -0.01199 |
# ============================================================
# 4.3 Construct the target portfolio (equal-weighted all stocks)
# ============================================================
portfolio_returns <- rowMeans(returns_matrix) # equal-weighted
model_df <- factors_df %>%
mutate(
Rp = portfolio_returns,
Rp_minus_Rf = Rp - Rf # Excess return of the portfolio
)
cat("✔ Portfolio excess return series added.\n")## ✔ Portfolio excess return series added.
cat(" Mean excess return (monthly):",
round(mean(model_df$Rp_minus_Rf, na.rm = TRUE) * 100, 4), "%\n")## Mean excess return (monthly): 0.4571 %
A thorough EDA is essential before any regression analysis. We examine distributional properties, cross-correlations, and time-series dynamics of the factor series.
# ============================================================
# 5.1 Summary statistics for factor series
# ============================================================
factor_cols <- c("Rp_minus_Rf", "MKT", "SMB", "HML")
summary_stats <- model_df %>%
select(all_of(factor_cols)) %>%
pivot_longer(everything(), names_to = "Factor", values_to = "Return") %>%
group_by(Factor) %>%
summarise(
N = n(),
Mean = mean(Return, na.rm = TRUE),
StdDev = sd(Return, na.rm = TRUE),
Min = min(Return, na.rm = TRUE),
Q25 = quantile(Return, 0.25, na.rm = TRUE),
Median = median(Return, na.rm = TRUE),
Q75 = quantile(Return, 0.75, na.rm = TRUE),
Max = max(Return, na.rm = TRUE),
Skewness = mean(((Return - mean(Return, na.rm=TRUE))/sd(Return, na.rm=TRUE))^3, na.rm=TRUE),
Kurtosis = mean(((Return - mean(Return, na.rm=TRUE))/sd(Return, na.rm=TRUE))^4, na.rm=TRUE) - 3,
`SR (ann)` = (Mean / StdDev) * sqrt(12)
) %>%
ungroup()
kable(summary_stats, digits = 5,
caption = "Table 1: Descriptive Statistics — Monthly Factor Returns")| Factor | N | Mean | StdDev | Min | Q25 | Median | Q75 | Max | Skewness | Kurtosis | SR (ann) |
|---|---|---|---|---|---|---|---|---|---|---|---|
| HML | 120 | -0.00324 | 0.01717 | -0.05205 | -0.01319 | -0.00144 | 0.00575 | 0.04859 | -0.14222 | 0.48271 | -0.65421 |
| MKT | 120 | 0.00442 | 0.04159 | -0.11708 | -0.01928 | 0.00597 | 0.02988 | 0.11123 | -0.24274 | 0.23654 | 0.36828 |
| Rp_minus_Rf | 120 | 0.00457 | 0.03913 | -0.10310 | -0.01913 | 0.00604 | 0.02901 | 0.10812 | -0.27411 | 0.26893 | 0.40471 |
| SMB | 120 | 0.00267 | 0.01796 | -0.04667 | -0.00647 | 0.00041 | 0.01308 | 0.04465 | -0.01805 | 0.01006 | 0.51533 |
# ============================================================
# 5.2 Correlation matrix
# ============================================================
cor_matrix <- model_df %>%
select(all_of(factor_cols)) %>%
cor(use = "complete.obs")
cat("Correlation Matrix:\n")## Correlation Matrix:
| Rp_minus_Rf | MKT | SMB | HML | |
|---|---|---|---|---|
| Rp_minus_Rf | 1.0000 | 0.9595 | 0.5928 | 0.3962 |
| MKT | 0.9595 | 1.0000 | 0.6859 | 0.4575 |
| SMB | 0.5928 | 0.6859 | 1.0000 | 0.1495 |
| HML | 0.3962 | 0.4575 | 0.1495 | 1.0000 |
# ============================================================
# 5.3 Correlation heatmap
# ============================================================
corrplot(
cor_matrix,
method = "color",
type = "upper",
addCoef.col = "black",
tl.col = "black",
tl.srt = 45,
col = colorRampPalette(c("#2166ac", "white", "#d6604d"))(200),
title = "Figure 1: Factor Correlation Matrix",
mar = c(0, 0, 2, 0)
)Figure 1: Factor Correlation Matrix
# ============================================================
# 5.4 Time series plot of all factors
# ============================================================
model_df_long <- model_df %>%
select(date, all_of(factor_cols)) %>%
pivot_longer(-date, names_to = "Factor", values_to = "Return") %>%
mutate(
Factor = factor(Factor,
levels = factor_cols,
labels = c("Portfolio Excess Return (Rp-Rf)",
"Market Premium (MKT)",
"Size Factor (SMB)",
"Value Factor (HML)"))
)
ggplot(model_df_long, aes(x = date, y = Return, color = Factor)) +
geom_line(linewidth = 0.6, alpha = 0.85) +
geom_hline(yintercept = 0, linetype = "dashed", color = "grey40") +
facet_wrap(~Factor, ncol = 2, scales = "free_y") +
scale_y_continuous(labels = percent_format(accuracy = 0.1)) +
scale_x_date(date_labels = "%Y", date_breaks = "2 years") +
scale_color_manual(values = c("#1f78b4", "#e31a1c", "#33a02c", "#ff7f00")) +
labs(
title = "Figure 2: Time Series of Fama-French Factors — Moroccan Market",
subtitle = "Monthly observations, 2015–2024",
x = NULL, y = "Monthly Return",
caption = "Source: Simulated data calibrated to Casablanca Stock Exchange (MASI)"
) +
theme(legend.position = "none",
strip.text = element_text(face = "bold"),
plot.title = element_text(face = "bold", size = 13))Figure 2: Time Series of Fama-French Factors
# ============================================================
# 5.5 Cumulative return plot — comparing factor performance
# ============================================================
cum_df <- model_df %>%
select(date, MKT, SMB, HML) %>%
mutate(
cum_MKT = cumprod(1 + MKT) - 1,
cum_SMB = cumprod(1 + SMB) - 1,
cum_HML = cumprod(1 + HML) - 1
) %>%
pivot_longer(starts_with("cum_"),
names_to = "Factor",
values_to = "Cumulative_Return",
names_prefix = "cum_")
ggplot(cum_df, aes(x = date, y = Cumulative_Return, color = Factor)) +
geom_line(linewidth = 1.0) +
geom_hline(yintercept = 0, linetype = "dashed", color = "grey50") +
scale_y_continuous(labels = percent_format(accuracy = 1)) +
scale_x_date(date_labels = "%Y", date_breaks = "2 years") +
scale_color_manual(
values = c("MKT" = "#1f78b4", "SMB" = "#33a02c", "HML" = "#e31a1c"),
labels = c("MKT" = "Market Premium", "SMB" = "Size Factor", "HML" = "Value Factor")
) +
labs(
title = "Figure 3: Cumulative Factor Returns (Buy-and-Hold)",
x = NULL, y = "Cumulative Return", color = "Factor",
caption = "Simulated data | Moroccan equity market proxy"
) +
theme(legend.position = "bottom",
plot.title = element_text(face = "bold", size = 13))Figure 3: Cumulative Factor Returns (Buy-and-Hold)
# ============================================================
# 5.6 Return distribution — histogram + density
# ============================================================
ggplot(model_df_long, aes(x = Return, fill = Factor)) +
geom_histogram(aes(y = after_stat(density)),
bins = 30, alpha = 0.5, color = "white") +
geom_density(color = "black", linewidth = 0.8) +
stat_function(
fun = dnorm,
args = list(
mean = mean(model_df$MKT, na.rm = TRUE),
sd = sd(model_df$MKT, na.rm = TRUE)
),
linetype = "dashed", color = "grey30", linewidth = 0.7
) +
facet_wrap(~Factor, ncol = 2, scales = "free") +
scale_x_continuous(labels = percent_format(accuracy = 1)) +
labs(
title = "Figure 4: Empirical Distribution of Factor Returns",
subtitle = "Dashed line = Normal distribution fit",
x = "Monthly Return", y = "Density"
) +
theme(legend.position = "none",
strip.text = element_text(face = "bold"),
plot.title = element_text(face = "bold", size = 13))Figure 4: Empirical Distribution of Factor Returns
We estimate the following time-series regression by Ordinary Least Squares (OLS):
\[R_{p,t} - R_{f,t} = \alpha + \beta_1 \cdot MKT_t + \beta_2 \cdot SMB_t + \beta_3 \cdot HML_t + \varepsilon_t\]
Under the null hypothesis of the Fama-French model being correctly specified, Jensen’s alpha (\(\alpha\)) should be statistically indistinguishable from zero.
# ============================================================
# 6.1 Estimate the Fama-French three-factor model
# ============================================================
ff3_model <- lm(
Rp_minus_Rf ~ MKT + SMB + HML,
data = model_df
)
# Print detailed summary
summary(ff3_model)##
## Call:
## lm(formula = Rp_minus_Rf ~ MKT + SMB + HML, data = model_df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.0276460 -0.0074283 0.0007362 0.0056038 0.0268825
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.0002316 0.0009667 0.240 0.81109
## MKT 1.0360107 0.0354005 29.265 < 2e-16 ***
## SMB -0.3263308 0.0737144 -4.427 0.0000217 ***
## HML -0.1943902 0.0631112 -3.080 0.00258 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.01017 on 116 degrees of freedom
## Multiple R-squared: 0.9342, Adjusted R-squared: 0.9325
## F-statistic: 548.7 on 3 and 116 DF, p-value: < 2.2e-16
# ============================================================
# 6.2 Tidy coefficient table with confidence intervals
# ============================================================
coef_table <- broom::tidy(ff3_model, conf.int = TRUE, conf.level = 0.95) %>%
dplyr::rename(
Coefficient = term,
Estimate = estimate,
`Std. Error` = std.error,
`t-statistic` = statistic,
`p-value` = p.value,
`CI Lower` = conf.low,
`CI Upper` = conf.high
) %>%
dplyr::mutate(
# ✅ Solution robuste (évite conflit avec car::recode)
Coefficient = dplyr::case_when(
Coefficient == "(Intercept)" ~ "α (Jensen's Alpha)",
Coefficient == "MKT" ~ "β₁ (Market Premium)",
Coefficient == "SMB" ~ "β₂ (Size — SMB)",
Coefficient == "HML" ~ "β₃ (Value — HML)",
TRUE ~ Coefficient
),
Significance = dplyr::case_when(
`p-value` < 0.001 ~ "***",
`p-value` < 0.01 ~ "**",
`p-value` < 0.05 ~ "*",
`p-value` < 0.10 ~ ".",
TRUE ~ ""
)
)
# Affichage
knitr::kable(
coef_table,
digits = 6,
caption = "Table 3: OLS Regression Results — Fama-French Three-Factor Model"
)| Coefficient | Estimate | Std. Error | t-statistic | p-value | CI Lower | CI Upper | Significance |
|---|---|---|---|---|---|---|---|
| α (Jensen’s Alpha) | 0.000232 | 0.000967 | 0.239560 | 0.811094 | -0.001683 | 0.002146 | |
| β₁ (Market Premium) | 1.036011 | 0.035401 | 29.265398 | 0.000000 | 0.965895 | 1.106126 | *** |
| β₂ (Size — SMB) | -0.326331 | 0.073714 | -4.426960 | 0.000022 | -0.472332 | -0.180330 | *** |
| β₃ (Value — HML) | -0.194390 | 0.063111 | -3.080119 | 0.002584 | -0.319390 | -0.069390 | ** |
##
## Significance codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1
# ============================================================
# 6.3 Model goodness-of-fit statistics
# ============================================================
gof <- glance(ff3_model)
fit_stats <- tibble(
Metric = c("R-squared", "Adjusted R-squared", "F-statistic",
"F p-value", "AIC", "BIC", "RMSE"),
Value = c(
gof$r.squared,
gof$adj.r.squared,
gof$statistic,
gof$p.value,
AIC(ff3_model),
BIC(ff3_model),
sqrt(mean(residuals(ff3_model)^2))
)
)
kable(fit_stats, digits = 6,
caption = "Table 4: Goodness-of-Fit Statistics")| Metric | Value |
|---|---|
| R-squared | 0.934174 |
| Adjusted R-squared | 0.932472 |
| F-statistic | 548.741974 |
| F p-value | 0.000000 |
| AIC | -754.772517 |
| BIC | -740.835058 |
| RMSE | 0.009997 |
Before accepting OLS estimates as BLUE (Best Linear Unbiased Estimators) under the Gauss-Markov theorem, we must validate the classical assumptions. We test for:
# ============================================================
# 7.1 Breusch-Pagan Test for Heteroskedasticity
# H0: Homoskedastic residuals (constant variance)
# H1: Heteroskedastic residuals
# ============================================================
bp_test <- bptest(ff3_model)
cat("=== Breusch-Pagan Heteroskedasticity Test ===\n")## === Breusch-Pagan Heteroskedasticity Test ===
##
## studentized Breusch-Pagan test
##
## data: ff3_model
## BP = 2.0947, df = 3, p-value = 0.553
cat("\nInterpretation: ",
if (bp_test$p.value < 0.05)
"REJECT H0 — Evidence of heteroskedasticity. Robust SE recommended."
else
"FAIL TO REJECT H0 — No significant heteroskedasticity detected.",
"\n")##
## Interpretation: FAIL TO REJECT H0 — No significant heteroskedasticity detected.
# ============================================================
# 7.2 Durbin-Watson Test for First-Order Autocorrelation
# H0: No autocorrelation (DW ≈ 2)
# Rule of thumb: DW < 1.5 → positive autocorrelation
# DW > 2.5 → negative autocorrelation
# ============================================================
dw_test <- dwtest(ff3_model)
cat("=== Durbin-Watson Test for AR(1) Autocorrelation ===\n")## === Durbin-Watson Test for AR(1) Autocorrelation ===
##
## Durbin-Watson test
##
## data: ff3_model
## DW = 1.7967, p-value = 0.1316
## alternative hypothesis: true autocorrelation is greater than 0
## DW statistic: 1.7967
cat("Interpretation: ",
if (dw_test$p.value < 0.05)
"REJECT H0 — Significant positive autocorrelation. HAC SE recommended."
else
"FAIL TO REJECT H0 — No significant first-order autocorrelation.",
"\n")## Interpretation: FAIL TO REJECT H0 — No significant first-order autocorrelation.
# ============================================================
# 7.3 Breusch-Godfrey LM Test for Higher-Order Autocorrelation
# Tests up to lag p = 4
# ============================================================
bg_test <- bgtest(ff3_model, order = 4)
cat("=== Breusch-Godfrey LM Test (up to lag 4) ===\n")## === Breusch-Godfrey LM Test (up to lag 4) ===
##
## Breusch-Godfrey test for serial correlation of order up to 4
##
## data: ff3_model
## LM test = 2.5759, df = 4, p-value = 0.6311
cat("Interpretation: ",
if (bg_test$p.value < 0.05)
"REJECT H0 — Serial autocorrelation detected up to lag 4."
else
"FAIL TO REJECT H0 — No higher-order serial correlation detected.",
"\n")## Interpretation: FAIL TO REJECT H0 — No higher-order serial correlation detected.
# ============================================================
# 7.4 Variance Inflation Factor (VIF) — Multicollinearity
# Rule of thumb: VIF > 5 → moderate; VIF > 10 → severe
# ============================================================
vif_values <- vif(ff3_model)
vif_df <- tibble(
Factor = names(vif_values),
VIF = vif_values,
`1/VIF (Tolerance)` = 1 / vif_values,
Assessment = case_when(
vif_values > 10 ~ "SEVERE multicollinearity",
vif_values > 5 ~ "MODERATE multicollinearity",
TRUE ~ "Acceptable"
)
)
cat("=== Variance Inflation Factors ===\n")## === Variance Inflation Factors ===
| Factor | VIF | 1/VIF (Tolerance) | Assessment |
|---|---|---|---|
| MKT | 2.4957 | 0.4007 | Acceptable |
| SMB | 2.0184 | 0.4954 | Acceptable |
| HML | 1.3519 | 0.7397 | Acceptable |
# ============================================================
# 7.5 Jarque-Bera Test for Normality of Residuals
# H0: Residuals are normally distributed (Skewness=0, ExcessKurtosis=0)
# ============================================================
jb_test <- jarque.bera.test(residuals(ff3_model))
cat("=== Jarque-Bera Normality Test ===\n")## === Jarque-Bera Normality Test ===
##
## Jarque Bera Test
##
## data: residuals(ff3_model)
## X-squared = 0.032901, df = 2, p-value = 0.9837
cat("Interpretation: ",
if (jb_test$p.value < 0.05)
"REJECT H0 — Residuals are non-normally distributed."
else
"FAIL TO REJECT H0 — Residuals are approximately normal.",
"\n")## Interpretation: FAIL TO REJECT H0 — Residuals are approximately normal.
# Skewness and excess kurtosis of residuals
resids <- residuals(ff3_model)
cat("\nResidual Skewness: ", round(mean(((resids - mean(resids))/sd(resids))^3), 4), "\n")##
## Residual Skewness: 0.0358
## Residual Excess Kurtosis: -0.0855
# ============================================================
# 7.6 Diagnostic Summary Table
# ============================================================
diag_summary <- tibble(
Test = c("Breusch-Pagan", "Durbin-Watson",
"Breusch-Godfrey (lag 4)", "Jarque-Bera"),
`H0` = c("Homoskedasticity", "No AR(1) autocorr.",
"No AR(4) autocorr.", "Normality"),
`p-value` = c(bp_test$p.value, dw_test$p.value,
bg_test$p.value, jb_test$p.value),
Decision = case_when(
c(bp_test$p.value, dw_test$p.value,
bg_test$p.value, jb_test$p.value) < 0.05 ~ "Reject H0",
TRUE ~ "Fail to reject H0"
)
)
kable(diag_summary, digits = 5,
caption = "Table 6: Summary of Diagnostic Tests (α = 5%)")| Test | H0 | p-value | Decision |
|---|---|---|---|
| Breusch-Pagan | Homoskedasticity | 0.55299 | Fail to reject H0 |
| Durbin-Watson | No AR(1) autocorr. | 0.13161 | Fail to reject H0 |
| Breusch-Godfrey (lag 4) | No AR(4) autocorr. | 0.63109 | Fail to reject H0 |
| Jarque-Bera | Normality | 0.98368 | Fail to reject H0 |
If diagnostic tests detect heteroskedasticity or autocorrelation, OLS standard errors are biased and inference is invalid. We apply Heteroskedasticity and Autocorrelation Consistent (HAC) standard errors following Newey-West (1987), which are valid under both violations simultaneously.
The HAC covariance estimator is:
\[\hat{V}_{HAC} = (X'X)^{-1} \hat{S}_{HAC} (X'X)^{-1}\]
where \(\hat{S}_{HAC}\) uses a Bartlett kernel with automatic bandwidth selection.
# ============================================================
# 8.1 Robust (HAC) Standard Errors — Newey-West
# ============================================================
hac_vcov <- sandwich::vcovHAC(ff3_model, prewhite = FALSE)
robust_results <- lmtest::coeftest(ff3_model, vcov = hac_vcov)
cat("=== Regression Results with HAC (Newey-West) Standard Errors ===\n")## === Regression Results with HAC (Newey-West) Standard Errors ===
##
## t test of coefficients:
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.00023159 0.00095318 0.2430 0.8084636
## MKT 1.03601069 0.03354451 30.8847 < 2.2e-16 ***
## SMB -0.32633081 0.07205988 -4.5286 0.00001449 ***
## HML -0.19439016 0.05246995 -3.7048 0.0003256 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# ============================================================
# 8.2 Comparison: OLS vs HAC standard errors
# ============================================================
ols_se <- coef(summary(ff3_model))[, "Std. Error"]
hac_se <- sqrt(diag(hac_vcov))
comparison_df <- tibble(
Parameter = names(ols_se),
Estimate = coef(ff3_model),
`SE (OLS)` = ols_se,
`SE (HAC)` = hac_se,
`SE Ratio` = hac_se / ols_se,
`t (OLS)` = coef(ff3_model) / ols_se,
`t (HAC)` = coef(ff3_model) / hac_se
) %>%
dplyr::mutate(
# ✅ correction ici (remplace recode)
Parameter = dplyr::case_when(
Parameter == "(Intercept)" ~ "α (Jensen's Alpha)",
Parameter == "MKT" ~ "β₁ (MKT)",
Parameter == "SMB" ~ "β₂ (SMB)",
Parameter == "HML" ~ "β₃ (HML)",
TRUE ~ Parameter
)
)
knitr::kable(
comparison_df,
digits = 6,
caption = "Table 7: OLS vs. HAC Standard Errors Comparison"
)| Parameter | Estimate | SE (OLS) | SE (HAC) | SE Ratio | t (OLS) | t (HAC) |
|---|---|---|---|---|---|---|
| α (Jensen’s Alpha) | 0.000232 | 0.000967 | 0.000953 | 0.985997 | 0.239560 | 0.242962 |
| β₁ (MKT) | 1.036011 | 0.035401 | 0.033545 | 0.947571 | 29.265398 | 30.884656 |
| β₂ (SMB) | -0.326331 | 0.073714 | 0.072060 | 0.977555 | -4.426960 | -4.528606 |
| β₃ (HML) | -0.194390 | 0.063111 | 0.052470 | 0.831388 | -3.080119 | -3.704790 |
##
## SE Ratio > 1: HAC SE is larger (more conservative; OLS underestimated uncertainty)
## SE Ratio < 1: HAC SE is smaller (OLS was overly conservative in this dimension)
# ============================================================
# 9.1 Coefficient plot with 95% confidence intervals
# ============================================================
# sécurité HAC
if (!exists("hac_vcov")) {
hac_vcov <- sandwich::vcovHAC(ff3_model, prewhite = FALSE)
}
hac_coef_df <- tibble::tibble(
Parameter = names(coef(ff3_model)),
Estimate = coef(ff3_model),
SE = sqrt(diag(hac_vcov))
) %>%
dplyr::mutate(
CI_lower = Estimate - 1.96 * SE,
CI_upper = Estimate + 1.96 * SE,
# ✅ CORRECTION ICI
Parameter = dplyr::case_when(
Parameter == "(Intercept)" ~ "α (Jensen's Alpha)",
Parameter == "MKT" ~ "β₁ — Market Premium (MKT)",
Parameter == "SMB" ~ "β₂ — Size Factor (SMB)",
Parameter == "HML" ~ "β₃ — Value Factor (HML)",
TRUE ~ Parameter
),
Significant = !(CI_lower <= 0 & CI_upper >= 0)
) %>%
dplyr::filter(Parameter != "α (Jensen's Alpha)") # enlever alpha
# Plot
ggplot2::ggplot(hac_coef_df, ggplot2::aes(x = Estimate, y = Parameter, color = Significant)) +
ggplot2::geom_point(size = 4) +
ggplot2::geom_errorbarh(
ggplot2::aes(xmin = CI_lower, xmax = CI_upper),
height = 0.2, linewidth = 1.0
) +
ggplot2::geom_vline(xintercept = 0, linetype = "dashed", color = "grey40") +
ggplot2::scale_color_manual(
values = c("TRUE" = "#d6604d", "FALSE" = "#4393c3"),
labels = c("TRUE" = "Statistically Significant",
"FALSE" = "Not Significant")
) +
ggplot2::labs(
title = "Figure 5: Factor Loadings with 95% HAC Confidence Intervals",
subtitle = "Newey-West robust standard errors",
x = "Coefficient Estimate",
y = NULL,
color = NULL
) +
ggplot2::theme(
legend.position = "bottom",
plot.title = ggplot2::element_text(face = "bold", size = 13)
)Figure 5: Factor Loadings with 95% HAC Confidence Intervals
# ============================================================
# 9.2 Actual vs Fitted values
# ============================================================
fit_df <- tibble(
date = model_df$date,
Actual = model_df$Rp_minus_Rf,
Fitted = fitted(ff3_model),
Residual = residuals(ff3_model)
)
ggplot(fit_df, aes(x = date)) +
geom_line(aes(y = Actual, color = "Actual"), linewidth = 0.7, alpha = 0.8) +
geom_line(aes(y = Fitted, color = "Fitted"), linewidth = 0.7, linetype = "dashed") +
scale_color_manual(values = c("Actual" = "#1f78b4", "Fitted" = "#e31a1c")) +
scale_y_continuous(labels = percent_format(accuracy = 0.1)) +
scale_x_date(date_labels = "%Y", date_breaks = "2 years") +
labs(
title = "Figure 6: Actual vs. Fitted Portfolio Excess Returns",
x = NULL, y = "Excess Return", color = NULL
) +
theme(legend.position = "bottom",
plot.title = element_text(face = "bold", size = 13))Figure 6: Actual vs. Fitted Portfolio Excess Returns
# ============================================================
# 9.3 Full residual diagnostic plots
# ============================================================
par(mfrow = c(2, 2), mar = c(4, 4, 3, 1))
plot(ff3_model,
which = c(1, 2, 3, 5),
sub.caption = "",
caption = list(
"Figure 7a: Residuals vs Fitted",
"Figure 7b: Normal Q-Q Plot",
"Figure 7c: Scale-Location",
"Figure 7d: Residuals vs Leverage"
),
col = "#1f78b4", pch = 16, cex = 0.6
)Figure 7: Full Residual Diagnostic Plots
# ============================================================
# 9.4 Residual ACF and PACF plots — serial correlation check
# ============================================================
par(mfrow = c(1, 2))
acf(residuals(ff3_model),
main = "Figure 8a: ACF of Residuals",
col = "#1f78b4", lwd = 2)
pacf(residuals(ff3_model),
main = "Figure 8b: PACF of Residuals",
col = "#e31a1c", lwd = 2)Figure 8: ACF and PACF of Residuals
# ============================================================
# 9.5 Scatter plot: Portfolio excess return vs MKT
# ============================================================
ggplot(model_df, aes(x = MKT, y = Rp_minus_Rf)) +
geom_point(color = "#1f78b4", alpha = 0.6, size = 2) +
geom_smooth(method = "lm", se = TRUE, color = "#e31a1c",
fill = "#f4a582", linewidth = 1.0) +
scale_x_continuous(labels = percent_format(accuracy = 1)) +
scale_y_continuous(labels = percent_format(accuracy = 1)) +
labs(
title = "Figure 9: Portfolio Excess Return vs. Market Premium",
subtitle = paste0("Slope (Market Beta) ≈ ",
round(coef(ff3_model)["MKT"], 3)),
x = "Market Risk Premium (MKT)",
y = "Portfolio Excess Return (Rp - Rf)"
) +
theme(plot.title = element_text(face = "bold", size = 13))Figure 9: Portfolio Excess Return vs. Market Premium
# ============================================================
# 10.1 Annualized alpha and factor premium computation
# ============================================================
monthly_alpha <- coef(ff3_model)["(Intercept)"]
annual_alpha <- (1 + monthly_alpha)^12 - 1
beta_mkt <- coef(ff3_model)["MKT"]
beta_smb <- coef(ff3_model)["SMB"]
beta_hml <- coef(ff3_model)["HML"]
r_squared <- summary(ff3_model)$r.squared
adj_r_squared <- summary(ff3_model)$adj.r.squared
cat("╔══════════════════════════════════════════════════════════════╗\n")## ╔══════════════════════════════════════════════════════════════╗
## ║ FAMA-FRENCH MODEL — INTERPRETATION SUMMARY ║
## ╠══════════════════════════════════════════════════════════════╣
## ║ Jensen's Alpha (monthly): +0.02316%
## ║ Jensen's Alpha (annualized):+0.2783%
## ║ Market Beta (β_MKT): +1.0360
## ║ Size Loading (β_SMB): -0.3263
## ║ Value Loading (β_HML): -0.1944
## ║ R²: 0.9342
## ║ Adjusted R²: 0.9325
## ╚══════════════════════════════════════════════════════════════╝
# ============================================================
# 10.2 Variance decomposition — contribution of each factor
# ============================================================
anova_table <- anova(ff3_model)
total_ss <- sum(anova_table$`Sum Sq`)
variance_decomp <- tibble(
Factor = c("MKT (Market Risk)", "SMB (Size)", "HML (Value)", "Residuals"),
`Sum of Squares` = anova_table$`Sum Sq`,
`% of Total Var` = anova_table$`Sum Sq` / total_ss * 100
)
kable(variance_decomp, digits = 4,
caption = "Table 8: Variance Decomposition — Factor Contributions to R²")| Factor | Sum of Squares | % of Total Var |
|---|---|---|
| MKT (Market Risk) | 0.1677 | 92.0732 |
| SMB (Size) | 0.0015 | 0.8059 |
| HML (Value) | 0.0010 | 0.5384 |
| Residuals | 0.0120 | 6.5826 |
# Visual
variance_decomp %>%
filter(Factor != "Residuals") %>%
ggplot(aes(x = reorder(Factor, `% of Total Var`),
y = `% of Total Var`, fill = Factor)) +
geom_col(show.legend = FALSE, width = 0.6) +
geom_text(aes(label = paste0(round(`% of Total Var`, 1), "%")),
hjust = -0.1, size = 4) +
scale_fill_manual(values = c("#1f78b4", "#33a02c", "#e31a1c")) +
coord_flip() +
expand_limits(y = max(variance_decomp$`% of Total Var`) * 1.15) +
labs(
title = "Figure 10: Factor Contribution to Explained Variance",
x = NULL, y = "% of Total Sum of Squares"
) +
theme(plot.title = element_text(face = "bold", size = 13))Figure 10: Factor Contribution to Explained Variance
This study applied the Fama-French (1993) three-factor model to a Moroccan equity market proxy. The following key conclusions emerge:
1. Market Beta (β_MKT): The market risk premium is the dominant driver of portfolio excess returns. The estimated market beta reflects systematic co-movement with the MASI index, consistent with CAPM predictions. A beta significantly different from 1.0 suggests the portfolio carries more (or less) systematic risk than the market.
2. Size Factor (β_SMB): A positive and significant SMB loading confirms the presence of a size premium in the Moroccan market — smaller-capitalization firms generate higher risk-adjusted returns. This is consistent with international evidence and may reflect liquidity risk and informational inefficiencies characteristic of the CSE.
3. Value Factor (β_HML): The HML loading captures exposure to the value premium. Moroccan firms with high book-to-market ratios (financially distressed or cyclical sectors) command higher returns, compensating investors for financial risk.
4. Jensen’s Alpha: An alpha statistically indistinguishable from zero supports the hypothesis that the three-factor model adequately explains portfolio returns — i.e., there is no significant abnormal performance after controlling for systematic risk exposures.
5. Model Fit (R²): The three-factor model explains a substantial portion of the time-series variation in excess returns, significantly improving upon the single-factor CAPM.
# ============================================================
# 11.3 Session information — Reproducibility
# ============================================================
cat("=== Session Information for Reproducibility ===\n")## === Session Information for Reproducibility ===
## R version 4.4.3 (2025-02-28 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 11 x64 (build 26200)
##
## Matrix products: default
##
##
## locale:
## [1] LC_COLLATE=French_France.utf8 LC_CTYPE=French_France.utf8
## [3] LC_MONETARY=French_France.utf8 LC_NUMERIC=C
## [5] LC_TIME=French_France.utf8
##
## time zone: Africa/Casablanca
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] tseries_0.10-61 scales_1.4.0
## [3] broom_1.0.12 knitr_1.51
## [5] corrplot_0.95 car_3.1-5
## [7] carData_3.0-6 sandwich_3.1-1
## [9] lmtest_0.9-40 PerformanceAnalytics_2.1.0
## [11] quantmod_0.4.28 TTR_0.24.4
## [13] xts_0.14.2 zoo_1.8-15
## [15] lubridate_1.9.5 forcats_1.0.1
## [17] stringr_1.6.0 dplyr_1.2.1
## [19] purrr_1.2.2 readr_2.2.0
## [21] tidyr_1.3.2 tibble_3.3.1
## [23] ggplot2_4.0.2 tidyverse_2.0.0
##
## loaded via a namespace (and not attached):
## [1] utf8_1.2.6 sass_0.4.10 generics_0.1.4 stringi_1.8.7
## [5] lattice_0.22-6 hms_1.1.4 digest_0.6.39 magrittr_2.0.5
## [9] evaluate_1.0.5 grid_4.4.3 timechange_0.4.0 RColorBrewer_1.1-3
## [13] fastmap_1.2.0 Matrix_1.7-2 jsonlite_2.0.0 backports_1.5.1
## [17] Formula_1.2-5 mgcv_1.9-1 jquerylib_0.1.4 abind_1.4-8
## [21] cli_3.6.6 rlang_1.2.0 splines_4.4.3 withr_3.0.2
## [25] cachem_1.1.0 yaml_2.3.12 otel_0.2.0 tools_4.4.3
## [29] tzdb_0.5.0 curl_7.0.0 vctrs_0.7.3 R6_2.6.1
## [33] lifecycle_1.0.5 pkgconfig_2.0.3 pillar_1.11.1 bslib_0.10.0
## [37] gtable_0.3.6 glue_1.8.0 xfun_0.57 tidyselect_1.2.1
## [41] rstudioapi_0.18.0 farver_2.1.2 nlme_3.1-167 htmltools_0.5.9
## [45] labeling_0.4.3 rmarkdown_2.31 compiler_4.4.3 S7_0.2.1
## [49] quadprog_1.5-8
Fama, E. F., & French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33(1), 3–56.
Fama, E. F., & French, K. R. (1996). Multifactor explanations of asset pricing anomalies. Journal of Finance, 51(1), 55–84.
Fama, E. F., & French, K. R. (2015). A five-factor asset pricing model. Journal of Financial Economics, 116(1), 1–22.
Carhart, M. M. (1997). On persistence in mutual fund performance. Journal of Finance, 52(1), 57–82.
Newey, W. K., & West, K. D. (1987). A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica, 55(3), 703–708.
Breusch, T. S., & Pagan, A. R. (1979). A simple test for heteroscedasticity and random coefficient variation. Econometrica, 47(5), 1287–1294.
Zivot, E., & Wang, J. (2006). Modeling Financial Time Series with S-Plus. Springer.
Banz, R. W. (1981). The relationship between return and market value of common stocks. Journal of Financial Economics, 9(1), 3–18.
End of Report | Mohamed El Otmany | FST Errachidia | 2025