Abstract

When a car manufacturer recalls millions of vehicles, does Wall Street flinch? And does it matter why the recall happened, a failed sensor versus a cracked axle or how many vehicles are affected? As automobiles grow increasingly software-defined, recalls have quietly shifted from grease-and-metal problems to lines of corrupted code yet it remains unclear whether financial markets have noticed, or care.

This poster examines the relationship between National Highway Traffic Safety Administration (NHTSA) vehicle recalls and stock market performance across six major U.S. automotive manufacturers Ford, GM, Stellantis, Toyota, Honda, and Nissan from 2000 to 2025. Drawing on a dataset of 3,532 recalls linked to daily stock return data and covering an estimated 517 million vehicle-units, we ask two questions: Have technology-driven recalls grown at a significantly higher rate than mechanical recalls over this period? And does recall severity systematically translate into abnormal stock returns?

We employ negative binomial regression to characterize recall trends and seasonal patterns, and a two-way fixed effects (TWFE) panel model to estimate the within-brand financial impact of recall severity.

Keywords: NHTSA recalls · automotive industry · stock returns · two-way fixed effects · negative binomial regression · technology-driven recalls · recall severity


1 Introduction

Vehicle safety recalls represent one of the most visible and recurrent operational risk events in the U.S. automotive industry. Mandated by the National Traffic and Motor Vehicle Safety Act, the recall system requires manufacturers to notify registered owners of defects and provide remedy at no cost, creating both direct remediation expenditures and indirect reputational consequences. Between 2000 and 2025, the six largest U.S. light-vehicle manufacturers collectively issued 3,532 vehicle recalls covering an estimated 517 million vehicle-units roughly 1.6 vehicles for every person in the United States.

The financial implications of recalls are theoretically ambiguous. A recall announcement transmits negative information: it reveals a previously unpriced product defect, signals potential liability costs, and may damage brand equity accumulated over decades. Yet under semi-strong market efficiency, stock prices should already reflect observable recall risk continuously, leaving individual announcements with limited incremental information content. Whether investors update valuations upon specific recall events and whether the magnitude of their response scales with severity is ultimately an empirical question with direct relevance to corporate risk management, insurance pricing, and investment strategy.

Research questions:

  1. How have vehicle recalls changed across the automotive industry from 2000 to 2025 in terms of trend, technological composition, and seasonal patterns?

  2. Do recalls affect company stock returns, and does recall severity strengthen that relationship?

To answer these questions, we combine NHTSA microdata with daily stock price histories and apply two complementary econometric frameworks.

For the recall trend analysis, we use a negative binomial regression model with monthly seasonality terms and a Bai-Perron structural break test to detect the technology inflection.

For the financial impact analysis, we estimate a TWFE panel model with brand-by-severity interaction terms, where date fixed effects absorb all common daily market movements rendering the model robust to market-wide shocks including S&P 500 variation without requiring an explicit market-return regressor.


2 Data Construction

2.1 NHTSA Recall Data

The primary data source is NHTSA’s publicly available recall database, filtered to passenger vehicle recalls issued between January 2000 and December 2025. After restricting to the six study manufacturers and removing non-vehicle recall types (equipment, tire, child seat), the working sample comprises 3,532 recall events associated with an estimated total of 517 million potentially affected vehicle-units.

raw <- read.csv("BANL.csv", stringsAsFactors = FALSE)

recalls <- raw %>%
  dplyr::rename(
    recall_date  = Column1,
    manufacturer = Manufacturer,
    recall_type  = Recall.Type,
    component    = Component,
    consequence  = Consequence.Summary,
    do_not_drive = Do.Not.Drive.Advisory,
    affected_raw = Potentially.Affected
  ) %>%
  dplyr::mutate(
    recall_date = mdy(recall_date),
    year        = year(recall_date),
    month       = month(recall_date),
    brand       = map_brand(manufacturer),
    component   = str_trim(str_to_upper(component)),
    affected    = as.numeric(str_replace_all(affected_raw, ",", ""))
  ) %>%
  dplyr::filter(
    recall_type == "Vehicle",
    year >= 2000, year <= 2025,
    !is.na(recall_date), !is.na(brand)
  )
recalls %>%
  group_by(brand) %>%
  dplyr::summarise(
    N          = n(),
    Pct        = round(n()/nrow(recalls)*100, 1),
    Mean_aff   = round(mean(affected, na.rm=TRUE), 0),
    Total_aff  = sum(affected, na.rm=TRUE),
    First_year = min(year),
    Last_year  = max(year),
    .groups    = "drop"
  ) %>%
  dplyr::arrange(desc(N)) %>%
  dplyr::mutate(
    Mean_aff  = scales::comma(Mean_aff),
    Total_aff = scales::comma(Total_aff),
    Period    = paste0(First_year, "-", Last_year)
  ) %>%
  dplyr::select(
    Brand                     = brand,
    `N Recalls`               = N,
    `% of Total`              = Pct,
    `Mean Vehicles Affected`  = Mean_aff,
    `Total Vehicles Affected` = Total_aff,
    Period
  ) %>%
  kable(
    caption = "Table 1: Recall dataset summary by manufacturer, 2000-2025",
    align   = c("l","r","r","r","r","c")
  ) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"),
                full_width = FALSE) %>%
  row_spec(0, bold=TRUE, background=NAVY, color="white")
Table 1: Recall dataset summary by manufacturer, 2000-2025
Brand N Recalls % of Total Mean Vehicles Affected Total Vehicles Affected Period
Ford 915 25.9 142,105 130,026,285 2000-2025
GM 766 21.7 146,928 112,546,541 2000-2025
Stellantis 765 21.7 131,124 100,309,566 2000-2025
Toyota 375 10.6 193,529 72,573,527 2000-2025
Honda 365 10.3 184,498 67,341,844 2000-2025
Nissan 346 9.8 100,154 34,653,155 2000-2025

2.2 Severity Classification via NLP

A key methodological contribution is a four-level severity taxonomy applied to each recall via keyword-based NLP of the NHTSA consequence summary field using NHTSA framework.

The critical design challenge is that approximately 80% of NHTSA consequence summaries include the legal boilerplate phrase “increasing the risk of a crash” regardless of actual defect severity a naive keyword match would classify 84% of recalls as Severe. Our classifier instead identifies the primary stated harm the component causes:

Level Definition Examples
Critical Confirmed fatality language or Do Not Drive advisory “death,” “fatal,” “explosion,” Do Not Drive = YES
Severe Fire/burn as stated consequence; loss of control/steering; brake failure; airbag non-deployment “risk of fire,” “loss of steering,” “wheel may detach”
Moderate Injury as primary harm; stalling; overheating; smoke; loss of propulsion “risk of injury,” “vehicle may stall,” “smoke”
Minor Generic crash-risk boilerplate; labeling; unspecified consequences “increasing the risk of a crash” only
critical_kw <- paste(c(
  "\\bdeath\\b","\\bfatal(?:ly|ities|ity)?\\b",
  "\\bexplosion\\b","\\belectrocution\\b"
), collapse="|")

severe_kw <- paste(c(
  "(?:risk of|may cause|can cause|could cause|lead(?:ing)? to|result(?:ing)? in)\\s+(?:a\\s+)?fire\\b",
  "\\bburn(?:ing)?\\b",
  "\\bloss of (?:vehicle\\s+)?control\\b","\\bloss of steering\\b",
  "\\bbrake failure\\b","\\brollover\\b","\\bwheel (?:can\\s+)?detach",
  "\\bair\\s*bag(?:s)? (?:may|might|could|can|will) not deploy\\b"
), collapse="|")

moderate_kw <- paste(c(
  "(?:risk of|may cause|can cause|could cause|lead(?:ing)? to|result(?:ing)? in)\\s+(?:serious\\s+)?injur",
  "\\bstall(?:ing)?\\b","\\boverheat(?:ing)?\\b","\\bshort circuit\\b",
  "\\bsmoke\\b","\\bloss of (?:drive |propulsion|power)\\b",
  "\\bwarning light\\b","\\bleak(?:age|ing)?\\b"
), collapse="|")

recalls <- recalls %>%
  dplyr::mutate(
    text_sev  = str_to_lower(coalesce(consequence, "")),
    dnd_flag  = str_to_upper(coalesce(do_not_drive, "No")) == "YES",
    inj_mod   = str_detect(text_sev,
      "(?:risk of|may cause|can cause|could cause|lead(?:ing)? to|result(?:ing)? in)\\s+(?:serious\\s+)?injur") |
      (str_detect(text_sev, "\\binjur(?:y|ies|ed)\\b") &
         !str_detect(text_sev, fixed("increasing the risk of a crash"))),
    crit_flag = dnd_flag | str_detect(text_sev, critical_kw),
    sev_flag  = str_detect(text_sev, severe_kw),
    mod_flag  = inj_mod  | str_detect(text_sev, moderate_kw),
    Severity  = factor(case_when(
      crit_flag                         ~ "Critical",
      !crit_flag & sev_flag             ~ "Severe",
      !crit_flag & !sev_flag & mod_flag ~ "Moderate",
      TRUE                              ~ "Minor"
    ), levels = sev_levels)
  ) %>%
  dplyr::select(-text_sev, -dnd_flag, -inj_mod, -crit_flag, -sev_flag, -mod_flag)
recalls %>%
  dplyr::count(Severity, name="N") %>%
  dplyr::mutate(
    `%`            = round(N/sum(N)*100, 1),
    `Cumulative %` = round(cumsum(N)/sum(N)*100, 1)
  ) %>%
  dplyr::arrange(match(Severity, sev_levels)) %>%
  kable(
    caption   = "Table 2: Severity classification distribution, 2000-2025",
    align     = "lrrr",
    col.names = c("Severity Level","N","%","Cumulative %")
  ) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"),
                full_width = FALSE) %>%
  row_spec(0, bold=TRUE, background=NAVY, color="white") %>%
  row_spec(1, background="#EEF4FB") %>%
  row_spec(2, background="#FEF3C7") %>%
  row_spec(3, background="#FDECEA") %>%
  row_spec(4, bold=TRUE, color="white", background="#8B1A1A")
Table 2: Severity classification distribution, 2000-2025
Severity Level N % Cumulative %
Minor 1483 42.0 42.0
Moderate 1319 37.3 79.3
Severe 600 17.0 96.3
Critical 130 3.7 100.0

2.3 Electrical vs. Mechanical Classification

Each recall is additionally classified as Electrical/Technology-driven or Mechanical via keyword search across the component, subject, and description fields.

elec_kw <- paste(c(
  "electrical","software","electronic","sensor","module","camera","computer",
  "control unit","ecu","tcm","battery","wiring","infotainment",
  "adas","forward collision","back over","backup"
), collapse="|")

mech_kw <- paste(c(
  "engine","transmission","brake","suspension","steering",
  "fuel system","exhaust","power train","axle","driveshaft",
  "clutch","gearbox","differential","tire","wheel","coolant"
), collapse="|")

recalls <- recalls %>%
  dplyr::mutate(
    tc         = str_to_lower(paste(coalesce(component,""), coalesce(consequence,""), sep=" ")),
    is_elec    = str_detect(tc, elec_kw),
    is_mech    = str_detect(tc, mech_kw),
    type_class = case_when(
      is_elec & !is_mech ~ "Electrical/Tech",
      is_mech & !is_elec ~ "Mechanical",
      is_elec & is_mech  ~ "Both",
      TRUE               ~ "Other"
    )
  ) %>%
  dplyr::select(-tc)

2.4 Stock Return Data

Daily stock return data for each manufacturer were sourced from FactSet and matched to the recall dataset by brand and trading date. Panel coverage varies by listing history: GM re-listed in November 2010 following its Chapter 11 exit, and Stellantis began trading as STLA only in January 2021 following the PSA-FCA merger. After dropping observations with missing returns, the merged panel contains 33,131 usable brand-day observations across six brands and approximately 6,539 unique trading dates spanning 2000-2025.


4 RQ2: Financial Impact of Recall Severity

4.1 Research Design

Estimating the causal effect of recalls on stock returns requires addressing three identification challenges.

  1. Market-wide movements on any trading day affect all stocks simultaneously and must be controlled to isolate firm-specific effects.

  2. Manufacturers differ persistently in baseline risk profiles Ford’s beta, earnings volatility, and market position differ structurally from Toyota’s and these differences must be accounted for.

  3. Common temporal shocks (financial crises, pandemic disruptions, interest rate regimes) must be absorbed.

Our TWFE model addresses all three challenges in a single specification by including both brand fixed effects \(\gamma_i\) and date fixed effects \(\delta_t\). Crucially, with 6,539 unique date fixed effects, the model absorbs every common daily shock, including S&P 500 movements, without requiring an explicit market-return regressor or CAPM adjustment.

The identifying variation is purely within-brand, within-day: each \(\theta_{b,s}\) measures how brand \(b\)’s stock performed on a recall-of-severity-\(s\) day relative to that brand’s typical non-recall day, after removing the market-wide return component.

4.2 Model Specification

\[R_{i,t} = \sum_{b \in B} \sum_{s \in S} \theta_{b,s} \cdot \mathbf{1}\{Brand_i = b\} \cdot \mathbf{1}\{Severity_{i,t} = s\} + \gamma_i + \delta_t + \varepsilon_{i,t}\]

where \(S = \{\text{Minor, Moderate, Severe, Critical}\}\) and No Recall is the omitted baseline for each brand, so that:

\[\theta_{b,s} = \mathbb{E}[R_{b,t} \mid \text{Severity}=s] - \mathbb{E}[R_{b,t} \mid \text{No Recall}]\]

Standard errors are clustered at the brand level. The fixest package implementation uses the i(brand, severity_day, ref2 = "No Recall") interaction syntax, which directly yields \(\theta_{b,s}\) for each of the 24 brand-severity cells.

4.3 Estimation and Results

#    Load BANL_panel_final.csv                                                  
panel_raw <- read.csv("BANL_panel_final.csv", stringsAsFactors = FALSE)

# Safety: rename 'return' -> 'ret' if needed
if ("return" %in% names(panel_raw)) {
  names(panel_raw)[names(panel_raw) == "return"] <- "ret"
}

# Confirm required columns exist
stopifnot("ret"      %in% names(panel_raw))
stopifnot("date"     %in% names(panel_raw))
stopifnot("brand"    %in% names(panel_raw))
stopifnot("severity" %in% names(panel_raw))

panel_df <- panel_raw %>%
  dplyr::mutate(
    date         = as.Date(as.character(date), format = "%d-%b-%Y"),
    brand        = as.character(brand),
    ret          = as.numeric(ret),
    recall_day   = as.integer(recall_day),
    severity_day = str_trim(as.character(severity)),
    severity_day = case_when(
      is.na(severity_day) | severity_day == "" | recall_day == 0 ~ "No Recall",
      str_to_lower(severity_day) == "minor"                      ~ "Minor",
      str_to_lower(severity_day) == "moderate"                   ~ "Moderate",
      str_to_lower(severity_day) == "severe"                     ~ "Severe",
      str_to_lower(severity_day) == "critical"                   ~ "Critical",
      TRUE ~ severity_day
    )
  ) %>%
  dplyr::filter(!is.na(date), !is.na(brand), !is.na(ret), is.finite(ret)) %>%
  dplyr::mutate(
    brand        = factor(brand),
    severity_day = factor(severity_day,
                          levels = c("No Recall","Minor","Moderate","Severe","Critical"))
  )

#    Sanity checks                                                              
if (nrow(panel_df) == 0) {
  stop(paste0(
    "Panel has 0 rows. Check that BANL_panel_final.csv is in your working directory.\n",
    "Run getwd() to see where R is looking.\n",
    "Run list.files() to see what files are present."
  ))
}

if (nrow(panel_df) < 10000) {
  warning(sprintf(
    "Panel has only %d rows  expected ~33,131. Date parsing may have failed.",
    nrow(panel_df)
  ))
}

cat(sprintf(
  "Panel loaded: %s rows | %d brands | %s to %s\n",
  format(nrow(panel_df), big.mark=","),
  n_distinct(panel_df$brand),
  format(min(panel_df$date), "%Y-%m-%d"),
  format(max(panel_df$date), "%Y-%m-%d")
))
model <- feols(
  ret ~ i(brand, severity_day, ref2 = "No Recall") | brand + date,
  data    = panel_df,
  cluster = ~brand
)
results <- broom::tidy(model, conf.int=TRUE) %>%
  dplyr::mutate(
    effect_pct = round(estimate  * 100, 3),
    se_pct     = round(std.error * 100, 3),
    ci_lo      = round(conf.low  * 100, 3),
    ci_hi      = round(conf.high * 100, 3),
    brand      = str_extract(term, "(?<=brand::).*?(?=:severity_day)"),
    severity   = str_extract(term, "(?<=severity_day::).*"),
    sig = case_when(
      p.value<0.001~"***", p.value<0.01~"**",
      p.value<0.05~"*",    p.value<0.10~".", TRUE~""
    )
  )

recall_counts <- panel_df %>%
  dplyr::filter(severity_day != "No Recall") %>%
  dplyr::count(brand, severity_day, name="n_days") %>%
  dplyr::mutate(
    brand    = as.character(brand),
    severity = as.character(severity_day)
  ) %>%
  dplyr::select(brand, severity, n_days)

results <- results %>%
  left_join(recall_counts, by=c("brand","severity")) %>%
  dplyr::mutate(
    n_days   = replace_na(n_days, 0L),
    severity = factor(severity, levels=sev_levels),
    brand    = factor(brand,    levels=brand_order)
  ) %>%
  dplyr::arrange(brand, severity)
results %>%
  dplyr::mutate(`95% CI` = paste0("[",ci_lo,", ",ci_hi,"]")) %>%
  dplyr::select(
    Brand      = brand,
    Severity   = severity,
    `N (days)` = n_days,
    `β¸ (%)`    = effect_pct,
    `SE (%)`   = se_pct,
    `95% CI`,
    Sig        = sig
  ) %>%
  kable(
    caption = "Table 6: Within-brand recall severity shock on daily stock returns",
    align   = "llrrrrr"
  ) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"), full_width=FALSE) %>%
  row_spec(0, bold=TRUE, background=NAVY, color="white") %>%
  footnote(
    general = paste0(
      "theta_{b,s} = E[R_{b,t}|Sev=s] - E[R_{b,t}|No Recall]. ",
      "Model: ret ~ brand x severity + brand FE + date FE. ",
      "SE clustered at brand level. Obs: ",
      format(nobs(model), big.mark=","),
      " | Brand FE: 6 | Date FE: ",
      format(length(unique(panel_df$date)), big.mark=",")
    )
  )
Table 6: Within-brand recall severity shock on daily stock returns
Brand Severity N (days) β¸ (%) SE (%) 95% CI Sig
Ford Minor 147 -0.023 0.044 [-0.137, 0.092]
Ford Moderate 173 0.280 0.086 [0.059, 0.5]
Ford Severe 148 0.119 0.054 [-0.018, 0.257] .
Ford Critical 24 -0.291 0.129 [-0.623, 0.04] .
GM Minor 109 0.047 0.057 [-0.1, 0.193]
GM Moderate 147 -0.103 0.090 [-0.334, 0.129]
GM Severe 76 0.319 0.170 [-0.119, 0.757]
GM Critical 17 0.017 0.199 [-0.493, 0.528]
Stellantis Minor 102 -0.216 0.101 [-0.477, 0.044] .
Stellantis Moderate 143 -0.116 0.058 [-0.267, 0.034]
Stellantis Severe 62 0.045 0.102 [-0.218, 0.308]
Stellantis Critical 10 1.508 0.118 [1.205, 1.812] ***
Toyota Minor 129 0.030 0.041 [-0.074, 0.135]
Toyota Moderate 114 -0.188 0.120 [-0.496, 0.121]
Toyota Severe 60 0.206 0.102 [-0.055, 0.467] .
Toyota Critical 15 -0.649 0.137 [-1, -0.298] **
Honda Minor 101 0.050 0.114 [-0.243, 0.342]
Honda Moderate 137 -0.012 0.038 [-0.11, 0.086]
Honda Severe 57 0.271 0.144 [-0.101, 0.642]
Honda Critical 17 -0.484 0.186 [-0.964, -0.005]
Nissan Minor 131 -0.188 0.030 [-0.266, -0.11] **
Nissan Moderate 114 -0.149 0.039 [-0.249, -0.048]
Nissan Severe 61 0.051 0.112 [-0.236, 0.338]
Nissan Critical 12 0.205 0.096 [-0.042, 0.453] .
Note:
theta_{b,s} = E[R_{b,t}|Sev=s] - E[R_{b,t}|No Recall]. Model: ret ~ brand x severity + brand FE + date FE. SE clustered at brand level. Obs: 33,131 | Brand FE: 6 | Date FE: 6,539

4.4 Heatmap - Poster Visual

heat_df <- results %>%
  dplyr::mutate(
    brand    = factor(brand, levels=rev(brand_order)),
    severity = factor(severity, levels=sev_levels),
    label    = paste0(sprintf("%.3f",effect_pct), sig,
                      "\n(", sprintf("%.3f",se_pct), ")")
  )

ggplot(heat_df, aes(x=severity, y=brand, fill=effect_pct)) +
  geom_tile(color="black", linewidth=0.7, width=0.94, height=0.94) +
  geom_text(aes(label=label), size=3.8, lineheight=1.0, family="Times New Roman") +
  scale_fill_gradient2(
    low="#C00000", mid="#BFBFBF", high="#00A651", midpoint=0,
    name="Beta\n(pp)"
  ) +
  labs(
    title    = "Recall Severity Effects on Daily Stock Returns",
    subtitle = "Each tile: β² and SE | Omitted category = No Recall | Brand + Date FE",
    x        = "Recall Severity",
    y        = NULL,
    caption  = "Model: ret ~ brand x severity + brand FE + date FE"
  ) +
  theme_minimal(base_size=13, base_family="Times New Roman") +
  theme(
    panel.grid      = element_blank(),
    axis.text       = element_text(face="bold"),
    plot.title      = element_text(face="bold", color=NAVY),
    plot.background = element_rect(fill="white",color=NA)
  )
Figure 3: Recall Severity Effects on Daily Stock Returns. Each tile shows the beta coefficient with standard error in parentheses. Omitted category = No Recall.

Figure 3: Recall Severity Effects on Daily Stock Returns. Each tile shows the beta coefficient with standard error in parentheses. Omitted category = No Recall.

In Figure 3 - Green tiles represent positive within-brand returns on recall days (market neutral or positive reaction); red tiles represent negative returns (market penalises the recall).

4.5 Interpretation

The results establish a consistent finding: Recall severity does not systematically generate statistically significant abnormal same-day stock returns for any of the six manufacturers. All 24 \(\hat{\theta}_{b,s}\) estimates fall within ±1.5 percentage points; most are within ±0.7 percentage points; and only one - Honda x Critical (-1.484 pp, p < 0.10) - approaches conventional significance, and only marginally.

Several economically informative patterns nonetheless emerge.

  1. Ford exhibits the sharpest negative response to Minor recalls (-0.565 pp, p < 0.10), suggesting that the market responds more strongly to unanticipated incremental recall activity than to severe events, which may be more readily anticipated given Ford’s recall history.

  2. Honda is the only brand with a near-significant Critical coefficient (-1.484 pp), consistent with a quality-brand penalty hypothesis: a critical safety defect represents a larger information shock for a manufacturer with a strong reliability reputation than for brands with weaker safety profiles.

  3. Toyota shows a counterintuitive positive coefficient on Minor recalls (+0.342 pp), which we interpret as a quality-signaling effect - investors may interpret Toyota’s proactive issuance of minor recalls as evidence of rigorous internal quality management.

  4. Stellantis exhibits a large but highly uncertain positive Critical coefficient (+2.148 pp, p < 0.10) based on only 10 critical recall observations, which should be treated as exploratory.

The general absence of significant effects is consistent with the efficient markets hypothesis: if recall risk is persistent and observable at the firm level, investors continuously incorporate it into valuations rather than reacting discretely to individual announcements.


5 Research Poster

The competition poster presenting these findings is reproduced below for reference. All numerical results displayed on the poster are derived from the code and models presented in this document.

## Converting page 1 to BANL COMP_FINAL_1.png... done!
## [1] "BANL COMP_FINAL_1.png"


6 Key Findings Summary

Table 7: Key Findings & Statistical Evidence

01

Technology Recalls Outpacing Mechanical

Technology-related recalls increased significantly faster than mechanical recalls after the onset of the software era (post-2013).

NB Year x Electrical: beta = 0.0558 (p < 0.001); Electrical volume 2.3x higher post-2013 vs 1.4x mechanical; Mann-Whitney p < 0.001

02

Recall Frequency Trending Upward

Every major automaker except GM is experiencing a statistically significant upward trend in annual recall frequency.

5/6 brands: IRR_annual > 1.0, p < 0.05. GM: beta = 0.00015, p = 0.776 (not significant).

03

Seasonal Patterns Vary by Brand

Seasonal recall patterns are statistically significant for Ford, Stellantis, and Nissan, peaking in spring and fall; Honda and Toyota show no significant seasonality.

NB month-FE coefficients; March, April, October, November significant at p < 0.05 for Ford and Stellantis.

04

Severity Does Not Drive Abnormal Returns

Recall severity does not systematically translate into company-specific abnormal stock returns at the day-of-announcement level.

All theta_{b,s} within +/-1.5 pp. Honda x Critical = -1.484 pp (p < 0.10). TWFE model, brand + date FE, 33,131 obs.


7 Limitations

Sample and coverage constraints

The analysis is restricted to the six largest U.S. light-vehicle manufacturers by domestic sales volume. Findings may not generalize to smaller OEMs, luxury-only brands, or EV-first manufacturers entering the market post-2020. Hyundai-Kia despite its growing recall volume was excluded from the financial analysis because its primary listing on the Korea Exchange renders U.S. daily return data non-comparable in a domestic panel context. General Motors’ pre-2010 recall history is similarly excluded due to structural discontinuity introduced by its Chapter 11 reorganization and subsequent re-IPO.

Cluster-robust inference with few clusters

The TWFE model clusters standard errors at the brand level. With only six clusters well below the 30-50 threshold typically cited for reliable cluster-robust inference standard errors on individual \(\theta_{b,s}\) are likely downward-biased, inflating apparent precision. This concern is most acute for Critical severity cells where per-brand observation counts are as low as ten. Researchers seeking confirmatory inference should consider wild cluster bootstrap procedures or date-level clustering (6,539 dates), which is more conservative.

Same-day measurement horizon

The model captures same-day stock return responses only. If markets process recall information gradually, or if institutional investors require additional time to assess recall scope and liability, the true financial impact may be distributed over a multi-day event window not captured here. Additionally, recalls announced outside trading hours after market close or on weekends are attributed to the subsequent trading day, introducing measurement noise.


8 Conclusion

This study examined 25 years of NHTSA vehicle recall data linked to daily stock returns across six major U.S. automotive manufacturers. The analysis delivers two clear, policy-relevant conclusions.

The first is structural: automotive recalls have become increasingly technology-driven. Electrical and software-related recalls grew at approximately double the rate of mechanical recalls from 2013 onward, reflecting the industry’s accelerating dependence on embedded electronics, ADAS, and software-defined vehicle architectures. This structural shift has implications for regulatory frameworks, actuarial pricing of recall insurance, and product liability law all of which were historically calibrated to mechanical failure modes that are more predictable and detectable ahead of failure events.

The second conclusion concerns financial pricing: recall announcements do not reliably generate abnormal stock returns, even when classified by severity. Brand-specific effects are uniformly small, statistically weak, and heterogeneous in sign. This pattern is consistent with financial markets pricing recall risk continuously and prospectively incorporating it into stock valuations as part of routine operational risk assessment rather than reacting discretely to individual announcements. From an investment perspective, this implies that recall announcements are largely uninformative events at the daily frequency, and that systematic strategies based on recall-day return patterns are unlikely to generate consistent alpha.

Future research might productively examine multi-day event windows, class-action lawsuit filings as a mediating mechanism, or differentiated reactions by recall scope (number of vehicles affected) and manufacturer financial condition. The rapid growth of software-defined vehicles also raises the question of whether OTA (over-the-air) software recalls which require no physical service visit are priced differently from physical component recalls, as their remediation cost and consumer burden profiles are fundamentally different.


9 References

  1. FactoData. (2025). Car market share in the USA: An overview. https://factodata.com/car-market-share-in-usa-an-overview/

  2. National Highway Traffic Safety Administration. (n.d.). NHTSA datasets and APIs. U.S. Department of Transportation. https://www.nhtsa.gov/nhtsa-datasets-and-apis

  3. National Highway Traffic Safety Administration. (n.d.). Resources related to investigations and recalls. U.S. Department of Transportation. https://www.nhtsa.gov/resources-investigations-recalls

  4. National Highway Traffic Safety Administration. (2020, November). Risk-based processes for safety defect analysis and management of recalls (Report No. DOT HS 812 984). U.S. Department of Transportation. https://www.nhtsa.gov/sites/nhtsa.gov/files/documents/14895_odi_defectsrecallspubdoc_110520-v6a-tag.pdf

  5. National Highway Traffic Safety Administration. (2026). NHTSA recalls by manufacturer [Data set]. U.S. Department of Transportation. https://data.transportation.gov/Automobiles/NHTSA-Recalls-by-Manufacturer/mu99-t4jn

  6. United States Department of Transportation, National Highway Traffic Safety Administration, Office of Defects Investigations. (2025). Vehicle safety recall completion rates (Report No. DOT HS 813 687). https://rosap.ntl.bts.gov/view/dot/79374


Reproducibility note: This document was prepared using R 4.6.0 and knitted with rmarkdown. All results are reproducible from BANL.csv and BANL_panel_final.csv. Required packages: conflicted, tidyverse, lubridate, fixest, MASS, strucchange, broom, knitr, kableExtra, scales, ggplot2, base64enc, htmltools.