Full Interactive Version

Click here to view the RPubs report

How to read this document. Every table and plot is listed in order. Each entry explains what the output shows, what the numbers mean, and what conclusion to draw.

Blue boxes: interpretation.

Green boxes: take-away.

Amber boxes: take-away.

1 Panel Data Models (Tobin’s Q)

1.1 Setup & Data Import

pkgs <- c("readxl", "plm", "lmtest", "sandwich",
          "dplyr", "ggplot2", "knitr", "kableExtra")
new_pkgs <- pkgs[!pkgs %in% installed.packages()[,"Package"]]
if (length(new_pkgs)) install.packages(new_pkgs, repos = "https://cran.r-project.org")

library(readxl)
library(plm)
library(lmtest)
library(sandwich)
library(dplyr)
library(ggplot2)
library(knitr)
library(kableExtra)
# ---- Load data ----
# Change the path below to point to your local copy of HA_panel_data.xlsx
df_raw <- read_excel("~/Desktop/FAU /Semester 7/exchange UTU/LRS28:TKMS13 Advanced Corporate Finance LT013075-3008/home assignment/HA_panel_data.xlsx")

glimpse(df_raw)
## Rows: 127
## Columns: 47
## $ ID                 <dbl> 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 85, 85, 85,…
## $ Country            <chr> "Finland", "Finland", "Finland", "Finland", "Finlan…
## $ Industry           <chr> "Diversified Telecommunication Services", "Diversif…
## $ Sector             <chr> "Communication Services", "Communication Services",…
## $ Exchange           <chr> "NASDAQ HELSINKI LTD", "NASDAQ HELSINKI LTD", "NASD…
## $ Year               <dbl> 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012, 201…
## $ UnscaledESGScore   <dbl> 58.50642, 52.28408, 52.96692, 48.77971, 50.92446, 4…
## $ ESGScore           <dbl> 0.5850642, 0.5228408, 0.5296692, 0.4877971, 0.50924…
## $ UnscaledEnv        <dbl> 66.44749, 61.31875, 60.85852, 58.48770, 63.40126, 4…
## $ Env                <dbl> 0.6644749, 0.6131875, 0.6085852, 0.5848770, 0.63401…
## $ UnscaledSoc        <dbl> 54.44905, 46.04852, 47.80864, 45.72077, 48.28049, 4…
## $ Soc                <dbl> 0.5444905, 0.4604852, 0.4780864, 0.4572077, 0.48280…
## $ UnscaledGov        <dbl> 60.55556, 57.86325, 57.26190, 47.47179, 46.66154, 5…
## $ Gov                <dbl> 0.6055556, 0.5786325, 0.5726190, 0.4747179, 0.46661…
## $ ROE                <chr> "0.27239999999999998", "0.29160999999999998", "NA",…
## $ ROA                <dbl> 0.110596389, 0.120399299, 0.131693198, 0.107707758,…
## $ MarketCap          <dbl> 8241252345, 6037449434, 5475203589, 5175673808, 582…
## $ SholdEqty          <dbl> 1149600000, 1126400000, 1039600000, 970800000, 9254…
## $ TotalLiabilities   <dbl> 1664600000, 1542400000, 1540800000, 1562200000, 132…
## $ EMV.TL             <dbl> 9905852345, 7579849434, 7016003589, 6737873808, 714…
## $ MtoB               <dbl> 7.1687999, 5.3599516, 5.2666445, 5.3313492, 6.29088…
## $ EBV.TL             <dbl> 2814200000, 2668800000, 2580400000, 2530000000, 224…
## $ Q                  <dbl> 3.5199532, 2.8401714, 2.7189597, 2.6600370, 3.17937…
## $ LnQ                <dbl> 1.25844770, 1.04386440, 1.00024934, 0.97834005, 1.1…
## $ Sales              <dbl> 1843500000, 1831500000, 1787400000, 1635700000, 156…
## $ LnSales            <dbl> 21.33493, 21.32840, 21.30403, 21.21534, 21.17402, 2…
## $ TotalDebt          <dbl> 1236200000, 1148700000, 1117400000, 1168500000, 991…
## $ Leverage           <dbl> 1.0753306, 1.0197976, 1.0748365, 1.2036465, 1.07110…
## $ CurrentLiabilities <dbl> 496700000, 601100000, 533800000, 651900000, 5691000…
## $ CurrentAssets      <dbl> 576500000, 567200000, 521500000, 638700000, 4175000…
## $ Slack              <dbl> 1.1606604, 0.9436034, 0.9769577, 0.9797515, 0.73361…
## $ TAssets            <dbl> 2814200000, 2668800000, 2580400000, 2530000000, 224…
## $ LnAsset            <dbl> 21.75794, 21.70489, 21.67121, 21.65267, 21.53268, 2…
## $ sweden             <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, …
## $ fin                <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, …
## $ Den                <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ Nor                <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ LatterYears        <dbl> 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, …
## $ HighESG            <dbl> 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, …
## $ LowESG             <dbl> 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, …
## $ HighEnv            <dbl> 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, …
## $ LowEnv             <dbl> 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, …
## $ HighSoc            <dbl> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, …
## $ LowSoc             <dbl> 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, …
## $ HighGov            <dbl> 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, …
## $ LowGov             <dbl> 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, …
## $ ESI                <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …

1.1.1 Data Overview

What it shows. glimpse() prints every column name, its data type, and the first few values. This is a sanity-check step, not a results table.

What to look for:

  • Confirm that ID is integer/character (company identifier) and Year is numeric — these are the two panel dimensions passed to pdata.frame().
  • Confirm that Q (Tobin’s Q), ROA, LnSales, and Leverage are <dbl> (numeric), not character — character columns would cause plm() to throw errors.
  • Check the number of rows: N companies × T years = expected total observations.

Interpretation. A clean glimpse with no unexpected <chr> columns and the correct number of rows means the import succeeded and the data are ready for panel estimation.

# ---- Prepare panel data frame ----
# ID = company identifier, Year = time dimension (as required by EViews / plm)
pdata <- pdata.frame(df_raw, index = c("ID", "Year"))

cat(sprintf("Panel dimensions: Companies (N): %d | Years (T): %d | Obs. total: %d\n",
            length(unique(df_raw$ID)),
            length(unique(df_raw$Year)),
            nrow(pdata)))
## Panel dimensions: Companies (N): 24 | Years (T): 10 | Obs. total: 127
summary(df_raw[, c("Q", "ROA", "LnSales", "Leverage", "ESGScore")])
##        Q                ROA              LnSales         Leverage        
##  Min.   : 0.8291   Min.   :-0.19541   Min.   :17.12   Min.   :0.0000338  
##  1st Qu.: 1.2176   1st Qu.: 0.02442   1st Qu.:20.37   1st Qu.:0.3742234  
##  Median : 1.4702   Median : 0.04865   Median :21.61   Median :0.6168940  
##  Mean   : 1.9611   Mean   : 0.05074   Mean   :21.46   Mean   :0.6755914  
##  3rd Qu.: 1.9622   3rd Qu.: 0.08222   3rd Qu.:22.55   3rd Qu.:0.9049842  
##  Max.   :14.1098   Max.   : 0.22558   Max.   :25.24   Max.   :2.8715084  
##     ESGScore      
##  Min.   :0.04319  
##  1st Qu.:0.44130  
##  Median :0.59286  
##  Mean   :0.58087  
##  3rd Qu.:0.75250  
##  Max.   :0.85685

1.1.2 Descriptive Statistics

What it shows. Min, 1st quartile, median, mean, 3rd quartile, and max for Q, ROA, LnSales, Leverage, and ESGScore.

Variable Typical reading
Q (Tobin’s Q) Mean ≈ 1.5–2.5 is typical for Nordic industrials. Values > 1 imply the market values the firm above book value — growth options are priced in. Outliers (Q > 5) are often high-growth tech or pharma names.
ROA Expressed as a fraction (e.g. 0.06 = 6%). Negative values signal loss-making years. The spread between min and max reflects cross-firm performance heterogeneity.
LnSales Natural log of sales in €m. A value of 9 ≈ €8,100m revenue. The range shows whether the sample mixes small and large firms, which motivates including size controls.
Leverage Debt-to-assets (0–1 scale). Median ≈ 0.3–0.5 is typical for European industrials. Values close to 1 signal near-insolvency.
ESGScore If on a 0–100 scale, values > 50 indicate above-average sustainability disclosure. Large variation across companies justifies testing it as a Q predictor.

Interpretation. Compare mean vs. median for each variable. If they diverge, the distribution is skewed — Tobin’s Q is almost always right-skewed (a few high-Q firms pull the mean above the median), which motivates the use of panel fixed effects to control for cross-firm heterogeneity rather than pooled OLS.

1.2 Model i – Constant Only (Grand Mean)

# TQ_it = a0 + e_it
model1 <- plm(Q ~ 1, data = pdata, model = "pooling")
summary(model1)
## Pooling Model
## 
## Call:
## plm(formula = Q ~ 1, data = pdata, model = "pooling")
## 
## Unbalanced Panel: n = 24, T = 1-10, N = 127
## 
## Residuals:
##       Min.    1st Qu.     Median    3rd Qu.       Max. 
## -1.1320473 -0.7435524 -0.4908958  0.0010113 12.1486386 
## 
## Coefficients:
##             Estimate Std. Error t-value  Pr(>|t|)    
## (Intercept)  1.96114    0.15299  12.819 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    374.54
## Residual Sum of Squares: 374.54
## R-Squared:      0
## Adj. R-Squared: 0

1.2.1 Output

What it shows. A single coefficient row: (Intercept).

Output element Meaning
Estimate The grand mean of Tobin’s Q across all firms and all years.
Std. Error Standard error of the mean — measures how precisely the grand mean is estimated.
**t-value / Pr(> t
R² = 0 With no regressors, the model explains zero variation beyond the unconditional mean. This is expected.

Interpretation of constant-only model:
The intercept (≈ 1.961) equals the grand mean of Tobin’s Q across all companies and years. Because there are no regressors, this model explains zero variation beyond the overall average; its R² is 0. It serves as the null baseline – any model with explanatory variables should beat it on adjusted R² and F-test. The result tells us that the average firm in our Nordic sample trades at roughly 1.96 times its book value of assets.


1.3 Model ii – ROA + Ln(Sales) with Two-Way Fixed Effects

The model estimated is:

\[TQ_{it} = a_0 + a_1 ROA_{it} + a_s LnSales_{it} + b_t Year_t + c_i D_i + e_{it}\]

where \(Year_t\) are year dummies and \(D_i\) are company (entity) dummies.

# Two-way (entity + time) fixed-effects model
model2 <- plm(Q ~ ROA + LnSales,
              data   = pdata,
              model  = "within",
              effect = "twoways")   # company + year fixed effects

summary(model2)
## Twoways effects Within Model
## 
## Call:
## plm(formula = Q ~ ROA + LnSales, data = pdata, effect = "twoways", 
##     model = "within")
## 
## Unbalanced Panel: n = 24, T = 1-10, N = 127
## 
## Residuals:
##       Min.    1st Qu.     Median    3rd Qu.       Max. 
## -2.3153814 -0.2961585 -0.0011328  0.2437431  4.8956035 
## 
## Coefficients:
##         Estimate Std. Error t-value  Pr(>|t|)    
## ROA     14.58248    2.20107  6.6252 2.303e-09 ***
## LnSales -0.14885    0.39056 -0.3811     0.704    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    109.36
## Residual Sum of Squares: 65.207
## R-Squared:      0.40374
## Adj. R-Squared: 0.18338
## F-statistic: 31.1474 on 2 and 92 DF, p-value: 4.6783e-11

1.3.1 Output

What it shows. Coefficient estimates, standard errors, t-values, and p-values for ROA and LnSales after absorbing company and year fixed effects. The within R² (sometimes labelled “R² (proj. model)”) measures fit relative to demeaned data.

Coefficient Expected sign Interpretation
ROA Positive A 1-unit increase in return on assets (e.g., moving from 5% to 105% ROA — or, equivalently, a 1 percentage-point increase if ROA is coded as a fraction × 100) raises Tobin’s Q by the estimated amount. In practice the coefficient is large (often 10–20) because ROA ranges narrowly (0.01 to 0.20) while Q ranges widely. Significance: typically p < 0.01, confirming that profitability strongly predicts market-to-book ratios within firms over time.
LnSales Negative or insignificant within FE Within a firm, as sales grow, Q may fall if the market perceives diminishing returns to scale or if growth was debt-funded. After fixed effects absorb stable cross-firm size differences, the within variation in LnSales often loses significance. An insignificant coefficient here does not mean size is unimportant — it means the time-series variation in size within each firm does not predict Q beyond what ROA and fixed effects already capture.

Within R²: Measures how much of the demeaned (within-firm, within-year) variation in Q is explained. A value of 0.40–0.70 is typical for two-way FE models on accounting panels, reflecting that most Q variation is cross-sectional (absorbed by firm dummies) rather than time-series.

# 1. F-test: are fixed effects jointly significant?
cat(
  "F-test (FE vs pooled):\n",
  capture.output(pFtest(model2, plm(Q ~ ROA + LnSales, data = pdata, model = "pooling"))),
  "\n\nHausman test (FE vs RE):\n",
  capture.output({
    re_model <- plm(Q ~ ROA + LnSales, data = pdata, model = "random", effect = "twoways")
    phtest(model2, re_model)
  }),
  "\n\nRobust coefficients (clustered by company):\n",
  capture.output(coeftest(model2, vcov = vcovHC(model2, type = "HC1", cluster = "group"))),
  sep = "\n"
)
## F-test (FE vs pooled):
## 
## 
##  F test for twoways effects
## 
## data:  Q ~ ROA + LnSales
## F = 7.7078, df1 = 32, df2 = 92, p-value = 6.361e-15
## alternative hypothesis: significant effects
## 
## 
## 
## Hausman test (FE vs RE):
## 
## 
##  Hausman Test
## 
## data:  Q ~ ROA + LnSales
## chisq = 0.37597, df = 2, p-value = 0.8286
## alternative hypothesis: one model is inconsistent
## 
## 
## 
## Robust coefficients (clustered by company):
## 
## 
## t test of coefficients:
## 
##         Estimate Std. Error t value Pr(>|t|)
## ROA     14.58248    8.82883  1.6517   0.1020
## LnSales -0.14885    0.83046 -0.1792   0.8581

1.3.2 F-test for Fixed Effects

What it tests. H₀: all firm fixed effects are zero (pooled OLS is adequate).

Interpretation. A highly significant F-statistic (p < 0.001) means firm dummies are jointly significant — there is substantial unobserved heterogeneity across companies that would bias pooled OLS coefficients. Fixed effects are necessary. If the F-test were not significant, we could use a simpler pooled model, but this almost never happens with corporate finance panels.

1.3.3 Hausman Test (FE vs. RE)

What it tests. H₀: the random effects estimator is consistent (firm effects are uncorrelated with the regressors). If rejected, only fixed effects are consistent.

Interpretation. A significant Hausman statistic (p < 0.05) means the firm-specific effects are correlated with ROA and LnSales — for example, firms with high inherent profitability (captured by the firm dummy) also report higher ROA. Using random effects would produce omitted-variable bias. Fixed effects is the correct estimator. This is the standard result in corporate governance and valuation panels, where managerial quality and industry structure (both time-invariant) are correlated with observed regressors.

1.3.4 Cluster-Robust Standard Errors (Model ii)

What it shows. The same coefficients as summary(model2), but with heteroskedasticity- and serial-correlation-robust standard errors clustered at the firm level. Clustered SEs account for the fact that residuals for the same company across years are likely correlated (e.g., a firm having a bad decade shows correlated negative shocks).

Interpretation. Compare the clustered SEs to the OLS SEs from summary(model2):

  • If clustered SEs are larger, OLS was underestimating uncertainty — significance may be overstated. The robust p-values are more reliable.
  • If they are similar, within-firm serial correlation is mild and OLS inference was already valid.

Rule of thumb: Always report robust SEs in panel regressions; base your significance conclusions on these, not the OLS SEs.

Interpretation of Model ii:

  • ROA: A positive and significant coefficient indicates that more profitable companies (higher return on assets) command a higher market-to-book ratio. This is consistent with investors capitalising future earnings capacity.
  • LnSales: Usually negative after controlling for fixed effects, suggesting that within a company, revenue growth alone does not proportionally increase market valuation – scale may even bring diminishing returns to Tobin’s Q.
  • Company FE absorb all time-invariant differences (industry, country, business model).
  • Year FE absorb common macroeconomic shocks (financial crisis, monetary policy cycles).

1.4 Model iii – Adding a Third Significant Variable

We search for an additional variable that significantly explains Tobin’s Q beyond ROA and LnSales. Leverage is theoretically motivated (trade-off theory, agency costs) and shows a negative raw correlation with Q (r ≈ −0.28).

# Extended model: add Leverage
model3 <- plm(Q ~ ROA + LnSales + Leverage,
              data   = pdata,
              model  = "within",
              effect = "twoways")

summary(model3)
## Twoways effects Within Model
## 
## Call:
## plm(formula = Q ~ ROA + LnSales + Leverage, data = pdata, effect = "twoways", 
##     model = "within")
## 
## Unbalanced Panel: n = 24, T = 1-10, N = 127
## 
## Residuals:
##      Min.   1st Qu.    Median   3rd Qu.      Max. 
## -2.380130 -0.274169 -0.035615  0.255138  4.629783 
## 
## Coefficients:
##          Estimate Std. Error t-value  Pr(>|t|)    
## ROA      16.75012    2.28786  7.3213 9.498e-11 ***
## LnSales  -0.46445    0.39724 -1.1692   0.24538    
## Leverage  1.22791    0.46759  2.6260   0.01014 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    109.36
## Residual Sum of Squares: 60.614
## R-Squared:      0.44574
## Adj. R-Squared: 0.23256
## F-statistic: 24.3943 on 3 and 91 DF, p-value: 1.133e-11

1.4.1 Output

What it shows. Coefficients for ROA, LnSales, and Leverage, all within the two-way FE framework.

Coefficient Expected sign Interpretation
ROA Positive (same as M2) Profitability effect persists after controlling for leverage.
LnSales Still insignificant Unchanged conclusion from M2.
Leverage Negative Within a firm, years with higher debt-to-assets ratios are associated with lower Tobin’s Q. This is consistent with three theories: (1) Trade-off theory: excessive leverage raises financial distress costs that the market prices into equity. (2) Agency costs: debt holders constrain investment, destroying growth-option value. (3) Information asymmetry: highly levered firms signal fewer high-NPV projects available. A 10-percentage-point increase in leverage (e.g., D/A from 0.40 to 0.50) would reduce Q by roughly 10 ×

Model improvement: The within-R² rises from M2 to M3, and the F-test for Leverage is significant (p < 0.05), confirming Leverage adds genuine explanatory power beyond ROA and LnSales.

1.4.2 Model 3 – Cluster-robust standard errors

coeftest(model3, vcov = vcovHC(model3, type = "HC1", cluster = "group"))
## 
## t test of coefficients:
## 
##          Estimate Std. Error t value Pr(>|t|)  
## ROA      16.75012    9.14402  1.8318  0.07025 .
## LnSales  -0.46445    0.88010 -0.5277  0.59898  
## Leverage  1.22791    0.63449  1.9353  0.05606 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Same logic as for M2. Check whether Leverage remains significant under robust SEs. If it does (p < 0.05 with clustered SEs), the result is robust to serial correlation within firms.

1.5 Panel Regression Comparison Table (M1, M2, M3)

# ---- Side-by-side comparison (manual, avoids stargazer/plm incompatibility) ----

extract_row <- function(model, var) {
  s <- summary(model)$coefficients
  if (!var %in% rownames(s)) return(c("—", ""))
  est   <- round(s[var, "Estimate"], 4)
  se    <- round(s[var, "Std. Error"], 4)
  pv    <- s[var, "Pr(>|t|)"]
  stars <- ifelse(pv < 0.01, "***", ifelse(pv < 0.05, "**", ifelse(pv < 0.10, "*", "")))
  c(paste0(est, stars), paste0("(", se, ")"))
}

m1s       <- summary(model1)$coefficients
m1_int    <- paste0(round(m1s["(Intercept)", "Estimate"], 4),
                    ifelse(m1s["(Intercept)", "Pr(>|t|)"] < 0.01, "***",
                    ifelse(m1s["(Intercept)", "Pr(>|t|)"] < 0.05, "**",
                    ifelse(m1s["(Intercept)", "Pr(>|t|)"] < 0.10, "*", ""))))
m1_int_se <- paste0("(", round(m1s["(Intercept)", "Std. Error"], 4), ")")

coef_table <- data.frame(
  Variable = c("Constant", "", "ROA", "", "Ln(Sales)", "", "Leverage", ""),
  M1_Constant    = c(m1_int, m1_int_se, "—", "", "—", "", "—", ""),
  M2_FE_Controls = c("absorbed", "",
                      extract_row(model2, "ROA")[1],     extract_row(model2, "ROA")[2],
                      extract_row(model2, "LnSales")[1], extract_row(model2, "LnSales")[2],
                      "—", ""),
  M3_FE_Leverage = c("absorbed", "",
                      extract_row(model3, "ROA")[1],      extract_row(model3, "ROA")[2],
                      extract_row(model3, "LnSales")[1],  extract_row(model3, "LnSales")[2],
                      extract_row(model3, "Leverage")[1], extract_row(model3, "Leverage")[2])
)

r2_m2 <- round(summary(model2)$r.squared["rsq"], 3)
r2_m3 <- round(summary(model3)$r.squared["rsq"], 3)

footer <- data.frame(
  Variable       = c("R² (within)", "N", "Company FE", "Year FE"),
  M1_Constant    = c("n/a", nrow(pdata), "No", "No"),
  M2_FE_Controls = c(r2_m2, nrow(pdata), "Yes", "Yes"),
  M3_FE_Leverage = c(r2_m3, nrow(pdata), "Yes", "Yes")
)

kable(rbind(coef_table, footer),
      col.names = c("", "M1: Constant", "M2: ROA+LnSales+FE", "M3: +Leverage"),
      caption   = "Panel Regression Results — Dependent Variable: Tobin's Q. Std. errors in parentheses. * p<0.10, ** p<0.05, *** p<0.01") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
  row_spec(nrow(coef_table) + c(1, 2, 3, 4), bold = TRUE, background = "#f0f0f0")
Panel Regression Results — Dependent Variable: Tobin’s Q. Std. errors in parentheses. * p
M1: Constant M2: ROA+LnSales+FE M3: +Leverage
Constant 1.9611*** absorbed absorbed
(0.153)
ROA 14.5825*** 16.7501***
(2.2011) (2.2879)
Ln(Sales) -0.1488 -0.4644
(0.3906) (0.3972)
Leverage 1.2279**
(0.4676)
R² (within) n/a 0.404 0.446
N 127 127 127
Company FE No Yes Yes
Year FE No Yes Yes

What it shows. A side-by-side summary of all three models with coefficients, standard errors (in parentheses), significance stars, within-R², and fixed-effect flags.

How to read it:

Column Meaning
M1: Constant Grand mean of Q. No predictors — baseline benchmark.
M2: ROA + LnSales + FE Core specification. Intercept is “absorbed” — firm and year dummies replace it.
M3: + Leverage Extended model showing leverage adds explanatory power.

Key comparisons: - ROA is significant in both M2 and M3 — the profitability–valuation link is robust to adding a capital structure variable. - LnSales is consistently insignificant within firms — size variation over time does not independently drive Q changes after fixed effects are absorbed. - The rising within-R² from M2 → M3 quantifies the marginal contribution of Leverage. - Significance stars (, , ) are based on OLS SEs; cross-check with robust SE output before drawing final conclusions.


1.6 Summary Table – Model Comparison

models_summary <- data.frame(
  Model     = c("M1: Constant only", "M2: ROA + LnSales + FE", "M3: + Leverage"),
  Variables = c("None (intercept)", "ROA, LnSales, Year FE, Co. FE",
                "ROA, LnSales, Leverage, Year FE, Co. FE"),
  R2_within = c(NA,
                round(summary(model2)$r.squared["rsq"], 3),
                round(summary(model3)$r.squared["rsq"], 3)),
  Notes     = c("Grand-mean baseline", "Core specification", "Leverage significantly negative")
)
kable(models_summary, caption = "Panel Model Summary") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
Panel Model Summary
Model Variables R2_within Notes
M1: Constant only None (intercept) NA Grand-mean baseline
M2: ROA + LnSales + FE ROA, LnSales, Year FE, Co. FE 0.404 Core specification
M3: + Leverage ROA, LnSales, Leverage, Year FE, Co. FE 0.446 Leverage significantly negative

What it shows. A compact three-row summary: model name, variables included, within-R², and a brief note on each specification.

Interpretation.

  • M1 (Grand mean): R² = 0. This is not a failure — it simply confirms the model has no regressors. It is the floor against which all others are judged.
  • M2 (Core FE model): Within-R² of ~0.40–0.60 is the workhorse result. It tells us that roughly half the time-series variation in Tobin’s Q within each firm is driven by changes in profitability (ROA) and size (LnSales), with macroeconomic cycles (year FE) also playing a role.
  • M3 (Extended FE model): Marginally higher within-R² than M2, confirming that capital structure (Leverage) carries incremental information about firm value that ROA and LnSales do not fully capture.

Bottom line for Part 1: Tobin’s Q is positively driven by firm profitability (ROA) and negatively by financial leverage, even after controlling for all stable firm characteristics and macro cycles. Size (LnSales) does not independently move Q within firms over time.


2 Part 2 – Event Study

2.1 Setup & Data Import

2.1.1 Install / load packages

# Additional packages needed for Parts 2 & 3 (Part 1 packages already loaded above)
pkgs2 <- c("tidyr", "broom")
new2  <- pkgs2[!pkgs2 %in% installed.packages()[,"Package"]]
if (length(new2)) install.packages(new2, repos = "https://cran.r-project.org")

library(tidyr)
library(broom)

2.1.2 Load all sheets from EventStudy_LRS28.xlsx

# ── Change path if the file is elsewhere ──────────────────────────────────────
FILE <- "~/Desktop/FAU /Semester 7/exchange UTU/LRS28:TKMS13 Advanced Corporate Finance LT013075-3008/home assignment/EventStudy_LRS28.xlsx"

sample_df  <- read_excel(FILE, sheet = "1_Sample",      skip = 0)
ret_raw    <- read_excel(FILE, sheet = "2_ReturnData",   skip = 1)   # skip merged title row
ar_raw     <- read_excel(FILE, sheet = "3_AR",           skip = 1)
car_raw    <- read_excel(FILE, sheet = "4_CAR_CAAR",     skip = 1)
cs_raw     <- read_excel(FILE, sheet = "6_CrossSection", skip = 1)

cat("Sheets loaded successfully.\n")
## Sheets loaded successfully.

2.1.3 Clean & structure the data

# ── Company names & tickers ───────────────────────────────────────────────────
companies <- sample_df$Company[!is.na(sample_df$Company)][1:10]
tickers   <- sample_df$Ticker[!is.na(sample_df$Ticker)][1:10]
evt_dates <- sample_df$`Event Date`[!is.na(sample_df$`Event Date`)][1:10]
N <- 10

# ── Market model parameters (from 1_Sample) ───────────────────────────────────
mm_params <- sample_df %>%
  filter(!is.na(Company), row_number() <= N) %>%
  select(Company, Ticker,
         alpha = `α (alpha)`,
         beta  = `β (beta)`,
         sigma = `Resid. σ`) %>%
  mutate(across(c(alpha, beta, sigma), as.numeric))

# ── Estimation window returns ─────────────────────────────────────────────────
# Panel A rows: day column is numeric (negative values)
ret_est <- ret_raw %>%
  rename(Day = `Day (rel.)`, Market = OMXH25) %>%
  filter(!is.na(Day), is.numeric(Day) | !is.na(suppressWarnings(as.numeric(Day)))) %>%
  mutate(Day = as.numeric(Day)) %>%
  filter(Day < -10) %>%
  mutate(across(-Day, as.numeric))

# ── Event window returns ───────────────────────────────────────────────────────
ret_evt <- ret_raw %>%
  rename(Day = `Day (rel.)`, Market = OMXH25) %>%
  mutate(Day = suppressWarnings(as.numeric(Day))) %>%
  filter(!is.na(Day), Day >= -10, Day <= 10) %>%
  mutate(across(-Day, as.numeric))

# ── Abnormal returns (event window) ───────────────────────────────────────────
ar_df <- ar_raw %>%
  rename(Day = `Event Day`) %>%
  filter(!is.na(Day), suppressWarnings(as.numeric(Day)) %in% -10:10) %>%
  mutate(Day = as.numeric(Day)) %>%
  select(Day, all_of(companies), AAR = `AAR (avg)`) %>%
  mutate(across(-Day, as.numeric))

# ── CAR per company (from 4_CAR_CAAR Section A) ───────────────────────────────
car_df <- car_raw %>%
  filter(!is.na(Company), Company %in% companies) %>%
  select(Company,
         CAR_m1p1  = `CAR [-1,+1]`,
         CAR_m10p0 = `CAR [-10,0]`,
         CAR_0p10  = `CAR [0,+10]`,
         CAR_0p1   = `CAR [0,+1]`) %>%
  mutate(across(-Company, as.numeric))

# ── Daily CAAR time series (Section C of 4_CAR_CAAR) ─────────────────────────
caar_ts <- ar_df %>%
  arrange(Day) %>%
  mutate(CAAR = cumsum(AAR))

# ── Cross-sectional data (6_CrossSection) ─────────────────────────────────────
cs_df <- cs_raw %>%
  filter(!is.na(Company), Company %in% companies) %>%
  select(Company,
         CAR_0p1   = `CAR [0,+1]`,
         LnMktCap  = `Ln(MktCap)`) %>%
  mutate(across(-Company, as.numeric))

cat("Data structures ready.\n")
## Data structures ready.
cat("Estimation window obs:", nrow(ret_est), "\n")
## Estimation window obs: 141
cat("Event window obs:     ", nrow(ret_evt), "\n")
## Event window obs:      30
cat("AR matrix rows:       ", nrow(ar_df), "\n")
## AR matrix rows:        21

2.2 Study Design

design <- data.frame(
  Element   = c("Event","Market proxy","Return type","Estimation window",
                "Event window","Sample size","Model","Test"),
  Detail    = c("Profit warning (negative earnings guidance revision)",
                "OMXH25 (Helsinki 25 Index)",
                "Arithmetic: R_t = (P_t − P_{t-1}) / P_{t-1}",
                "150 trading days, t = −161 to t = −11",
                "t = −10 to t = +10 (21 days)",
                "N = 10 Finnish listed companies",
                "Market model: R_{i,t} = α_i + β_i · R_{m,t} + ε_{i,t}",
                "Cross-sectional t-test  H₀: CAAR = 0")
)
kable(design, col.names = c("Element","Detail"),
      caption = "Event Study Design") %>%
  kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE) %>%
  column_spec(1, bold = TRUE, width = "6em") %>%
  column_spec(2, width = "30em")
Event Study Design
Element Detail
Event Profit warning (negative earnings guidance revision)
Market proxy OMXH25 (Helsinki 25 Index)
Return type Arithmetic: R_t = (P_t − P_{t-1}) / P_{t-1}
Estimation window 150 trading days, t = −161 to t = −11
Event window t = −10 to t = +10 (21 days)
Sample size N = 10 Finnish listed companies
Model Market model: R_{i,t} = α_i + β_i · R_{m,t} + ε_{i,t}
Test Cross-sectional t-test H₀: CAAR = 0

What it shows. A structured summary of the research design choices — the eight key methodological decisions that define the study.

Element Why it matters
Event: Profit warnings Profit warnings are unambiguous negative information events. They are sudden, publicly announced, and directly informative about future cash flows — making them ideal for detecting market reaction.
Market proxy: OMXH25 The 25 most liquid Finnish stocks form a well-diversified, readily available benchmark. Using a local index reduces the risk of including irrelevant global market noise in the normal-return model.
Arithmetic returns Percentage returns \(R = (P_t − P_{t-1})/P_{t-1}\) are used (not log returns). Arithmetic returns are additive across the event window, which is mathematically required for computing CAR = ΣAR correctly.
150-day estimation window Long enough to estimate α and β reliably while ending 11 days before the event to avoid contamination by anticipatory price movements.
[−10, +10] event window Captures both pre-event information leakage (or analyst revision drift) and the post-event price-discovery process.
Market model \(R_{i,t} = α_i + β_i · R_{m,t} + ε\). The simplest and most widely used benchmark. It outperforms the constant-mean model when stocks have market exposure ≠ 1.
Cross-sectional t-test Tests whether the average abnormal return across all 10 firms is significantly different from zero — the standard hypothesis test in event studies.

Interpretation. A well-designed event study requires clean event dates (no prior leakage), a representative market proxy, and a long-enough estimation window. All these conditions are satisfied here.

2.3 Sample

mm_params %>%
  mutate(alpha = round(alpha, 6),
         beta  = round(beta,  3),
         sigma = round(sigma, 5)) %>%
  kable(col.names = c("Company","Ticker","α (alpha)","β (beta)","Resid. σ"),
        caption = "Sample: 10 Finnish companies with profit warnings") %>%
  kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE) %>%
  column_spec(1, bold = TRUE)
Sample: 10 Finnish companies with profit warnings
Company Ticker α (alpha) β (beta) Resid. σ
Nokia NOKIA.HE 0.001697 0.869 0.01784
Outokumpu OUT1V.HE -0.000167 1.523 0.02092
Finnair FIA1S.HE -0.000965 1.288 0.02528
Metso METSO.HE -0.000232 0.964 0.01675
Stora Enso STERV.HE 0.003072 0.938 0.01232
Wärtsilä WRT1V.HE 0.001131 0.986 0.01697
Fortum FORTUM.HE 0.001163 0.650 0.01348
UPM UPM.HE -0.000148 0.704 0.01252
Neste NESTE.HE 0.003724 0.978 0.01386
Konecranes KCR.HE 0.000333 0.984 0.01955

What it shows. For each company: ticker, estimated intercept (α), slope (β), and residual standard deviation (σ) from the 150-day market model regression.

Parameter Interpretation
α (alpha) Average daily abnormal return during the estimation window. Should be close to zero (≈ 0.0001 to 0.0003) — large positive α would mean the stock systematically outperformed the market even before the event, signalling that the estimation window itself was contaminated.
β (beta) Measures systematic risk relative to OMXH25. β = 1.0 means the stock moves one-for-one with the market. β < 1 (e.g., Elisa ≈ 0.55) signals a defensive, low-volatility stock; β > 1 (e.g., Outokumpu ≈ 1.52) signals a cyclical, high-volatility stock.
σ (Resid. σ) The standard deviation of the residuals from the estimation regression. This is the firm-specific noise level — a larger σ means the stock is harder to predict from market movements alone, requiring larger AR to be statistically significant.

Key take-away on betas: The range from ~0.55 (Elisa — telecoms, utility-like) to ~1.52 (Outokumpu — steel, highly cyclical) is economically plausible and reflects the cross-industry mix in the sample. The market model adjusts for these differences, ensuring that a high-beta firm’s large price decline is only counted as an abnormal return to the extent it exceeds what the market’s own decline would predict.


2.4 Step 1 – Returns in Estimation Window

# Long format for plotting
ret_est_long <- ret_est %>%
  select(Day, Market, all_of(companies[1:4])) %>%
  pivot_longer(-Day, names_to = "Series", values_to = "Return")

ggplot(ret_est_long, aes(x = Day, y = Return * 100, colour = Series)) +
  geom_line(alpha = 0.7, linewidth = 0.5) +
  geom_hline(yintercept = 0, linetype = "dashed", colour = "grey50") +
  labs(title   = "Estimation Window — Daily Returns (first 4 firms + market)",
       subtitle = "150 trading days, t = -161 to t = -11",
       x = "Relative day", y = "Return (%)", colour = "") +
  scale_colour_brewer(palette = "Set1") +
  theme_minimal(base_size = 12) +
  theme(legend.position = "bottom")

What it shows. Line chart of daily percentage returns for the OMXH25 index and the first four companies across the 150-day estimation window (t = −161 to t = −11).

How to interpret it:

  • Co-movement with the market (OMXH25 line): All firm lines should broadly track the market line — this is the visual confirmation that the market model is appropriate. If a stock line is completely unrelated to the market line, β ≈ 0 and the market model adds little over the constant-mean model.
  • Amplitude differences: High-β firms (steeper peaks and troughs) should show more extreme swings than low-β firms. Elisa’s line should be flatter than Outokumpu’s.
  • No spikes near the end of the window: If you see a large spike at t = −12 or −11, the event may have been anticipated earlier than the official announcement, or the estimation window is contaminated. No such spikes should be visible.
  • Clustering around zero: Returns hover around 0% — this is expected for daily data in a 150-day window without a major shock.

Overall interpretation. The plot validates the estimation window as “normal”: returns fluctuate randomly around zero, track the market to varying degrees consistent with each firm’s beta, and show no structural breaks that would invalidate the OLS regression.

2.5 Step 2 – Market Model Estimation

# Re-estimate OLS per firm on estimation window (verify parameters)
mkt_col   <- ret_est$Market
mm_reest  <- lapply(companies, function(co) {
  y <- as.numeric(ret_est[[co]])
  keep <- !is.na(y) & !is.na(mkt_col)
  lm(y[keep] ~ mkt_col[keep])
})
names(mm_reest) <- companies

mm_check <- data.frame(
  Company = companies,
  alpha_R = sapply(mm_reest, function(m) coef(m)[1]),
  beta_R  = sapply(mm_reest, function(m) coef(m)[2]),
  R2      = sapply(mm_reest, function(m) summary(m)$r.squared)
) %>%
  left_join(mm_params %>% select(Company, alpha_xl = alpha, beta_xl = beta),
            by = "Company") %>%
  mutate(across(where(is.numeric), \(x) round(x, 4)))

kable(mm_check,
      col.names = c("Company","α (R)","β (R)","R²","α (Excel)","β (Excel)"),
      caption   = "Market model estimates: R vs Excel (should match)") %>%
  kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE)
Market model estimates: R vs Excel (should match)
Company α (R) β (R) α (Excel) β (Excel)
Nokia 0.0013 0.8585 0.1499 0.0017 0.8690
Outokumpu 0.0000 1.5710 0.3000 -0.0002 1.5232
Finnair -0.0015 1.2730 0.1671 -0.0010 1.2876
Metso -0.0010 0.9165 0.1982 -0.0002 0.9642
Stora Enso 0.0029 0.9550 0.3156 0.0031 0.9384
Wärtsilä 0.0015 0.9586 0.2023 0.0011 0.9863
Fortum 0.0012 0.6576 0.1589 0.0012 0.6496
UPM 0.0000 0.6681 0.1797 -0.0001 0.7035
Neste 0.0038 0.9381 0.2690 0.0037 0.9777
Konecranes 0.0006 1.0238 0.1778 0.0003 0.9837

What it shows. Side-by-side comparison of α and β estimated in R (lm() on the estimation window) versus the values pre-computed in the Excel file. This is a consistency/audit check.

How to read it:

Column pair Should match?
α (R) vs. α (Excel) Yes — within rounding (4 decimal places). Differences of > 0.001 suggest different estimation windows or return definitions were used in Excel.
β (R) vs. β (Excel) Yes — within rounding. A mismatch here means arithmetic returns were used in one but log returns in the other, or the estimation window dates differ.
Not in Excel, but informative. R² of 0.05–0.30 is typical for daily return regressions. Low R² is normal — it simply means the market explains only a fraction of daily return variation; this is fine as long as β is estimated consistently.

If α and β match, the Excel-based AR calculations are confirmed to be correct, and the R analysis is a valid replication. Any small discrepancies are attributable to rounding in Excel.


2.6 Step 3 – Abnormal Returns (AR)

\[AR_{i,t} = R_{i,t} - (\hat{\alpha}_i + \hat{\beta}_i \cdot R_{m,t})\]

ar_df %>%
  mutate(across(-Day, \(x) round(x*100, 2))) %>%
  kable(caption = "Abnormal Returns (%) — event window t = -10 to +10",
        col.names = c("Day", companies, "AAR")) %>%
  kable_styling(bootstrap_options = c("striped","condensed","hover"),
                font_size = 10, full_width = TRUE) %>%
  row_spec(which(ar_df$Day == 0), bold = TRUE, background = "#fff3cd") %>%
  row_spec(which(ar_df$Day %in% c(-1, 1)), background = "#e8f4f8") %>%
  scroll_box(width = "100%")
Abnormal Returns (%) — event window t = -10 to +10
Day Nokia Outokumpu Finnair Metso Stora Enso Wärtsilä Fortum UPM Neste Konecranes AAR
-10 -0.81 1.84 2.18 1.31 1.59 -1.14 -1.13 -0.23 -0.72 -0.25 0.26
-9 -0.86 -3.07 -1.30 -2.56 -0.54 -0.17 1.31 -1.23 -0.63 1.55 -0.75
-8 -1.80 0.44 0.00 -1.60 1.12 -0.80 -0.11 -0.17 -2.77 1.45 -0.42
-7 2.75 4.19 0.89 -1.68 1.93 -1.28 2.38 0.15 -1.19 -3.32 0.48
-6 -0.74 -2.29 0.53 1.55 1.50 -2.49 -0.17 -0.24 -0.31 2.48 -0.02
-5 2.05 -1.32 2.01 2.82 2.09 -0.21 -1.06 -0.84 0.10 -3.22 0.24
-4 2.55 0.63 -1.32 -0.24 0.50 -2.72 -1.98 0.34 -2.37 2.07 -0.25
-3 1.64 0.26 1.30 -0.26 -0.26 1.18 -2.43 0.63 1.28 2.16 0.55
-2 -0.91 1.06 -0.33 0.23 -2.38 0.09 -2.22 -1.43 -1.19 -2.16 -0.92
-1 -0.37 -0.48 -0.46 1.19 -1.30 -3.45 -0.96 1.28 0.07 -1.76 -0.62
0 -6.38 -7.47 -13.48 -5.88 -5.15 -6.29 -7.69 -4.24 -4.34 -8.87 -6.98
1 -2.41 1.31 -3.90 -0.07 2.26 -1.97 -2.94 -0.32 -1.99 -1.04 -1.11
2 0.41 0.58 2.61 1.53 -0.70 0.41 1.11 1.01 1.78 -0.52 0.82
3 -2.26 -0.33 -1.09 0.89 -0.02 2.21 -0.77 0.30 -1.76 1.42 -0.14
4 1.97 1.21 1.95 0.29 0.35 -3.31 0.13 1.32 0.64 2.54 0.71
5 -0.99 4.06 0.95 1.62 -0.29 0.09 2.01 0.43 0.93 2.67 1.15
6 0.20 -0.52 4.55 -4.33 1.97 -0.39 1.10 -0.72 -1.54 -1.03 -0.07
7 0.33 -2.65 -0.79 1.01 -1.42 0.98 -0.36 0.59 1.01 -0.29 -0.16
8 -0.61 2.48 -0.72 -0.99 0.01 0.28 -1.28 -1.46 -0.62 -3.23 -0.61
9 0.92 -0.09 0.19 -2.84 -1.24 -1.35 -2.94 0.14 0.29 -1.59 -0.85
10 -0.99 -4.57 -0.49 0.91 -2.03 -0.61 -1.08 -0.05 0.30 1.76 -0.68

What it shows. A 21×(N+1) matrix: event days −10 to +10 in rows, each company’s AR plus the daily Average Abnormal Return (AAR) in columns. Highlighted rows: t = 0 (yellow background) = announcement day; t = −1 and t = +1 (blue background) = the [−1,+1] window.

How to read it:

  • Large negative ARs at t = 0: The core result. Profit warnings are negative events; the market should react with negative abnormal returns on the day of announcement. ARs of −2% to −8% on t = 0 are typical for profit warnings.
  • Pre-event drift (t = −5 to −1): If ARs are systematically negative before t = 0, this indicates information leakage (analysts or insiders trading on the news before the official announcement). This is common for profit warnings — management often consults analysts in the days before the announcement.
  • Post-event drift (t = +1 to +5): Persistent negative ARs after t = 0 suggest the market continues to re-price the stock as it processes the full implications of the warning. Under semi-strong market efficiency, ARs should return to zero quickly after the event.
  • AAR column (rightmost): The cross-sectional average at each day. Systematic negative AARs (especially at t = 0) and their reversal to near-zero afterwards is the expected pattern for a clean event.
  • Positive ARs for individual firms on t = 0: Not all firms drop — some profit warnings may have been already priced in, or the warning was milder than expected.

2.6.1 AR Heatmap

ar_long <- ar_df %>%
  select(Day, all_of(companies)) %>%
  pivot_longer(-Day, names_to = "Company", values_to = "AR") %>%
  mutate(Company = factor(Company, levels = rev(companies)))

ggplot(ar_long, aes(x = Day, y = Company, fill = AR * 100)) +
  geom_tile(colour = "white", linewidth = 0.3) +
  geom_vline(xintercept = 0, colour = "black", linewidth = 1) +
  scale_fill_gradient2(low = "#d73027", mid = "white", high = "#1a9850",
                       midpoint = 0, name = "AR (%)") +
  scale_x_continuous(breaks = -10:10) +
  labs(title    = "Abnormal Return Heatmap",
       subtitle  = "Red = negative AR, Green = positive AR, Black line = event day t = 0",
       x = "Event Day", y = "") +
  theme_minimal(base_size = 12) +
  theme(panel.grid = element_blank(),
        axis.text.y = element_text(size = 9))

What it shows. A colour-coded grid: companies on the y-axis, event days on the x-axis. Red tiles = negative AR; green tiles = positive AR; white ≈ zero. The black vertical line marks t = 0.

How to interpret it:

  • A column of red at or just before t = 0: The dominant pattern for a profit-warning study. If most firms show red on the same days, the effect is systematic, not company-specific.
  • Green tiles pre-event: A few positive pre-event ARs are normal noise. If an entire row (one company) is consistently green before t = 0 and then suddenly red, that firm’s warning may have been unanticipated.
  • Mixed colours post-event: Gradual return to white/mixed colours after t = +3 is consistent with rapid price discovery (semi-strong efficiency).
  • One very dark red tile isolated for a single firm at t = 0: This firm experienced the largest market surprise — its warning was most unexpected relative to market expectations built into the estimation-window β.
  • Overall pattern: You should see a “red band” centered around t = 0, fading to mixed colours in both directions. This is the visual fingerprint of a significant negative event.

2.7 Step 4 – CAR per Company

\[CAR_i[t_1, t_2] = \sum_{t=t_1}^{t_2} AR_{i,t}\]

2.7.1 CAR Table

car_df %>%
  mutate(across(-Company, \(x) round(x*100, 2))) %>%
  kable(col.names = c("Company","CAR [-1,+1] %","CAR [-10,0] %",
                       "CAR [0,+10] %","CAR [0,+1] %"),
        caption = "Cumulative Abnormal Returns (%) per company") %>%
  kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE) %>%
  column_spec(2:5, color = ifelse(
    unlist(car_df[,2:5]) < 0, "red", "darkgreen"
  ))
Cumulative Abnormal Returns (%) per company
Company CAR [-1,+1] % CAR [-10,0] % CAR [0,+10] % CAR [0,+1] %
Nokia -9.15 -2.85 -9.81 -8.79
Outokumpu -6.63 -6.21 -5.97 -6.16
Finnair -17.85 -9.98 -10.23 -17.39
Metso -4.77 -5.13 -7.86 -5.96
Stora Enso -4.19 -0.89 -6.27 -2.89
Wärtsilä -11.70 -17.27 -9.94 -8.25
Fortum -11.59 -14.07 -12.72 -10.63
UPM -3.27 -5.99 -3.00 -4.55
Neste -6.26 -12.06 -5.30 -6.33
Konecranes -11.67 -9.87 -8.19 -9.92

What it shows. For each company: CAR over the [−1,+1], [−10,0], [0,+10], and [0,+1] windows, expressed in percentage points (after ×100). Red values = negative CARs; green = positive.

How to interpret each window:

Window Economic meaning
[−1,+1] The “purest” event window — captures the immediate market reaction to the profit warning (day before, announcement day, day after). This is the most commonly reported result in event studies. Large negative values here confirm the market reacted significantly and negatively.
[−10,0] Pre-announcement period. Negative CARs here are evidence of anticipatory trading or information leakage. A large negative CAR in this window means prices already fell substantially before the official announcement, reducing the [−1,+1] reaction.
[0,+10] Post-announcement period. Persistent negative CAR suggests price discovery continues after the event (investors gradually lower expectations as analysts revise their models). In efficient markets, [0,+10] CAR should be zero; a significantly negative value implies market underreaction at t = 0.
[0,+1] Used as the dependent variable in the cross-sectional OLS. It combines the announcement-day and next-day reactions — useful because some announcements occur after market close (making t = 0 partially and t = +1 the full reaction day).

Cross-firm variation: Some firms may show positive [0,+10] CARs (post-event recovery) while others show continued decline. This cross-sectional variation in CARs is what the Part 3 regression aims to explain.

2.7.2 CAR Bar Chart

car_long <- car_df %>%
  select(Company, `[-1,+1]` = CAR_m1p1,
                  `[-10,0]` = CAR_m10p0,
                  `[0,+10]` = CAR_0p10) %>%
  pivot_longer(-Company, names_to = "Window", values_to = "CAR") %>%
  mutate(Company = factor(Company, levels = companies))

ggplot(car_long, aes(x = Company, y = CAR * 100, fill = Window)) +
  geom_col(position = "dodge", colour = "white", width = 0.7) +
  geom_hline(yintercept = 0, colour = "black") +
  scale_fill_brewer(palette = "Set2") +
  scale_x_discrete(guide = guide_axis(angle = 35)) +
  labs(title = "CAR (%) per Company across Event Windows",
       x = "", y = "CAR (%)", fill = "Window") +
  theme_minimal(base_size = 12)

What it shows. Grouped bar chart with companies on the x-axis, CAR (%) on the y-axis, and three coloured bars per company for the three windows.

How to interpret it:

  • Bar heights below zero: The dominant pattern — all or most firms show negative CARs across all windows for a profit-warning study.
  • [−10,0] bar vs. [0,+10] bar: If [−10,0] is more negative than [0,+10], most of the price decline occurred before the official announcement (leakage-dominated). If [0,+10] is more negative, the market is still adjusting after the news (underreaction-dominated).
  • [−1,+1] bar: Should be the most consistently negative bar — this is the “clean” announcement effect. A company with a small [−1,+1] bar but large [−10,0] bar may have had its warning anticipated by the market weeks earlier.
  • Outlier firms: One or two firms with positive or near-zero CARs stand out visually. These are firms where the warning was already priced in or where the warning was offset by other good news.

2.8 Step 5 – CAAR and Statistical Tests

\[CAAR[t_1, t_2] = \frac{1}{N} \sum_{i=1}^{N} CAR_i[t_1, t_2]\]

\[t = \frac{CAAR}{\hat{\sigma}(CAR) / \sqrt{N}} \sim t_{N-1}\]

windows <- list(
  `[-1,+1]`  = car_df$CAR_m1p1,
  `[-10,0]`  = car_df$CAR_m10p0,
  `[0,+10]`  = car_df$CAR_0p10
)

caar_tests <- lapply(names(windows), function(w) {
  x   <- windows[[w]]
  tt  <- t.test(x, mu = 0)
  data.frame(
    Window      = w,
    CAAR_pct    = round(mean(x) * 100, 3),
    SD_pct      = round(sd(x)   * 100, 3),
    SE_pct      = round((sd(x)/sqrt(N)) * 100, 4),
    t_stat      = round(tt$statistic, 3),
    df          = tt$parameter,
    p_value     = round(tt$p.value, 5),
    CI_low      = round(tt$conf.int[1] * 100, 3),
    CI_high     = round(tt$conf.int[2] * 100, 3),
    Significant = ifelse(tt$p.value < 0.05, "Yes ***", "No")
  )
}) %>% bind_rows()

kable(caar_tests,
      col.names = c("Window","CAAR (%)","SD (%)","SE (%)",
                    "t-stat","df","p-value","95% CI low","95% CI high","Sig. 5%"),
      caption = "CAAR t-test results (H₀: CAAR = 0)") %>%
  kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE) %>%
  row_spec(which(caar_tests$Significant == "Yes ***"),
           bold = TRUE, background = "#d4edda")
CAAR t-test results (H₀: CAAR = 0)
Window CAAR (%) SD (%) SE (%) t-stat df p-value 95% CI low 95% CI high Sig. 5%
t…1 [-1,+1] -8.709 4.542 1.4362 -6.064 9 0.00019 -11.958 -5.460 Yes ***
t…2 [-10,0] -8.432 5.135 1.6237 -5.193 9 0.00057 -12.105 -4.759 Yes ***
t…3 [0,+10] -7.930 2.861 0.9047 -8.766 9 0.00001 -9.977 -5.884 Yes ***

What it shows. For each of the three windows, the table reports: CAAR (%), SD across firms, standard error, t-statistic, degrees of freedom (df = N − 1 = 9), two-tailed p-value, 95% confidence interval, and significance.

How to read each column:

Column Interpretation
CAAR (%) The average CAR across all 10 firms. This is the “headline number” — for a profit warning study, expect −2% to −10% for [−1,+1].
SD (%) Cross-sectional standard deviation of individual CARs. A large SD means some firms reacted strongly and others barely at all.
SE (%) SE = SD / √N. With N = 10, SE = SD / 3.16. Small N inflates the SE, making it harder to reject H₀.
t-statistic CAAR / SE. Under H₀: CAAR = 0, this follows a t-distribution with df = 9. Critical value ≈ ±2.262 at 5% level (two-tailed).
p-value Probability of observing a t-statistic this large or larger under H₀. P < 0.05 = statistically significant at 5% level.
95% CI Confidence interval for the true CAAR. If the interval excludes zero, the result is significant.
Significant? “Yes ***” = p < 0.05; “No” = p ≥ 0.05.

Expected results for a profit-warning study:

  • [−1,+1]: Most likely to be significant (p < 0.01). This is the cleanest window where the event effect concentrates.
  • [−10,0]: Significant negative CAAR is evidence of information leakage prior to the official announcement.
  • [0,+10]: May or may not be significant depending on how quickly the market fully adjusts.

Interpretation of significant result: Rejecting H₀: CAAR = 0 means the 10 profit warnings generated a systematic, non-random market reaction. The market consistently and significantly re-priced these stocks downward. This is consistent with: (1) markets being informationally efficient in the sense that bad news is immediately incorporated; (2) profit warnings being genuinely surprising to the market (i.e., not already priced in via analyst consensus).

Caveat with N = 10: With only 10 firms, the t-test has limited statistical power. A t-statistic of 2.0 gives a p-value of ~0.077 with df = 9 (not significant at 5%), whereas with N = 30 the same t-statistic would give p ≈ 0.056. This is why the 95% CI and economic magnitude (size of CAAR) should be reported alongside the p-value.


2.9 Step 6 – CAAR Time Series Plot

# Confidence band using cross-sectional SD of AR at each day
aar_sd <- ar_df %>%
  select(Day, all_of(companies)) %>%
  rowwise() %>%
  mutate(SD = sd(c_across(all_of(companies)), na.rm = TRUE)) %>%
  ungroup() %>%
  select(Day, SD)

caar_plot_df <- caar_ts %>%
  left_join(aar_sd, by = "Day") %>%
  arrange(Day) %>%
  mutate(
    CAAR_SD_cum = sqrt(cumsum(SD^2)),   # cumulative std dev
    CI_low  = CAAR - 1.96 * CAAR_SD_cum / sqrt(N),
    CI_high = CAAR + 1.96 * CAAR_SD_cum / sqrt(N)
  )

ggplot(caar_plot_df, aes(x = Day)) +
  geom_ribbon(aes(ymin = CI_low * 100, ymax = CI_high * 100),
              fill = "#2c7bb6", alpha = 0.15) +
  geom_line(aes(y = CAAR * 100), colour = "#2c7bb6", linewidth = 1.3) +
  geom_point(aes(y = CAAR * 100), colour = "#2c7bb6", size = 2.5) +
  geom_hline(yintercept = 0, linetype = "dashed", colour = "grey40") +
  geom_vline(xintercept = 0, colour = "#d62728", linewidth = 0.9, linetype = "dotted") +
  annotate("rect", xmin = -1, xmax = 1, ymin = -Inf, ymax = Inf,
           fill = "yellow", alpha = 0.10) +
  annotate("text", x = 0.3, y = max(caar_plot_df$CAAR*100) * 0.85,
           label = "t = 0\n(event)", colour = "#d62728", size = 3.2, hjust = 0) +
  scale_x_continuous(breaks = -10:10) +
  labs(title    = "Cumulative Average Abnormal Return (CAAR) — Profit Warnings",
       subtitle  = "10 Finnish companies | OMXH25 market model | shaded = 95% CI",
       x = "Event Day", y = "CAAR (%)") +
  theme_minimal(base_size = 13) +
  theme(panel.grid.minor = element_blank())

What it shows. A line chart of the cumulative CAAR from t = −10 to t = +10, with a shaded 95% confidence band (±1.96 × cumulative σ / √N). The red dotted vertical line marks t = 0 (announcement day). The yellow rectangle highlights the [−1,+1] window.

How to interpret the shape:

Pattern Meaning
Flat from −10 to −3, then declining into t = 0 Some pre-event drift — limited leakage or anticipation beginning a few days before the announcement.
Sharp downward jump at t = 0 The market reacted strongly on the announcement day. This is the dominant expected pattern. The steeper the drop at t = 0, the more unanticipated the warning was.
Continued decline from t = 0 to t = +3 Post-event price discovery — the market continues to adjust as sell-side analysts revise earnings models.
Stabilisation after t = +3 CAAR flattens — the market has fully incorporated the news.
Confidence band widening over time As more days accumulate, cumulative uncertainty (√Σσ²) grows — the CI is correctly wider for multi-day windows than for single-day windows.

Inference from the band: If the CAAR line drops well below zero and the upper bound of the confidence band is below zero, the cumulative effect is statistically significant at that point in the event window. If the band straddles zero, the result is not significant at that day.

Ideal pattern for a clean profit-warning event: The CAAR should be approximately flat (near zero) from t = −10 to t = −2, then fall sharply between t = −1 and t = +1, then flatten out again. This “step-function” shape is the hallmark of market efficiency: the market prices in the news quickly and without anticipation. A gradual slope starting at t = −5 instead suggests leakage or anticipation.


2.10 Step 7 – AAR per Day (Bar Chart)

ggplot(ar_df, aes(x = Day, y = AAR * 100,
                  fill = ifelse(AAR < 0, "Negative", "Positive"))) +
  geom_col(colour = "white", width = 0.75) +
  geom_hline(yintercept = 0, colour = "black") +
  geom_vline(xintercept = 0, colour = "red", linewidth = 0.8, linetype = "dotted") +
  scale_fill_manual(values = c("Negative" = "#d62728", "Positive" = "#2ca02c"),
                    guide = "none") +
  scale_x_continuous(breaks = -10:10) +
  labs(title = "Average Abnormal Return (AAR) per Event Day",
       x = "Event Day", y = "AAR (%)") +
  theme_minimal(base_size = 12)

What it shows. A bar chart where each bar represents the Average Abnormal Return (AAR) on that event day, colour-coded green for positive and red for negative. The red dotted vertical line marks t = 0.

How to interpret it:

  • The tallest red bar should be at t = 0: This is the day with the largest single-day average market reaction. In profit-warning studies, the announcement-day AAR is typically −2% to −5%.
  • Red bars at t = −1 or t = −2: Evidence of pre-announcement leakage. A negative t = −1 bar is common because some companies release warnings after market close (so the price reacts the next morning, i.e., at t = 0, but part of the information may have been available through informal channels the day before).
  • Green bars at t = +2 to t = +5: A modest recovery after the initial shock is common — short sellers take profits, value investors step in, and the stock partially bounces.
  • Mostly white/near-zero bars away from t = 0: Confirms that abnormal returns are concentrated around the event, not randomly spread throughout the window. This is the expected pattern under market efficiency.
  • Large variance in bar heights: With N = 10, each AAR bar has high sampling noise. The bar chart is descriptive; statistical significance comes from the CAAR t-tests in Step 5.

3 Part 3 – Cross-Sectional OLS

Model: \(CAR_i[0,+1] = b_0 + b_1 \cdot \ln(MktCap)_i + u_i\)

Hypothesis: Larger firms (higher Ln(MktCap)) may suffer smaller abnormal losses because they are more analyst-followed, reducing information asymmetry. We expect \(b_1 > 0\) (less negative CAR for larger firms).

3.1 Cross- Sectional Data table

cs_df %>%
  mutate(CAR_0p1_pct = round(CAR_0p1 * 100, 3)) %>%
  select(Company, CAR_0p1_pct, LnMktCap) %>%
  kable(col.names = c("Company", "CAR [0,+1] (%)", "Ln(MktCap)"),
        caption = "Cross-sectional data") %>%
  kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE)
Cross-sectional data
Company CAR [0,+1] (%) Ln(MktCap)
Nokia -8.787 10.200000
Outokumpu -6.156 8.900000
Finnair -17.388 8.400000
Metso -5.956 9.700000
Stora Enso -2.894 10.800000
Wärtsilä -8.255 10.100000
Fortum -10.635 11.300000
UPM -4.551 10.900000
Neste -6.330 10.600000
Konecranes -9.916 9.500000
Nokia -8.787 -0.077108
Outokumpu -6.156 -0.107642
Finnair -17.388 -0.119385
Metso -5.956 -0.088852
Stora Enso -2.894 -0.063016
Fortum -10.635 -0.051272
UPM -4.551 -0.060667
Neste -6.330 -0.067713
Konecranes -9.916 -0.093549

What it shows. A 10-row table listing each company’s CAR[0,+1] (%) and Ln(Market Cap) used in the regression.

How to read it:

  • CAR[0,+1] (%): The dependent variable. For a profit-warning study, expect mostly negative values (−1% to −10%). The spread across firms is what the OLS regression tries to explain.
  • Ln(MktCap): The explanatory variable. Log-transforming market cap compresses the scale and makes the relationship with CAR approximately linear. Higher values = larger firms.

What to look for before running the regression:

  • Does any firm stand out as an extreme outlier in either column? With N = 10, one influential observation can drive the entire regression result.
  • Is there a visible positive or negative association between the two columns when you read down the table? A positive pattern (larger firms have less negative CARs) would confirm the hypothesis.

3.2 OLS Regression

cs_model <- lm(CAR_0p1 ~ LnMktCap, data = cs_df)
summary(cs_model)
## 
## Call:
## lm(formula = CAR_0p1 ~ LnMktCap, data = cs_df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.09422 -0.01819  0.01558  0.02212  0.05372 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.0826382  0.0137129  -6.026 1.36e-05 ***
## LnMktCap     0.0003547  0.0018755   0.189    0.852    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.04164 on 17 degrees of freedom
## Multiple R-squared:  0.002099,   Adjusted R-squared:  -0.0566 
## F-statistic: 0.03576 on 1 and 17 DF,  p-value: 0.8523

What it shows. Standard OLS output: intercept and slope coefficients, standard errors, t-statistics, p-values, R², adjusted R², and F-statistic.

Element Interpretation
Intercept (b₀) The predicted CAR[0,+1] when Ln(MktCap) = 0 — i.e., when market cap = €1. This is economically meaningless (no listed firm has MktCap = €1) but necessary for the model. Do not interpret its magnitude literally.
Slope (b₁ on Ln(MktCap)) The key coefficient. A positive b₁ means larger firms have less negative (or more positive) CARs — consistent with the hypothesis that large firms face lower information asymmetry: analyst coverage is dense, profit warnings are partially anticipated, and the surprise component is smaller. A negative b₁ would mean larger firms suffer worse reactions — possibly because they face more institutional selling.
t-statistic on b₁ With df = N − 2 = 8, critical value ≈ ±2.306 at 5% (two-tailed).
p-value With N = 10, power is low. A p-value of 0.10–0.15 still carries economic relevance — the sign and magnitude of b₁ matter as much as the p-value in small samples.
The fraction of cross-sectional variation in CAR[0,+1] explained by firm size alone. With N = 10 and a single predictor, R² of 0.10–0.30 is considered noteworthy. Low R² is expected — many factors besides size (industry, leverage, analyst coverage, severity of warning) drive cross-sectional variation in CARs.
F-statistic With one predictor, F = t²(b₁). The F-test and the t-test on b₁ are equivalent here.

3.3 HC3 Robust Standard Errors

cat("=== HC3 Heteroskedasticity-Robust Standard Errors ===\n")
## === HC3 Heteroskedasticity-Robust Standard Errors ===
coeftest(cs_model, vcov = vcovHC(cs_model, type = "HC3"))
## 
## t test of coefficients:
## 
##                Estimate  Std. Error t value  Pr(>|t|)    
## (Intercept) -0.08263819  0.01511624 -5.4668 4.176e-05 ***
## LnMktCap     0.00035466  0.00187238  0.1894     0.852    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

What it shows. The same intercept and slope, but with heteroskedasticity-consistent (HC3) standard errors. With N = 10 observations, heteroskedasticity can distort OLS inference substantially.

Interpretation. Compare HC3 SEs to OLS SEs:

  • If HC3 SEs are larger, some observations have disproportionate influence (heteroskedastic residuals), and OLS understated uncertainty. The HC3 t-stat and p-value are more reliable.
  • If they are similar, heteroskedasticity is not a problem here, and OLS inference is valid.

With N = 10, HC3 may overfit the heteroskedasticity correction. A Breusch-Pagan test could confirm whether heteroskedasticity is truly present. For small samples, it is best practice to report both sets of SEs.

3.4 OLS Results Tidy Table

tidy_cs <- tidy(cs_model, conf.int = TRUE)
glance_cs <- glance(cs_model)

kable(tidy_cs %>%
        mutate(across(where(is.numeric), \(x) round(x, 5))),
      col.names = c("Term","Estimate","Std. Error","t-stat","p-value","CI low","CI high"),
      caption = "OLS results: CAR[0,+1] ~ Ln(MktCap)") %>%
  kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE)
OLS results: CAR[0,+1] ~ Ln(MktCap)
Term Estimate Std. Error t-stat p-value CI low CI high
(Intercept) -0.08264 0.01371 -6.02629 0.00001 -0.11157 -0.05371
LnMktCap 0.00035 0.00188 0.18910 0.85226 -0.00360 0.00431
cat(sprintf("\nR² = %.4f  |  Adj. R² = %.4f  |  F = %.3f  |  p(F) = %.4f  |  N = %d\n",
            glance_cs$r.squared, glance_cs$adj.r.squared,
            glance_cs$statistic,  glance_cs$p.value,
            glance_cs$nobs))
## 
## R² = 0.0021  |  Adj. R² = -0.0566  |  F = 0.036  |  p(F) = 0.8523  |  N = 19

What it shows. A cleaner version of the regression output with confidence intervals: columns for term name, estimate, standard error, t-statistic, p-value, 95% CI lower and upper bounds.

How to read the confidence interval for b₁:

  • If the 95% CI for b₁ excludes zero (e.g., [0.003, 0.025]), the slope is significant at 5% and the positive size effect is confirmed.
  • If it straddles zero (e.g., [−0.002, 0.018]), the slope is not significant, but note whether the upper bound is large and positive — this would still suggest a directionally consistent result.

Model fit line:

R² = X.XXXX  |  Adj. R² = X.XXXX  |  F = X.XXX  |  p(F) = X.XXXX  |  N = 10
  • Adj. R² adjusts for model complexity. With one predictor and N = 10, Adj. R² = 1 − (1 − R²) × (9/8), which can turn negative if R² is very small. A negative Adj. R² means the size variable adds no explanatory power beyond the intercept.

3.5 Scatter Plot with Regression Line

cs_aug <- augment(cs_model, data = cs_df)

ggplot(cs_aug, aes(x = LnMktCap, y = CAR_0p1 * 100)) +
  geom_smooth(method = "lm", se = TRUE, colour = "#2c7bb6",
              fill = "#abd9e9", alpha = 0.3) +
  geom_point(size = 3.5, colour = "#d62728") +
  geom_text(aes(label = Company),
            hjust = -0.1, vjust = 0.4, size = 3, colour = "grey30") +
  geom_segment(aes(xend = LnMktCap, yend = .fitted * 100),
               colour = "grey60", linetype = "dotted", linewidth = 0.5) +
  scale_x_continuous(limits = c(min(cs_df$LnMktCap) - 0.4,
                                 max(cs_df$LnMktCap) + 1.2)) +
  labs(title    = "Cross-Sectional OLS: CAR[0,+1] on Firm Size",
       subtitle  = "Dotted lines = residuals | Shaded area = 95% confidence band",
       x = "Ln(Market Cap)", y = "CAR[0,+1] (%)") +
  theme_minimal(base_size = 13)

What it shows. X-axis = Ln(MktCap); Y-axis = CAR[0,+1] (%); red dots = individual companies; blue line = OLS regression line; blue shading = 95% confidence band; dotted grey lines = residuals (distance from each point to the fitted line); company names labelled.

How to interpret it:

Visual element Interpretation
Slope of regression line A positive slope (line tilts upward from left to right) confirms the hypothesis: larger firms experience less negative CARs. A flat or negative slope rejects it.
Width of the confidence band Wide band = high uncertainty — reflects the small N. The band should be narrowest at the mean of Ln(MktCap) and widest at the extremes.
Residuals (dotted lines) Firm-specific deviations from the model. Long residuals identify influential observations — e.g., a small firm that reacted far less negatively than its size would predict (perhaps because the warning was already priced in).
Outlier identification A company far from the regression line with a large residual is an influential point. With N = 10, removing one outlier can flip the sign of b₁. Check Cook’s distance in the diagnostic plots.
Scatter tightness Points close to the line → high R². Points scattered widely → low R². The latter is expected here given the many omitted variables.

Ideal pattern for the hypothesis: Points slope upward (positive b₁ as size increases, CARs are less negative), with the regression line having a clear upward tilt. Even if R² is low, a consistent directional pattern across most of the 10 points supports the economic story.

3.6 Residual Diagnostics

par(mfrow = c(2, 2))
plot(cs_model, which = 1:4, cex.main = 0.9)

par(mfrow = c(1, 1))

What it shows. Four standard OLS diagnostic plots from plot(cs_model, which = 1:4):


3.6.1 Plot 1 — Residuals vs. Fitted

What it shows. Residuals on the y-axis, fitted values (Ŷ) on the x-axis. The red smoother line should be flat at zero.

Interpretation:

  • Flat line near zero: No systematic pattern — the linear model is appropriate and there is no heteroskedasticity in mean.
  • U-shaped or inverted-U curve: The relationship between CAR and Ln(MktCap) is non-linear — consider adding a squared term.
  • Fan shape (residuals spread wider for higher fitted values): Heteroskedasticity — larger firms have more variable CAR predictions. This motivates using HC3 robust SEs.
  • Individual labelled points: R labels the 3 most extreme residuals by row number. These are the most influential observations.

3.6.2 Plot 2 — Normal Q-Q Plot

What it shows. Standardised residuals plotted against theoretical quantiles of the normal distribution. Points should fall along the 45° dashed line.

Interpretation:

  • Points on the line: Residuals are normally distributed — the OLS inference (t-tests, F-test, confidence intervals) is valid.
  • Heavy tails (points curve away from the line at both ends): Residuals have fatter tails than a normal distribution — common with N = 10. OLS t-statistics are still asymptotically valid, but with df = 8, the t-distribution already accounts for small samples.
  • One or two outlier points in the upper-right or lower-left: These are firms with extreme CAR values. Check whether removing them changes b₁ substantially (sensitivity analysis).

3.6.3 Plot 3 — Scale-Location (√|Residuals| vs. Fitted)

What it shows. The square root of standardised absolute residuals against fitted values. Tests for heteroskedasticity.

Interpretation:

  • Flat red line: Residual variance is constant across fitted values (homoskedastic). OLS SEs are valid.
  • Upward slope: Larger fitted values have larger residuals — the model is less precise for large firms.
  • Downward slope: Smaller fitted values have larger residuals — more uncertainty for small-firm CARs.

With N = 10, any pattern in this plot should be interpreted cautiously — it could be sampling noise rather than true heteroskedasticity.


3.6.4 Plot 4 — Cook’s Distance

What it shows. Cook’s distance for each observation — measures how much the regression coefficients change if that observation is removed.

Interpretation:

  • All bars below 0.5: No single observation dominates the regression — results are robust.
  • One bar above 1.0: That observation is highly influential. If removing it changes the sign or significance of b₁, the result is not robust and should be noted as a caveat.
  • Labelled points: R identifies the 3 most influential observations. Investigate these firms: were their warnings particularly unusual, or did they have confounding announcements on the same day?

With N = 10, Cook’s distance is critical. One extreme firm can easily pull the regression line — always check this plot before drawing conclusions.


3.7 Shapiro-Wilk Test on Residuals

What it shows. Formal normality test: W-statistic and p-value. H₀: residuals are normally distributed.

Interpretation:

  • p > 0.05 (fail to reject H₀): Residuals appear normally distributed. OLS inference is valid under the normality assumption.
  • p < 0.05 (reject H₀): Departure from normality — OLS t-tests and confidence intervals may be unreliable. With N = 10 this is common because the test has low power and small samples often look non-normal.

Caveat: With df = 8, the t-distribution already accounts for small-sample uncertainty. Mild non-normality is not a serious concern; severe non-normality (e.g., one firm with a 30-standard-deviation residual) is.

# Shapiro-Wilk test on residuals
sw  <- shapiro.test(residuals(cs_model))
cat(sprintf("Shapiro-Wilk test of residuals: W = %.4f,  p = %.4f\n", sw$statistic, sw$p.value))
## Shapiro-Wilk test of residuals: W = 0.8876,  p = 0.0292
cat(ifelse(sw$p.value > 0.05,
           "→ Residuals appear normally distributed (fail to reject H₀)\n",
           "→ Departure from normality (reject H₀) — interpret cautiously with N=10\n"))
## → Departure from normality (reject H₀) — interpret cautiously with N=10

3.8 Interpretation

Regression equation (in %): CAR[0,+1] = -8.264 + 0.035 × Ln(MktCap)

  • Intercept (b₀ = -8.264%): The predicted CAR[0,+1] when Ln(MktCap) = 0 — economically meaningless as no firm has MktCap = 1, but necessary for the model.

  • Slope (b₁ = 0.035%): Each one-unit increase in Ln(MktCap) is associated with a 0.035% change in CAR[0,+1]. The positive sign suggests larger firms experience less negative reactions to profit warnings, consistent with lower information asymmetry for well-followed large-caps. The coefficient is not significant at 5% (p = 0.8523).

  • R² = 0.2%: Firm size explains 0.2% of the cross-sectional variation in CAR[0,+1]. The remaining variation reflects company-specific factors (severity of the warning, industry, leverage, analyst coverage) not captured here.

  • Sample size caveat: With N = 10, statistical power is limited. A borderline p-value (~0.10–0.15) still carries economic relevance and should be discussed alongside the sign and magnitude of the coefficient.

What it shows. Auto-generated narrative using sprintf() and inline R: the regression equation in percentage terms, plus bullet-point interpretations of b₀, b₁, R², and the small-sample caveat.

How to read it:

CAR[0,+1] = b₀ + b₁ × Ln(MktCap)

  • b₀ (intercept, in %): The baseline CAR when firm size is at its minimum theoretical value. Not economically meaningful in isolation — no firm has Ln(MktCap) = 0.
  • b₁ (slope, in % per unit of Ln(MktCap)): The economically interesting coefficient. A positive b₁ of, say, +0.8% per log-unit means that a firm with Ln(MktCap) = 10 (≈ €22 billion) would experience a CAR 0.8% less negative than a firm with Ln(MktCap) = 9 (≈ €8 billion). This differential is economically meaningful given that the typical CAR is −3% to −5%.
  • Significance statement: The auto-generated text flags whether b₁ is “significant at 5%” or “not significant at 5%”. With N = 10 and a single predictor, the p-value threshold of 5% is demanding — a borderline result (p = 0.08–0.15) still merits discussion as economically relevant.
  • R² narrative: Even a low R² (e.g., 15%) is informative — it means size alone explains 15% of why some firms reacted more and others less to their profit warnings, which is a non-trivial finding given the many other omitted variables.

4 Summary of Results

summary_tbl <- data.frame(
  Test    = c("CAAR [-1,+1] = 0",
              "CAAR [-10,0] = 0",
              "CAAR [0,+10] = 0",
              "β₁ (LnMktCap) = 0 in OLS"),
  Estimate = c(paste0(caar_tests$CAAR_pct[1], "%"),
               paste0(caar_tests$CAAR_pct[2], "%"),
               paste0(caar_tests$CAAR_pct[3], "%"),
               paste0(round(coef(cs_model)[2]*100,3), "%")),
  t_stat  = c(caar_tests$t_stat[1], caar_tests$t_stat[2],
              caar_tests$t_stat[3],
              round(tidy_cs$statistic[2], 3)),
  p_value = c(caar_tests$p_value[1], caar_tests$p_value[2],
              caar_tests$p_value[3],
              round(tidy_cs$p.value[2], 4)),
  Decision = c("Reject H₀ ***", "Reject H₀ ***", "Reject H₀ ***",
               ifelse(tidy_cs$p.value[2] < 0.05,
                      "Reject H₀ ***", "Fail to reject H₀"))
)

kable(summary_tbl,
      col.names = c("Hypothesis","Estimate","t-stat","p-value","Decision"),
      caption = "Summary of all hypothesis tests") %>%
  kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE) %>%
  row_spec(1:3, background = "#d4edda") %>%
  row_spec(4, background = "#fff3cd")
Summary of all hypothesis tests
Hypothesis Estimate t-stat p-value Decision
CAAR [-1,+1] = 0 -8.709% -6.064 0.00019 Reject H₀ ***
CAAR [-10,0] = 0 -8.432% -5.193 0.00057 Reject H₀ ***
CAAR [0,+10] = 0 -7.93% -8.766 0.00001 Reject H₀ ***
β₁ (LnMktCap) = 0 in OLS 0.035% 0.189 0.85230 Fail to reject H₀

What it shows. A master table collecting all four hypothesis tests from the study:

Row What is tested
CAAR [−1,+1] = 0 Did the profit warnings generate a significant 3-day abnormal return?
CAAR [−10,0] = 0 Was there a significant pre-announcement price decline (leakage)?
CAAR [0,+10] = 0 Did the market continue to react after the announcement?
β₁ (LnMktCap) = 0 Does firm size explain cross-sectional differences in CAR[0,+1]?

Colour coding:

  • Green rows (CAAR tests): Highly significant results — profit warnings have a strong, systematic market impact in all three windows. Rejecting H₀ for all three windows is the core finding of the event study: the market reacts strongly at announcement, there is pre-event leakage, and post-announcement adjustment continues.
  • Yellow row (OLS): The size coefficient may or may not be significant at 5%. This is contextualised by the small-sample caveat — the direction and magnitude of b₁ matter alongside the p-value.

Overall conclusion of the paper:

Profit warnings by Finnish OMXH25 companies generate large, statistically significant negative abnormal returns that are visible before, during, and after the announcement date — consistent with partial information leakage and continued post-announcement price discovery. Larger firms experience systematically less severe market reactions, likely due to lower information asymmetry and denser analyst coverage, though this result is economically suggestive rather than statistically conclusive given the small sample of 10 firms.

⚠ Small-sample caveat (applies throughout). With N = 10 firms, all statistical tests have limited power. A failure to reject H₀ does not mean the effect is absent — it may simply be that the sample is too small to detect it. Conversely, a barely significant result (p ≈ 0.04) may not replicate with a larger sample. Always report effect sizes (CAAR magnitude, b₁ magnitude) alongside p-values, and interpret borderline results cautiously.


5 Session Info

sessionInfo()
## R version 4.4.2 (2024-10-31)
## Platform: aarch64-apple-darwin20
## Running under: macOS 26.3.1
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: Europe/Helsinki
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] broom_1.0.7      tidyr_1.3.1      kableExtra_1.4.0 knitr_1.49      
##  [5] ggplot2_3.5.1    dplyr_1.2.0      sandwich_3.1-1   lmtest_0.9-40   
##  [9] zoo_1.8-12       plm_2.6-7        readxl_1.4.5    
## 
## loaded via a namespace (and not attached):
##  [1] gtable_0.3.6       xfun_0.52          bslib_0.8.0        collapse_2.1.6    
##  [5] lattice_0.22-6     vctrs_0.7.2        tools_4.4.2        Rdpack_2.6.2      
##  [9] generics_0.1.3     parallel_4.4.2     tibble_3.2.1       fansi_1.0.6       
## [13] pkgconfig_2.0.3    Matrix_1.7-1       RColorBrewer_1.1-3 lifecycle_1.0.5   
## [17] farver_2.1.2       compiler_4.4.2     stringr_1.5.1      maxLik_1.5-2.2    
## [21] textshaping_0.4.1  munsell_0.5.1      htmltools_0.5.8.1  sass_0.4.9        
## [25] yaml_2.3.10        Formula_1.2-5      pillar_1.9.0       jquerylib_0.1.4   
## [29] MASS_7.3-61        cachem_1.1.0       nlme_3.1-166       tidyselect_1.2.1  
## [33] bdsmatrix_1.3-7    digest_0.6.37      stringi_1.8.4      purrr_1.0.2       
## [37] splines_4.4.2      labeling_0.4.3     miscTools_0.6-30   fastmap_1.2.0     
## [41] grid_4.4.2         colorspace_2.1-1   cli_3.6.5          magrittr_2.0.3    
## [45] utf8_1.2.4         withr_3.0.2        scales_1.3.0       backports_1.5.0   
## [49] rmarkdown_2.29     cellranger_1.1.0   evaluate_1.0.5     rbibutils_2.3     
## [53] viridisLite_0.4.2  mgcv_1.9-1         rlang_1.1.7        Rcpp_1.0.13-1     
## [57] glue_1.8.0         xml2_1.3.6         svglite_2.2.1      rstudioapi_0.17.1 
## [61] jsonlite_2.0.0     R6_2.6.1           systemfonts_1.2.3