Replicating Rhodes-Kropf et al. (2005)

Data step

First, collect data on Fama-French industries for each firm-year (Rhodes-Kropf et al. [2005] use the twelve-industry classification).

library(RPostgreSQL)
## Loading required package: DBI
drv <- dbDriver("PostgreSQL")
pg<- dbConnect(drv, dbname = "crsp")

# Create a CRSP-Compustat link table that includes SIC data from CRSP and this linked in turn 
# to Fama-French (12) industries. Note that Compustat only has a "header" SIC code, whereas 
# the CRSP data can change for a given firm over time.
rs <- dbGetQuery(pg, "DROP TABLE IF EXISTS sics")
rs <- dbGetQuery(pg, "
  CREATE TEMP TABLE sics AS
  SELECT gvkey, permno, namedt, nameenddt, siccd, 
    (CASE WHEN ind_num IS NULL THEN 12 ELSE ind_num::integer END) AS ff_ind
  FROM
    (SELECT DISTINCT gvkey, lpermno AS permno, siccd, 
      min(namedt) AS namedt, max(nameenddt) AS nameenddt
    FROM crsp.ccmxpf_linktable AS a
    INNER JOIN crsp.stocknames AS b
    ON a.lpermno=b.permno
    WHERE USEDFLAG='1' AND namedt >= linkdt AND nameenddt <= linkenddt OR linkenddt IS NULL
    GROUP BY gvkey, lpermno, siccd
    ORDER BY permno, namedt, nameenddt) AS a
  LEFT JOIN ff.ind_12 AS b
  ON siccd BETWEEN b.sic_low AND b.sic_high")

Rhodes-Kropf et al. (2005) describe the following variable calculations:

Using Compustat,we calculate a variety of size, performance,and leverage ratios. Market value is CRSP market equity plus Compustat book assets (item6=AT) minus deferred taxes (item74=TXDB) minus book equity (item60=CEQ). In addition, we obtain the following size-related measures: Total Plant, Property, Equipment (item 8=PPENT), Total Cash (item 1=CHE), Long-term Debt (item 9=DLTT), capital expenditures (CAPEX) (item 128=CAPX), and Net Income (item172=NI). Return on assets and equity are calculated by dividing net income in year t by assets (item6=AT) or book equity (item60=CEQ) in year t-1.

For leverage measures, we obtain the Current Ratio (items 4/5=ACT/LCT), Quick Ratio [items (4 - 3)/5=(ACT-INVT)/LCT], market leverage (1 - market equity/market value), and book leverage (1 - book equity/total book assets).

Finally, the announcement and closing dates of mergers, the method of payment (when available), and a dummy for whether the merger was withdrawn were taken from SDC and merged to the data from Compustat and CRSP.

# Based on this, pull together data from Compustat and merge with CRSP SIC data
reg.data <- dbGetQuery(pg,
  "SELECT gvkey, datadate, fyear,  b.ff_ind, b.siccd, 
    prcc_f * csho + at - txdb - ceq AS mv, prcc_f * csho AS mve,
    at - txdb AS bv,
    ppent, che, dltt, capx, ni, ceq,
    CASE WHEN lag(at) OVER w > 0 THEN ni/lag(at) OVER w END AS roa, 
    CASE WHEN lag(ceq) OVER w > 0 THEN ni/lag(ceq) OVER w END AS roe,
    CASE WHEN lct > 0 THEN act/lct END AS cr, 
    CASE WHEN lct > 0 THEN (act-invt)/lct END AS qr, 
    CASE WHEN (prcc_f * csho + at - txdb - ceq) > 0 THEN
    1 - (prcc_f * csho)/(prcc_f * csho + at - txdb - ceq) END AS lev_mv,
    CASE WHEN at > 0 THEN 1 - ceq/at END AS lev_book
  FROM comp.funda AS a
  LEFT JOIN sics AS b
  USING (gvkey) 
  WHERE indfmt='INDL' AND consol='C' AND datafmt='STD' AND prcc_f IS NOT NULL 
    AND a.datadate BETWEEN b.namedt AND b.nameenddt
  WINDOW w AS (PARTITION BY gvkey ORDER BY datadate)")

Replication and plot

The following is a rough replication of Model 3 of Table 4. The coefficients are pretty similar, except for leverage (perhaps due to a different definition; Rhodes-Kropf et al. [2005] aren't too clear on how they calculated this). Also the \( R^2 \) values I'm getting are a bit higher, which can't be a bad thing. Note that I am including the main effect of the loss indicator (\( I_{NI<0} \)), though this doesn't actually affect the results much at all.

Industry Constant \( b \) \( \textbf{LEV} \) \( \log(\left|\mathbf{NI}\right|) \) \( I_{\mathrm{NI}< 0} \) \( I_{\mathrm{NI}< 0} \cdot \log(\left|\mathbf{NI}\right|) \) \( R^2 \)
1 1.74 0.68 -1.68 0.35 -0.01 -0.10 0.93
2 1.67 0.68 -1.44 0.33 -0.06 -0.08 0.94
3 1.78 0.68 -1.62 0.30 0.04 -0.06 0.92
4 1.66 0.74 -1.25 0.23 -0.15 0.01 0.92
5 1.93 0.65 -1.74 0.35 0.04 -0.06 0.95
6 2.22 0.61 -2.10 0.36 -0.04 -0.07 0.88
7 2.47 0.62 -1.49 0.30 -0.13 0.11 0.91
8 1.59 0.81 -1.56 0.18 -0.06 -0.04 0.97
9 2.01 0.67 -1.86 0.32 -0.07 -0.08 0.91
10 2.38 0.59 -1.96 0.37 -0.08 -0.11 0.90
11 1.91 0.58 -1.19 0.41 -0.11 -0.16 0.89
12 2.12 0.64 -1.81 0.31 -0.07 -0.06 0.87

Now plot the predicted values against actual values

library(ggplot2)
qplot(log(mve), predicted, data = predicted)

plot of chunk unnamed-chunk-1