Introduction

This document shows you how to calculate cluster robust standard errors in R for the the Fixed Effect Poisson Model. but this method will work with any maximum likelihood based estimation procedure. This particular presentation is useful for those individuals transitioning from STATA to R.

Reivew of Calculating Robust Standard Errors

First, we assume our dependent variable follows a known distribution such that \[y \sim f(y|\theta)\] Then our likelihood function becomes \[\Lambda=\prod_{i=1}^n f(y_{i}|\theta)\] The log-likelihood function is \[L=\sum_{i=1}^n log f(y_{i}|\theta)\] We can find the gradient by taking the first derrivative of the log likelihood function \[L'(\theta)=\sum_{i=1}^n g(y_{i}|\theta)\] where \[g(y_{i}|\theta)=\frac{d}{d\theta}log f(y_{i}|\theta)\]. The hessian matrix is the second derrivative of the log likelihood function. \[L''(\theta)=\sum_{i=1}^n h(y_{i}|\theta)\] where \[h(y_{i}|\theta)=\frac{d^2}{d\theta\theta'}log f(y_{i}|\theta)\]. We can use this information along with the taylor rule to find the score \[z=\hat{\theta}-\theta_{0}=[-L''(\theta)]^{-1}L'(\theta)^T\] Robust Standard Errors are then found by taking the outerproduct of the score \(zz'\). \[zz'=[-\sum_{i=1}^n h(y_{i}|\theta)]^{-1}[\sum_{i=1}^n g(y_{i}|\theta)g(y_{i}|\theta)^T][-\sum_{i=1}^n h(y_{i}|\theta)]^{-1}\]

You can see that the covariance matrix resembles a sandwitch where \[bread=[-\sum_{i=1}^n h(y_{i}|\theta)]^{-1}\] and \[meat=[\sum_{i=1}^n g(y_{i}|\theta)g(y_{i}|\theta)^T]\].

To calculate clustered standard errors, we use a similar process, but we must change the meat. Instead of suming the outerproduce over each individual, we first collapse the individuals by sum over each group before completing the outerproduct. Next, we complete the outerproduct for each group. \[\hat{zz'}=[-\sum_{i=1}^n h(y_{i}|\theta)]^{-1}[\sum_{j}\sum_{i\in c_{j}} g(y_{i}|\theta)g(y_{i}|\theta)^T][-\sum_{i=1}^n h(y_{i}|\theta)]^{-1}\] where c is the group identifier and j indexes the number of groups.

Below is an example from the STATA xtpoisson help file.

use http://www.stata-press.com/data/r13/ships
gen lnservice=ln( service)
xtpoisson accident op_75_79 co_65_69 co_70_74 co_75_79 lnservice
Conditional fixed-effects Poisson regression    Number of obs      =        34
Group variable: ship                            Number of groups   =         5

                                                Obs per group: min =         6
                                                               avg =       6.8
                                                               max =         7

                                                Wald chi2(5)       =     98.07
Log likelihood  = -54.196462                    Prob > chi2        =    0.0000

------------------------------------------------------------------------------
    accident |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    op_75_79 |   .3702566   .1181453     3.13   0.002     .1386961     .601817
    co_65_69 |   .6625407   .1536357     4.31   0.000     .3614202    .9636612
    co_70_74 |    .759738   .1776628     4.28   0.000     .4115253    1.107951
    co_75_79 |   .3697412    .245843     1.50   0.133    -.1121022    .8515845
   lnservice |   .9027077   .1018028     8.87   0.000      .703178    1.102237
------------------------------------------------------------------------------

Alternatively you could include the profile.do in your Markdown document by including a few lines in the preliminary code chunk that write out the file:

xtpoisson accident op_75_79 co_65_69 co_70_74 co_75_79 lnservice, fe r

Conditional fixed-effects Poisson regression    Number of obs      =        34
Group variable: ship                            Number of groups   =         5
                                                Obs per group: min =         6
                                                               avg =       6.8
                                                               max =         7
                                                Wald chi2(5)       =    368.23
Log pseudolikelihood  = -54.196462              Prob > chi2        =    0.0000
                                   (Std. Err. adjusted for clustering on ship)
------------------------------------------------------------------------------
          |               Robust
 accident |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------+-------------------------------------------------------------------
 op_75_79 |   .3702566   .0711164     5.21   0.000      .230871    .5096421
 co_65_69 |   .6625407   .0381431    17.37   0.000     .5877815    .7372999
 co_70_74 |    .759738   .1499585     5.07   0.000     .4658248    1.053651
 co_75_79 |   .3697412   .2103598     1.76   0.079    -.0425564    .7820387
lnservice |   .9027077   .1070874     8.43   0.000     .6928203    1.112595
------------------------------------------------------------------------------

To replicate these results in R we will use the Panel GLM package pglm.

Description: Estimation of panel models for glm-like models: this includes binomial models (logit and probit) count models (poisson and negbin) and ordered models (logit and probit) 

This package was created by Yves Croissant. The advantage of this package is that it provides the gradient and hessian matrix as outputs. We simply apply some matrix algrebra after runing the initial estimation.

library(pglm)
library(readstata13)
library(lmtest)
library(MASS)
ships<-readstata13::read.dta13("http://www.stata-press.com/data/r13/ships.dta")
ships$lnservice=log(ships$service)
res1 <- pglm(accident ~ op_75_79+co_65_69+co_70_74+co_75_79+lnservice,family = poisson, data = ships, effect = "individual", model="within", index = "ship")
summary(res1)
## --------------------------------------------
## Maximum Likelihood estimation
## Newton-Raphson maximisation, 3 iterations
## Return code 1: gradient close to zero
## Log-Likelihood: -54.19646 
## 5  free parameters
## Estimates:
##           Estimate Std. error t value  Pr(> t)    
## op_75_79    0.3703     0.1181   3.134  0.00172 ** 
## co_65_69    0.6625     0.1536   4.312 1.61e-05 ***
## co_70_74    0.7597     0.1777   4.276 1.90e-05 ***
## co_75_79    0.3697     0.2458   1.504  0.13259    
## lnservice   0.9027     0.1018   8.867  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## --------------------------------------------

This code above is equivalent to running xtpoisson accident op_75_79- lnservice, fe i(ship). Notice there are no robust standard errors. These results match the top panel of the STATA output. These standard errors are equivalent to using the inverse of the hessian matrix.

standard_se<-ginv(-res1$hessian)
coeftest(res1,standard_se)
## 
## z test of coefficients:
## 
##           Estimate Std. Error z value  Pr(>|z|)    
## op_75_79   0.37026    0.11815  3.1339  0.001725 ** 
## co_65_69   0.66254    0.15364  4.3124 1.615e-05 ***
## co_70_74   0.75974    0.17766  4.2763 1.900e-05 ***
## co_75_79   0.36974    0.24584  1.5040  0.132589    
## lnservice  0.90271    0.10180  8.8672 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

If you wanted to calculate robust clustered standard errors, then we would use the sandwitch estimator after accounting for the groups. The code I use assumes you are clustering on the individual, but this is not a restriction of the model. For example, if you had panel data of several cities over time, then you could use city fixed effects and cluster at the state level using the same code. This option is something that cannot be completed using the canned STATA routine of xtpoisson.

# Similar to e(sample) in STATA
esample<-as.numeric(rownames(as.matrix(res1$gradientObs)))
fc <- ships[esample,]$ship #isolates the groups used in estimation

# Calculates the new Meat portion of our covariance matrix
m <- length(unique(fc))
k <- 5
u <- res1$gradientObs
u.clust <- matrix(NA, nrow=m, ncol=k)
for(j in 1:k){
  u.clust[,j] <- tapply(u[,j], fc, sum)
}
cl.vcov <-ginv(-res1$hessian)%*%( t(u.clust) %*% (u.clust))%*%ginv(-res1$hessian)
coeftest(res1,cl.vcov)
## 
## z test of coefficients:
## 
##           Estimate Std. Error z value  Pr(>|z|)    
## op_75_79  0.370257   0.071116  5.2063 1.926e-07 ***
## co_65_69  0.662541   0.038143 17.3699 < 2.2e-16 ***
## co_70_74  0.759738   0.149958  5.0663 4.056e-07 ***
## co_75_79  0.369741   0.210360  1.7577   0.07881 .  
## lnservice 0.902708   0.107087  8.4296 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

You will notice that these results are very similar to those found in the second panel of the STATA output. For other models you may need to change the covariance matrix to deal with differences in degrees of freedom.

I hope you find this code useful. If you have questions you can contact me at jose[period]fernandez[at]louisville[period]edu