One of the most useful packages I have come across this semester is the stargazer package. It provides a way to create publication quality tables, and a way for researchers to avoid creating new tables each time they tweak their datasets. This package saves users time, and has been welcomed by the R community. It outputs tables in multiple formats; from .txt to LaTex code as well as .html. This tutorial will go through the .txt and .html formats and provide the basic understanding needed to create Summary Statistics Tables and Regression Tables.

This blog will demonstrate some of the applications of this wonderful package

library(stargazer)

We will be using the iris data to show the following features of the package.

Summary Statistics

This is similar to the base r function but shows the statistics in a table format.

stargazer(iris, type='text', title=' Iris : Summary Statistics', out='table1.txt')
## 
## Iris : Summary Statistics
## =============================================================
## Statistic     N  Mean  St. Dev.  Min  Pctl(25) Pctl(75)  Max 
## -------------------------------------------------------------
## Sepal.Length 150 5.843  0.828   4.300  5.100    6.400   7.900
## Sepal.Width  150 3.057  0.436   2.000  2.800    3.300   4.400
## Petal.Length 150 3.758  1.765   1.000  1.600    5.100   6.900
## Petal.Width  150 1.199  0.762   0.100  0.300    1.800   2.500
## -------------------------------------------------------------

You can also flip the table

stargazer(iris, type='text', title=' Iris : Summary Statistics', out='table1.txt', flip=TRUE)
## 
## Iris : Summary Statistics
## ===========================================================
## Statistic Sepal.Length Sepal.Width Petal.Length Petal.Width
## -----------------------------------------------------------
## N             150          150         150          150    
## Mean         5.843        3.057       3.758        1.199   
## St. Dev.     0.828        0.436       1.765        0.762   
## Min          4.300        2.000       1.000        0.100   
## Pctl(25)     5.100        2.800       1.600        0.300   
## Pctl(75)     6.400        3.300       5.100        1.800   
## Max          7.900        4.400       6.900        2.500   
## -----------------------------------------------------------

Regression

I think the most important application of this package is being able to display regression statistics to compare the results. To demonstrate this, I will run three regression models first.

data<-iris
m1<-lm(Sepal.Length~., data)
m2<-lm(Sepal.Length ~ Sepal.Width, data)
m3<-lm(Sepal.Length~ Sepal.Width + Petal.Length, data)

Now let us compare the outputs

stargazer(m1, m2, m3, type='text', title='Regression results', out='table2.text')
## 
## Regression results
## =========================================================================================
##                                              Dependent variable:                         
##                     ---------------------------------------------------------------------
##                                                 Sepal.Length                             
##                               (1)                    (2)                   (3)           
## -----------------------------------------------------------------------------------------
## Sepal.Width                 0.496***               -0.223                0.596***        
##                             (0.086)                (0.155)               (0.069)         
##                                                                                          
## Petal.Length                0.829***                                     0.472***        
##                             (0.069)                                      (0.017)         
##                                                                                          
## Petal.Width                 -0.315**                                                     
##                             (0.151)                                                      
##                                                                                          
## Speciesversicolor          -0.724***                                                     
##                             (0.240)                                                      
##                                                                                          
## Speciesvirginica           -1.023***                                                     
##                             (0.334)                                                      
##                                                                                          
## Constant                    2.171***              6.526***               2.249***        
##                             (0.280)                (0.479)               (0.248)         
##                                                                                          
## -----------------------------------------------------------------------------------------
## Observations                  150                    150                   150           
## R2                           0.867                  0.014                 0.840          
## Adjusted R2                  0.863                  0.007                 0.838          
## Residual Std. Error     0.307 (df = 144)      0.825 (df = 148)       0.333 (df = 147)    
## F Statistic         188.251*** (df = 5; 144) 2.074 (df = 1; 148) 386.386*** (df = 2; 147)
## =========================================================================================
## Note:                                                         *p<0.1; **p<0.05; ***p<0.01

We can easily compare the Rsquare, errors, p values, etc side by side to compare which model fits our data best.