Introduction

This is a short tutorial on the stargazer package with a goal to provide a basic understanding on how to create regression tables.

What is the stargazer package?

  • The Stargazer package is a great way to create tables to neatly represent your regression outputs nicely.
  • The package gives options to output tables in multiple formats: .txt, LaTex code, and as .html.
  • Using the output table as text (.txt) gives a quick view of results.
  • Printing the output table as .html, produces tables in Word document.

How to use the stargazer package?

  • First, install the package and run the library
#install.packages("stargazer")
library(stargazer)
## 
## Please cite as:
##  Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
##  R package version 5.2.3. https://CRAN.R-project.org/package=stargazer
stargazer(mtcars, type = "text", digits = 1)
## 
## ======================================
## Statistic N  Mean  St. Dev. Min   Max 
## --------------------------------------
## mpg       32 20.1    6.0    10.4 33.9 
## cyl       32  6.2    1.8     4     8  
## disp      32 230.7  123.9   71.1 472.0
## hp        32 146.7   68.6    52   335 
## drat      32  3.6    0.5    2.8   4.9 
## wt        32  3.2    1.0    1.5   5.4 
## qsec      32 17.8    1.8    14.5 22.9 
## vs        32  0.4    0.5     0     1  
## am        32  0.4    0.5     0     1  
## gear      32  3.7    0.7     3     5  
## carb      32  2.8    1.6     1     8  
## --------------------------------------
dat1 <- mtcars

The syntax for stargazer has four main arguments.

  • The first one indicates the data frame (dat1),
  • The second one asks the output type (“text”),
  • The third argument asks for the title (“Summary Statistics”),
  • The fourth one specifies the file name for your exported table (“dat1.txt”).
  • It also provides the standard deviation, minimum, mean and maximum values.
stargazer(dat1, type= "text", title= "Summary Statistics", out= "dat1.text")
## 
## Summary Statistics
## ============================================
## Statistic N   Mean   St. Dev.  Min     Max  
## --------------------------------------------
## mpg       32 20.091   6.027   10.400 33.900 
## cyl       32  6.188   1.786     4       8   
## disp      32 230.722 123.939  71.100 472.000
## hp        32 146.688  68.563    52     335  
## drat      32  3.597   0.535   2.760   4.930 
## wt        32  3.217   0.978   1.513   5.424 
## qsec      32 17.849   1.787   14.500 22.900 
## vs        32  0.438   0.504     0       1   
## am        32  0.406   0.499     0       1   
## gear      32  3.688   0.738     3       5   
## carb      32  2.812   1.615     1       8   
## --------------------------------------------

Regression Output with stargazer

So far, we have seen that by passing a data frame to stargazer package creates a summary statistic table. This package is also extremely practical when it comes to creating regression models by simply passing a regression object.

m1 <- lm(mpg ~ hp, mtcars)
m2 <- lm(mpg~ drat, mtcars)
m3 <- lm(mpg ~ hp + drat, mtcars)

stargazer(m1, m2, m3,
          type = "html",
          digits = 1,
          header = FALSE,
          title= "Regression Results",
          covariate.labels = c("Horsepower", "Rear axle ratio"))
Regression Results
Dependent variable:
mpg
(1) (2) (3)
Horsepower -0.1*** -0.1***
(0.01) (0.01)
Rear axle ratio 7.7*** 4.7***
(1.5) (1.2)
Constant 30.1*** -7.5 10.8**
(1.6) (5.5) (5.1)
Observations 32 32 32
R2 0.6 0.5 0.7
Adjusted R2 0.6 0.4 0.7
Residual Std. Error 3.9 (df = 30) 4.5 (df = 30) 3.2 (df = 29)
F Statistic 45.5*** (df = 1; 30) 26.0*** (df = 1; 30) 41.5*** (df = 2; 29)
Note: p<0.1; p<0.05; p<0.01

Descriptive Statistics and Dataframe

When creating an output from html or latex code, your chunk should include results='asis'. This option tells knitr to treat verbatim code blocks “as is.” Otherwise, instead of your table, you will see the raw html or latex code.

stargazer(attitude[1:5,], header=FALSE, type='html', summary=FALSE, title="Data Frame",digits=1)
Data Frame
rating complaints privileges learning raises critical advance
1 43 51 30 39 61 92 45
2 63 64 51 54 63 73 47
3 71 70 68 69 76 86 48
4 61 63 45 47 54 84 35
5 81 78 56 66 71 83 47

We can also display Descriptive Statistics

stargazer(attitude, header=FALSE, type='html', title="Descriptive Statistics",digits=1)
Descriptive Statistics
Statistic N Mean St. Dev. Min Max
rating 30 64.6 12.2 40 85
complaints 30 66.6 13.3 37 90
privileges 30 53.1 12.2 30 83
learning 30 56.4 11.7 34 75
raises 30 64.6 10.4 43 88
critical 30 74.8 9.9 49 92
advance 30 42.9 10.3 25 72

We can change the covariate labels

stargazer(attitude[c("rating","complaints","privileges")], header=FALSE, type='html', 
          title="Descriptive Statistics", digits=1,
          covariate.labels=c("Rating","Complaints","Privileges")
          )
Descriptive Statistics
Statistic N Mean St. Dev. Min Max
Rating 30 64.6 12.2 40 85
Complaints 30 66.6 13.3 37 90
Privileges 30 53.1 12.2 30 83

We can select which statistics to display

stargazer(attitude[c("rating","complaints","privileges")], header=FALSE, type='html', 
          title="Descriptive Statistics", digits=1,
          covariate.labels=c("Rating","Complaints","Privileges"),
          summary.stat=c("n","mean","p75","sd","min")
          )
Descriptive Statistics
Statistic N Mean Pctl(75) St. Dev. Min
Rating 30 64.6 71.8 12.2 40
Complaints 30 66.6 77 13.3 37
Privileges 30 53.1 62.5 12.2 30

We can transpose rows and columns

stargazer(attitude, header=FALSE, type='html', title="Descriptive Statistics", digits=1, flip=TRUE)
Descriptive Statistics
Statistic rating complaints privileges learning raises critical advance
N 30 30 30 30 30 30 30
Mean 64.6 66.6 53.1 56.4 64.6 74.8 42.9
St. Dev. 12.2 13.3 12.2 11.7 10.4 9.9 10.3
Min 40 37 30 34 43 49 25
Max 85 90 83 75 88 92 72

Display correlation matrix

correlation.matrix <- cor(attitude[,c("rating","complaints","privileges")])
correlation.matrix
          rating complaints privileges

rating 1.0000000 0.8254176 0.4261169 complaints 0.8254176 1.0000000 0.5582882 privileges 0.4261169 0.5582882 1.0000000

stargazer(correlation.matrix, header=FALSE, type="html", title="Correlation Matrix")
Correlation Matrix
rating complaints privileges
rating 1 0.825 0.426
complaints 0.825 1 0.558
privileges 0.426 0.558 1

Regression tables

linear.1 <- lm(rating ~ complaints + privileges + learning + raises + critical, data=attitude)
linear.2 <- lm(rating ~ complaints + privileges + learning, data=attitude) 

## create an indicator dependent variable, and run a probit model
attitude$highrating <- (attitude$rating > 70) 

logit.model <- glm(highrating ~ learning + critical + advance, data=attitude,
family = binomial(link = "logit"))

stargazer(linear.1, linear.2, logit.model,header=FALSE,title="My Nice Regression Table", 
          type='html',digits=2)
My Nice Regression Table
Dependent variable:
rating highrating
OLS logistic
(1) (2) (3)
complaints 0.69*** 0.68***
(0.15) (0.13)
privileges -0.10 -0.10
(0.13) (0.13)
learning 0.25 0.24* 0.28***
(0.16) (0.14) (0.10)
raises -0.03
(0.20)
critical 0.02 0.001
(0.15) (0.08)
advance -0.10
(0.08)
Constant 11.01 11.26 -13.23**
(11.70) (7.32) (6.60)
Observations 30 30 30
R2 0.72 0.72
Adjusted R2 0.66 0.68
Log Likelihood -9.21
Akaike Inf. Crit. 26.43
Residual Std. Error 7.14 (df = 24) 6.86 (df = 26)
F Statistic 12.06*** (df = 5; 24) 21.74*** (df = 3; 26)
Note: p<0.1; p<0.05; p<0.01

Customizing the table

We can further customize it. We can remove the dependent variable names by dep.var.labels.include = FALSE, we can remove model names by model.names = FALSE, we can remove model numbers by model.numbers = FALSE, we can rename the columns by column.labels = c()

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1, 1)
          )
My Nice Regression Table
A better caption
Good Better Best
complaints 0.69*** 0.68***
(0.15) (0.13)
privileges -0.10 -0.10
(0.13) (0.13)
learning 0.25 0.24* 0.28***
(0.16) (0.14) (0.10)
raises -0.03
(0.20)
critical 0.02 0.001
(0.15) (0.08)
advance -0.10
(0.08)
Constant 11.01 11.26 -13.23**
(11.70) (7.32) (6.60)
Observations 30 30 30
R2 0.72 0.72
Adjusted R2 0.66 0.68
Log Likelihood -9.21
Akaike Inf. Crit. 26.43
Residual Std. Error 7.14 (df = 24) 6.86 (df = 26)
F Statistic 12.06*** (df = 5; 24) 21.74*** (df = 3; 26)
Note: p<0.1; p<0.05; p<0.01

Combining column labels

We can combine the label of columns by column.separate =c()

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Awesome"),
          column.separate = c(2, 1),
          covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance")
          )
My Nice Regression Table
A better caption
Good Awesome
Complaints 0.69*** 0.68***
(0.15) (0.13)
Privileges -0.10 -0.10
(0.13) (0.13)
Learning 0.25 0.24* 0.28***
(0.16) (0.14) (0.10)
Raises -0.03
(0.20)
Critical 0.02 0.001
(0.15) (0.08)
Advance -0.10
(0.08)
Constant 11.01 11.26 -13.23**
(11.70) (7.32) (6.60)
Observations 30 30 30
R2 0.72 0.72
Adjusted R2 0.66 0.68
Log Likelihood -9.21
Akaike Inf. Crit. 26.43
Residual Std. Error 7.14 (df = 24) 6.86 (df = 26)
F Statistic 12.06*** (df = 5; 24) 21.74*** (df = 3; 26)
Note: p<0.1; p<0.05; p<0.01

Different journal styles:

We can use different journal formats. Lets see the American Economic Review style for our table using style="aer"

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1,1),
          covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance"),
          style="aer"
          )
My Nice Regression Table
Good Better Best
Complaints 0.69*** 0.68***
(0.15) (0.13)
Privileges -0.10 -0.10
(0.13) (0.13)
Learning 0.25 0.24* 0.28***
(0.16) (0.14) (0.10)
Raises -0.03
(0.20)
Critical 0.02 0.00
(0.15) (0.08)
Advance -0.10
(0.08)
Constant 11.01 11.26 -13.23**
(11.70) (7.32) (6.60)
Observations 30 30 30
R2 0.72 0.72
Adjusted R2 0.66 0.68
Log Likelihood -9.21
Akaike Inf. Crit. 26.43
Residual Std. Error 7.14 (df = 24) 6.86 (df = 26)
F Statistic 12.06*** (df = 5; 24) 21.74*** (df = 3; 26)
Notes: ***Significant at the 1 percent level.
**Significant at the 5 percent level.
*Significant at the 10 percent level.

American Journal of Political Science style

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1,1),
          covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance"),
          style="ajps"
          )
My Nice Regression Table
Good Better Best
Complaints 0.69*** 0.68***
(0.15) (0.13)
Privileges -0.10 -0.10
(0.13) (0.13)
Learning 0.25 0.24* 0.28***
(0.16) (0.14) (0.10)
Raises -0.03
(0.20)
Critical 0.02 0.001
(0.15) (0.08)
Advance -0.10
(0.08)
Constant 11.01 11.26 -13.23**
(11.70) (7.32) (6.60)
N 30 30 30
R-squared 0.72 0.72
Adj. R-squared 0.66 0.68
Log Likelihood -9.21
Residual Std. Error 7.14 (df = 24) 6.86 (df = 26)
F Statistic 12.06*** (df = 5; 24) 21.74*** (df = 3; 26)
AIC 26.43
p < .01; p < .05; p < .1

Adding a line below our reported coeffients.

Let’s further customize our reported statistics below the coefficients. For this, we use add.lines = list(c("Hello", "This", "is", "how"),c("add", "a line", "below", "coefficents"))

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1,1),
          covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance"),
          add.lines = list(c("Hello", "This", "is", "how"),c("add", "a line", "below", "coefficents"))
          )
My Nice Regression Table
A better caption
Good Better Best
Complaints 0.69*** 0.68***
(0.15) (0.13)
Privileges -0.10 -0.10
(0.13) (0.13)
Learning 0.25 0.24* 0.28***
(0.16) (0.14) (0.10)
Raises -0.03
(0.20)
Critical 0.02 0.001
(0.15) (0.08)
Advance -0.10
(0.08)
Constant 11.01 11.26 -13.23**
(11.70) (7.32) (6.60)
Hello This is how
add a line below coefficents
Observations 30 30 30
R2 0.72 0.72
Adjusted R2 0.66 0.68
Log Likelihood -9.21
Akaike Inf. Crit. 26.43
Residual Std. Error 7.14 (df = 24) 6.86 (df = 26)
F Statistic 12.06*** (df = 5; 24) 21.74*** (df = 3; 26)
Note: p<0.1; p<0.05; p<0.01

Showing the confidence intervals rather than standard errors

You can alternatively present the confidence intervals by adding ci = TRUE,ci.level = 0.95

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1,1),
          covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance"),
          ci = TRUE,ci.level = 0.95
          )
My Nice Regression Table
A better caption
Good Better Best
Complaints 0.69*** 0.68***
(0.40, 0.98) (0.43, 0.93)
Privileges -0.10 -0.10
(-0.37, 0.16) (-0.36, 0.15)
Learning 0.25 0.24* 0.28***
(-0.06, 0.56) (-0.04, 0.51) (0.08, 0.48)
Raises -0.03
(-0.43, 0.36)
Critical 0.02 0.001
(-0.27, 0.30) (-0.16, 0.16)
Advance -0.10
(-0.24, 0.05)
Constant 11.01 11.26 -13.23**
(-11.93, 33.95) (-3.09, 25.60) (-26.15, -0.30)
Observations 30 30 30
R2 0.72 0.72
Adjusted R2 0.66 0.68
Log Likelihood -9.21
Akaike Inf. Crit. 26.43
Residual Std. Error 7.14 (df = 24) 6.86 (df = 26)
F Statistic 12.06*** (df = 5; 24) 21.74*** (df = 3; 26)
Note: p<0.1; p<0.05; p<0.01

Omitting some variables from the table

You can omit some variables from the table by omit = c("varname1","varname2")

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1,1),
          add.lines = list(c("Control Variables", "No", "Yes", "Yes"),c("Algorithms optimized", "Yes", "No", "No")),
          omit=c("complaints","privileges","raises","critical","advance")
          )
My Nice Regression Table
A better caption
Good Better Best
learning 0.25 0.24* 0.28***
(0.16) (0.14) (0.10)
Constant 11.01 11.26 -13.23**
(11.70) (7.32) (6.60)
Control Variables No Yes Yes
Algorithms optimized Yes No No
Observations 30 30 30
R2 0.72 0.72
Adjusted R2 0.66 0.68
Log Likelihood -9.21
Akaike Inf. Crit. 26.43
Residual Std. Error 7.14 (df = 24) 6.86 (df = 26)
F Statistic 12.06*** (df = 5; 24) 21.74*** (df = 3; 26)
Note: p<0.1; p<0.05; p<0.01