A short tutorial on stargazer package

Introduction

This is a short tutorial on the stargazer package with a goal to provide a basic understanding on how to create regression tables.

What is the `stargazer` package?

The Stargazer package is a great way to create tables to neatly represent your regression outputs nicely.
The package gives options to output tables in multiple formats: .txt, LaTex code, and as .html.
Using the output table as text (.txt) gives a quick view of results.
Printing the output table as .html, produces tables in Word document.

How to use the `stargazer` package?

First, install the package and run the library

#install.packages("stargazer")
library(stargazer)

## 
## Please cite as:

##  Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.

##  R package version 5.2.3. https://CRAN.R-project.org/package=stargazer

stargazer(mtcars, type = "text", digits = 1)

## 
## ======================================
## Statistic N  Mean  St. Dev. Min   Max 
## --------------------------------------
## mpg       32 20.1    6.0    10.4 33.9 
## cyl       32  6.2    1.8     4     8  
## disp      32 230.7  123.9   71.1 472.0
## hp        32 146.7   68.6    52   335 
## drat      32  3.6    0.5    2.8   4.9 
## wt        32  3.2    1.0    1.5   5.4 
## qsec      32 17.8    1.8    14.5 22.9 
## vs        32  0.4    0.5     0     1  
## am        32  0.4    0.5     0     1  
## gear      32  3.7    0.7     3     5  
## carb      32  2.8    1.6     1     8  
## --------------------------------------

dat1 <- mtcars

The syntax for stargazer has four main arguments.

The first one indicates the data frame (dat1),
The second one asks the output type (“text”),
The third argument asks for the title (“Summary Statistics”),
The fourth one specifies the file name for your exported table (“dat1.txt”).
It also provides the standard deviation, minimum, mean and maximum values.

stargazer(dat1, type= "text", title= "Summary Statistics", out= "dat1.text")

## 
## Summary Statistics
## ============================================
## Statistic N   Mean   St. Dev.  Min     Max  
## --------------------------------------------
## mpg       32 20.091   6.027   10.400 33.900 
## cyl       32  6.188   1.786     4       8   
## disp      32 230.722 123.939  71.100 472.000
## hp        32 146.688  68.563    52     335  
## drat      32  3.597   0.535   2.760   4.930 
## wt        32  3.217   0.978   1.513   5.424 
## qsec      32 17.849   1.787   14.500 22.900 
## vs        32  0.438   0.504     0       1   
## am        32  0.406   0.499     0       1   
## gear      32  3.688   0.738     3       5   
## carb      32  2.812   1.615     1       8   
## --------------------------------------------

Regression Output with stargazer

So far, we have seen that by passing a data frame to stargazer package creates a summary statistic table. This package is also extremely practical when it comes to creating regression models by simply passing a regression object.

m1 <- lm(mpg ~ hp, mtcars)
m2 <- lm(mpg~ drat, mtcars)
m3 <- lm(mpg ~ hp + drat, mtcars)

stargazer(m1, m2, m3,
          type = "html",
          digits = 1,
          header = FALSE,
          title= "Regression Results",
          covariate.labels = c("Horsepower", "Rear axle ratio"))

**Regression Results**

	Dependent variable:

	mpg
	(1)	(2)	(3)

Horsepower	-0.1^***		-0.1^***
	(0.01)		(0.01)

Rear axle ratio		7.7^***	4.7^***
		(1.5)	(1.2)

Constant	30.1^***	-7.5	10.8^**
	(1.6)	(5.5)	(5.1)


Observations	32	32	32
R²	0.6	0.5	0.7
Adjusted R²	0.6	0.4	0.7
Residual Std. Error	3.9 (df = 30)	4.5 (df = 30)	3.2 (df = 29)
F Statistic	45.5^*** (df = 1; 30)	26.0^*** (df = 1; 30)	41.5^*** (df = 2; 29)

Note:	p<0.1; p<0.05; p<0.01

Descriptive Statistics and Dataframe

When creating an output from html or latex code, your chunk should include results='asis'. This option tells knitr to treat verbatim code blocks “as is.” Otherwise, instead of your table, you will see the raw html or latex code.

stargazer(attitude[1:5,], header=FALSE, type='html', summary=FALSE, title="Data Frame",digits=1)

**Data Frame**

	rating	complaints	privileges	learning	raises	critical	advance

1	43	51	30	39	61	92	45
2	63	64	51	54	63	73	47
3	71	70	68	69	76	86	48
4	61	63	45	47	54	84	35
5	81	78	56	66	71	83	47

We can also display Descriptive Statistics

stargazer(attitude, header=FALSE, type='html', title="Descriptive Statistics",digits=1)

**Descriptive Statistics**

Statistic	N	Mean	St. Dev.	Min	Max

rating	30	64.6	12.2	40	85
complaints	30	66.6	13.3	37	90
privileges	30	53.1	12.2	30	83
learning	30	56.4	11.7	34	75
raises	30	64.6	10.4	43	88
critical	30	74.8	9.9	49	92
advance	30	42.9	10.3	25	72

We can change the covariate labels

stargazer(attitude[c("rating","complaints","privileges")], header=FALSE, type='html', 
          title="Descriptive Statistics", digits=1,
          covariate.labels=c("Rating","Complaints","Privileges")
          )

**Descriptive Statistics**

Statistic	N	Mean	St. Dev.	Min	Max

Rating	30	64.6	12.2	40	85
Complaints	30	66.6	13.3	37	90
Privileges	30	53.1	12.2	30	83

We can select which statistics to display

stargazer(attitude[c("rating","complaints","privileges")], header=FALSE, type='html', 
          title="Descriptive Statistics", digits=1,
          covariate.labels=c("Rating","Complaints","Privileges"),
          summary.stat=c("n","mean","p75","sd","min")
          )

**Descriptive Statistics**

Statistic	N	Mean	Pctl(75)	St. Dev.	Min

Rating	30	64.6	71.8	12.2	40
Complaints	30	66.6	77	13.3	37
Privileges	30	53.1	62.5	12.2	30

We can transpose rows and columns

stargazer(attitude, header=FALSE, type='html', title="Descriptive Statistics", digits=1, flip=TRUE)

**Descriptive Statistics**

Statistic	rating	complaints	privileges	learning	raises	critical	advance

N	30	30	30	30	30	30	30
Mean	64.6	66.6	53.1	56.4	64.6	74.8	42.9
St. Dev.	12.2	13.3	12.2	11.7	10.4	9.9	10.3
Min	40	37	30	34	43	49	25
Max	85	90	83	75	88	92	72

Display correlation matrix

correlation.matrix <- cor(attitude[,c("rating","complaints","privileges")])
correlation.matrix

          rating complaints privileges

rating 1.0000000 0.8254176 0.4261169 complaints 0.8254176 1.0000000 0.5582882 privileges 0.4261169 0.5582882 1.0000000

stargazer(correlation.matrix, header=FALSE, type="html", title="Correlation Matrix")

**Correlation Matrix**

	rating	complaints	privileges

rating	1	0.825	0.426
complaints	0.825	1	0.558
privileges	0.426	0.558	1

Regression tables

linear.1 <- lm(rating ~ complaints + privileges + learning + raises + critical, data=attitude)
linear.2 <- lm(rating ~ complaints + privileges + learning, data=attitude) 

## create an indicator dependent variable, and run a probit model
attitude$highrating <- (attitude$rating > 70) 

logit.model <- glm(highrating ~ learning + critical + advance, data=attitude,
family = binomial(link = "logit"))

stargazer(linear.1, linear.2, logit.model,header=FALSE,title="My Nice Regression Table", 
          type='html',digits=2)

**My Nice Regression Table**

	Dependent variable:

	rating		highrating
	OLS		logistic
	(1)	(2)	(3)

complaints	0.69^***	0.68^***
	(0.15)	(0.13)

privileges	-0.10	-0.10
	(0.13)	(0.13)

learning	0.25	0.24^*	0.28^***
	(0.16)	(0.14)	(0.10)

raises	-0.03
	(0.20)

critical	0.02		0.001
	(0.15)		(0.08)

advance			-0.10
			(0.08)

Constant	11.01	11.26	-13.23^**
	(11.70)	(7.32)	(6.60)


Observations	30	30	30
R²	0.72	0.72
Adjusted R²	0.66	0.68
Log Likelihood			-9.21
Akaike Inf. Crit.			26.43
Residual Std. Error	7.14 (df = 24)	6.86 (df = 26)
F Statistic	12.06^*** (df = 5; 24)	21.74^*** (df = 3; 26)

Note:	p<0.1; p<0.05; p<0.01

Customizing the table

We can further customize it. We can remove the dependent variable names by dep.var.labels.include = FALSE, we can remove model names by model.names = FALSE, we can remove model numbers by model.numbers = FALSE, we can rename the columns by column.labels = c()

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1, 1)
          )

**My Nice Regression Table**

	A better caption

	Good	Better	Best

complaints	0.69^***	0.68^***
	(0.15)	(0.13)

privileges	-0.10	-0.10
	(0.13)	(0.13)

learning	0.25	0.24^*	0.28^***
	(0.16)	(0.14)	(0.10)

raises	-0.03
	(0.20)

critical	0.02		0.001
	(0.15)		(0.08)

advance			-0.10
			(0.08)

Constant	11.01	11.26	-13.23^**
	(11.70)	(7.32)	(6.60)


Observations	30	30	30
R²	0.72	0.72
Adjusted R²	0.66	0.68
Log Likelihood			-9.21
Akaike Inf. Crit.			26.43
Residual Std. Error	7.14 (df = 24)	6.86 (df = 26)
F Statistic	12.06^*** (df = 5; 24)	21.74^*** (df = 3; 26)

Note:	p<0.1; p<0.05; p<0.01

Combining column labels

We can combine the label of columns by column.separate =c()

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Awesome"),
          column.separate = c(2, 1),
          covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance")
          )

**My Nice Regression Table**

	A better caption

	Good		Awesome

Complaints	0.69^***	0.68^***
	(0.15)	(0.13)

Privileges	-0.10	-0.10
	(0.13)	(0.13)

Learning	0.25	0.24^*	0.28^***
	(0.16)	(0.14)	(0.10)

Raises	-0.03
	(0.20)

Critical	0.02		0.001
	(0.15)		(0.08)

Advance			-0.10
			(0.08)

Constant	11.01	11.26	-13.23^**
	(11.70)	(7.32)	(6.60)


Observations	30	30	30
R²	0.72	0.72
Adjusted R²	0.66	0.68
Log Likelihood			-9.21
Akaike Inf. Crit.			26.43
Residual Std. Error	7.14 (df = 24)	6.86 (df = 26)
F Statistic	12.06^*** (df = 5; 24)	21.74^*** (df = 3; 26)

Note:	p<0.1; p<0.05; p<0.01

Different journal styles:

We can use different journal formats. Lets see the American Economic Review style for our table using style="aer"

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1,1),
          covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance"),
          style="aer"
          )

**My Nice Regression Table**

	Good	Better	Best

Complaints	0.69^***	0.68^***
	(0.15)	(0.13)

Privileges	-0.10	-0.10
	(0.13)	(0.13)

Learning	0.25	0.24^*	0.28^***
	(0.16)	(0.14)	(0.10)

Raises	-0.03
	(0.20)

Critical	0.02		0.00
	(0.15)		(0.08)

Advance			-0.10
			(0.08)

Constant	11.01	11.26	-13.23^**
	(11.70)	(7.32)	(6.60)

Observations	30	30	30
R²	0.72	0.72
Adjusted R²	0.66	0.68
Log Likelihood			-9.21
Akaike Inf. Crit.			26.43
Residual Std. Error	7.14 (df = 24)	6.86 (df = 26)
F Statistic	12.06^*** (df = 5; 24)	21.74^*** (df = 3; 26)

Notes:	^***Significant at the 1 percent level.
	^**Significant at the 5 percent level.
	^*Significant at the 10 percent level.

American Journal of Political Science style

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1,1),
          covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance"),
          style="ajps"
          )

**My Nice Regression Table**

	Good	Better	Best

Complaints	0.69^***	0.68^***
	(0.15)	(0.13)
Privileges	-0.10	-0.10
	(0.13)	(0.13)
Learning	0.25	0.24^*	0.28^***
	(0.16)	(0.14)	(0.10)
Raises	-0.03
	(0.20)
Critical	0.02		0.001
	(0.15)		(0.08)
Advance			-0.10
			(0.08)
Constant	11.01	11.26	-13.23^**
	(11.70)	(7.32)	(6.60)
N	30	30	30
R-squared	0.72	0.72
Adj. R-squared	0.66	0.68
Log Likelihood			-9.21
Residual Std. Error	7.14 (df = 24)	6.86 (df = 26)
F Statistic	12.06^*** (df = 5; 24)	21.74^*** (df = 3; 26)
AIC			26.43

p < .01; p < .05; p < .1

Adding a line below our reported coeffients.

Let’s further customize our reported statistics below the coefficients. For this, we use add.lines = list(c("Hello", "This", "is", "how"),c("add", "a line", "below", "coefficents"))

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1,1),
          covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance"),
          add.lines = list(c("Hello", "This", "is", "how"),c("add", "a line", "below", "coefficents"))
          )

**My Nice Regression Table**

	A better caption

	Good	Better	Best

Complaints	0.69^***	0.68^***
	(0.15)	(0.13)

Privileges	-0.10	-0.10
	(0.13)	(0.13)

Learning	0.25	0.24^*	0.28^***
	(0.16)	(0.14)	(0.10)

Raises	-0.03
	(0.20)

Critical	0.02		0.001
	(0.15)		(0.08)

Advance			-0.10
			(0.08)

Constant	11.01	11.26	-13.23^**
	(11.70)	(7.32)	(6.60)


Hello	This	is	how
add	a line	below	coefficents
Observations	30	30	30
R²	0.72	0.72
Adjusted R²	0.66	0.68
Log Likelihood			-9.21
Akaike Inf. Crit.			26.43
Residual Std. Error	7.14 (df = 24)	6.86 (df = 26)
F Statistic	12.06^*** (df = 5; 24)	21.74^*** (df = 3; 26)

Note:	p<0.1; p<0.05; p<0.01

Showing the confidence intervals rather than standard errors

You can alternatively present the confidence intervals by adding ci = TRUE,ci.level = 0.95

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1,1),
          covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance"),
          ci = TRUE,ci.level = 0.95
          )

**My Nice Regression Table**

	A better caption

	Good	Better	Best

Complaints	0.69^***	0.68^***
	(0.40, 0.98)	(0.43, 0.93)

Privileges	-0.10	-0.10
	(-0.37, 0.16)	(-0.36, 0.15)

Learning	0.25	0.24^*	0.28^***
	(-0.06, 0.56)	(-0.04, 0.51)	(0.08, 0.48)

Raises	-0.03
	(-0.43, 0.36)

Critical	0.02		0.001
	(-0.27, 0.30)		(-0.16, 0.16)

Advance			-0.10
			(-0.24, 0.05)

Constant	11.01	11.26	-13.23^**
	(-11.93, 33.95)	(-3.09, 25.60)	(-26.15, -0.30)


Observations	30	30	30
R²	0.72	0.72
Adjusted R²	0.66	0.68
Log Likelihood			-9.21
Akaike Inf. Crit.			26.43
Residual Std. Error	7.14 (df = 24)	6.86 (df = 26)
F Statistic	12.06^*** (df = 5; 24)	21.74^*** (df = 3; 26)

Note:	p<0.1; p<0.05; p<0.01

Omitting some variables from the table

You can omit some variables from the table by omit = c("varname1","varname2")

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1,1),
          add.lines = list(c("Control Variables", "No", "Yes", "Yes"),c("Algorithms optimized", "Yes", "No", "No")),
          omit=c("complaints","privileges","raises","critical","advance")
          )

**My Nice Regression Table**

	A better caption

	Good	Better	Best

learning	0.25	0.24^*	0.28^***
	(0.16)	(0.14)	(0.10)

Constant	11.01	11.26	-13.23^**
	(11.70)	(7.32)	(6.60)


Control Variables	No	Yes	Yes
Algorithms optimized	Yes	No	No
Observations	30	30	30
R²	0.72	0.72
Adjusted R²	0.66	0.68
Log Likelihood			-9.21
Akaike Inf. Crit.			26.43
Residual Std. Error	7.14 (df = 24)	6.86 (df = 26)
F Statistic	12.06^*** (df = 5; 24)	21.74^*** (df = 3; 26)

Note:	p<0.1; p<0.05; p<0.01

A short tutorial on stargazer package

A Student

02 March 2022

Introduction

What is the `stargazer` package?

How to use the `stargazer` package?

The syntax for stargazer has four main arguments.

Regression Output with stargazer

Descriptive Statistics and Dataframe

We can also display Descriptive Statistics

We can change the covariate labels

We can select which statistics to display

We can transpose rows and columns

Display correlation matrix

Regression tables

Customizing the table

Combining column labels

Different journal styles:

American Journal of Political Science style

Adding a line below our reported coeffients.

Showing the confidence intervals rather than standard errors

Omitting some variables from the table

A short tutorial on stargazer package

A Student

02 March 2022

Introduction

What is the stargazer package?

How to use the stargazer package?

The syntax for stargazer has four main arguments.

Regression Output with stargazer

Descriptive Statistics and Dataframe

We can also display Descriptive Statistics

We can change the covariate labels

We can select which statistics to display

We can transpose rows and columns

Display correlation matrix

Regression tables

Customizing the table

Combining column labels

Different journal styles:

American Journal of Political Science style

Adding a line below our reported coeffients.

Showing the confidence intervals rather than standard errors

Omitting some variables from the table

What is the `stargazer` package?

How to use the `stargazer` package?