stargazer is an R package that creates html code for beautiful tables.You can use R to run your analysis and use stargazer output in your R Markdown code directly. No copy, no paste!

We will also use equatiomatic package to extract the equations we use in our models.

Creating Tables

Installation

Let’s first install stargazer.

install.packages("stargazer")
library(stargazer)

library(stargazer)
data(attitude)

Descriptive Statistics and Dataframe

When creating an output from html or latex code, your chunk should include results='asis'. This option tells knitr to treat verbatim code blocks “as is.” Otherwise, instead of your table, you will see the raw html or latex code.

#```{r, results='asis', echo = TRUE, eval=TRUE, warning=FALSE, message=FALSE}
stargazer(attitude[1:5,], header=FALSE, type='html', summary=FALSE, title="Data Frame",digits=1)

**Data Frame**

	rating	complaints	privileges	learning	raises	critical	advance

1	43	51	30	39	61	92	45
2	63	64	51	54	63	73	47
3	71	70	68	69	76	86	48
4	61	63	45	47	54	84	35
5	81	78	56	66	71	83	47

We can show the dataframe:

stargazer(attitude[1:5,], header=FALSE, type='html', summary=FALSE, title="Data Frame",digits=1)

**Data Frame**

	rating	complaints	privileges	learning	raises	critical	advance

1	43	51	30	39	61	92	45
2	63	64	51	54	63	73	47
3	71	70	68	69	76	86	48
4	61	63	45	47	54	84	35
5	81	78	56	66	71	83	47

or most commonly descriptive statistics:

stargazer(attitude, header=FALSE, type='html', title="Descriptive Statistics",digits=1)

**Descriptive Statistics**

Statistic	N	Mean	St. Dev.	Min	Pctl(25)	Pctl(75)	Max

rating	30	64.6	12.2	40	58.8	71.8	85
complaints	30	66.6	13.3	37	58.5	77	90
privileges	30	53.1	12.2	30	45	62.5	83
learning	30	56.4	11.7	34	47	66.8	75
raises	30	64.6	10.4	43	58.2	71	88
critical	30	74.8	9.9	49	69.2	80	92
advance	30	42.9	10.3	25	35	47.8	72

Replace Labels

We can replace the labels.

stargazer(attitude[c("rating","complaints","privileges")], header=FALSE, type='html', 
          title="Descriptive Statistics", digits=1,
          covariate.labels=c("Rating","Complaints","Privileges")
          )

**Descriptive Statistics**

Statistic	N	Mean	St. Dev.	Min	Pctl(25)	Pctl(75)	Max

Rating	30	64.6	12.2	40	58.8	71.8	85
Complaints	30	66.6	13.3	37	58.5	77	90
Privileges	30	53.1	12.2	30	45	62.5	83

Choosing the statistics to present

We can decide on what statistics to present:

stargazer(attitude[c("rating","complaints","privileges")], header=FALSE, type='html', 
          title="Descriptive Statistics", digits=1,
          covariate.labels=c("Rating","Complaints","Privileges"),
          summary.stat=c("n","mean","p75","sd","min")
          )

**Descriptive Statistics**

Statistic	N	Mean	Pctl(75)	St. Dev.	Min

Rating	30	64.6	71.8	12.2	40
Complaints	30	66.6	77	13.3	37
Privileges	30	53.1	62.5	12.2	30

Transposing the table

or we can transpose the table using flip=TRUE:

stargazer(attitude, header=FALSE, type='html', title="Descriptive Statistics", digits=1, flip=TRUE)

**Descriptive Statistics**

Statistic	rating	complaints	privileges	learning	raises	critical	advance

N	30	30	30	30	30	30	30
Mean	64.6	66.6	53.1	56.4	64.6	74.8	42.9
St. Dev.	12.2	13.3	12.2	11.7	10.4	9.9	10.3
Min	40	37	30	34	43	49	25
Pctl(25)	58.8	58.5	45	47	58.2	69.2	35
Pctl(75)	71.8	77	62.5	66.8	71	80	47.8
Max	85	90	83	75	88	92	72

Correlation Matrix

Lets see the correlation matrix:

correlation.matrix <- cor(attitude[,c("rating","complaints","privileges")])
correlation.matrix

##               rating complaints privileges
## rating     1.0000000  0.8254176  0.4261169
## complaints 0.8254176  1.0000000  0.5582882
## privileges 0.4261169  0.5582882  1.0000000

We can visualize it better:

stargazer(correlation.matrix, header=FALSE, type="html", title="Correlation Matrix")

**Correlation Matrix**

	rating	complaints	privileges

rating	1	0.825	0.426
complaints	0.825	1	0.558
privileges	0.426	0.558	1

Regression Tables

Running our regressions

Let’s run linear regression and a probit regression.

linear.1 <- lm(rating ~ complaints + privileges + learning + raises + critical, data=attitude)
linear.2 <- lm(rating ~ complaints + privileges + learning, data=attitude)
## create an indicator dependent variable, and run a probit model
attitude$highrating <- (attitude$rating > 70)
logit.model <- glm(highrating ~ learning + critical + advance, data=attitude,
family = binomial(link = "logit"))

Extracting our regression equations

install.packages("remotes")
remotes::install_github("datalorax/equatiomatic")

We basically ran this model:

library(tidyverse)
library(equatiomatic)

We basically have these models

extract_eq(linear.1)

\[ \operatorname{rating} = \alpha + \beta_{1}(\operatorname{complaints}) + \beta_{2}(\operatorname{privileges}) + \beta_{3}(\operatorname{learning}) + \beta_{4}(\operatorname{raises}) + \beta_{5}(\operatorname{critical}) + \epsilon \]

the model estimates the coefficients as

extract_eq(linear.1, use_coefs = TRUE,coef_digits = 3,)

\[ \operatorname{rating} = 11.011 + 0.692(\operatorname{complaints}) - 0.104(\operatorname{privileges}) + 0.249(\operatorname{learning}) - 0.033(\operatorname{raises}) + 0.015(\operatorname{critical}) + \epsilon \]

Our logistic regression model is as follows:

extract_eq(logit.model)

\[ \log\left[ \frac { P( \operatorname{highrating} = \operatorname{TRUE} ) }{ 1 - P( \operatorname{highrating} = \operatorname{TRUE} ) } \right] = \alpha + \beta_{1}(\operatorname{learning}) + \beta_{2}(\operatorname{critical}) + \beta_{3}(\operatorname{advance}) + \epsilon \]

the model estimates the coefficients as

extract_eq(logit.model, use_coefs = TRUE,coef_digits = 3,)

\[ \log\left[ \frac { P( \operatorname{highrating} = \operatorname{TRUE} ) }{ 1 - P( \operatorname{highrating} = \operatorname{TRUE} ) } \right] = -13.226 + 0.279(\operatorname{learning}) + 0.001(\operatorname{critical}) - 0.097(\operatorname{advance}) + \epsilon \]

But we want to see all coefficients in one table. Let’s our nice table:

Presenting our regression coefficients in a table

stargazer(linear.1, linear.2, logit.model,header=FALSE,title="My Nice Regression Table", 
          type='html',digits=2)

**My Nice Regression Table**

	Dependent variable:

	rating		highrating
	OLS		logistic
	(1)	(2)	(3)

complaints	0.69^***	0.68^***
	(0.15)	(0.13)

privileges	-0.10	-0.10
	(0.13)	(0.13)

learning	0.25	0.24^*	0.28^***
	(0.16)	(0.14)	(0.10)

raises	-0.03
	(0.20)

critical	0.02		0.001
	(0.15)		(0.08)

advance			-0.10
			(0.08)

Constant	11.01	11.26	-13.23^**
	(11.70)	(7.32)	(6.60)


Observations	30	30	30
R²	0.72	0.72
Adjusted R²	0.66	0.68
Log Likelihood			-9.21
Akaike Inf. Crit.			26.43
Residual Std. Error	7.14 (df = 24)	6.86 (df = 26)
F Statistic	12.06^*** (df = 5; 24)	21.74^*** (df = 3; 26)

Note:	p<0.1; p<0.05; p<0.01

Customizing the table

We can further customize it. We can remove the dependent variable names by dep.var.labels.include = FALSE, we can remove model names by model.names = FALSE, we can remove model numbers by model.numbers = FALSE, we can rename the columns by column.labels = c()

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1, 1)
          )

**My Nice Regression Table**

	A better caption

	Good	Better	Best

complaints	0.69^***	0.68^***
	(0.15)	(0.13)

privileges	-0.10	-0.10
	(0.13)	(0.13)

learning	0.25	0.24^*	0.28^***
	(0.16)	(0.14)	(0.10)

raises	-0.03
	(0.20)

critical	0.02		0.001
	(0.15)		(0.08)

advance			-0.10
			(0.08)

Constant	11.01	11.26	-13.23^**
	(11.70)	(7.32)	(6.60)


Observations	30	30	30
R²	0.72	0.72
Adjusted R²	0.66	0.68
Log Likelihood			-9.21
Akaike Inf. Crit.			26.43
Residual Std. Error	7.14 (df = 24)	6.86 (df = 26)
F Statistic	12.06^*** (df = 5; 24)	21.74^*** (df = 3; 26)

Note:	p<0.1; p<0.05; p<0.01

Combining column labels

We can combine the label of columns by column.separate =c()

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Awesome"),
          column.separate = c(2, 1),
          covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance")
          )

**My Nice Regression Table**

	A better caption

	Good		Awesome

Complaints	0.69^***	0.68^***
	(0.15)	(0.13)

Privileges	-0.10	-0.10
	(0.13)	(0.13)

Learning	0.25	0.24^*	0.28^***
	(0.16)	(0.14)	(0.10)

Raises	-0.03
	(0.20)

Critical	0.02		0.001
	(0.15)		(0.08)

Advance			-0.10
			(0.08)

Constant	11.01	11.26	-13.23^**
	(11.70)	(7.32)	(6.60)


Observations	30	30	30
R²	0.72	0.72
Adjusted R²	0.66	0.68
Log Likelihood			-9.21
Akaike Inf. Crit.			26.43
Residual Std. Error	7.14 (df = 24)	6.86 (df = 26)
F Statistic	12.06^*** (df = 5; 24)	21.74^*** (df = 3; 26)

Note:	p<0.1; p<0.05; p<0.01

Different journal styles:

We can use different journal formats. Lets see the American Political Science Review style for our table using style="apsr"

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1,1),
          covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance"),
          style="apsr"
          )

**My Nice Regression Table**

	Good	Better	Best

Complaints	0.69^***	0.68^***
	(0.15)	(0.13)
Privileges	-0.10	-0.10
	(0.13)	(0.13)
Learning	0.25	0.24^*	0.28^***
	(0.16)	(0.14)	(0.10)
Raises	-0.03
	(0.20)
Critical	0.02		0.001
	(0.15)		(0.08)
Advance			-0.10
			(0.08)
Constant	11.01	11.26	-13.23^**
	(11.70)	(7.32)	(6.60)
N	30	30	30
R²	0.72	0.72
Adjusted R²	0.66	0.68
Log Likelihood			-9.21
Residual Std. Error	7.14 (df = 24)	6.86 (df = 26)
F Statistic	12.06^*** (df = 5; 24)	21.74^*** (df = 3; 26)
AIC			26.43

p < .1; p < .05; p < .01

What about American Journal of Political Science: style="ajps"

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1,1),
          covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance"),
          style="ajps"
          )

**My Nice Regression Table**

	Good	Better	Best

Complaints	0.69^***	0.68^***
	(0.15)	(0.13)
Privileges	-0.10	-0.10
	(0.13)	(0.13)
Learning	0.25	0.24^*	0.28^***
	(0.16)	(0.14)	(0.10)
Raises	-0.03
	(0.20)
Critical	0.02		0.001
	(0.15)		(0.08)
Advance			-0.10
			(0.08)
Constant	11.01	11.26	-13.23^**
	(11.70)	(7.32)	(6.60)
N	30	30	30
R-squared	0.72	0.72
Adj. R-squared	0.66	0.68
Log Likelihood			-9.21
Residual Std. Error	7.14 (df = 24)	6.86 (df = 26)
F Statistic	12.06^*** (df = 5; 24)	21.74^*** (df = 3; 26)
AIC			26.43

p < .01; p < .05; p < .1

Adding a line below our reported coeffients.

Let’s further customize our reported statistics below the coefficients. For this, we use add.lines = list(c("Hello", "This", "is", "NYUAD"),c("Yes", "I", "confirm", "0.006")),

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1,1),
          covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance"),
          add.lines = list(c("Hello", "This", "is", "NYUAD"),c("Yes", "I", "confirm", "0.006"))
          )

**My Nice Regression Table**

	A better caption

	Good	Better	Best

Complaints	0.69^***	0.68^***
	(0.15)	(0.13)

Privileges	-0.10	-0.10
	(0.13)	(0.13)

Learning	0.25	0.24^*	0.28^***
	(0.16)	(0.14)	(0.10)

Raises	-0.03
	(0.20)

Critical	0.02		0.001
	(0.15)		(0.08)

Advance			-0.10
			(0.08)

Constant	11.01	11.26	-13.23^**
	(11.70)	(7.32)	(6.60)


Hello	This	is	NYUAD
Yes	I	confirm	0.006
Observations	30	30	30
R²	0.72	0.72
Adjusted R²	0.66	0.68
Log Likelihood			-9.21
Akaike Inf. Crit.			26.43
Residual Std. Error	7.14 (df = 24)	6.86 (df = 26)
F Statistic	12.06^*** (df = 5; 24)	21.74^*** (df = 3; 26)

Note:	p<0.1; p<0.05; p<0.01

Showing the confidence intervals rather than standard errors

You can alternatively present the confidence intervals by adding ci = TRUE,ci.level = 0.95

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1,1),
          covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance"),
          add.lines = list(c("Hello", "This", "0.007", "James Bond")),
          ci = TRUE,ci.level = 0.95
          )

**My Nice Regression Table**

	A better caption

	Good	Better	Best

Complaints	0.69^***	0.68^***
	(0.40, 0.98)	(0.43, 0.93)

Privileges	-0.10	-0.10
	(-0.37, 0.16)	(-0.36, 0.15)

Learning	0.25	0.24^*	0.28^***
	(-0.06, 0.56)	(-0.04, 0.51)	(0.08, 0.48)

Raises	-0.03
	(-0.43, 0.36)

Critical	0.02		0.001
	(-0.27, 0.30)		(-0.16, 0.16)

Advance			-0.10
			(-0.24, 0.05)

Constant	11.01	11.26	-13.23^**
	(-11.93, 33.95)	(-3.09, 25.60)	(-26.15, -0.30)


Hello	This	0.007	James Bond
Observations	30	30	30
R²	0.72	0.72
Adjusted R²	0.66	0.68
Log Likelihood			-9.21
Akaike Inf. Crit.			26.43
Residual Std. Error	7.14 (df = 24)	6.86 (df = 26)
F Statistic	12.06^*** (df = 5; 24)	21.74^*** (df = 3; 26)

Note:	p<0.1; p<0.05; p<0.01

Omitting some variables from the table

You can omit some variables from the table by omit = c("varname1","varname2")

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1,1),
          add.lines = list(c("Control Variables", "No", "Yes", "Yes"),c("World Wars Included", "Yes", "No", "No")),
          omit=c("complaints","privileges","raises","critical","advance")
          )

**My Nice Regression Table**

	A better caption

	Good	Better	Best

learning	0.25	0.24^*	0.28^***
	(0.16)	(0.14)	(0.10)

Constant	11.01	11.26	-13.23^**
	(11.70)	(7.32)	(6.60)


Control Variables	No	Yes	Yes
World Wars Included	Yes	No	No
Observations	30	30	30
R²	0.72	0.72
Adjusted R²	0.66	0.68
Log Likelihood			-9.21
Akaike Inf. Crit.			26.43
Residual Std. Error	7.14 (df = 24)	6.86 (df = 26)
F Statistic	12.06^*** (df = 5; 24)	21.74^*** (df = 3; 26)

Note:	p<0.1; p<0.05; p<0.01

`gt` Package

If the canned options in stargazer do not meet your needs, you may consider gt as an option. Even though it currently supports only HTML format, it might be the right tool in the long-run as the extensions to PDF and RTF are on the way.

install.packages("gt")

library(gt)
library(tidyverse)
library(glue)

# Define the start and end dates for the data range
start_date <- "2010-06-07"
end_date <- "2010-06-14"

# Create a gt table based on preprocessed
# `sp500` table data
sp500 %>%
  dplyr::filter(date >= start_date & date <= end_date) %>%
  dplyr::select(-adj_close) %>%
  gt() %>%
  tab_header(
    title = "S&P 500",
    subtitle = glue::glue("{start_date} to {end_date}")
  ) %>%
  fmt_date(
    columns = vars(date),
    date_style = 3
  ) %>%
  fmt_currency(
    columns = vars(open, high, low, close),
    currency = "USD"
  ) %>%
  fmt_number(
    columns = vars(volume),
    suffixing = TRUE
  )

S&P 500
2010-06-07 to 2010-06-14
date	open	high	low	close	volume
Mon, Jun 14, 2010	$1,095.00	$1,105.91	$1,089.03	$1,089.63	4.43B
Fri, Jun 11, 2010	$1,082.65	$1,092.25	$1,077.12	$1,091.60	4.06B
Thu, Jun 10, 2010	$1,058.77	$1,087.85	$1,058.77	$1,086.84	5.14B
Wed, Jun 9, 2010	$1,062.75	$1,077.74	$1,052.25	$1,055.69	5.98B
Tue, Jun 8, 2010	$1,050.81	$1,063.15	$1,042.17	$1,062.00	6.19B
Mon, Jun 7, 2010	$1,065.84	$1,071.36	$1,049.86	$1,050.47	5.47B

Publication Quality Outputs with `stargazer`

Ömer Faruk Örsün | omerorsun@nyu.edu

Creating Tables

Installation

Descriptive Statistics and Dataframe

Replace Labels

Choosing the statistics to present

Transposing the table

Correlation Matrix

Regression Tables

Running our regressions

Extracting our regression equations

Presenting our regression coefficients in a table

Customizing the table

Combining column labels

Different journal styles:

Adding a line below our reported coeffients.

Showing the confidence intervals rather than standard errors

Omitting some variables from the table

`gt` Package

Publication Quality Outputs with stargazer

Ömer Faruk Örsün | omerorsun@nyu.edu

Creating Tables

Installation

Descriptive Statistics and Dataframe

Replace Labels

Choosing the statistics to present

Transposing the table

Correlation Matrix

Regression Tables

Running our regressions

Extracting our regression equations

Presenting our regression coefficients in a table

Customizing the table

Combining column labels

Different journal styles:

Adding a line below our reported coeffients.

Showing the confidence intervals rather than standard errors

Omitting some variables from the table

gt Package

Publication Quality Outputs with `stargazer`

`gt` Package