stargazer is an R package that creates html code for beautiful tables.You can use R to run your analysis and use stargazer output in your R Markdown code directly. No copy, no paste!

We will also use equatiomatic package to extract the equations we use in our models.

Creating Tables

Installation

Let’s first install stargazer.

install.packages("stargazer")
library(stargazer)
library(stargazer)
data(attitude)

Descriptive Statistics and Dataframe

When creating an output from html or latex code, your chunk should include results='asis'. This option tells knitr to treat verbatim code blocks “as is.” Otherwise, instead of your table, you will see the raw html or latex code.

#```{r, results='asis', echo = TRUE, eval=TRUE, warning=FALSE, message=FALSE}
stargazer(attitude[1:5,], header=FALSE, type='html', summary=FALSE, title="Data Frame",digits=1)
Data Frame
rating complaints privileges learning raises critical advance
1 43 51 30 39 61 92 45
2 63 64 51 54 63 73 47
3 71 70 68 69 76 86 48
4 61 63 45 47 54 84 35
5 81 78 56 66 71 83 47

We can show the dataframe:

stargazer(attitude[1:5,], header=FALSE, type='html', summary=FALSE, title="Data Frame",digits=1)
Data Frame
rating complaints privileges learning raises critical advance
1 43 51 30 39 61 92 45
2 63 64 51 54 63 73 47
3 71 70 68 69 76 86 48
4 61 63 45 47 54 84 35
5 81 78 56 66 71 83 47

or most commonly descriptive statistics:

stargazer(attitude, header=FALSE, type='html', title="Descriptive Statistics",digits=1)
Descriptive Statistics
Statistic N Mean St. Dev. Min Pctl(25) Pctl(75) Max
rating 30 64.6 12.2 40 58.8 71.8 85
complaints 30 66.6 13.3 37 58.5 77 90
privileges 30 53.1 12.2 30 45 62.5 83
learning 30 56.4 11.7 34 47 66.8 75
raises 30 64.6 10.4 43 58.2 71 88
critical 30 74.8 9.9 49 69.2 80 92
advance 30 42.9 10.3 25 35 47.8 72

Replace Labels

We can replace the labels.

stargazer(attitude[c("rating","complaints","privileges")], header=FALSE, type='html', 
          title="Descriptive Statistics", digits=1,
          covariate.labels=c("Rating","Complaints","Privileges")
          )
Descriptive Statistics
Statistic N Mean St. Dev. Min Pctl(25) Pctl(75) Max
Rating 30 64.6 12.2 40 58.8 71.8 85
Complaints 30 66.6 13.3 37 58.5 77 90
Privileges 30 53.1 12.2 30 45 62.5 83

Choosing the statistics to present

We can decide on what statistics to present:

stargazer(attitude[c("rating","complaints","privileges")], header=FALSE, type='html', 
          title="Descriptive Statistics", digits=1,
          covariate.labels=c("Rating","Complaints","Privileges"),
          summary.stat=c("n","mean","p75","sd","min")
          )
Descriptive Statistics
Statistic N Mean Pctl(75) St. Dev. Min
Rating 30 64.6 71.8 12.2 40
Complaints 30 66.6 77 13.3 37
Privileges 30 53.1 62.5 12.2 30

Transposing the table

or we can transpose the table using flip=TRUE:

stargazer(attitude, header=FALSE, type='html', title="Descriptive Statistics", digits=1, flip=TRUE)
Descriptive Statistics
Statistic rating complaints privileges learning raises critical advance
N 30 30 30 30 30 30 30
Mean 64.6 66.6 53.1 56.4 64.6 74.8 42.9
St. Dev. 12.2 13.3 12.2 11.7 10.4 9.9 10.3
Min 40 37 30 34 43 49 25
Pctl(25) 58.8 58.5 45 47 58.2 69.2 35
Pctl(75) 71.8 77 62.5 66.8 71 80 47.8
Max 85 90 83 75 88 92 72

Correlation Matrix

Lets see the correlation matrix:

correlation.matrix <- cor(attitude[,c("rating","complaints","privileges")])
correlation.matrix
##               rating complaints privileges
## rating     1.0000000  0.8254176  0.4261169
## complaints 0.8254176  1.0000000  0.5582882
## privileges 0.4261169  0.5582882  1.0000000

We can visualize it better:

stargazer(correlation.matrix, header=FALSE, type="html", title="Correlation Matrix")
Correlation Matrix
rating complaints privileges
rating 1 0.825 0.426
complaints 0.825 1 0.558
privileges 0.426 0.558 1

Regression Tables

Running our regressions

Let’s run linear regression and a probit regression.

linear.1 <- lm(rating ~ complaints + privileges + learning + raises + critical, data=attitude)
linear.2 <- lm(rating ~ complaints + privileges + learning, data=attitude)
## create an indicator dependent variable, and run a probit model
attitude$highrating <- (attitude$rating > 70)
logit.model <- glm(highrating ~ learning + critical + advance, data=attitude,
family = binomial(link = "logit"))

Extracting our regression equations

install.packages("remotes")
remotes::install_github("datalorax/equatiomatic")

We basically ran this model:

library(tidyverse)
library(equatiomatic)

We basically have these models

extract_eq(linear.1)

\[ \operatorname{rating} = \alpha + \beta_{1}(\operatorname{complaints}) + \beta_{2}(\operatorname{privileges}) + \beta_{3}(\operatorname{learning}) + \beta_{4}(\operatorname{raises}) + \beta_{5}(\operatorname{critical}) + \epsilon \]

the model estimates the coefficients as

extract_eq(linear.1, use_coefs = TRUE,coef_digits = 3,)

\[ \operatorname{rating} = 11.011 + 0.692(\operatorname{complaints}) - 0.104(\operatorname{privileges}) + 0.249(\operatorname{learning}) - 0.033(\operatorname{raises}) + 0.015(\operatorname{critical}) + \epsilon \]

Our logistic regression model is as follows:

extract_eq(logit.model)

\[ \log\left[ \frac { P( \operatorname{highrating} = \operatorname{TRUE} ) }{ 1 - P( \operatorname{highrating} = \operatorname{TRUE} ) } \right] = \alpha + \beta_{1}(\operatorname{learning}) + \beta_{2}(\operatorname{critical}) + \beta_{3}(\operatorname{advance}) + \epsilon \]

the model estimates the coefficients as

extract_eq(logit.model, use_coefs = TRUE,coef_digits = 3,)

\[ \log\left[ \frac { P( \operatorname{highrating} = \operatorname{TRUE} ) }{ 1 - P( \operatorname{highrating} = \operatorname{TRUE} ) } \right] = -13.226 + 0.279(\operatorname{learning}) + 0.001(\operatorname{critical}) - 0.097(\operatorname{advance}) + \epsilon \]

But we want to see all coefficients in one table. Let’s our nice table:

Presenting our regression coefficients in a table

stargazer(linear.1, linear.2, logit.model,header=FALSE,title="My Nice Regression Table", 
          type='html',digits=2)
My Nice Regression Table
Dependent variable:
rating highrating
OLS logistic
(1) (2) (3)
complaints 0.69*** 0.68***
(0.15) (0.13)
privileges -0.10 -0.10
(0.13) (0.13)
learning 0.25 0.24* 0.28***
(0.16) (0.14) (0.10)
raises -0.03
(0.20)
critical 0.02 0.001
(0.15) (0.08)
advance -0.10
(0.08)
Constant 11.01 11.26 -13.23**
(11.70) (7.32) (6.60)
Observations 30 30 30
R2 0.72 0.72
Adjusted R2 0.66 0.68
Log Likelihood -9.21
Akaike Inf. Crit. 26.43
Residual Std. Error 7.14 (df = 24) 6.86 (df = 26)
F Statistic 12.06*** (df = 5; 24) 21.74*** (df = 3; 26)
Note: p<0.1; p<0.05; p<0.01

Customizing the table

We can further customize it. We can remove the dependent variable names by dep.var.labels.include = FALSE, we can remove model names by model.names = FALSE, we can remove model numbers by model.numbers = FALSE, we can rename the columns by column.labels = c()

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1, 1)
          )
My Nice Regression Table
A better caption
Good Better Best
complaints 0.69*** 0.68***
(0.15) (0.13)
privileges -0.10 -0.10
(0.13) (0.13)
learning 0.25 0.24* 0.28***
(0.16) (0.14) (0.10)
raises -0.03
(0.20)
critical 0.02 0.001
(0.15) (0.08)
advance -0.10
(0.08)
Constant 11.01 11.26 -13.23**
(11.70) (7.32) (6.60)
Observations 30 30 30
R2 0.72 0.72
Adjusted R2 0.66 0.68
Log Likelihood -9.21
Akaike Inf. Crit. 26.43
Residual Std. Error 7.14 (df = 24) 6.86 (df = 26)
F Statistic 12.06*** (df = 5; 24) 21.74*** (df = 3; 26)
Note: p<0.1; p<0.05; p<0.01

Combining column labels

We can combine the label of columns by column.separate =c()

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Awesome"),
          column.separate = c(2, 1),
          covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance")
          )
My Nice Regression Table
A better caption
Good Awesome
Complaints 0.69*** 0.68***
(0.15) (0.13)
Privileges -0.10 -0.10
(0.13) (0.13)
Learning 0.25 0.24* 0.28***
(0.16) (0.14) (0.10)
Raises -0.03
(0.20)
Critical 0.02 0.001
(0.15) (0.08)
Advance -0.10
(0.08)
Constant 11.01 11.26 -13.23**
(11.70) (7.32) (6.60)
Observations 30 30 30
R2 0.72 0.72
Adjusted R2 0.66 0.68
Log Likelihood -9.21
Akaike Inf. Crit. 26.43
Residual Std. Error 7.14 (df = 24) 6.86 (df = 26)
F Statistic 12.06*** (df = 5; 24) 21.74*** (df = 3; 26)
Note: p<0.1; p<0.05; p<0.01

Different journal styles:

We can use different journal formats. Lets see the American Political Science Review style for our table using style="apsr"

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1,1),
          covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance"),
          style="apsr"
          )
My Nice Regression Table
Good Better Best
Complaints 0.69*** 0.68***
(0.15) (0.13)
Privileges -0.10 -0.10
(0.13) (0.13)
Learning 0.25 0.24* 0.28***
(0.16) (0.14) (0.10)
Raises -0.03
(0.20)
Critical 0.02 0.001
(0.15) (0.08)
Advance -0.10
(0.08)
Constant 11.01 11.26 -13.23**
(11.70) (7.32) (6.60)
N 30 30 30
R2 0.72 0.72
Adjusted R2 0.66 0.68
Log Likelihood -9.21
Residual Std. Error 7.14 (df = 24) 6.86 (df = 26)
F Statistic 12.06*** (df = 5; 24) 21.74*** (df = 3; 26)
AIC 26.43
p < .1; p < .05; p < .01

What about American Journal of Political Science: style="ajps"

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1,1),
          covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance"),
          style="ajps"
          )
My Nice Regression Table
Good Better Best
Complaints 0.69*** 0.68***
(0.15) (0.13)
Privileges -0.10 -0.10
(0.13) (0.13)
Learning 0.25 0.24* 0.28***
(0.16) (0.14) (0.10)
Raises -0.03
(0.20)
Critical 0.02 0.001
(0.15) (0.08)
Advance -0.10
(0.08)
Constant 11.01 11.26 -13.23**
(11.70) (7.32) (6.60)
N 30 30 30
R-squared 0.72 0.72
Adj. R-squared 0.66 0.68
Log Likelihood -9.21
Residual Std. Error 7.14 (df = 24) 6.86 (df = 26)
F Statistic 12.06*** (df = 5; 24) 21.74*** (df = 3; 26)
AIC 26.43
p < .01; p < .05; p < .1

Adding a line below our reported coeffients.

Let’s further customize our reported statistics below the coefficients. For this, we use add.lines = list(c("Hello", "This", "is", "NYUAD"),c("Yes", "I", "confirm", "0.006")),

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1,1),
          covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance"),
          add.lines = list(c("Hello", "This", "is", "NYUAD"),c("Yes", "I", "confirm", "0.006"))
          )
My Nice Regression Table
A better caption
Good Better Best
Complaints 0.69*** 0.68***
(0.15) (0.13)
Privileges -0.10 -0.10
(0.13) (0.13)
Learning 0.25 0.24* 0.28***
(0.16) (0.14) (0.10)
Raises -0.03
(0.20)
Critical 0.02 0.001
(0.15) (0.08)
Advance -0.10
(0.08)
Constant 11.01 11.26 -13.23**
(11.70) (7.32) (6.60)
Hello This is NYUAD
Yes I confirm 0.006
Observations 30 30 30
R2 0.72 0.72
Adjusted R2 0.66 0.68
Log Likelihood -9.21
Akaike Inf. Crit. 26.43
Residual Std. Error 7.14 (df = 24) 6.86 (df = 26)
F Statistic 12.06*** (df = 5; 24) 21.74*** (df = 3; 26)
Note: p<0.1; p<0.05; p<0.01

Showing the confidence intervals rather than standard errors

You can alternatively present the confidence intervals by adding ci = TRUE,ci.level = 0.95

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1,1),
          covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance"),
          add.lines = list(c("Hello", "This", "0.007", "James Bond")),
          ci = TRUE,ci.level = 0.95
          )
My Nice Regression Table
A better caption
Good Better Best
Complaints 0.69*** 0.68***
(0.40, 0.98) (0.43, 0.93)
Privileges -0.10 -0.10
(-0.37, 0.16) (-0.36, 0.15)
Learning 0.25 0.24* 0.28***
(-0.06, 0.56) (-0.04, 0.51) (0.08, 0.48)
Raises -0.03
(-0.43, 0.36)
Critical 0.02 0.001
(-0.27, 0.30) (-0.16, 0.16)
Advance -0.10
(-0.24, 0.05)
Constant 11.01 11.26 -13.23**
(-11.93, 33.95) (-3.09, 25.60) (-26.15, -0.30)
Hello This 0.007 James Bond
Observations 30 30 30
R2 0.72 0.72
Adjusted R2 0.66 0.68
Log Likelihood -9.21
Akaike Inf. Crit. 26.43
Residual Std. Error 7.14 (df = 24) 6.86 (df = 26)
F Statistic 12.06*** (df = 5; 24) 21.74*** (df = 3; 26)
Note: p<0.1; p<0.05; p<0.01

Omitting some variables from the table

You can omit some variables from the table by omit = c("varname1","varname2")

stargazer(linear.1, linear.2, logit.model,header=FALSE,
          title="My Nice Regression Table", type='html',digits=2,
          dep.var.caption  = "A better caption",
          dep.var.labels.include = FALSE,
          model.names = FALSE,
          model.numbers = FALSE,
          column.labels   = c("Good", "Better","Best"),
          column.separate = c(1,1,1),
          add.lines = list(c("Control Variables", "No", "Yes", "Yes"),c("World Wars Included", "Yes", "No", "No")),
          omit=c("complaints","privileges","raises","critical","advance")
          )
My Nice Regression Table
A better caption
Good Better Best
learning 0.25 0.24* 0.28***
(0.16) (0.14) (0.10)
Constant 11.01 11.26 -13.23**
(11.70) (7.32) (6.60)
Control Variables No Yes Yes
World Wars Included Yes No No
Observations 30 30 30
R2 0.72 0.72
Adjusted R2 0.66 0.68
Log Likelihood -9.21
Akaike Inf. Crit. 26.43
Residual Std. Error 7.14 (df = 24) 6.86 (df = 26)
F Statistic 12.06*** (df = 5; 24) 21.74*** (df = 3; 26)
Note: p<0.1; p<0.05; p<0.01

gt Package

If the canned options in stargazer do not meet your needs, you may consider gt as an option. Even though it currently supports only HTML format, it might be the right tool in the long-run as the extensions to PDF and RTF are on the way.

install.packages("gt")
library(gt)
library(tidyverse)
library(glue)

# Define the start and end dates for the data range
start_date <- "2010-06-07"
end_date <- "2010-06-14"

# Create a gt table based on preprocessed
# `sp500` table data
sp500 %>%
  dplyr::filter(date >= start_date & date <= end_date) %>%
  dplyr::select(-adj_close) %>%
  gt() %>%
  tab_header(
    title = "S&P 500",
    subtitle = glue::glue("{start_date} to {end_date}")
  ) %>%
  fmt_date(
    columns = vars(date),
    date_style = 3
  ) %>%
  fmt_currency(
    columns = vars(open, high, low, close),
    currency = "USD"
  ) %>%
  fmt_number(
    columns = vars(volume),
    suffixing = TRUE
  )
S&P 500
2010-06-07 to 2010-06-14
date open high low close volume
Mon, Jun 14, 2010 $1,095.00 $1,105.91 $1,089.03 $1,089.63 4.43B
Fri, Jun 11, 2010 $1,082.65 $1,092.25 $1,077.12 $1,091.60 4.06B
Thu, Jun 10, 2010 $1,058.77 $1,087.85 $1,058.77 $1,086.84 5.14B
Wed, Jun 9, 2010 $1,062.75 $1,077.74 $1,052.25 $1,055.69 5.98B
Tue, Jun 8, 2010 $1,050.81 $1,063.15 $1,042.17 $1,062.00 6.19B
Mon, Jun 7, 2010 $1,065.84 $1,071.36 $1,049.86 $1,050.47 5.47B