stargazer
stargazeris an R package that creates html code for beautiful tables.You can use R to run your analysis and use stargazer output in your R Markdown code directly. No copy, no paste!
We will also use
equatiomaticpackage to extract the equations we use in our models.
Let’s first install stargazer.
When creating an output from html or latex code, your chunk should include results='asis'. This option tells knitr to treat verbatim code blocks “as is.” Otherwise, instead of your table, you will see the raw html or latex code.
#```{r, results='asis', echo = TRUE, eval=TRUE, warning=FALSE, message=FALSE}
stargazer(attitude[1:5,], header=FALSE, type='html', summary=FALSE, title="Data Frame",digits=1)| rating | complaints | privileges | learning | raises | critical | advance | |
| 1 | 43 | 51 | 30 | 39 | 61 | 92 | 45 |
| 2 | 63 | 64 | 51 | 54 | 63 | 73 | 47 |
| 3 | 71 | 70 | 68 | 69 | 76 | 86 | 48 |
| 4 | 61 | 63 | 45 | 47 | 54 | 84 | 35 |
| 5 | 81 | 78 | 56 | 66 | 71 | 83 | 47 |
We can show the dataframe:
| rating | complaints | privileges | learning | raises | critical | advance | |
| 1 | 43 | 51 | 30 | 39 | 61 | 92 | 45 |
| 2 | 63 | 64 | 51 | 54 | 63 | 73 | 47 |
| 3 | 71 | 70 | 68 | 69 | 76 | 86 | 48 |
| 4 | 61 | 63 | 45 | 47 | 54 | 84 | 35 |
| 5 | 81 | 78 | 56 | 66 | 71 | 83 | 47 |
or most commonly descriptive statistics:
| Statistic | N | Mean | St. Dev. | Min | Pctl(25) | Pctl(75) | Max |
| rating | 30 | 64.6 | 12.2 | 40 | 58.8 | 71.8 | 85 |
| complaints | 30 | 66.6 | 13.3 | 37 | 58.5 | 77 | 90 |
| privileges | 30 | 53.1 | 12.2 | 30 | 45 | 62.5 | 83 |
| learning | 30 | 56.4 | 11.7 | 34 | 47 | 66.8 | 75 |
| raises | 30 | 64.6 | 10.4 | 43 | 58.2 | 71 | 88 |
| critical | 30 | 74.8 | 9.9 | 49 | 69.2 | 80 | 92 |
| advance | 30 | 42.9 | 10.3 | 25 | 35 | 47.8 | 72 |
We can replace the labels.
stargazer(attitude[c("rating","complaints","privileges")], header=FALSE, type='html',
title="Descriptive Statistics", digits=1,
covariate.labels=c("Rating","Complaints","Privileges")
)| Statistic | N | Mean | St. Dev. | Min | Pctl(25) | Pctl(75) | Max |
| Rating | 30 | 64.6 | 12.2 | 40 | 58.8 | 71.8 | 85 |
| Complaints | 30 | 66.6 | 13.3 | 37 | 58.5 | 77 | 90 |
| Privileges | 30 | 53.1 | 12.2 | 30 | 45 | 62.5 | 83 |
We can decide on what statistics to present:
stargazer(attitude[c("rating","complaints","privileges")], header=FALSE, type='html',
title="Descriptive Statistics", digits=1,
covariate.labels=c("Rating","Complaints","Privileges"),
summary.stat=c("n","mean","p75","sd","min")
)| Statistic | N | Mean | Pctl(75) | St. Dev. | Min |
| Rating | 30 | 64.6 | 71.8 | 12.2 | 40 |
| Complaints | 30 | 66.6 | 77 | 13.3 | 37 |
| Privileges | 30 | 53.1 | 62.5 | 12.2 | 30 |
or we can transpose the table using flip=TRUE:
| Statistic | rating | complaints | privileges | learning | raises | critical | advance |
| N | 30 | 30 | 30 | 30 | 30 | 30 | 30 |
| Mean | 64.6 | 66.6 | 53.1 | 56.4 | 64.6 | 74.8 | 42.9 |
| St. Dev. | 12.2 | 13.3 | 12.2 | 11.7 | 10.4 | 9.9 | 10.3 |
| Min | 40 | 37 | 30 | 34 | 43 | 49 | 25 |
| Pctl(25) | 58.8 | 58.5 | 45 | 47 | 58.2 | 69.2 | 35 |
| Pctl(75) | 71.8 | 77 | 62.5 | 66.8 | 71 | 80 | 47.8 |
| Max | 85 | 90 | 83 | 75 | 88 | 92 | 72 |
Lets see the correlation matrix:
## rating complaints privileges
## rating 1.0000000 0.8254176 0.4261169
## complaints 0.8254176 1.0000000 0.5582882
## privileges 0.4261169 0.5582882 1.0000000
We can visualize it better:
| rating | complaints | privileges | |
| rating | 1 | 0.825 | 0.426 |
| complaints | 0.825 | 1 | 0.558 |
| privileges | 0.426 | 0.558 | 1 |
Let’s run linear regression and a probit regression.
linear.1 <- lm(rating ~ complaints + privileges + learning + raises + critical, data=attitude)
linear.2 <- lm(rating ~ complaints + privileges + learning, data=attitude)
## create an indicator dependent variable, and run a probit model
attitude$highrating <- (attitude$rating > 70)
logit.model <- glm(highrating ~ learning + critical + advance, data=attitude,
family = binomial(link = "logit"))We basically ran this model:
We basically have these models
\[ \operatorname{rating} = \alpha + \beta_{1}(\operatorname{complaints}) + \beta_{2}(\operatorname{privileges}) + \beta_{3}(\operatorname{learning}) + \beta_{4}(\operatorname{raises}) + \beta_{5}(\operatorname{critical}) + \epsilon \]
the model estimates the coefficients as
\[ \operatorname{rating} = 11.011 + 0.692(\operatorname{complaints}) - 0.104(\operatorname{privileges}) + 0.249(\operatorname{learning}) - 0.033(\operatorname{raises}) + 0.015(\operatorname{critical}) + \epsilon \]
Our logistic regression model is as follows:
\[ \log\left[ \frac { P( \operatorname{highrating} = \operatorname{TRUE} ) }{ 1 - P( \operatorname{highrating} = \operatorname{TRUE} ) } \right] = \alpha + \beta_{1}(\operatorname{learning}) + \beta_{2}(\operatorname{critical}) + \beta_{3}(\operatorname{advance}) + \epsilon \]
the model estimates the coefficients as
\[ \log\left[ \frac { P( \operatorname{highrating} = \operatorname{TRUE} ) }{ 1 - P( \operatorname{highrating} = \operatorname{TRUE} ) } \right] = -13.226 + 0.279(\operatorname{learning}) + 0.001(\operatorname{critical}) - 0.097(\operatorname{advance}) + \epsilon \]
But we want to see all coefficients in one table. Let’s our nice table:
stargazer(linear.1, linear.2, logit.model,header=FALSE,title="My Nice Regression Table",
type='html',digits=2)| Dependent variable: | |||
| rating | highrating | ||
| OLS | logistic | ||
| (1) | (2) | (3) | |
| complaints | 0.69*** | 0.68*** | |
| (0.15) | (0.13) | ||
| privileges | -0.10 | -0.10 | |
| (0.13) | (0.13) | ||
| learning | 0.25 | 0.24* | 0.28*** |
| (0.16) | (0.14) | (0.10) | |
| raises | -0.03 | ||
| (0.20) | |||
| critical | 0.02 | 0.001 | |
| (0.15) | (0.08) | ||
| advance | -0.10 | ||
| (0.08) | |||
| Constant | 11.01 | 11.26 | -13.23** |
| (11.70) | (7.32) | (6.60) | |
| Observations | 30 | 30 | 30 |
| R2 | 0.72 | 0.72 | |
| Adjusted R2 | 0.66 | 0.68 | |
| Log Likelihood | -9.21 | ||
| Akaike Inf. Crit. | 26.43 | ||
| Residual Std. Error | 7.14 (df = 24) | 6.86 (df = 26) | |
| F Statistic | 12.06*** (df = 5; 24) | 21.74*** (df = 3; 26) | |
| Note: | p<0.1; p<0.05; p<0.01 | ||
We can further customize it. We can remove the dependent variable names by dep.var.labels.include = FALSE, we can remove model names by model.names = FALSE, we can remove model numbers by model.numbers = FALSE, we can rename the columns by column.labels = c()
stargazer(linear.1, linear.2, logit.model,header=FALSE,
title="My Nice Regression Table", type='html',digits=2,
dep.var.caption = "A better caption",
dep.var.labels.include = FALSE,
model.names = FALSE,
model.numbers = FALSE,
column.labels = c("Good", "Better","Best"),
column.separate = c(1,1, 1)
)| A better caption | |||
| Good | Better | Best | |
| complaints | 0.69*** | 0.68*** | |
| (0.15) | (0.13) | ||
| privileges | -0.10 | -0.10 | |
| (0.13) | (0.13) | ||
| learning | 0.25 | 0.24* | 0.28*** |
| (0.16) | (0.14) | (0.10) | |
| raises | -0.03 | ||
| (0.20) | |||
| critical | 0.02 | 0.001 | |
| (0.15) | (0.08) | ||
| advance | -0.10 | ||
| (0.08) | |||
| Constant | 11.01 | 11.26 | -13.23** |
| (11.70) | (7.32) | (6.60) | |
| Observations | 30 | 30 | 30 |
| R2 | 0.72 | 0.72 | |
| Adjusted R2 | 0.66 | 0.68 | |
| Log Likelihood | -9.21 | ||
| Akaike Inf. Crit. | 26.43 | ||
| Residual Std. Error | 7.14 (df = 24) | 6.86 (df = 26) | |
| F Statistic | 12.06*** (df = 5; 24) | 21.74*** (df = 3; 26) | |
| Note: | p<0.1; p<0.05; p<0.01 | ||
We can combine the label of columns by column.separate =c()
stargazer(linear.1, linear.2, logit.model,header=FALSE,
title="My Nice Regression Table", type='html',digits=2,
dep.var.caption = "A better caption",
dep.var.labels.include = FALSE,
model.names = FALSE,
model.numbers = FALSE,
column.labels = c("Good", "Awesome"),
column.separate = c(2, 1),
covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance")
)| A better caption | |||
| Good | Awesome | ||
| Complaints | 0.69*** | 0.68*** | |
| (0.15) | (0.13) | ||
| Privileges | -0.10 | -0.10 | |
| (0.13) | (0.13) | ||
| Learning | 0.25 | 0.24* | 0.28*** |
| (0.16) | (0.14) | (0.10) | |
| Raises | -0.03 | ||
| (0.20) | |||
| Critical | 0.02 | 0.001 | |
| (0.15) | (0.08) | ||
| Advance | -0.10 | ||
| (0.08) | |||
| Constant | 11.01 | 11.26 | -13.23** |
| (11.70) | (7.32) | (6.60) | |
| Observations | 30 | 30 | 30 |
| R2 | 0.72 | 0.72 | |
| Adjusted R2 | 0.66 | 0.68 | |
| Log Likelihood | -9.21 | ||
| Akaike Inf. Crit. | 26.43 | ||
| Residual Std. Error | 7.14 (df = 24) | 6.86 (df = 26) | |
| F Statistic | 12.06*** (df = 5; 24) | 21.74*** (df = 3; 26) | |
| Note: | p<0.1; p<0.05; p<0.01 | ||
We can use different journal formats. Lets see the American Political Science Review style for our table using style="apsr"
stargazer(linear.1, linear.2, logit.model,header=FALSE,
title="My Nice Regression Table", type='html',digits=2,
dep.var.caption = "A better caption",
dep.var.labels.include = FALSE,
model.names = FALSE,
model.numbers = FALSE,
column.labels = c("Good", "Better","Best"),
column.separate = c(1,1,1),
covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance"),
style="apsr"
)| Good | Better | Best | |
| Complaints | 0.69*** | 0.68*** | |
| (0.15) | (0.13) | ||
| Privileges | -0.10 | -0.10 | |
| (0.13) | (0.13) | ||
| Learning | 0.25 | 0.24* | 0.28*** |
| (0.16) | (0.14) | (0.10) | |
| Raises | -0.03 | ||
| (0.20) | |||
| Critical | 0.02 | 0.001 | |
| (0.15) | (0.08) | ||
| Advance | -0.10 | ||
| (0.08) | |||
| Constant | 11.01 | 11.26 | -13.23** |
| (11.70) | (7.32) | (6.60) | |
| N | 30 | 30 | 30 |
| R2 | 0.72 | 0.72 | |
| Adjusted R2 | 0.66 | 0.68 | |
| Log Likelihood | -9.21 | ||
| Residual Std. Error | 7.14 (df = 24) | 6.86 (df = 26) | |
| F Statistic | 12.06*** (df = 5; 24) | 21.74*** (df = 3; 26) | |
| AIC | 26.43 | ||
| p < .1; p < .05; p < .01 | |||
What about American Journal of Political Science: style="ajps"
stargazer(linear.1, linear.2, logit.model,header=FALSE,
title="My Nice Regression Table", type='html',digits=2,
dep.var.caption = "A better caption",
dep.var.labels.include = FALSE,
model.names = FALSE,
model.numbers = FALSE,
column.labels = c("Good", "Better","Best"),
column.separate = c(1,1,1),
covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance"),
style="ajps"
)| Good | Better | Best | |
| Complaints | 0.69*** | 0.68*** | |
| (0.15) | (0.13) | ||
| Privileges | -0.10 | -0.10 | |
| (0.13) | (0.13) | ||
| Learning | 0.25 | 0.24* | 0.28*** |
| (0.16) | (0.14) | (0.10) | |
| Raises | -0.03 | ||
| (0.20) | |||
| Critical | 0.02 | 0.001 | |
| (0.15) | (0.08) | ||
| Advance | -0.10 | ||
| (0.08) | |||
| Constant | 11.01 | 11.26 | -13.23** |
| (11.70) | (7.32) | (6.60) | |
| N | 30 | 30 | 30 |
| R-squared | 0.72 | 0.72 | |
| Adj. R-squared | 0.66 | 0.68 | |
| Log Likelihood | -9.21 | ||
| Residual Std. Error | 7.14 (df = 24) | 6.86 (df = 26) | |
| F Statistic | 12.06*** (df = 5; 24) | 21.74*** (df = 3; 26) | |
| AIC | 26.43 | ||
| p < .01; p < .05; p < .1 | |||
Let’s further customize our reported statistics below the coefficients. For this, we use add.lines = list(c("Hello", "This", "is", "NYUAD"),c("Yes", "I", "confirm", "0.006")),
stargazer(linear.1, linear.2, logit.model,header=FALSE,
title="My Nice Regression Table", type='html',digits=2,
dep.var.caption = "A better caption",
dep.var.labels.include = FALSE,
model.names = FALSE,
model.numbers = FALSE,
column.labels = c("Good", "Better","Best"),
column.separate = c(1,1,1),
covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance"),
add.lines = list(c("Hello", "This", "is", "NYUAD"),c("Yes", "I", "confirm", "0.006"))
)| A better caption | |||
| Good | Better | Best | |
| Complaints | 0.69*** | 0.68*** | |
| (0.15) | (0.13) | ||
| Privileges | -0.10 | -0.10 | |
| (0.13) | (0.13) | ||
| Learning | 0.25 | 0.24* | 0.28*** |
| (0.16) | (0.14) | (0.10) | |
| Raises | -0.03 | ||
| (0.20) | |||
| Critical | 0.02 | 0.001 | |
| (0.15) | (0.08) | ||
| Advance | -0.10 | ||
| (0.08) | |||
| Constant | 11.01 | 11.26 | -13.23** |
| (11.70) | (7.32) | (6.60) | |
| Hello | This | is | NYUAD |
| Yes | I | confirm | 0.006 |
| Observations | 30 | 30 | 30 |
| R2 | 0.72 | 0.72 | |
| Adjusted R2 | 0.66 | 0.68 | |
| Log Likelihood | -9.21 | ||
| Akaike Inf. Crit. | 26.43 | ||
| Residual Std. Error | 7.14 (df = 24) | 6.86 (df = 26) | |
| F Statistic | 12.06*** (df = 5; 24) | 21.74*** (df = 3; 26) | |
| Note: | p<0.1; p<0.05; p<0.01 | ||
You can alternatively present the confidence intervals by adding ci = TRUE,ci.level = 0.95
stargazer(linear.1, linear.2, logit.model,header=FALSE,
title="My Nice Regression Table", type='html',digits=2,
dep.var.caption = "A better caption",
dep.var.labels.include = FALSE,
model.names = FALSE,
model.numbers = FALSE,
column.labels = c("Good", "Better","Best"),
column.separate = c(1,1,1),
covariate.labels=c("Complaints","Privileges","Learning", "Raises","Critical", "Advance"),
add.lines = list(c("Hello", "This", "0.007", "James Bond")),
ci = TRUE,ci.level = 0.95
)| A better caption | |||
| Good | Better | Best | |
| Complaints | 0.69*** | 0.68*** | |
| (0.40, 0.98) | (0.43, 0.93) | ||
| Privileges | -0.10 | -0.10 | |
| (-0.37, 0.16) | (-0.36, 0.15) | ||
| Learning | 0.25 | 0.24* | 0.28*** |
| (-0.06, 0.56) | (-0.04, 0.51) | (0.08, 0.48) | |
| Raises | -0.03 | ||
| (-0.43, 0.36) | |||
| Critical | 0.02 | 0.001 | |
| (-0.27, 0.30) | (-0.16, 0.16) | ||
| Advance | -0.10 | ||
| (-0.24, 0.05) | |||
| Constant | 11.01 | 11.26 | -13.23** |
| (-11.93, 33.95) | (-3.09, 25.60) | (-26.15, -0.30) | |
| Hello | This | 0.007 | James Bond |
| Observations | 30 | 30 | 30 |
| R2 | 0.72 | 0.72 | |
| Adjusted R2 | 0.66 | 0.68 | |
| Log Likelihood | -9.21 | ||
| Akaike Inf. Crit. | 26.43 | ||
| Residual Std. Error | 7.14 (df = 24) | 6.86 (df = 26) | |
| F Statistic | 12.06*** (df = 5; 24) | 21.74*** (df = 3; 26) | |
| Note: | p<0.1; p<0.05; p<0.01 | ||
You can omit some variables from the table by omit = c("varname1","varname2")
stargazer(linear.1, linear.2, logit.model,header=FALSE,
title="My Nice Regression Table", type='html',digits=2,
dep.var.caption = "A better caption",
dep.var.labels.include = FALSE,
model.names = FALSE,
model.numbers = FALSE,
column.labels = c("Good", "Better","Best"),
column.separate = c(1,1,1),
add.lines = list(c("Control Variables", "No", "Yes", "Yes"),c("World Wars Included", "Yes", "No", "No")),
omit=c("complaints","privileges","raises","critical","advance")
)| A better caption | |||
| Good | Better | Best | |
| learning | 0.25 | 0.24* | 0.28*** |
| (0.16) | (0.14) | (0.10) | |
| Constant | 11.01 | 11.26 | -13.23** |
| (11.70) | (7.32) | (6.60) | |
| Control Variables | No | Yes | Yes |
| World Wars Included | Yes | No | No |
| Observations | 30 | 30 | 30 |
| R2 | 0.72 | 0.72 | |
| Adjusted R2 | 0.66 | 0.68 | |
| Log Likelihood | -9.21 | ||
| Akaike Inf. Crit. | 26.43 | ||
| Residual Std. Error | 7.14 (df = 24) | 6.86 (df = 26) | |
| F Statistic | 12.06*** (df = 5; 24) | 21.74*** (df = 3; 26) | |
| Note: | p<0.1; p<0.05; p<0.01 | ||
gt PackageIf the canned options in stargazer do not meet your needs, you may consider gt as an option. Even though it currently supports only HTML format, it might be the right tool in the long-run as the extensions to PDF and RTF are on the way.
library(gt)
library(tidyverse)
library(glue)
# Define the start and end dates for the data range
start_date <- "2010-06-07"
end_date <- "2010-06-14"
# Create a gt table based on preprocessed
# `sp500` table data
sp500 %>%
dplyr::filter(date >= start_date & date <= end_date) %>%
dplyr::select(-adj_close) %>%
gt() %>%
tab_header(
title = "S&P 500",
subtitle = glue::glue("{start_date} to {end_date}")
) %>%
fmt_date(
columns = vars(date),
date_style = 3
) %>%
fmt_currency(
columns = vars(open, high, low, close),
currency = "USD"
) %>%
fmt_number(
columns = vars(volume),
suffixing = TRUE
)| S&P 500 | |||||
|---|---|---|---|---|---|
| 2010-06-07 to 2010-06-14 | |||||
| date | open | high | low | close | volume |
| Mon, Jun 14, 2010 | $1,095.00 | $1,105.91 | $1,089.03 | $1,089.63 | 4.43B |
| Fri, Jun 11, 2010 | $1,082.65 | $1,092.25 | $1,077.12 | $1,091.60 | 4.06B |
| Thu, Jun 10, 2010 | $1,058.77 | $1,087.85 | $1,058.77 | $1,086.84 | 5.14B |
| Wed, Jun 9, 2010 | $1,062.75 | $1,077.74 | $1,052.25 | $1,055.69 | 5.98B |
| Tue, Jun 8, 2010 | $1,050.81 | $1,063.15 | $1,042.17 | $1,062.00 | 6.19B |
| Mon, Jun 7, 2010 | $1,065.84 | $1,071.36 | $1,049.86 | $1,050.47 | 5.47B |