0.1 Package Comparison Overview

Package	Descriptive Stats	Regression Tables	Robust / HC SE	Custom GOF Control	Formatting Flexibility	LaTeX/HTML/Word	Best Use Case
modelsummary	Excellent	Excellent	Excellent	Very Good	High	Excellent	Default modern workflow
texreg	Limited	Excellent	Excellent	Excellent	Moderate	Excellent	Manual control of regression output
huxtable	Very Good	Very Good	Very Good	High	Excellent	Excellent	Highly polished tables
stargazer	Good	Good	Good	Limited	Low–Moderate	Good (LaTeX)	Quick textbook-style tables
xtable	Basic	Basic	Manual	Limited	Low	LaTeX only	Simple LaTeX conversion

Colour Guide:
- 🟢 Green = Strongest capability
- 🔵 Light Blue = Very strong
- 🟡 Yellow = Recommended default workflow

This table summarises the relative strengths of each package. In practice, the choice depends on the balance between ease of use, formatting control, and econometric transparency required.

0.2 Summary (what to use when)

This note compares stargazer, texreg, huxtable, and modelsummary for two common tasks:

Descriptive statistics tables
- Fastest “one-liner”: stargazer(..., summary = TRUE)
- Most flexible formatting (HTML/Word/PDF): huxtable (and it plays nicely with Quarto)
- Most modern + consistent workflow: modelsummary::datasummary() (excellent defaults, flexible)
- Not really intended for this: texreg (possible via a workaround, but not its strength)
Regression tables (OLS + HC robust SE, and joint tests)
- Econometrics-friendly, minimal code: texreg (great for multi-model tables + custom GOF rows)
- Classic / familiar: stargazer (robust SEs are possible, but overall/robust F-tests require extra work)
- Best control over styling/output formats: huxtable (especially via huxreg())
- Best “one interface for everything” (including robust/clustered SE): modelsummary::modelsummary()

In the examples below we use the Tutorial 5 dataset (tut5.csv) and estimate:

\[ \text{healthpc}_i = \beta_0 + \beta_1\,\text{gdppc}_i + u_i \]

0.3 Setup

# Core data + modelling
library(tidyverse)
library(lmtest)
library(sandwich)

# Table packages
library(stargazer)
library(texreg)
library(huxtable)
library(modelsummary)  # <-- new package added

# Data
tut5 <- read_csv("tut5.csv")

# Keep the variables used in the tutorial regression
dat <- tut5 %>%
  select(countryname, gdppc, healthpc) %>%
  filter(!is.na(gdppc), !is.na(healthpc))

# Model
m_ols <- lm(healthpc ~ gdppc, data = dat)

# Robust covariance + robust tests
V_hc1 <- vcovHC(m_ols, type = "HC1")
ct_hc1 <- coeftest(m_ols, vcov = V_hc1)
se_hc1 <- sqrt(diag(V_hc1))
p_hc1  <- ct_hc1[, 4]

1 Descriptive statistics tables

We illustrate the same basic descriptive-statistics table using each package.

We will summarise healthpc and gdppc.

vars_for_desc <- dat %>% select(healthpc, gdppc)

1.1 Descriptive stats with `stargazer`

stargazer() can produce a quick “journal-style” summary table with almost no code.

# ensure it's a plain data.frame (not tibble)
tut5_df <- as.data.frame(tut5)

# (optional but recommended) keep numeric columns only for descriptives
tut5_num <- tut5_df[sapply(tut5_df, is.numeric)]


  stargazer(
    tut5_num[c("healthpc","gdppc")],
    type  = "html",
    digits = 3,
    title  = "Descriptive statistics (stargazer)"
  )

**Descriptive statistics (stargazer)**

Statistic	N	Mean	St. Dev.	Min	Max

healthpc	187	1,139.383	1,850.334	18.521	10,515.300
gdppc	208	18,731.120	27,840.020	238.784	185,978.600

Pros: extremely quick; familiar layout.
Cons: limited styling control; not actively developed; can feel dated in HTML.

1.2 Descriptive stats with `huxtable`

With huxtable, you typically compute the stats (with dplyr or base R) and then convert to a huxtable for styling/output.

desc_tbl <- vars_for_desc %>%
  summarise(across(everything(),
                   list(
                     N = ~sum(!is.na(.)),
                     Mean = ~mean(., na.rm = TRUE),
                     SD = ~sd(., na.rm = TRUE),
                     Min = ~min(., na.rm = TRUE),
                     Max = ~max(., na.rm = TRUE)
                   ))) %>%
  pivot_longer(everything(),
               names_to = c("Variable", ".value"),
               names_sep = "_")

ht <- as_hux(desc_tbl)
ht <- set_caption(ht, "Descriptive statistics (huxtable)")
ht

Descriptive statistics (huxtable)
Variable	N	Mean	SD	Min	Max
healthpc	184	1.16e+03	1.86e+03	18.5	1.05e+04
gdppc	184	1.56e+04	2.36e+04	239	1.86e+05

Pros: best styling control; great for Quarto (HTML/PDF/Word).
Cons: a bit more code than stargazer() for a simple table.

1.3 Descriptive stats with `modelsummary`

modelsummary has a purpose-built descriptive-statistics interface: datasummary() / datasummary_skim().

datasummary_skim(vars_for_desc,
                 fmt = 3,
                 title = "Descriptive statistics (modelsummary)")

	Unique	Missing Pct.	Mean	SD	Min	Median	Max	Histogram
healthpc	184	0	1155.162	1861.063	18.521	389.957	10515.300
gdppc	184	0	15608.773	23637.569	238.784	6220.876	185978.600

Pros: modern defaults; flexible; consistent workflow for both descriptives and regressions.
Cons: learning the datasummary() formula interface can take a moment (worth it).

1.4 Descriptive stats with `texreg` (workaround)

texreg is primarily for regression models. For descriptives, the most honest approach is: - use huxtable, modelsummary, or stargazer, and - reserve texreg for regressions.

That said, if you must stay inside texreg, you can create a custom texreg object. Below is a minimal “mean (sd)” table using createTexreg().

means <- sapply(vars_for_desc, mean, na.rm = TRUE)
sds   <- sapply(vars_for_desc, sd, na.rm = TRUE)

tex_desc <- createTexreg(
  coef.names = names(means),
  coef       = as.numeric(means),
  se         = as.numeric(sds),
  pvalues    = rep(NA_real_, length(means)),
  gof.names  = c("N"),
  gof        = c(nrow(vars_for_desc)),
  gof.decimal = c(FALSE)
)

screenreg(list(tex_desc),
          custom.model.names = "Descriptives",
          digits = 3,
          caption = "Descriptive statistics (texreg workaround): coef = mean, se = sd")


======================
          Descriptives
----------------------
healthpc    1155.162  
           (1861.063) 
gdppc      15608.773  
          (23637.569) 
----------------------
N            184      
======================
*** p < 0.001; ** p < 0.01; * p < 0.05

Pros: possible if you need everything to run through texreg.
Cons: not natural; you lose the richer descriptive layouts other packages provide.

2 Regression tables (OLS vs HC1)

We now report the same regression in two columns:

Column 1: OLS SE (homoskedasticity-based)
Column 2: HC1 robust SE (heteroskedasticity-consistent)

We also show the practical issue you identified: the “usual” model F-statistic vs a robust joint test (Wald F).

2.1 Regression output with `stargazer`

stargazer can accept a list of standard errors, but it will still print the usual model F-stat (from the lm object). A robust overall test needs extra work if you want it reported.

# estimate model **1** using OLS (ignoring any heteroskedasticity)


# OLS  Model (Ignore heteroskedasticity)
# dependent variable: healthpc
# explanatory variable: gdppc
reg3 <- lm(healthpc ~ gdppc, data=tut5)
# Sample F statistic
wald_ols <- waldtest(reg3)                            # sample F statistics of overall significance
RSS_reg3 <- deviance(reg3)                            # residual sum of squares for reg3
resid_reg3 <- reg3$residuals                          # generate series for residuals for reg3                    
yhat_reg3 <- reg3$fitted.values                       # generate series for fitted values for reg3
demdf1 <- df.residual(reg3)                           # denominator df
fstat1 <- round(wald_ols$"F"[2], digits=4)            # Sample value of F stat
pvalf1 <- round(wald_ols$'Pr(>F)'[2], digits=4)       # p value of F test
numdf1 <- abs(wald_ols$"Df"[2])                       # numerator df

Now estimate model 2 using Huber-White standard errors (this is using the code given in tute5.R )

# Use sandwich package
cov_reg3 <- vcovHC(reg3, type = "HC1")
# print(cov_reg3)                               # print robust variance=covariance matrix
# coeftest(reg3, vcov=cov_reg3)                 # print estimates with robust std.errors 
robust_se_reg3    <- sqrt(diag(cov_reg3))     # robust std errors as sq. root of main diagonals      
# Adjusted F statistic (Robust standard errors)
wald_r <- waldtest(reg3, vcov = cov_reg3)     # sample F statistic using the robust variance-cov matrix
# print(wald_r)
fstat2 <- round(wald_r$"F"[2], digits=4)           # Sample value of F stat based on robust var-cov matrix
pvalf2 <- round(wald_r$'Pr(>F)'[2], digits=4)      # p value of F test
numdf2 <- abs(wald_r$"Df"[2])                      # numerator df

Then, again using the code provided in tute5.R , report the results using stargazer

Still need to use omit.stat="f" in this and then include add.lines

Little modification to make the table more readable ?

f_C <- qf(p=.01, df1=1, df2=181, lower.tail=FALSE)
F_crit <- sprintf("F(%.0f,%.0f)= %.3f", 
                            numdf1, demdf1,f_C)

stargazer(reg3, reg3,  type = "html", 
          dep.var.labels=c("Health Expenditure, per capita (current (&#36;US)"),
          covariate.labels=c("Intercept", "GDP, per capita (current (&#36;US)"),
          column.labels = c("OLS", "Robust (White)"),
          se        = list(NULL, robust_se_reg3, NULL),
          omit.stat = "f",
          title = "Figure 5: Model (4) Estimation Results",
          add.lines = list(c("F Statistic", fstat1, fstat2),
                           c("F critical", F_crit, F_crit),
                           c("p-value", sprintf("%.5f",pvalf1), 
                             sprintf("%.5f",pvalf2))),
          digits=4, align=TRUE,
          intercept.bottom=FALSE,
          star.cutoffs = c(0.05, 0.01, 0.001))

**Figure 5: Model (4) Estimation Results**

	Dependent variable:

	Health Expenditure, per capita (current ($US)
	OLS	Robust (White)
	(1)	(2)

Intercept	160.4437	160.4437
	(96.8937)	(191.1788)

GDP, per capita (current ($US)	0.0637^***	0.0637^***
	(0.0034)	(0.0162)


F Statistic	345.7792	15.5561
F critical	F(1,182)= 6.777	F(1,182)= 6.777
p-value	0.00000	0.00010
Observations	184	184
R²	0.6552	0.6552
Adjusted R²	0.6533	0.6533
Residual Std. Error (df = 182)	1,095.8730	1,095.8730

Note:	p<0.05; p<0.01; p<0.001

Pros: quick; familiar.
Cons: robust overall tests (Wald F) are not “native”; custom lines can get fiddly.

2.2 Regression output with `texreg` (recommended for this use-case)

This is the workflow we used earlier: keep both columns as lm objects and override the SE/p-values in the robust column, then manually insert the robust Wald F (and p-value).

# Classical (usual) F-stat and p-value
fs   <- summary(m_ols)$fstatistic
F_lm <- unname(fs["value"])
df1  <- unname(fs["numdf"])
df2  <- unname(fs["dendf"])
p_lm <- pf(F_lm, df1, df2, lower.tail = FALSE)

# Robust (HC1) Wald F and p-value for overall significance
wald   <- waldtest(m_ols, vcov = V_hc1, test = "F")
F_hc   <- unname(wald[2, "F"])
p_hc_F <- unname(wald[2, "Pr(>F)"])

add_stars <- function(p) {
  if (is.na(p)) ""
  else if (p < 0.001) "***"
  else if (p < 0.01) "**"
  else if (p < 0.05) "*"
  else if (p < 0.1) "."
  else ""
}

F_label   <- sprintf("F(%d, %d)", df1, df2)
F_lm_str  <- sprintf("%.3f%s", F_lm, add_stars(p_lm))
F_hc_str  <- sprintf("%.3f%s", F_hc, add_stars(p_hc_F))
p_lm_str  <- sprintf("(%.3f)", p_lm)
p_hc_str  <- sprintf("(%.3f)", p_hc_F)

screenreg(
  list(m_ols, m_ols),
  custom.model.names = c("OLS", "HC1"),
  override.se = list(NULL, se_hc1),
  override.pvalues = list(NULL, p_hc1),
  include.fstat = FALSE,
  custom.gof.rows = list(
    F_label     = c(F_lm_str, F_hc_str),
    "(p-value)" = c(p_lm_str, p_hc_str)
  ),
  digits = 3,
  caption = "Regression results (texreg): OLS vs HC1, with usual F and robust Wald F"
)

Warning in override(models = models, override.coef = override.coef, override.se
= override.se, : Number of SEs provided does not match number of coefficients
in model 1. Using default SEs.

Warning in override(models = models, override.coef = override.coef, override.se
= override.se, : Number of p-values provided does not match number of
coefficients in model 1. Using default p-values.


======================================
             OLS          HC1         
--------------------------------------
(Intercept)  160.444       160.444    
             (96.894)     (191.179)   
gdppc          0.064 ***     0.064 ***
              (0.003)       (0.016)   
--------------------------------------
F_label      345.779***     15.556*** 
(p-value)     (0.000)       (0.000)   
R^2            0.655         0.655    
Adj. R^2       0.653         0.653    
Num. obs.    184           184        
======================================
*** p < 0.001; ** p < 0.01; * p < 0.05

Pros: very econometrics-friendly; easy robust column; easy custom GOF rows.
Cons: descriptives are awkward; very bespoke styling is less convenient than huxtable.

2.3 Regression output with `huxtable`

huxreg() is a nice, readable interface. You can supply robust SE/p-values by constructing a second “robust” column (similar conceptually to the texreg override approach).

library(huxtable)
library(sandwich)
library(lmtest)

# --- Fit model (adjust to your variables/data) ---
# Example:
# dat <- read.csv("tut5.csv")
# m_ols <- lm(y ~ gdppc, data = dat)

# Robust covariance + robust SEs
cov1  <- vcovHC(m_ols, type = "HC1")
se_hc1 <- sqrt(diag(cov1))
se_hc1 <- se_hc1[names(coef(m_ols))]  # align with coef order

# Build base huxtable (two columns = same model)
hux_tbl <- huxreg(
  "OLS" = m_ols,
  "HC1" = m_ols,
  coefs = c("Intercept" = "(Intercept)", "GDP per capita" = "gdppc")
)

# --- Replace SE strings in HC1 column (col 3) ---
mat <- huxtable::contents(hux_tbl)

to_chr <- function(x) vapply(x, function(z) paste(z, collapse = ""), character(1))
mat_chr <- apply(mat, 2, to_chr)

# SE rows are rows where OLS column begins with "("
se_rows   <- which(grepl("^\\(", mat_chr[, 2]))
coef_rows <- se_rows - 1

# Map displayed term labels to underlying coef names
lookup <- c("Intercept" = "(Intercept)", "GDP per capita" = "gdppc")
display_terms <- mat_chr[coef_rows, 1]
coef_names_in_table <- unname(lookup[display_terms])

# Robust SEs in the same order as terms displayed
se_hc1_in_table_order <- se_hc1[coef_names_in_table]

# Overwrite HC1 SE cells
mat[se_rows, 3] <- sprintf("(%.3f)", se_hc1_in_table_order)
huxtable::contents(hux_tbl) <- mat

# --- Compute overall F stats (OLS and robust Wald F) ---
fs   <- summary(m_ols)$fstatistic
F_lm <- unname(fs["value"])
df1  <- unname(fs["numdf"])
df2  <- unname(fs["dendf"])
p_lm <- pf(F_lm, df1, df2, lower.tail = FALSE)

wald   <- waldtest(m_ols, vcov = cov1, test = "F")
F_hc   <- unname(wald[2, "F"])
p_hc_F <- unname(wald[2, "Pr(>F)"])

add_stars <- function(p) {
  if (p < 0.001) "***"
  else if (p < 0.01) "**"
  else if (p < 0.05) "*"
  else if (p < 0.1) "."
  else ""
}

F_label  <- sprintf("F(%d, %d)", df1, df2)
F_lm_str <- sprintf("%.3f%s", F_lm, add_stars(p_lm))
F_hc_str <- sprintf("%.3f%s", F_hc, add_stars(p_hc_F))

p_lm_str <- sprintf("(%.3f)", p_lm)
p_hc_str <- sprintf("(%.3f)", p_hc_F)

# --- Append F rows to table (safest via character matrix -> as_hux) ---
mat2 <- huxtable::contents(hux_tbl)
mat2_chr <- apply(mat2, 2, to_chr)

mat2_chr <- rbind(
  mat2_chr,
  c(F_label,    F_lm_str, F_hc_str),
  c("(p-value)", p_lm_str, p_hc_str)
)

hux_final <- as_hux(mat2_chr)

# Simple styling
bold(hux_final)[1, ] <- TRUE
align(hux_final) <- "center"
align(hux_final)[, 1] <- "left"
set_caption(hux_final, "Regression results (huxtable): OLS vs HC1 (robust SE + overall F-tests)")

Regression results (huxtable): OLS vs HC1 (robust SE + overall F-tests)
	OLS	HC1
Intercept	160.443652225566	160.443652225566
	(96.8937191911367)	(191.179)
GDP per capita	0.0637281798232077 ***	0.0637281798232077 ***
	(0.00342714168873277)	(0.016)
N	184	184
R2	0.655158846931442	0.655158846931442
logLik	-1547.95170959669	-1547.95170959669
AIC	3101.90341919338	3101.90341919338
* p < 0.001; p < 0.01; * p < 0.05.	* p < 0.001; p < 0.01; * p < 0.05.	* p < 0.001; p < 0.01; * p < 0.05.
F(1, 182)	345.779***	15.556***
(p-value)	(0.000)	(0.000)

hux_final

	OLS	HC1
Intercept	160.443652225566	160.443652225566
	(96.8937191911367)	(191.179)
GDP per capita	0.0637281798232077 ***	0.0637281798232077 ***
	(0.00342714168873277)	(0.016)
N	184	184
R2	0.655158846931442	0.655158846931442
logLik	-1547.95170959669	-1547.95170959669
AIC	3101.90341919338	3101.90341919338
* p < 0.001; p < 0.01; * p < 0.05.	* p < 0.001; p < 0.01; * p < 0.05.	* p < 0.001; p < 0.01; * p < 0.05.
F(1, 182)	345.779***	15.556***
(p-value)	(0.000)	(0.000)

Pros: excellent control over styling and output formats.
Cons: more coding for desired output; robust inference is often easiest via modelsummary rather than manual cell edits.

2.4 Regression output with `modelsummary` (cleanest “modern” approach)

modelsummary() makes robust SEs straightforward and consistent. You can provide different vcov specs per model column.

models <- list("OLS" = m_ols, "HC1" = m_ols)

modelsummary(
  models,
  vcov = list(NULL, "HC1"),
  statistic = "({std.error})",
  stars = TRUE,
  fmt = 3,
  title = "Regression results (modelsummary): OLS vs HC1"
)

Regression results (modelsummary): OLS vs HC1
	OLS	HC1
+ p < 0.1, * p < 0.05, p < 0.01, * p < 0.001
(Intercept)	160.444+	160.444
	(96.894)	(191.179)
gdppc	0.064***	0.064***
	(0.003)	(0.016)
Num.Obs.	184	184
R2	0.655	0.655
R2 Adj.	0.653	0.653
AIC	3101.9	3101.9
BIC	3111.5	3111.5
Log.Lik.	-1547.952	-1547.952
F	345.779	15.556
RMSE	1089.90	1089.90
Std.Errors		HC1

2.4.1 Adding an overall (robust) Wald F line

modelsummary is happiest when you add custom rows via add_rows. We’ll compute the same “usual F” and “robust Wald F” and append them.

extra_rows <- data.frame(
  term = c(F_label, "(p-value)"),
  OLS = c(F_lm_str, p_lm_str),
  HC1 = c(F_hc_str, p_hc_str),
  check.names = FALSE
)

modelsummary(
  models,
  vcov = list(NULL, "HC1"),
  statistic = "({std.error})",
  stars = TRUE,
  fmt = 3,
  title = "Regression results (modelsummary): with usual F and robust Wald F",
  add_rows = extra_rows
)

Regression results (modelsummary): with usual F and robust Wald F
	OLS	HC1
+ p < 0.1, * p < 0.05, p < 0.01, * p < 0.001
(Intercept)	160.444+	160.444
	(96.894)	(191.179)
gdppc	0.064***	0.064***
	(0.003)	(0.016)
Num.Obs.	184	184
R2	0.655	0.655
R2 Adj.	0.653	0.653
AIC	3101.9	3101.9
BIC	3111.5	3111.5
Log.Lik.	-1547.952	-1547.952
F	345.779	15.556
RMSE	1089.90	1089.90
Std.Errors		HC1
F(1, 182)	345.779***	15.556***
(p-value)	(0.000)	(0.000)

Pros: arguably the cleanest interface today; robust/clustered SE easy; works great in Quarto.
Cons: learning the full set of options (especially for custom layouts) can take a bit of time.

2.5 Practical notes

The usual model F-stat in lm output relies on homoskedasticity-based variance assumptions.
The robust “overall significance” test is a Wald/F test computed using a robust covariance matrix (HC1 here).
If you show both in the same table, it’s good practice to label the robust one as Wald F (HC1) or similar in notes/caption.

2.6 References

stargazer cheatsheet: https://www.jakeruss.com/cheatsheets/stargazer/#TOC
texreg: https://jrnold.github.io/intro-methods-notes/formatting-tables.html
huxtable: https://hughjonesd.github.io/huxtable/
modelsummary: https://modelsummary.com/vignettes/modelsummary.html

2.7 Recommended Workflow

2.7.1 Practical Recommendation

modelsummary: Best all-round choice. Clean syntax, handles OLS, HC, clustered SEs, descriptive statistics, and multiple output formats (LaTeX, HTML, Word). Recommended default for most teaching and applied work.

texreg: Excellent when you want full manual control over goodness-of-fit rows (e.g., inserting classical and robust F-statistics with stars). Very transparent for econometrics teaching.

huxtable: Best when you want maximum formatting flexibility and polished output. Especially useful in Quarto workflows and when styling tables carefully.

stargazer: Still useful for quick textbook-style tables, but development has slowed and customization is limited compared to newer packages.

2.7.2 Suggested Default Strategy

Use modelsummary for descriptive statistics and regression tables.

Switch to texreg when you need explicit control over classical vs. robust inference presentation.

0.1 Package Comparison Overview

0.2 Summary (what to use when)

0.3 Setup

1 Descriptive statistics tables

1.1 Descriptive stats with stargazer

1.2 Descriptive stats with huxtable

1.3 Descriptive stats with modelsummary

1.4 Descriptive stats with texreg (workaround)

2 Regression tables (OLS vs HC1)

2.1 Regression output with stargazer

2.2 Regression output with texreg (recommended for this use-case)

2.3 Regression output with huxtable

2.4 Regression output with modelsummary (cleanest “modern” approach)