Summary Tables with gtsummary

Revealjs Presentation

Matthew Prado

IBM 6530, Cal Poly Pomona

2026-03-03

In Step 1, you have seen how to create a summary table and modify it for various statistical analyses. Summarize the package’s capabilities as explained by the speaker. How may the {gtsummary} benefit you for your school work or career?

The gtsummary package is designed to create clean, publication-ready statistical summary tables directly from data and statistical models in R. Its core capability is simplifying the process of generating descriptive statistics, group comparisons, and regression results while automatically applying appropriate statistical methods.

The package can:

  • Produce descriptive summary tables for continuous and categorical variables, including counts, percentages, medians, and interquartile ranges.

  • Compare groups using statistically appropriate tests (e.g., Wilcoxon rank-sum, chi-square, Fisher’s exact test) with minimal user input.

  • Add p-values, confidence intervals, and effect size estimates to tables to support inferential analysis.

  • Generate cross-tabulations for categorical variables with associated statistical tests.

  • Create regression summary tables for linear, logistic, and other models, including exponentiated results such as odds ratios.

  • Support univariate and multivariable regression summaries for exploratory and modeling workflows.

  • Output tables in formats suitable for reports, manuscripts, and presentations, emphasizing clarity and reproducibility.

In Step 2, you watched the extended presentation of the {gtsummary} package by Dr. Daniel Dsjoberg, the very creator of the package.

2.1. How does it differ from gt and gtExtras?

The way I feel it differs from gr and gtExtras is I feel these are basic commands while with gtsummary is able to provide a bigger picture and be able to explain things that can be hard to explain.

2.2. Give three things you learned newly that were not explained in the lecture in Step 1.

  • One thing I learned about is the table can be very customizable with different commands.

    • by: specifies a column variable for cross-tabulation

    • type: specifies the summary type

    • statistic: customize the reported statistics

    • label: change or customize variable labels

  • The modify function is able to change anything when it comes to the header and the footnote for the table summary

  • Regression having a fresh look was interesting to see as I had not thought it was possible to change the look of it instead of having the standard output. Providing a clean table really makes it clear into what the reader is looking at.

Apply what you learned to your MSDM CEP data. Choose two appropriate variables for cross-tabulation and show if the two variables are associated or not. Use appropriate statistics to test if the two variables are associated. Code, produce the table, and interpret the result.

library(gtsummary)

CE_Event_Tracker_Com_Events_2025_ |>

tbl_cross( row = Type_of_Event, col = Purpose ) |>

add_p()

Every time I try to run the code it kept giving me this error

Using the MSDM CEP data, run multiple regression. The dependent variable can be continuous or dichotomous. Regress a dependent variable on a set of independent variables. Code, produce the table, and interpret the result.

library(dplyr)

CE_Event_Tracker_Com_Events_2025_ <- CE_Event_Tracker_Com_Events_2025_ |>} mutate( Type_of_Event = as.factor(Type_of_Event), Purpose = as.factor(Purpose), City = as.factor(City) )

model <- lm(Engagements ~ Type_of_Event + Purpose + Attendance, data = CE_Event_Tracker_Com_Events_2025)

tbl <- model |> gtsummary::tbl_regression() tbl

(same with this code it was giving me issues)