Professional quality tables and figures

Schedule was updated

  • See updated syllabus

Creating reproducible tables in R

  • How many of you are still COPYING & PASTING numbers into Word to create tables?

Creating reproducible tables in R

  • Why automate table creation?

Creating reproducible tables in R

  • Why automate table creation?

    • minimize (or eliminate) human error
    • use time efficiently
    • easily update table with new results

Creating reproducible tables in R

  • There are many R packages that can produce nice tables

  • Some of my favorites are

    • flextable (LaTex, HTML, or Microsoft Word)
    • xtable (Latex or HTML)
    • table1 (LaTex, HTML, or Microsoft Word; for creating Table 1)
    • texreg (LaTex, HTML, or Microsoft Word; for creating table from model outputs)
  • Some other options are

    • gtsummary
    • modelsummary

Creating reproducible tables in R

  • Best practice is to write a short script that will produce one table at a time

    • table_patientcharacteristics.R
    • table_regressionmodel.R
    • table_logisticmodel.R
  • It might be useful to change the name of the R script after your research is published

    • table1_patientcharacteristics.R
    • table2_regressionmodel.R
    • table3_logisticmodel.R

Creating reproducible figures in R

  • Same principles apply to creating figures

  • ggplot2 is the most commonly used (for good reason!) R package to create nice figures

Assignment 1 - Table 1 for armed conflict paper

  • How would you design a Table 1 for the armed conflict paper (primary analyses with binary predictor)?

    • How many observations?
    • How many columns?
    • Which variables to include as rows?
    • Any other considerations?
  • Write an R script that creates a Table 1 using your favorite package (either in PDF or Word)

    • Remember, we created a table of patient characteristics in the first lecture!

Assignment 2 - Descriptive figure

  • Write an R script that creates a figure that shows the trend in maternal mortality for countries that had an increase from 2000 to 2017
  1. Select only the countries that had an increase in maternal mortality from 2000 to 2017 (why 2017?)

    1. Hint: This code will create a new variable diffmatmor that shows the difference between maternal mortality of each year and maternal mortality in 2000
finaldata <- read.csv(here("data", "finaldata.csv"), header = TRUE)
finaldata |>
  dplyr::select(country_name, ISO, year, MatMor) |>
  dplyr::filter(year < 2018) |>
  arrange(ISO, year) |>
  group_by(ISO) |>
  mutate(diffmatmor = MatMor - MatMor[1L]) 
  1. Create a line graph with maternal mortality on the y-axis, year on the x-axis, and a unique color for each country

    1. Look at my code from last week (EDA) on how to create a line graph using ggplot2

Assignment 3 - Peer review

  • Have your Table 1 and Figure reviewed by your partner
  • Make any changes based on their suggestions
  • Commit + Push to your GitHub repo