QUESTION: “What makes for a good data dictionary, and …make one in R?”

Data

We’ll use the “palmerpenguins” packages (https://allisonhorst.github.io/palmerpenguins/) to address this question.

What makes a good dictionary? A good dictionary includes columns and labels that are self-explanatory when the user/reader can easily read something without any confusion on what a variable is for, etc.

library(pander) 
# data dictionary columns
column.names <- c("species", "island", "bill_length_mm","bill_depth_mm", "flipper_length_mm", "body_mass_g", "sex", "year")

author.names <- c("Allison Marie Horst", " Alison Presmanes Hill", " Kristen B Gorman", "Allison Marie Horst", " Alison Presmanes Hill", " Kristen B Gorman"," Kristen B Gorman", " Kristen B Gorman" )

reference <- c("Gorman KB", " Williams TD" , "Fraser WR (2014)", "Palmer Station Antarctica LTER" , "K. Gorman, 2020" ,"K. Gorman, 2020", "K. Gorman, 2020", "K. Gorman, 2020" )

Make a dataframe to hold the DATA DICTIONARY information

# make dataframe
data.dictionary <- data.frame(ColumnNames = column.names, AuthorNames = author.names, Reference = reference)
# Display data dictionary
pander(data.dictionary) #pander = prints the dictionary; Vectors, lists, arrays. tests, models, prcomp
ColumnNames AuthorNames Reference
species Allison Marie Horst Gorman KB
island Alison Presmanes Hill Williams TD
bill_length_mm Kristen B Gorman Fraser WR (2014)
bill_depth_mm Allison Marie Horst Palmer Station Antarctica LTER
flipper_length_mm Alison Presmanes Hill K. Gorman, 2020
body_mass_g Kristen B Gorman K. Gorman, 2020
sex Kristen B Gorman K. Gorman, 2020
year Kristen B Gorman K. Gorman, 2020

Keywords: 1.pander() 2. data dictionary