QUESTION: “What makes for a good data dictionary, and …make one in R?”
We’ll use the “palmerpenguins” packages (https://allisonhorst.github.io/palmerpenguins/) to address this question.
What makes a good dictionary? A good dictionary includes columns and labels that are self-explanatory when the user/reader can easily read something without any confusion on what a variable is for, etc.
library(pander)
# data dictionary columns
column.names <- c("species", "island", "bill_length_mm","bill_depth_mm", "flipper_length_mm", "body_mass_g", "sex", "year")
author.names <- c("Allison Marie Horst", " Alison Presmanes Hill", " Kristen B Gorman", "Allison Marie Horst", " Alison Presmanes Hill", " Kristen B Gorman"," Kristen B Gorman", " Kristen B Gorman" )
reference <- c("Gorman KB", " Williams TD" , "Fraser WR (2014)", "Palmer Station Antarctica LTER" , "K. Gorman, 2020" ,"K. Gorman, 2020", "K. Gorman, 2020", "K. Gorman, 2020" )
Make a dataframe to hold the DATA DICTIONARY information
# make dataframe
data.dictionary <- data.frame(ColumnNames = column.names, AuthorNames = author.names, Reference = reference)
# Display data dictionary
pander(data.dictionary) #pander = prints the dictionary; Vectors, lists, arrays. tests, models, prcomp
| ColumnNames | AuthorNames | Reference |
|---|---|---|
| species | Allison Marie Horst | Gorman KB |
| island | Alison Presmanes Hill | Williams TD |
| bill_length_mm | Kristen B Gorman | Fraser WR (2014) |
| bill_depth_mm | Allison Marie Horst | Palmer Station Antarctica LTER |
| flipper_length_mm | Alison Presmanes Hill | K. Gorman, 2020 |
| body_mass_g | Kristen B Gorman | K. Gorman, 2020 |
| sex | Kristen B Gorman | K. Gorman, 2020 |
| year | Kristen B Gorman | K. Gorman, 2020 |
Keywords: 1.pander() 2. data dictionary