Creating a Data Dictionary for BDHS Data Using R
Working with BDHS (Bangladesh Demographic and Health Survey) data?
Organizing variables with a data dictionary saves time and effort!
Here’s how to quickly summarize variable names, labels, unique values,
and more.
Step-by-Step Guide
1️⃣ Load the Required Library
Use the expss package to manage variable and value labels in R.
if(!require(expss)) install.packages("expss")
library(expss)
2️⃣ Load Your Dataset
Import the BDHS dataset you’re working with, like PR.
PR <- read_sav("C:/R/Data/BD_2022_DHS_11112024_545_222526/BDPR81SV/BDPR81FL.SAV")
3️⃣ Create the Data Dictionary Table
This code creates a summary table of variables, labels, values, and
missing data.
# Create a summary table for the PR dataset
dd.PR <- data.frame(
Variable = names(PR), # 1️⃣ Column: Variable names
Label = sapply(PR, var_lab), # 2️⃣ Column: Variable labels
Values = sapply(PR, function(x) paste(unique(x), collapse = ", ")), # 3️⃣ Column: Unique values
Value_Labels = sapply(PR, function(x) { # 4️⃣ Column: Value labels
val_labels <- val_lab(x) # Get value labels for each variable
if (!is.null(val_labels)) { # Check if labels exist
paste(names(val_labels), "=", val_labels, collapse = ", ") # Format as "name = value"
} else {
NA # If no labels, assign NA
}
}),
Missing_Values = sapply(PR, function(x) sum(is.na(x))), # 5️⃣ Column: Count of missing values
Total_Rows = nrow(PR) # 6️⃣ Column: Total rows in the dataset
)
4️⃣ Export as CSV
Save your data dictionary to a CSV file for easy sharing.
write.csv(dd.PR, "PR_data_dictionary.csv", row.names = FALSE)
Why Use a Data Dictionary?
Quick Analysis
: Understand variables and values at a
glance.
Team Collaboration
: Share dataset structure with others
easily.
Data Quality Check
: View missing values and unique
entries to assess data quality.
Customize for Any BDHS Dataset
To create a data dictionary for a different dataset, simply replace
PR with the dataset name you’re using (e.g., KR, IR, VA).
LS0tDQp0aXRsZTogIlIgRGF0YSBEaWN0aW9uYXJ5Ig0Kb3V0cHV0OiBodG1sX25vdGVib29rDQotLS0NCg0KIyBDcmVhdGluZyBhIERhdGEgRGljdGlvbmFyeSBmb3IgQkRIUyBEYXRhIFVzaW5nIFIgDQoNCldvcmtpbmcgd2l0aCBCREhTIChCYW5nbGFkZXNoIERlbW9ncmFwaGljIGFuZCBIZWFsdGggU3VydmV5KSBkYXRhPyBPcmdhbml6aW5nIHZhcmlhYmxlcyB3aXRoIGEgZGF0YSBkaWN0aW9uYXJ5IHNhdmVzIHRpbWUgYW5kIGVmZm9ydCEgSGVyZeKAmXMgaG93IHRvIHF1aWNrbHkgc3VtbWFyaXplIHZhcmlhYmxlIG5hbWVzLCBsYWJlbHMsIHVuaXF1ZSB2YWx1ZXMsIGFuZCBtb3JlLg0KDQojIyMgU3RlcC1ieS1TdGVwIEd1aWRlDQoNCiMjIyMgMe+4j+KDoyBMb2FkIHRoZSBSZXF1aXJlZCBMaWJyYXJ5DQpVc2UgdGhlIGV4cHNzIHBhY2thZ2UgdG8gbWFuYWdlIHZhcmlhYmxlIGFuZCB2YWx1ZSBsYWJlbHMgaW4gUi4NCg0KDQpgYGB7cn0NCmlmKCFyZXF1aXJlKGV4cHNzKSkgaW5zdGFsbC5wYWNrYWdlcygiZXhwc3MiKQ0KbGlicmFyeShleHBzcykNCmBgYA0KDQoNCiMjIyMgMu+4j+KDoyBMb2FkIFlvdXIgRGF0YXNldA0KSW1wb3J0IHRoZSBCREhTIGRhdGFzZXQgeW914oCZcmUgd29ya2luZyB3aXRoLCBsaWtlIFBSLg0KDQpgYGB7cn0NClBSIDwtIHJlYWRfc2F2KCJDOi9SL0RhdGEvQkRfMjAyMl9ESFNfMTExMTIwMjRfNTQ1XzIyMjUyNi9CRFBSODFTVi9CRFBSODFGTC5TQVYiKQ0KYGBgDQoNCg0KIyMjIyAz77iP4oOjIENyZWF0ZSB0aGUgRGF0YSBEaWN0aW9uYXJ5IFRhYmxlDQpUaGlzIGNvZGUgY3JlYXRlcyBhIHN1bW1hcnkgdGFibGUgb2YgdmFyaWFibGVzLCBsYWJlbHMsIHZhbHVlcywgYW5kIG1pc3NpbmcgZGF0YS4NCg0KYGBge3J9DQojIENyZWF0ZSBhIHN1bW1hcnkgdGFibGUgZm9yIHRoZSBQUiBkYXRhc2V0DQpkZC5QUiA8LSBkYXRhLmZyYW1lKA0KICBWYXJpYWJsZSA9IG5hbWVzKFBSKSwgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICMgMe+4j+KDoyBDb2x1bW46IFZhcmlhYmxlIG5hbWVzDQogIExhYmVsID0gc2FwcGx5KFBSLCB2YXJfbGFiKSwgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIyAy77iP4oOjIENvbHVtbjogVmFyaWFibGUgbGFiZWxzDQogIFZhbHVlcyA9IHNhcHBseShQUiwgZnVuY3Rpb24oeCkgcGFzdGUodW5pcXVlKHgpLCBjb2xsYXBzZSA9ICIsICIpKSwgICAgIyAz77iP4oOjIENvbHVtbjogVW5pcXVlIHZhbHVlcw0KICBWYWx1ZV9MYWJlbHMgPSBzYXBwbHkoUFIsIGZ1bmN0aW9uKHgpIHsgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICMgNO+4j+KDoyBDb2x1bW46IFZhbHVlIGxhYmVscw0KICAgIHZhbF9sYWJlbHMgPC0gdmFsX2xhYih4KSAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIyBHZXQgdmFsdWUgbGFiZWxzIGZvciBlYWNoIHZhcmlhYmxlDQogICAgaWYgKCFpcy5udWxsKHZhbF9sYWJlbHMpKSB7ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAjIENoZWNrIGlmIGxhYmVscyBleGlzdA0KICAgICAgcGFzdGUobmFtZXModmFsX2xhYmVscyksICI9IiwgdmFsX2xhYmVscywgY29sbGFwc2UgPSAiLCAiKSAgICAgICAgICAgIyBGb3JtYXQgYXMgIm5hbWUgPSB2YWx1ZSINCiAgICB9IGVsc2Ugew0KICAgICAgTkEgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIyBJZiBubyBsYWJlbHMsIGFzc2lnbiBOQQ0KICAgIH0NCiAgfSksDQogIE1pc3NpbmdfVmFsdWVzID0gc2FwcGx5KFBSLCBmdW5jdGlvbih4KSBzdW0oaXMubmEoeCkpKSwgICAgICAgICAgICAgICAgIyA177iP4oOjIENvbHVtbjogQ291bnQgb2YgbWlzc2luZyB2YWx1ZXMNCiAgVG90YWxfUm93cyA9IG5yb3coUFIpICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAjIDbvuI/ig6MgQ29sdW1uOiBUb3RhbCByb3dzIGluIHRoZSBkYXRhc2V0DQopDQpgYGANCg0KDQojIyMjIDTvuI/ig6MgRXhwb3J0IGFzIENTVg0KU2F2ZSB5b3VyIGRhdGEgZGljdGlvbmFyeSB0byBhIENTViBmaWxlIGZvciBlYXN5IHNoYXJpbmcuDQoNCmBgYHtyfQ0Kd3JpdGUuY3N2KGRkLlBSLCAiUFJfZGF0YV9kaWN0aW9uYXJ5LmNzdiIsIHJvdy5uYW1lcyA9IEZBTFNFKQ0KYGBgDQoNCg0KIyMjIFdoeSBVc2UgYSBEYXRhIERpY3Rpb25hcnk/DQotIGBRdWljayBBbmFseXNpc2A6IFVuZGVyc3RhbmQgdmFyaWFibGVzIGFuZCB2YWx1ZXMgYXQgYSBnbGFuY2UuDQotIGBUZWFtIENvbGxhYm9yYXRpb25gOiBTaGFyZSBkYXRhc2V0IHN0cnVjdHVyZSB3aXRoIG90aGVycyBlYXNpbHkuDQotIGBEYXRhIFF1YWxpdHkgQ2hlY2tgOiBWaWV3IG1pc3NpbmcgdmFsdWVzIGFuZCB1bmlxdWUgZW50cmllcyB0byBhc3Nlc3MgZGF0YSBxdWFsaXR5Lg0KDQoNCiMjIyBDdXN0b21pemUgZm9yIEFueSBCREhTIERhdGFzZXQNClRvIGNyZWF0ZSBhIGRhdGEgZGljdGlvbmFyeSBmb3IgYSBkaWZmZXJlbnQgZGF0YXNldCwgc2ltcGx5IHJlcGxhY2UgUFIgd2l0aCB0aGUgZGF0YXNldCBuYW1lIHlvdeKAmXJlIHVzaW5nIChlLmcuLCBLUiwgSVIsIFZBKS4NCg==