Data Cleaning, Household Survey Data
Introduction
In the following code, I clean sections 00, 01, 06, and 12 in Burkina Faso’s 2018-2019 Enquête Harmonisée sur le Conditions de Vie des Ménages (EHCVM). The purpose of this project was 1. to build profiles of individuals (based on clusters of characteristics) likely to join violent extremist groups in Burkina Faso and 2. to identify provinces with a higher proportion of these identified profiles to better direct UNDP aid. Later, we evaluate responses to these survey questions.
Data Source: https://microdata.worldbank.org/index.php/catalog/4290
Data Cleaning
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## Thank you for using fastDummies!
## To acknowledge our work, please cite the package:
## Kaplan, J. & Schlegel, B. (2023). fastDummies: Fast Creation of Dummy (Binary) Columns and Rows from Categorical Variables. Version 1.7.1. URL: https://github.com/jacobkap/fastDummies, https://jacobkap.github.io/fastDummies/.
## here() starts at /Users/shanametcalf/Desktop/Work, Resumes, Profesional Photos/Misc.Application Docs/code samples
Section 00: Demographics Part One
We want to pull out household number, region, province, and type of terrain
## Rows: 3227 Columns: 51
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (15): s00q00, s00q01, s00q02, s00q03, s00q04, s00q05, s00q07a, s00q07b,...
## dbl (9): hhid, grappe, menage, vague, s00q06, s00q07, hhid_EHCVM1, grappe_...
## lgl (22): s00q07d2, s00q10, s00q11, s00q12a, s00q12b, s00q13, s00q14, s00q1...
## dttm (5): s00q23a, s00q24a, s00q25a, s00q23b, s00q24b
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## # A tibble: 6 × 51
## hhid grappe menage vague s00q00 s00q01 s00q02 s00q03 s00q04 s00q05 s00q06
## <dbl> <dbl> <dbl> <dbl> <chr> <chr> <chr> <chr> <chr> <chr> <dbl>
## 1 2001 2 1 1 Burkina F… Sahel Oudal… Gorom… Urbain Secte… 2
## 2 2002 2 2 1 Burkina F… Sahel Oudal… Gorom… Urbain Secte… 2
## 3 2004 2 4 1 Burkina F… Sahel Oudal… Gorom… Urbain Secte… 2
## 4 2005 2 5 1 Burkina F… Sahel Oudal… Gorom… Urbain Secte… 2
## 5 2007 2 7 1 Burkina F… Sahel Oudal… Gorom… Urbain Secte… 2
## 6 2040 2 40 1 Burkina F… Sahel Oudal… Gorom… Urbain Secte… 2
## # ℹ 40 more variables: s00q07 <dbl>, s00q07a <chr>, s00q07b <chr>,
## # s00q07c <chr>, s00q07d <chr>, s00q07d2 <lgl>, s00q07e <chr>,
## # hhid_EHCVM1 <dbl>, grappe_EHCVM1 <dbl>, menage_EHCVM1 <dbl>, s00q10 <lgl>,
## # s00q11 <lgl>, s00q12a <lgl>, s00q12b <lgl>, s00q13 <lgl>, s00q14 <lgl>,
## # s00q13a <lgl>, s00q14a <lgl>, s00q15 <lgl>, s00q16 <lgl>, s00q17 <lgl>,
## # s00q18 <lgl>, s00q19 <lgl>, s00q20 <lgl>, s00q22 <lgl>, s00q23a <dttm>,
## # s00q24a <dttm>, s00q25a <dttm>, s00q23b <dttm>, s00q24b <dttm>, …
# process dataset
s00_final <- s00 %>%
mutate(
HHID_ID = paste(grappe, menage, vague, sep = "_"), # keep only head of households. currently, this is on an individual level. we want this on a household level.
s00q04_type = ifelse(s00q04 == 'Urbain', 'Urban', 'Rural') # translate french to english
) %>%
select(HHID_ID, hhid, region = s00q01, province = s00q02, village = s00q03, s00q04_type)
# ensure we have dataset in the format we want it
head(s00_final)## # A tibble: 6 × 6
## HHID_ID hhid region province village s00q04_type
## <chr> <dbl> <chr> <chr> <chr> <chr>
## 1 2_1_1 2001 Sahel Oudalan Gorom-Gorom Urban
## 2 2_2_1 2002 Sahel Oudalan Gorom-Gorom Urban
## 3 2_4_1 2004 Sahel Oudalan Gorom-Gorom Urban
## 4 2_5_1 2005 Sahel Oudalan Gorom-Gorom Urban
## 5 2_7_1 2007 Sahel Oudalan Gorom-Gorom Urban
## 6 2_40_1 2040 Sahel Oudalan Gorom-Gorom Urban
Section 1: Demographics Part Two
We want to pull out household number, age, gender, ethnicity, nationality, relationship status, religion, and a couple of questions on access to internet.
## Warning: One or more parsing issues, call `problems()` on your data frame for details,
## e.g.:
## dat <- vroom(...)
## problems(dat)
## Rows: 26523 Columns: 67
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (38): s01q00aa, s01q00b, s01q00b_autre, s01q00d, s01q00f, s01q01, s01q02...
## dbl (26): hhid, grappe, menage, vague, pid, s01q00c, s01q00e, s01q03a, s01q0...
## lgl (3): s01q09__3, s01q09__4, s01q19_autre
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## # A tibble: 6 × 67
## hhid grappe menage vague pid s01q00aa s01q00b s01q00b_autre s01q00c s01q00d
## <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <chr> <dbl> <chr>
## 1 2001 2 1 1 5 OUI <NA> <NA> NA <NA>
## 2 2001 2 1 1 2 OUI <NA> <NA> NA <NA>
## 3 2001 2 1 1 1 OUI <NA> <NA> NA <NA>
## 4 2001 2 1 1 4 OUI <NA> <NA> NA <NA>
## 5 2001 2 1 1 6 OUI <NA> <NA> NA <NA>
## 6 2001 2 1 1 3 OUI <NA> <NA> NA <NA>
## # ℹ 57 more variables: s01q00e <dbl>, s01q00f <chr>, s01q01 <chr>,
## # s01q02 <chr>, s01q03a <dbl>, s01q03b <chr>, s01q03c <dbl>, s01q04a <dbl>,
## # s01q04b <dbl>, s01q05 <chr>, s01q06 <chr>, s01q00_0 <chr>, s01q00 <dbl>,
## # s01q07 <chr>, s01q08 <chr>, s01q09__0 <dbl>, s01q09__1 <dbl>,
## # s01q09__2 <dbl>, s01q09__3 <lgl>, s01q09__4 <lgl>, s01q10 <dbl>,
## # s01q11 <chr>, s01q12 <chr>, s01q13 <chr>, s01q14 <chr>, s01q15 <chr>,
## # s01q16 <chr>, s01q17 <chr>, s01q18 <chr>, s01q19 <chr>, …
# process dataset
s01_final <- s01 %>%
mutate(HHID_ID = paste(grappe, menage, vague, pid, sep = "_"), # create unique household ID
# translate variables from french to english and create new columns for gender, ethnicity, religion, and relationship status
s01q01_gender = ifelse(s01q01 == 'Masculin', 'Male', 'Female') %>% as_factor(),
s01q16_ethnicity = case_when(
s01q16 == 'Autres ethnies' ~ 'Other',
TRUE ~ s01q16
) %>% as_factor(),
s01q3c_age = 2022 - s01q03c, # create age variable for later analysis
s01q14_religion = case_when(
s01q14 == 'Musulman' ~ 'Muslim',
s01q14 == 'Animiste' ~ 'Animist',
s01q14 == 'Chrétien' ~ 'Christian',
s01q14 == 'Sans Réligion' ~ 'No religion',
s01q14 == 'Autre Réligion' ~ 'Other',
TRUE ~ NA_character_
) %>% as_factor(),
s01q07_relationship_status = case_when(
s01q07 == 'Célibataire' ~ 'Single',
s01q07 == 'Marié(e) polygame' ~ 'Polygamous marriage',
s01q07 == 'Marié(e) monogame' ~ 'Monogamous marriage',
s01q07 == 'Veuf(ve)' ~ 'Widow',
s01q07 == 'Union libre' ~ 'Open relationship',
s01q07 == 'Divorcé(e)' ~ 'Divorced',
s01q07 == 'Séparé(e)' ~ 'Separated',
TRUE ~ NA_character_
) %>% as_factor()
) %>%
# rename columns
rename(
s01q15_nationality = s01q15,
s01q39a_internet_on_phone = s01q39__1,
s01q39b_internet_at_work = s01q39__2,
s01q39c_internet_at_cyber_cafe = s01q39__3,
s01q39d_internet_at_home = s01q39__4,
s01q39e_internet_at_school = s01q39__5
) %>%
# select relevant variables
select(
HHID_ID, hhid, s01q01_gender, s01q3c_age, s01q07_relationship_status,
s01q14_religion, s01q15_nationality, s01q16_ethnicity, s01q39a_internet_on_phone,
s01q39b_internet_at_work, s01q39c_internet_at_cyber_cafe, s01q39d_internet_at_home,
s01q39e_internet_at_school
)
head(s01_final)## # A tibble: 6 × 13
## HHID_ID hhid s01q01_gender s01q3c_age s01q07_relationship_s…¹ s01q14_religion
## <chr> <dbl> <fct> <dbl> <fct> <fct>
## 1 2_1_1_5 2001 Male 15 Single Muslim
## 2 2_1_1_2 2001 Female 43 Monogamous marriage Muslim
## 3 2_1_1_1 2001 Male 52 Monogamous marriage Muslim
## 4 2_1_1_4 2001 Male 15 Single Muslim
## 5 2_1_1_6 2001 Male 21 Single Muslim
## 6 2_1_1_3 2001 Female 9 <NA> Muslim
## # ℹ abbreviated name: ¹s01q07_relationship_status
## # ℹ 7 more variables: s01q15_nationality <chr>, s01q16_ethnicity <fct>,
## # s01q39a_internet_on_phone <dbl>, s01q39b_internet_at_work <dbl>,
## # s01q39c_internet_at_cyber_cafe <dbl>, s01q39d_internet_at_home <dbl>,
## # s01q39e_internet_at_school <dbl>
Section 6: Savings and Credit Data
We want to pull out household number, access to bank accounts, access to credit, and follow-up questions on these.
## Warning: One or more parsing issues, call `problems()` on your data frame for details,
## e.g.:
## dat <- vroom(...)
## problems(dat)
## Rows: 26523 Columns: 35
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (15): s06q00_0, s06q02, s06q03, s06q04, s06q04_autre, s06q05, s06q06, s0...
## dbl (17): hhid, grappe, menage, vague, pid, s06q00, s06q01__1, s06q01__2, s0...
## lgl (3): s06q06_autre, s06q011_autre, s06q12_autre
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# process dataset
s06_final <- s06 %>%
mutate(
HHID_ID = paste(grappe, menage, vague, pid, sep = "_"), # create unique household ID
# create new columns for type of bank account access
s06q01a_has_standard_bank_account = s06q01__1,
s06q01b_has_bank_account_through_job = s06q01__2,
s06q01c_has_rural_savings_bank_account = s06q01__3,
s06q01d_has_mobile_bank_account = s06q01__4,
s06q01e_has_prepaid_card = s06q01__5,
s06q03_applied_for_credit_last_twelve = ifelse(s06q03 == 'Oui', 1, 0),
s06q05_obtained_credit_last_twelve = ifelse(s06q05 == 'Oui', 1, 0),
s06q02_has_savings_account = ifelse(s06q02 == 'Oui', 1, 0),
# translate from french to english and create new column for follow up credit question
s06q04_if_no_credit_why = case_when(
s06q04 == 'Pas nécessaire' ~ 'Didnt_need',
s06q04 == 'Ne remplit pas les conditions' ~ 'Did not qualify',
s06q04 == 'Ne sait pas comment demander' ~ 'Do not know how to ask for one',
s06q04 == 'Pas capable de rembourser' ~ 'Unable to repay',
s06q04 == "Absence d'institutions de crédit" ~ 'No credit institutions available',
s06q04 == "N'est pas sûr d'en obtenir un" ~ 'Not sure you would get one',
s06q04 == "Taux d'intérêts élevés" ~ 'High interest rates',
s06q04 == 'Autre crédit en cours' ~ 'Other source of credit',
s06q04 == 'Autre (à préciser)' ~ 'Other reason',
TRUE ~ NA_character_
) %>% as_factor()
) %>%
select(
HHID_ID, hhid, s06q01a_has_standard_bank_account, s06q01b_has_bank_account_through_job,
s06q01c_has_rural_savings_bank_account, s06q01d_has_mobile_bank_account, s06q01e_has_prepaid_card, s06q02_has_savings_account, s06q03_applied_for_credit_last_twelve, s06q05_obtained_credit_last_twelve,
s06q04_if_no_credit_why
) %>%
# create dummy variables for categorical variables
fastDummies::dummy_cols(select_columns = 's06q04_if_no_credit_why') %>%
select(-s06q04_if_no_credit_why)
head(s06_final)## # A tibble: 6 × 20
## HHID_ID hhid s06q01a_has_standard_bank_account s06q01b_has_bank_account_thr…¹
## <chr> <dbl> <dbl> <dbl>
## 1 2_1_1_3 2001 NA NA
## 2 2_1_1_2 2001 0 0
## 3 2_1_1_1 2001 0 0
## 4 2_1_1_5 2001 NA NA
## 5 2_1_1_7 2001 NA NA
## 6 2_1_1_6 2001 0 0
## # ℹ abbreviated name: ¹s06q01b_has_bank_account_through_job
## # ℹ 16 more variables: s06q01c_has_rural_savings_bank_account <dbl>,
## # s06q01d_has_mobile_bank_account <dbl>, s06q01e_has_prepaid_card <dbl>,
## # s06q02_has_savings_account <dbl>,
## # s06q03_applied_for_credit_last_twelve <dbl>,
## # s06q05_obtained_credit_last_twelve <dbl>,
## # `s06q04_if_no_credit_why_Did not qualify` <int>, …
Section 12: Household Asset Data
We want to create a dataset where each row is a unique household and each column is a household item. The cells should be populated with either 0 (does not own) or 1 (does own) depending on the household’s survey response.
## Warning: One or more parsing issues, call `problems()` on your data frame for details,
## e.g.:
## dat <- vroom(...)
## problems(dat)
## Rows: 145215 Columns: 36
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): s12q01, s12q02, s12q04, s12q06
## dbl (16): hhid, grappe, menage, vague, s12q00, s12q03, s12q05__0, s12q05__1,...
## lgl (16): s12q05__7, s12q05__8, s12q05__9, s12q05__10, s12q05__11, s12q05__1...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# process dataset
s12_final <- s12 %>%
mutate(
HHID_ID = paste(grappe, menage, vague, sep = "_"), # create unique household ID
# translate household assets from french to english
s12q01_has = case_when(
s12q01 == 'Salon (Fauteuils et table basse)' ~ 'Living room',
s12q01 == 'Table à manger (table + chaises)' ~ 'Dining table',
s12q01 == 'Lit' ~ 'Bed',
s12q01 == 'Matelas simple' ~ 'Mattress',
s12q01 == 'Armoires et autres meubles' ~ 'Wardrobes other furniture',
s12q01 == 'Tapis' ~ 'Carpet',
s12q01 == 'Fer à repasser électrique' ~ 'Electric iron',
s12q01 == 'Fer à repasser à charbon' ~ 'Coal iron',
s12q01 == 'Cuisinière à gaz ou électrique' ~ 'Stove',
s12q01 == 'Bonbonne de gaz' ~ 'Propane tank',
s12q01 == 'Réchaud (plaque) à gaz ou électrique' ~ 'Portable stove',
s12q01 == 'Four à micro-onde ou électrique' ~ 'Microwave',
s12q01 == 'Foyers améliorés' ~ 'Cookstoves',
s12q01 == 'Robot de cuisine électrique (Moulinex)' ~ 'Kitchen mixer',
s12q01 == 'Mixeur/Presse-fruits non électrique' ~ 'Blender',
s12q01 == 'Réfrigérateur' ~ 'Refrigerator',
s12q01 == 'Congélateur' ~ 'Freezer',
s12q01 == 'Ventilateur sur pied' ~ 'Fan',
s12q01 == 'Radio simple/Radiocassette' ~ 'Radio',
s12q01 == 'Appareil TV' ~ 'TV',
s12q01 == 'Magnétoscope/CD/DVD' ~ 'VCR CD DVD player',
s12q01 == 'Antenne parabolique / décodeur' ~ 'Satellite antenna',
s12q01 == 'Lave-linge, sèche linge' ~ 'Washer dryer',
s12q01 == 'Aspirateur' ~ 'Vacuum cleaner',
s12q01 == 'Climatiseurs/Splits amovibles' ~ 'Air conditioner',
s12q01 == 'Tondeuse à gazon et autre article de jardinage' ~ 'Lawn mower',
s12q01 == 'Groupe électrogène' ~ 'Generator',
s12q01 == 'Voiture personnelle' ~ 'Personal car',
s12q01 == 'Cyclomoteur/Vélomoteur, motocyclette' ~ 'Motorbike',
s12q01 == 'Bicyclette' ~ 'Bicycle',
s12q01 == 'Appareil photo' ~ 'Camera',
s12q01 == 'Camescope' ~ 'Camcorder',
s12q01 == 'Chaîne Hi Fi' ~ 'Hi Fi system',
s12q01 == 'Téléphone fixe' ~ 'Landline',
s12q01 == 'Téléphone portable' ~ 'Cellphone',
s12q01 == 'Tablette' ~ 'Tablet',
s12q01 == 'Ordinateur' ~ 'Computer',
s12q01 == 'Imprimante/Fax' ~ 'Printer',
s12q01 == 'Caméra Vidéo' ~ 'Video camera',
s12q01 == 'Pirogue et hors-bord (bateaux de plaisance)' ~ 'Small boat',
s12q01 == 'Fusils de chasse' ~ 'Hunting rifles',
s12q01 == 'Guitare' ~ 'Guitar',
s12q01 == 'Piano et autre appareil de musique' ~ 'Musical instruments',
s12q01 == 'Immeuble/Maison' ~ 'House',
s12q01 == 'Terrain non bâti' ~ 'Land',
TRUE ~ NA_character_
) %>% as_factor(),
s12q02 = ifelse(s12q02 == 'Oui', 1, 0) # if household has the item, marked 1
) %>%
# pivot data to household level
pivot_wider(
id_cols = c(HHID_ID, hhid),
names_from = s12q01_has,
values_from = s12q02
) %>%
# select final relevant columns
select(-HHID_ID)
head(s12_final)## # A tibble: 6 × 46
## hhid Bicycle `Air conditioner` `Lawn mower` Bed Fan `VCR CD DVD player`
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2001 1 0 0 0 1 0
## 2 2002 1 0 0 1 0 0
## 3 2004 0 0 0 0 0 0
## 4 2005 0 0 0 1 0 0
## 5 2007 0 0 0 0 0 0
## 6 2040 1 0 0 1 1 0
## # ℹ 39 more variables: `Video camera` <dbl>, Mattress <dbl>,
## # `Small boat` <dbl>, TV <dbl>, Camcorder <dbl>, `Dining table` <dbl>,
## # Motorbike <dbl>, Camera <dbl>, Carpet <dbl>, `Vacuum cleaner` <dbl>,
## # Refrigerator <dbl>, Radio <dbl>, `Kitchen mixer` <dbl>,
## # `Hunting rifles` <dbl>, Computer <dbl>, `Musical instruments` <dbl>,
## # Generator <dbl>, `Coal iron` <dbl>, `Hi Fi system` <dbl>,
## # `Personal car` <dbl>, Cookstoves <dbl>, Cellphone <dbl>, Freezer <dbl>, …
Data Merging, Sections 1, 6, and 12
# first, merging individual level datasets (sections 1 and 6)
s01_s06 <-
merge(s01_final, s06_final, by = c("HHID_ID", 'hhid'), all.x = TRUE, all.y = TRUE)
head(s01_s06)## HHID_ID hhid s01q01_gender s01q3c_age s01q07_relationship_status
## 1 100_26_2_1 100026 Male 39 Monogamous marriage
## 2 100_26_2_2 100026 Female 37 Monogamous marriage
## 3 100_26_2_3 100026 Female 7 <NA>
## 4 100_26_2_4 100026 Male 15 Single
## 5 100_26_2_5 100026 Male 3 <NA>
## 6 100_46_2_1 100046 Male 58 Monogamous marriage
## s01q14_religion s01q15_nationality s01q16_ethnicity s01q39a_internet_on_phone
## 1 Muslim Burkina Faso Mossi 1
## 2 Muslim Burkina Faso Mossi 0
## 3 Muslim Burkina Faso Mossi NA
## 4 Muslim Burkina Faso Mossi 0
## 5 Muslim Burkina Faso Mossi NA
## 6 Christian Burkina Faso Mossi 0
## s01q39b_internet_at_work s01q39c_internet_at_cyber_cafe
## 1 0 0
## 2 0 0
## 3 NA NA
## 4 0 0
## 5 NA NA
## 6 0 0
## s01q39d_internet_at_home s01q39e_internet_at_school
## 1 0 0
## 2 0 0
## 3 NA NA
## 4 0 0
## 5 NA NA
## 6 0 0
## s06q01a_has_standard_bank_account s06q01b_has_bank_account_through_job
## 1 1 0
## 2 0 0
## 3 NA NA
## 4 NA NA
## 5 NA NA
## 6 0 0
## s06q01c_has_rural_savings_bank_account s06q01d_has_mobile_bank_account
## 1 0 1
## 2 0 1
## 3 NA NA
## 4 NA NA
## 5 NA NA
## 6 1 1
## s06q01e_has_prepaid_card s06q02_has_savings_account
## 1 0 1
## 2 0 0
## 3 NA NA
## 4 NA NA
## 5 NA NA
## 6 0 1
## s06q03_applied_for_credit_last_twelve s06q05_obtained_credit_last_twelve
## 1 0 NA
## 2 0 NA
## 3 NA NA
## 4 NA NA
## 5 NA NA
## 6 0 NA
## s06q04_if_no_credit_why_Did not qualify
## 1 0
## 2 1
## 3 NA
## 4 NA
## 5 NA
## 6 1
## s06q04_if_no_credit_why_High interest rates
## 1 0
## 2 0
## 3 NA
## 4 NA
## 5 NA
## 6 0
## s06q04_if_no_credit_why_Not sure you would get one
## 1 0
## 2 0
## 3 NA
## 4 NA
## 5 NA
## 6 0
## s06q04_if_no_credit_why_Didnt_need
## 1 1
## 2 0
## 3 NA
## 4 NA
## 5 NA
## 6 0
## s06q04_if_no_credit_why_Other source of credit
## 1 0
## 2 0
## 3 NA
## 4 NA
## 5 NA
## 6 0
## s06q04_if_no_credit_why_Do not know how to ask for one
## 1 0
## 2 0
## 3 NA
## 4 NA
## 5 NA
## 6 0
## s06q04_if_no_credit_why_Unable to repay s06q04_if_no_credit_why_Other reason
## 1 0 0
## 2 0 0
## 3 NA NA
## 4 NA NA
## 5 NA NA
## 6 0 0
## s06q04_if_no_credit_why_No credit institutions available
## 1 0
## 2 0
## 3 NA
## 4 NA
## 5 NA
## 6 0
## s06q04_if_no_credit_why_NA
## 1 0
## 2 0
## 3 1
## 4 1
## 5 1
## 6 0
# second, merge section 12 (household level) to individual level merged dataset
s01_s06_s12 <- left_join(s01_s06, s12_final, by = 'hhid')
head(s01_s06_s12)## HHID_ID hhid s01q01_gender s01q3c_age s01q07_relationship_status
## 1 100_26_2_1 100026 Male 39 Monogamous marriage
## 2 100_26_2_2 100026 Female 37 Monogamous marriage
## 3 100_26_2_3 100026 Female 7 <NA>
## 4 100_26_2_4 100026 Male 15 Single
## 5 100_26_2_5 100026 Male 3 <NA>
## 6 100_46_2_1 100046 Male 58 Monogamous marriage
## s01q14_religion s01q15_nationality s01q16_ethnicity s01q39a_internet_on_phone
## 1 Muslim Burkina Faso Mossi 1
## 2 Muslim Burkina Faso Mossi 0
## 3 Muslim Burkina Faso Mossi NA
## 4 Muslim Burkina Faso Mossi 0
## 5 Muslim Burkina Faso Mossi NA
## 6 Christian Burkina Faso Mossi 0
## s01q39b_internet_at_work s01q39c_internet_at_cyber_cafe
## 1 0 0
## 2 0 0
## 3 NA NA
## 4 0 0
## 5 NA NA
## 6 0 0
## s01q39d_internet_at_home s01q39e_internet_at_school
## 1 0 0
## 2 0 0
## 3 NA NA
## 4 0 0
## 5 NA NA
## 6 0 0
## s06q01a_has_standard_bank_account s06q01b_has_bank_account_through_job
## 1 1 0
## 2 0 0
## 3 NA NA
## 4 NA NA
## 5 NA NA
## 6 0 0
## s06q01c_has_rural_savings_bank_account s06q01d_has_mobile_bank_account
## 1 0 1
## 2 0 1
## 3 NA NA
## 4 NA NA
## 5 NA NA
## 6 1 1
## s06q01e_has_prepaid_card s06q02_has_savings_account
## 1 0 1
## 2 0 0
## 3 NA NA
## 4 NA NA
## 5 NA NA
## 6 0 1
## s06q03_applied_for_credit_last_twelve s06q05_obtained_credit_last_twelve
## 1 0 NA
## 2 0 NA
## 3 NA NA
## 4 NA NA
## 5 NA NA
## 6 0 NA
## s06q04_if_no_credit_why_Did not qualify
## 1 0
## 2 1
## 3 NA
## 4 NA
## 5 NA
## 6 1
## s06q04_if_no_credit_why_High interest rates
## 1 0
## 2 0
## 3 NA
## 4 NA
## 5 NA
## 6 0
## s06q04_if_no_credit_why_Not sure you would get one
## 1 0
## 2 0
## 3 NA
## 4 NA
## 5 NA
## 6 0
## s06q04_if_no_credit_why_Didnt_need
## 1 1
## 2 0
## 3 NA
## 4 NA
## 5 NA
## 6 0
## s06q04_if_no_credit_why_Other source of credit
## 1 0
## 2 0
## 3 NA
## 4 NA
## 5 NA
## 6 0
## s06q04_if_no_credit_why_Do not know how to ask for one
## 1 0
## 2 0
## 3 NA
## 4 NA
## 5 NA
## 6 0
## s06q04_if_no_credit_why_Unable to repay s06q04_if_no_credit_why_Other reason
## 1 0 0
## 2 0 0
## 3 NA NA
## 4 NA NA
## 5 NA NA
## 6 0 0
## s06q04_if_no_credit_why_No credit institutions available
## 1 0
## 2 0
## 3 NA
## 4 NA
## 5 NA
## 6 0
## s06q04_if_no_credit_why_NA Bicycle Air conditioner Lawn mower Bed Fan
## 1 0 1 0 0 1 1
## 2 0 1 0 0 1 1
## 3 1 1 0 0 1 1
## 4 1 1 0 0 1 1
## 5 1 1 0 0 1 1
## 6 0 1 0 0 0 0
## VCR CD DVD player Video camera Mattress Small boat TV Camcorder Dining table
## 1 0 0 1 0 1 0 0
## 2 0 0 1 0 1 0 0
## 3 0 0 1 0 1 0 0
## 4 0 0 1 0 1 0 0
## 5 0 0 1 0 1 0 0
## 6 0 0 1 0 0 0 0
## Motorbike Camera Carpet Vacuum cleaner Refrigerator Radio Kitchen mixer
## 1 1 0 0 0 1 0 0
## 2 1 0 0 0 1 0 0
## 3 1 0 0 0 1 0 0
## 4 1 0 0 0 1 0 0
## 5 1 0 0 0 1 0 0
## 6 1 0 0 0 0 0 0
## Hunting rifles Computer Musical instruments Generator Coal iron Hi Fi system
## 1 0 0 0 0 0 0
## 2 0 0 0 0 0 0
## 3 0 0 0 0 0 0
## 4 0 0 0 0 0 0
## 5 0 0 0 0 0 0
## 6 0 0 0 0 0 0
## Personal car Cookstoves Cellphone Freezer Guitar Electric iron Printer
## 1 0 1 1 0 0 0 0
## 2 0 1 1 0 0 0 0
## 3 0 1 1 0 0 0 0
## 4 0 1 1 0 0 0 0
## 5 0 1 1 0 0 0 0
## 6 0 1 1 0 0 0 0
## Propane tank Living room Portable stove Microwave Landline Satellite antenna
## 1 1 0 0 0 0 0
## 2 1 0 0 0 0 0
## 3 1 0 0 0 0 0
## 4 1 0 0 0 0 0
## 5 1 0 0 0 0 0
## 6 0 0 0 0 0 0
## Washer dryer House Stove Tablet Land Blender Wardrobes other furniture
## 1 0 1 0 0 1 0 1
## 2 0 1 0 0 1 0 1
## 3 0 1 0 0 1 0 1
## 4 0 1 0 0 1 0 1
## 5 0 1 0 0 1 0 1
## 6 0 0 0 0 0 0 0
## HHID_ID HHID s01q01_gender s01q3c_age s01q07_relationship_status
## 1 100_26_2_1 100026 Male 39 Monogamous marriage
## 2 100_26_2_2 100026 Female 37 Monogamous marriage
## 3 100_26_2_3 100026 Female 7 <NA>
## 4 100_26_2_4 100026 Male 15 Single
## 5 100_26_2_5 100026 Male 3 <NA>
## 6 100_46_2_1 100046 Male 58 Monogamous marriage
## s01q14_religion s01q15_nationality s01q16_ethnicity s01q39a_internet_on_phone
## 1 Muslim Burkina Faso Mossi 1
## 2 Muslim Burkina Faso Mossi 0
## 3 Muslim Burkina Faso Mossi NA
## 4 Muslim Burkina Faso Mossi 0
## 5 Muslim Burkina Faso Mossi NA
## 6 Christian Burkina Faso Mossi 0
## s01q39b_internet_at_work s01q39c_internet_at_cyber_cafe
## 1 0 0
## 2 0 0
## 3 NA NA
## 4 0 0
## 5 NA NA
## 6 0 0
## s01q39d_internet_at_home s01q39e_internet_at_school
## 1 0 0
## 2 0 0
## 3 NA NA
## 4 0 0
## 5 NA NA
## 6 0 0
## s06q01a_has_standard_bank_account s06q01b_has_bank_account_through_job
## 1 1 0
## 2 0 0
## 3 NA NA
## 4 NA NA
## 5 NA NA
## 6 0 0
## s06q01c_has_rural_savings_bank_account s06q01d_has_mobile_bank_account
## 1 0 1
## 2 0 1
## 3 NA NA
## 4 NA NA
## 5 NA NA
## 6 1 1
## s06q01e_has_prepaid_card s06q02_has_savings_account
## 1 0 1
## 2 0 0
## 3 NA NA
## 4 NA NA
## 5 NA NA
## 6 0 1
## s06q03_applied_for_credit_last_twelve s06q05_obtained_credit_last_twelve
## 1 0 NA
## 2 0 NA
## 3 NA NA
## 4 NA NA
## 5 NA NA
## 6 0 NA
## s06q04_if_no_credit_why_Did not qualify
## 1 0
## 2 1
## 3 NA
## 4 NA
## 5 NA
## 6 1
## s06q04_if_no_credit_why_High interest rates
## 1 0
## 2 0
## 3 NA
## 4 NA
## 5 NA
## 6 0
## s06q04_if_no_credit_why_Not sure you would get one
## 1 0
## 2 0
## 3 NA
## 4 NA
## 5 NA
## 6 0
## s06q04_if_no_credit_why_Didnt_need
## 1 1
## 2 0
## 3 NA
## 4 NA
## 5 NA
## 6 0
## s06q04_if_no_credit_why_Other source of credit
## 1 0
## 2 0
## 3 NA
## 4 NA
## 5 NA
## 6 0
## s06q04_if_no_credit_why_Do not know how to ask for one
## 1 0
## 2 0
## 3 NA
## 4 NA
## 5 NA
## 6 0
## s06q04_if_no_credit_why_Unable to repay s06q04_if_no_credit_why_Other reason
## 1 0 0
## 2 0 0
## 3 NA NA
## 4 NA NA
## 5 NA NA
## 6 0 0
## s06q04_if_no_credit_why_No credit institutions available
## 1 0
## 2 0
## 3 NA
## 4 NA
## 5 NA
## 6 0
## s06q04_if_no_credit_why_NA Bicycle Air conditioner Lawn mower Bed Fan
## 1 0 1 0 0 1 1
## 2 0 1 0 0 1 1
## 3 1 1 0 0 1 1
## 4 1 1 0 0 1 1
## 5 1 1 0 0 1 1
## 6 0 1 0 0 0 0
## VCR CD DVD player Video camera Mattress Small boat TV Camcorder Dining table
## 1 0 0 1 0 1 0 0
## 2 0 0 1 0 1 0 0
## 3 0 0 1 0 1 0 0
## 4 0 0 1 0 1 0 0
## 5 0 0 1 0 1 0 0
## 6 0 0 1 0 0 0 0
## Motorbike Camera Carpet Vacuum cleaner Refrigerator Radio Kitchen mixer
## 1 1 0 0 0 1 0 0
## 2 1 0 0 0 1 0 0
## 3 1 0 0 0 1 0 0
## 4 1 0 0 0 1 0 0
## 5 1 0 0 0 1 0 0
## 6 1 0 0 0 0 0 0
## Hunting rifles Computer Musical instruments Generator Coal iron Hi Fi system
## 1 0 0 0 0 0 0
## 2 0 0 0 0 0 0
## 3 0 0 0 0 0 0
## 4 0 0 0 0 0 0
## 5 0 0 0 0 0 0
## 6 0 0 0 0 0 0
## Personal car Cookstoves Cellphone Freezer Guitar Electric iron Printer
## 1 0 1 1 0 0 0 0
## 2 0 1 1 0 0 0 0
## 3 0 1 1 0 0 0 0
## 4 0 1 1 0 0 0 0
## 5 0 1 1 0 0 0 0
## 6 0 1 1 0 0 0 0
## Propane tank Living room Portable stove Microwave Landline Satellite antenna
## 1 1 0 0 0 0 0
## 2 1 0 0 0 0 0
## 3 1 0 0 0 0 0
## 4 1 0 0 0 0 0
## 5 1 0 0 0 0 0
## 6 0 0 0 0 0 0
## Washer dryer House Stove Tablet Land Blender Wardrobes other furniture
## 1 0 1 0 0 1 0 1
## 2 0 1 0 0 1 0 1
## 3 0 1 0 0 1 0 1
## 4 0 1 0 0 1 0 1
## 5 0 1 0 0 1 0 1
## 6 0 0 0 0 0 0 0