Data Cleaning, Household Survey Data

Introduction

In the following code, I clean sections 00, 01, 06, and 12 in Burkina Faso’s 2018-2019 Enquête Harmonisée sur le Conditions de Vie des Ménages (EHCVM). The purpose of this project was 1. to build profiles of individuals (based on clusters of characteristics) likely to join violent extremist groups in Burkina Faso and 2. to identify provinces with a higher proportion of these identified profiles to better direct UNDP aid. Later, we evaluate responses to these survey questions.

Data Source: https://microdata.worldbank.org/index.php/catalog/4290

Data Cleaning

# load libraries
library(utils)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(fastDummies)
## Thank you for using fastDummies!
## To acknowledge our work, please cite the package:
## Kaplan, J. & Schlegel, B. (2023). fastDummies: Fast Creation of Dummy (Binary) Columns and Rows from Categorical Variables. Version 1.7.1. URL: https://github.com/jacobkap/fastDummies, https://jacobkap.github.io/fastDummies/.
library(ggplot2)
library(here)
## here() starts at /Users/shanametcalf/Desktop/Work, Resumes, Profesional Photos/Misc.Application Docs/code samples
library(dplyr)
library(purrr)
library(here)
library(skimr)

Section 00: Demographics Part One

We want to pull out household number, region, province, and type of terrain

# loading dataset
s00 <- read_csv(here("s00_me_bfa2021.csv"))
## Rows: 3227 Columns: 51
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (15): s00q00, s00q01, s00q02, s00q03, s00q04, s00q05, s00q07a, s00q07b,...
## dbl   (9): hhid, grappe, menage, vague, s00q06, s00q07, hhid_EHCVM1, grappe_...
## lgl  (22): s00q07d2, s00q10, s00q11, s00q12a, s00q12b, s00q13, s00q14, s00q1...
## dttm  (5): s00q23a, s00q24a, s00q25a, s00q23b, s00q24b
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(s00)
## # A tibble: 6 × 51
##    hhid grappe menage vague s00q00     s00q01 s00q02 s00q03 s00q04 s00q05 s00q06
##   <dbl>  <dbl>  <dbl> <dbl> <chr>      <chr>  <chr>  <chr>  <chr>  <chr>   <dbl>
## 1  2001      2      1     1 Burkina F… Sahel  Oudal… Gorom… Urbain Secte…      2
## 2  2002      2      2     1 Burkina F… Sahel  Oudal… Gorom… Urbain Secte…      2
## 3  2004      2      4     1 Burkina F… Sahel  Oudal… Gorom… Urbain Secte…      2
## 4  2005      2      5     1 Burkina F… Sahel  Oudal… Gorom… Urbain Secte…      2
## 5  2007      2      7     1 Burkina F… Sahel  Oudal… Gorom… Urbain Secte…      2
## 6  2040      2     40     1 Burkina F… Sahel  Oudal… Gorom… Urbain Secte…      2
## # ℹ 40 more variables: s00q07 <dbl>, s00q07a <chr>, s00q07b <chr>,
## #   s00q07c <chr>, s00q07d <chr>, s00q07d2 <lgl>, s00q07e <chr>,
## #   hhid_EHCVM1 <dbl>, grappe_EHCVM1 <dbl>, menage_EHCVM1 <dbl>, s00q10 <lgl>,
## #   s00q11 <lgl>, s00q12a <lgl>, s00q12b <lgl>, s00q13 <lgl>, s00q14 <lgl>,
## #   s00q13a <lgl>, s00q14a <lgl>, s00q15 <lgl>, s00q16 <lgl>, s00q17 <lgl>,
## #   s00q18 <lgl>, s00q19 <lgl>, s00q20 <lgl>, s00q22 <lgl>, s00q23a <dttm>,
## #   s00q24a <dttm>, s00q25a <dttm>, s00q23b <dttm>, s00q24b <dttm>, …
# process dataset
s00_final <- s00 %>%
  mutate(
    HHID_ID = paste(grappe, menage, vague, sep = "_"), # keep only head of households. currently, this is on an individual level. we want this on a household level.
    s00q04_type = ifelse(s00q04 == 'Urbain', 'Urban', 'Rural') # translate french to english
  ) %>%
  select(HHID_ID, hhid, region = s00q01, province = s00q02, village = s00q03, s00q04_type)

# ensure we have dataset in the format we want it 
head(s00_final)
## # A tibble: 6 × 6
##   HHID_ID  hhid region province village     s00q04_type
##   <chr>   <dbl> <chr>  <chr>    <chr>       <chr>      
## 1 2_1_1    2001 Sahel  Oudalan  Gorom-Gorom Urban      
## 2 2_2_1    2002 Sahel  Oudalan  Gorom-Gorom Urban      
## 3 2_4_1    2004 Sahel  Oudalan  Gorom-Gorom Urban      
## 4 2_5_1    2005 Sahel  Oudalan  Gorom-Gorom Urban      
## 5 2_7_1    2007 Sahel  Oudalan  Gorom-Gorom Urban      
## 6 2_40_1   2040 Sahel  Oudalan  Gorom-Gorom Urban

Section 1: Demographics Part Two

We want to pull out household number, age, gender, ethnicity, nationality, relationship status, religion, and a couple of questions on access to internet.

# loading dataset
s01 <- read_csv(here("s01_me_bfa2021.csv"))
## Warning: One or more parsing issues, call `problems()` on your data frame for details,
## e.g.:
##   dat <- vroom(...)
##   problems(dat)
## Rows: 26523 Columns: 67
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (38): s01q00aa, s01q00b, s01q00b_autre, s01q00d, s01q00f, s01q01, s01q02...
## dbl (26): hhid, grappe, menage, vague, pid, s01q00c, s01q00e, s01q03a, s01q0...
## lgl  (3): s01q09__3, s01q09__4, s01q19_autre
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(s01)
## # A tibble: 6 × 67
##    hhid grappe menage vague   pid s01q00aa s01q00b s01q00b_autre s01q00c s01q00d
##   <dbl>  <dbl>  <dbl> <dbl> <dbl> <chr>    <chr>   <chr>           <dbl> <chr>  
## 1  2001      2      1     1     5 OUI      <NA>    <NA>               NA <NA>   
## 2  2001      2      1     1     2 OUI      <NA>    <NA>               NA <NA>   
## 3  2001      2      1     1     1 OUI      <NA>    <NA>               NA <NA>   
## 4  2001      2      1     1     4 OUI      <NA>    <NA>               NA <NA>   
## 5  2001      2      1     1     6 OUI      <NA>    <NA>               NA <NA>   
## 6  2001      2      1     1     3 OUI      <NA>    <NA>               NA <NA>   
## # ℹ 57 more variables: s01q00e <dbl>, s01q00f <chr>, s01q01 <chr>,
## #   s01q02 <chr>, s01q03a <dbl>, s01q03b <chr>, s01q03c <dbl>, s01q04a <dbl>,
## #   s01q04b <dbl>, s01q05 <chr>, s01q06 <chr>, s01q00_0 <chr>, s01q00 <dbl>,
## #   s01q07 <chr>, s01q08 <chr>, s01q09__0 <dbl>, s01q09__1 <dbl>,
## #   s01q09__2 <dbl>, s01q09__3 <lgl>, s01q09__4 <lgl>, s01q10 <dbl>,
## #   s01q11 <chr>, s01q12 <chr>, s01q13 <chr>, s01q14 <chr>, s01q15 <chr>,
## #   s01q16 <chr>, s01q17 <chr>, s01q18 <chr>, s01q19 <chr>, …
# process dataset
s01_final <- s01 %>%
  mutate(HHID_ID = paste(grappe, menage, vague, pid, sep = "_"),   # create unique household ID
         # translate variables from french to english and create new columns for gender, ethnicity, religion, and relationship status
         s01q01_gender = ifelse(s01q01 == 'Masculin', 'Male', 'Female') %>% as_factor(),
         s01q16_ethnicity = case_when(
           s01q16 == 'Autres ethnies' ~ 'Other',
           TRUE ~ s01q16  
         ) %>% as_factor(),
         s01q3c_age = 2022 - s01q03c, # create age variable for later analysis
         s01q14_religion = case_when(
           s01q14 == 'Musulman' ~ 'Muslim',
           s01q14 == 'Animiste' ~ 'Animist',
           s01q14 == 'Chrétien' ~ 'Christian',
           s01q14 == 'Sans Réligion' ~ 'No religion',
           s01q14 == 'Autre Réligion' ~ 'Other',
           TRUE ~ NA_character_  
         ) %>% as_factor(),
         s01q07_relationship_status = case_when(
           s01q07 == 'Célibataire' ~ 'Single',
           s01q07 == 'Marié(e) polygame' ~ 'Polygamous marriage',
           s01q07 == 'Marié(e) monogame' ~ 'Monogamous marriage',
           s01q07 == 'Veuf(ve)' ~ 'Widow',
           s01q07 == 'Union libre' ~ 'Open relationship',
           s01q07 == 'Divorcé(e)' ~ 'Divorced',
           s01q07 == 'Séparé(e)' ~ 'Separated',
           TRUE ~ NA_character_  
         ) %>% as_factor()
  ) %>%
  # rename columns
  rename(
    s01q15_nationality = s01q15,
    s01q39a_internet_on_phone = s01q39__1,
    s01q39b_internet_at_work = s01q39__2,
    s01q39c_internet_at_cyber_cafe = s01q39__3,
    s01q39d_internet_at_home = s01q39__4,
    s01q39e_internet_at_school = s01q39__5
  ) %>%
  # select relevant variables
  select(
    HHID_ID, hhid, s01q01_gender, s01q3c_age, s01q07_relationship_status, 
    s01q14_religion, s01q15_nationality, s01q16_ethnicity, s01q39a_internet_on_phone, 
    s01q39b_internet_at_work, s01q39c_internet_at_cyber_cafe, s01q39d_internet_at_home, 
    s01q39e_internet_at_school
  )

head(s01_final)
## # A tibble: 6 × 13
##   HHID_ID  hhid s01q01_gender s01q3c_age s01q07_relationship_s…¹ s01q14_religion
##   <chr>   <dbl> <fct>              <dbl> <fct>                   <fct>          
## 1 2_1_1_5  2001 Male                  15 Single                  Muslim         
## 2 2_1_1_2  2001 Female                43 Monogamous marriage     Muslim         
## 3 2_1_1_1  2001 Male                  52 Monogamous marriage     Muslim         
## 4 2_1_1_4  2001 Male                  15 Single                  Muslim         
## 5 2_1_1_6  2001 Male                  21 Single                  Muslim         
## 6 2_1_1_3  2001 Female                 9 <NA>                    Muslim         
## # ℹ abbreviated name: ¹​s01q07_relationship_status
## # ℹ 7 more variables: s01q15_nationality <chr>, s01q16_ethnicity <fct>,
## #   s01q39a_internet_on_phone <dbl>, s01q39b_internet_at_work <dbl>,
## #   s01q39c_internet_at_cyber_cafe <dbl>, s01q39d_internet_at_home <dbl>,
## #   s01q39e_internet_at_school <dbl>

Section 6: Savings and Credit Data

We want to pull out household number, access to bank accounts, access to credit, and follow-up questions on these.

# loading dataset
s06 <- read_csv(here("s06_me_bfa2021.csv"))
## Warning: One or more parsing issues, call `problems()` on your data frame for details,
## e.g.:
##   dat <- vroom(...)
##   problems(dat)
## Rows: 26523 Columns: 35
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (15): s06q00_0, s06q02, s06q03, s06q04, s06q04_autre, s06q05, s06q06, s0...
## dbl (17): hhid, grappe, menage, vague, pid, s06q00, s06q01__1, s06q01__2, s0...
## lgl  (3): s06q06_autre, s06q011_autre, s06q12_autre
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# process dataset
s06_final <- s06 %>% 
  mutate(
    HHID_ID = paste(grappe, menage, vague, pid, sep = "_"), # create unique household ID
    # create new columns for type of bank account access
    s06q01a_has_standard_bank_account = s06q01__1,
    s06q01b_has_bank_account_through_job = s06q01__2,
    s06q01c_has_rural_savings_bank_account = s06q01__3,
    s06q01d_has_mobile_bank_account = s06q01__4,
    s06q01e_has_prepaid_card = s06q01__5,
    s06q03_applied_for_credit_last_twelve = ifelse(s06q03 == 'Oui', 1, 0),
    s06q05_obtained_credit_last_twelve = ifelse(s06q05 == 'Oui', 1, 0),
    s06q02_has_savings_account = ifelse(s06q02 == 'Oui', 1, 0),
    # translate from french to english and create new column for follow up credit question
    s06q04_if_no_credit_why = case_when(
      s06q04 == 'Pas nécessaire' ~ 'Didnt_need',
      s06q04 == 'Ne remplit pas les conditions' ~ 'Did not qualify',
      s06q04 == 'Ne sait pas comment demander' ~ 'Do not know how to ask for one',
      s06q04 == 'Pas capable de rembourser' ~ 'Unable to repay',
      s06q04 == "Absence d'institutions de crédit" ~ 'No credit institutions available',
      s06q04 == "N'est pas sûr d'en obtenir un" ~ 'Not sure you would get one',
      s06q04 == "Taux d'intérêts élevés" ~ 'High interest rates',
      s06q04 == 'Autre crédit en cours' ~ 'Other source of credit',
      s06q04 == 'Autre (à préciser)' ~ 'Other reason',
      TRUE ~ NA_character_
    ) %>% as_factor()
  ) %>%
  select(
    HHID_ID, hhid, s06q01a_has_standard_bank_account, s06q01b_has_bank_account_through_job, 
    s06q01c_has_rural_savings_bank_account, s06q01d_has_mobile_bank_account, s06q01e_has_prepaid_card, s06q02_has_savings_account, s06q03_applied_for_credit_last_twelve, s06q05_obtained_credit_last_twelve, 
    s06q04_if_no_credit_why
  ) %>%
  # create dummy variables for categorical variables
  fastDummies::dummy_cols(select_columns = 's06q04_if_no_credit_why') %>%
  select(-s06q04_if_no_credit_why)

head(s06_final)
## # A tibble: 6 × 20
##   HHID_ID  hhid s06q01a_has_standard_bank_account s06q01b_has_bank_account_thr…¹
##   <chr>   <dbl>                             <dbl>                          <dbl>
## 1 2_1_1_3  2001                                NA                             NA
## 2 2_1_1_2  2001                                 0                              0
## 3 2_1_1_1  2001                                 0                              0
## 4 2_1_1_5  2001                                NA                             NA
## 5 2_1_1_7  2001                                NA                             NA
## 6 2_1_1_6  2001                                 0                              0
## # ℹ abbreviated name: ¹​s06q01b_has_bank_account_through_job
## # ℹ 16 more variables: s06q01c_has_rural_savings_bank_account <dbl>,
## #   s06q01d_has_mobile_bank_account <dbl>, s06q01e_has_prepaid_card <dbl>,
## #   s06q02_has_savings_account <dbl>,
## #   s06q03_applied_for_credit_last_twelve <dbl>,
## #   s06q05_obtained_credit_last_twelve <dbl>,
## #   `s06q04_if_no_credit_why_Did not qualify` <int>, …

Section 12: Household Asset Data

We want to create a dataset where each row is a unique household and each column is a household item. The cells should be populated with either 0 (does not own) or 1 (does own) depending on the household’s survey response.

# loading dataset
s12 <- read_csv(here("s12_me_bfa2021.csv"))
## Warning: One or more parsing issues, call `problems()` on your data frame for details,
## e.g.:
##   dat <- vroom(...)
##   problems(dat)
## Rows: 145215 Columns: 36
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (4): s12q01, s12q02, s12q04, s12q06
## dbl (16): hhid, grappe, menage, vague, s12q00, s12q03, s12q05__0, s12q05__1,...
## lgl (16): s12q05__7, s12q05__8, s12q05__9, s12q05__10, s12q05__11, s12q05__1...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# process dataset
s12_final <- s12 %>% 
  mutate(
    HHID_ID = paste(grappe, menage, vague, sep = "_"), # create unique household ID 
    # translate household assets from french to english
    s12q01_has = case_when(
      s12q01 == 'Salon  (Fauteuils et table basse)' ~ 'Living room',
      s12q01 == 'Table à manger  (table + chaises)' ~ 'Dining table',
      s12q01 == 'Lit' ~ 'Bed',
      s12q01 == 'Matelas simple' ~ 'Mattress',
      s12q01 == 'Armoires et autres meubles' ~ 'Wardrobes other furniture',
      s12q01 == 'Tapis' ~ 'Carpet',
      s12q01 == 'Fer à repasser électrique' ~ 'Electric iron',
      s12q01 == 'Fer à repasser à charbon' ~ 'Coal iron',
      s12q01 == 'Cuisinière à gaz ou électrique' ~ 'Stove',
      s12q01 == 'Bonbonne de gaz' ~ 'Propane tank',
      s12q01 == 'Réchaud (plaque) à gaz ou électrique' ~ 'Portable stove',
      s12q01 == 'Four à micro-onde ou électrique' ~ 'Microwave',
      s12q01 == 'Foyers améliorés' ~ 'Cookstoves',
      s12q01 == 'Robot de cuisine électrique (Moulinex)' ~ 'Kitchen mixer',
      s12q01 == 'Mixeur/Presse-fruits non électrique' ~ 'Blender',
      s12q01 == 'Réfrigérateur' ~ 'Refrigerator',
      s12q01 == 'Congélateur' ~ 'Freezer',
      s12q01 == 'Ventilateur sur pied' ~ 'Fan',
      s12q01 == 'Radio simple/Radiocassette' ~ 'Radio',
      s12q01 == 'Appareil TV' ~ 'TV',
      s12q01 == 'Magnétoscope/CD/DVD' ~ 'VCR CD DVD player',
      s12q01 == 'Antenne parabolique / décodeur' ~ 'Satellite antenna',
      s12q01 == 'Lave-linge, sèche linge' ~ 'Washer dryer',
      s12q01 == 'Aspirateur' ~ 'Vacuum cleaner',
      s12q01 == 'Climatiseurs/Splits amovibles' ~ 'Air conditioner',
      s12q01 == 'Tondeuse à gazon et autre article de jardinage' ~ 'Lawn mower',
      s12q01 == 'Groupe électrogène' ~ 'Generator',
      s12q01 == 'Voiture personnelle' ~ 'Personal car',
      s12q01 == 'Cyclomoteur/Vélomoteur, motocyclette' ~ 'Motorbike',
      s12q01 == 'Bicyclette' ~ 'Bicycle',
      s12q01 == 'Appareil photo' ~ 'Camera',
      s12q01 == 'Camescope' ~ 'Camcorder',
      s12q01 == 'Chaîne Hi Fi' ~ 'Hi Fi system',
      s12q01 == 'Téléphone fixe' ~ 'Landline',
      s12q01 == 'Téléphone portable' ~ 'Cellphone',
      s12q01 == 'Tablette' ~ 'Tablet',
      s12q01 == 'Ordinateur' ~ 'Computer',
      s12q01 == 'Imprimante/Fax' ~ 'Printer',
      s12q01 == 'Caméra Vidéo' ~ 'Video camera',
      s12q01 == 'Pirogue et hors-bord (bateaux de plaisance)' ~ 'Small boat',
      s12q01 == 'Fusils de chasse' ~ 'Hunting rifles',
      s12q01 == 'Guitare' ~ 'Guitar',
      s12q01 == 'Piano et autre appareil de musique' ~ 'Musical instruments',
      s12q01 == 'Immeuble/Maison' ~ 'House',
      s12q01 == 'Terrain non bâti' ~ 'Land',
      TRUE ~ NA_character_
    ) %>% as_factor(),
    s12q02 = ifelse(s12q02 == 'Oui', 1, 0)  # if household has the item, marked 1
  ) %>%
  # pivot data to household level
  pivot_wider(
    id_cols = c(HHID_ID, hhid),
    names_from = s12q01_has, 
    values_from = s12q02
  ) %>%
  # select final relevant columns
  select(-HHID_ID)

head(s12_final)
## # A tibble: 6 × 46
##    hhid Bicycle `Air conditioner` `Lawn mower`   Bed   Fan `VCR CD DVD player`
##   <dbl>   <dbl>             <dbl>        <dbl> <dbl> <dbl>               <dbl>
## 1  2001       1                 0            0     0     1                   0
## 2  2002       1                 0            0     1     0                   0
## 3  2004       0                 0            0     0     0                   0
## 4  2005       0                 0            0     1     0                   0
## 5  2007       0                 0            0     0     0                   0
## 6  2040       1                 0            0     1     1                   0
## # ℹ 39 more variables: `Video camera` <dbl>, Mattress <dbl>,
## #   `Small boat` <dbl>, TV <dbl>, Camcorder <dbl>, `Dining table` <dbl>,
## #   Motorbike <dbl>, Camera <dbl>, Carpet <dbl>, `Vacuum cleaner` <dbl>,
## #   Refrigerator <dbl>, Radio <dbl>, `Kitchen mixer` <dbl>,
## #   `Hunting rifles` <dbl>, Computer <dbl>, `Musical instruments` <dbl>,
## #   Generator <dbl>, `Coal iron` <dbl>, `Hi Fi system` <dbl>,
## #   `Personal car` <dbl>, Cookstoves <dbl>, Cellphone <dbl>, Freezer <dbl>, …

Data Merging, Sections 1, 6, and 12

# first, merging individual level datasets (sections 1 and 6)
s01_s06 <- 
  merge(s01_final, s06_final, by = c("HHID_ID", 'hhid'), all.x = TRUE, all.y = TRUE)
head(s01_s06)
##      HHID_ID   hhid s01q01_gender s01q3c_age s01q07_relationship_status
## 1 100_26_2_1 100026          Male         39        Monogamous marriage
## 2 100_26_2_2 100026        Female         37        Monogamous marriage
## 3 100_26_2_3 100026        Female          7                       <NA>
## 4 100_26_2_4 100026          Male         15                     Single
## 5 100_26_2_5 100026          Male          3                       <NA>
## 6 100_46_2_1 100046          Male         58        Monogamous marriage
##   s01q14_religion s01q15_nationality s01q16_ethnicity s01q39a_internet_on_phone
## 1          Muslim       Burkina Faso            Mossi                         1
## 2          Muslim       Burkina Faso            Mossi                         0
## 3          Muslim       Burkina Faso            Mossi                        NA
## 4          Muslim       Burkina Faso            Mossi                         0
## 5          Muslim       Burkina Faso            Mossi                        NA
## 6       Christian       Burkina Faso            Mossi                         0
##   s01q39b_internet_at_work s01q39c_internet_at_cyber_cafe
## 1                        0                              0
## 2                        0                              0
## 3                       NA                             NA
## 4                        0                              0
## 5                       NA                             NA
## 6                        0                              0
##   s01q39d_internet_at_home s01q39e_internet_at_school
## 1                        0                          0
## 2                        0                          0
## 3                       NA                         NA
## 4                        0                          0
## 5                       NA                         NA
## 6                        0                          0
##   s06q01a_has_standard_bank_account s06q01b_has_bank_account_through_job
## 1                                 1                                    0
## 2                                 0                                    0
## 3                                NA                                   NA
## 4                                NA                                   NA
## 5                                NA                                   NA
## 6                                 0                                    0
##   s06q01c_has_rural_savings_bank_account s06q01d_has_mobile_bank_account
## 1                                      0                               1
## 2                                      0                               1
## 3                                     NA                              NA
## 4                                     NA                              NA
## 5                                     NA                              NA
## 6                                      1                               1
##   s06q01e_has_prepaid_card s06q02_has_savings_account
## 1                        0                          1
## 2                        0                          0
## 3                       NA                         NA
## 4                       NA                         NA
## 5                       NA                         NA
## 6                        0                          1
##   s06q03_applied_for_credit_last_twelve s06q05_obtained_credit_last_twelve
## 1                                     0                                 NA
## 2                                     0                                 NA
## 3                                    NA                                 NA
## 4                                    NA                                 NA
## 5                                    NA                                 NA
## 6                                     0                                 NA
##   s06q04_if_no_credit_why_Did not qualify
## 1                                       0
## 2                                       1
## 3                                      NA
## 4                                      NA
## 5                                      NA
## 6                                       1
##   s06q04_if_no_credit_why_High interest rates
## 1                                           0
## 2                                           0
## 3                                          NA
## 4                                          NA
## 5                                          NA
## 6                                           0
##   s06q04_if_no_credit_why_Not sure you would get one
## 1                                                  0
## 2                                                  0
## 3                                                 NA
## 4                                                 NA
## 5                                                 NA
## 6                                                  0
##   s06q04_if_no_credit_why_Didnt_need
## 1                                  1
## 2                                  0
## 3                                 NA
## 4                                 NA
## 5                                 NA
## 6                                  0
##   s06q04_if_no_credit_why_Other source of credit
## 1                                              0
## 2                                              0
## 3                                             NA
## 4                                             NA
## 5                                             NA
## 6                                              0
##   s06q04_if_no_credit_why_Do not know how to ask for one
## 1                                                      0
## 2                                                      0
## 3                                                     NA
## 4                                                     NA
## 5                                                     NA
## 6                                                      0
##   s06q04_if_no_credit_why_Unable to repay s06q04_if_no_credit_why_Other reason
## 1                                       0                                    0
## 2                                       0                                    0
## 3                                      NA                                   NA
## 4                                      NA                                   NA
## 5                                      NA                                   NA
## 6                                       0                                    0
##   s06q04_if_no_credit_why_No credit institutions available
## 1                                                        0
## 2                                                        0
## 3                                                       NA
## 4                                                       NA
## 5                                                       NA
## 6                                                        0
##   s06q04_if_no_credit_why_NA
## 1                          0
## 2                          0
## 3                          1
## 4                          1
## 5                          1
## 6                          0
# second, merge section 12 (household level) to individual level merged dataset
s01_s06_s12 <- left_join(s01_s06, s12_final, by = 'hhid')
head(s01_s06_s12)
##      HHID_ID   hhid s01q01_gender s01q3c_age s01q07_relationship_status
## 1 100_26_2_1 100026          Male         39        Monogamous marriage
## 2 100_26_2_2 100026        Female         37        Monogamous marriage
## 3 100_26_2_3 100026        Female          7                       <NA>
## 4 100_26_2_4 100026          Male         15                     Single
## 5 100_26_2_5 100026          Male          3                       <NA>
## 6 100_46_2_1 100046          Male         58        Monogamous marriage
##   s01q14_religion s01q15_nationality s01q16_ethnicity s01q39a_internet_on_phone
## 1          Muslim       Burkina Faso            Mossi                         1
## 2          Muslim       Burkina Faso            Mossi                         0
## 3          Muslim       Burkina Faso            Mossi                        NA
## 4          Muslim       Burkina Faso            Mossi                         0
## 5          Muslim       Burkina Faso            Mossi                        NA
## 6       Christian       Burkina Faso            Mossi                         0
##   s01q39b_internet_at_work s01q39c_internet_at_cyber_cafe
## 1                        0                              0
## 2                        0                              0
## 3                       NA                             NA
## 4                        0                              0
## 5                       NA                             NA
## 6                        0                              0
##   s01q39d_internet_at_home s01q39e_internet_at_school
## 1                        0                          0
## 2                        0                          0
## 3                       NA                         NA
## 4                        0                          0
## 5                       NA                         NA
## 6                        0                          0
##   s06q01a_has_standard_bank_account s06q01b_has_bank_account_through_job
## 1                                 1                                    0
## 2                                 0                                    0
## 3                                NA                                   NA
## 4                                NA                                   NA
## 5                                NA                                   NA
## 6                                 0                                    0
##   s06q01c_has_rural_savings_bank_account s06q01d_has_mobile_bank_account
## 1                                      0                               1
## 2                                      0                               1
## 3                                     NA                              NA
## 4                                     NA                              NA
## 5                                     NA                              NA
## 6                                      1                               1
##   s06q01e_has_prepaid_card s06q02_has_savings_account
## 1                        0                          1
## 2                        0                          0
## 3                       NA                         NA
## 4                       NA                         NA
## 5                       NA                         NA
## 6                        0                          1
##   s06q03_applied_for_credit_last_twelve s06q05_obtained_credit_last_twelve
## 1                                     0                                 NA
## 2                                     0                                 NA
## 3                                    NA                                 NA
## 4                                    NA                                 NA
## 5                                    NA                                 NA
## 6                                     0                                 NA
##   s06q04_if_no_credit_why_Did not qualify
## 1                                       0
## 2                                       1
## 3                                      NA
## 4                                      NA
## 5                                      NA
## 6                                       1
##   s06q04_if_no_credit_why_High interest rates
## 1                                           0
## 2                                           0
## 3                                          NA
## 4                                          NA
## 5                                          NA
## 6                                           0
##   s06q04_if_no_credit_why_Not sure you would get one
## 1                                                  0
## 2                                                  0
## 3                                                 NA
## 4                                                 NA
## 5                                                 NA
## 6                                                  0
##   s06q04_if_no_credit_why_Didnt_need
## 1                                  1
## 2                                  0
## 3                                 NA
## 4                                 NA
## 5                                 NA
## 6                                  0
##   s06q04_if_no_credit_why_Other source of credit
## 1                                              0
## 2                                              0
## 3                                             NA
## 4                                             NA
## 5                                             NA
## 6                                              0
##   s06q04_if_no_credit_why_Do not know how to ask for one
## 1                                                      0
## 2                                                      0
## 3                                                     NA
## 4                                                     NA
## 5                                                     NA
## 6                                                      0
##   s06q04_if_no_credit_why_Unable to repay s06q04_if_no_credit_why_Other reason
## 1                                       0                                    0
## 2                                       0                                    0
## 3                                      NA                                   NA
## 4                                      NA                                   NA
## 5                                      NA                                   NA
## 6                                       0                                    0
##   s06q04_if_no_credit_why_No credit institutions available
## 1                                                        0
## 2                                                        0
## 3                                                       NA
## 4                                                       NA
## 5                                                       NA
## 6                                                        0
##   s06q04_if_no_credit_why_NA Bicycle Air conditioner Lawn mower Bed Fan
## 1                          0       1               0          0   1   1
## 2                          0       1               0          0   1   1
## 3                          1       1               0          0   1   1
## 4                          1       1               0          0   1   1
## 5                          1       1               0          0   1   1
## 6                          0       1               0          0   0   0
##   VCR CD DVD player Video camera Mattress Small boat TV Camcorder Dining table
## 1                 0            0        1          0  1         0            0
## 2                 0            0        1          0  1         0            0
## 3                 0            0        1          0  1         0            0
## 4                 0            0        1          0  1         0            0
## 5                 0            0        1          0  1         0            0
## 6                 0            0        1          0  0         0            0
##   Motorbike Camera Carpet Vacuum cleaner Refrigerator Radio Kitchen mixer
## 1         1      0      0              0            1     0             0
## 2         1      0      0              0            1     0             0
## 3         1      0      0              0            1     0             0
## 4         1      0      0              0            1     0             0
## 5         1      0      0              0            1     0             0
## 6         1      0      0              0            0     0             0
##   Hunting rifles Computer Musical instruments Generator Coal iron Hi Fi system
## 1              0        0                   0         0         0            0
## 2              0        0                   0         0         0            0
## 3              0        0                   0         0         0            0
## 4              0        0                   0         0         0            0
## 5              0        0                   0         0         0            0
## 6              0        0                   0         0         0            0
##   Personal car Cookstoves Cellphone Freezer Guitar Electric iron Printer
## 1            0          1         1       0      0             0       0
## 2            0          1         1       0      0             0       0
## 3            0          1         1       0      0             0       0
## 4            0          1         1       0      0             0       0
## 5            0          1         1       0      0             0       0
## 6            0          1         1       0      0             0       0
##   Propane tank Living room Portable stove Microwave Landline Satellite antenna
## 1            1           0              0         0        0                 0
## 2            1           0              0         0        0                 0
## 3            1           0              0         0        0                 0
## 4            1           0              0         0        0                 0
## 5            1           0              0         0        0                 0
## 6            0           0              0         0        0                 0
##   Washer dryer House Stove Tablet Land Blender Wardrobes other furniture
## 1            0     1     0      0    1       0                         1
## 2            0     1     0      0    1       0                         1
## 3            0     1     0      0    1       0                         1
## 4            0     1     0      0    1       0                         1
## 5            0     1     0      0    1       0                         1
## 6            0     0     0      0    0       0                         0
s01_s06_s12_final <- s01_s06_s12 %>%
  rename(
    HHID = hhid)
head(s01_s06_s12_final)
##      HHID_ID   HHID s01q01_gender s01q3c_age s01q07_relationship_status
## 1 100_26_2_1 100026          Male         39        Monogamous marriage
## 2 100_26_2_2 100026        Female         37        Monogamous marriage
## 3 100_26_2_3 100026        Female          7                       <NA>
## 4 100_26_2_4 100026          Male         15                     Single
## 5 100_26_2_5 100026          Male          3                       <NA>
## 6 100_46_2_1 100046          Male         58        Monogamous marriage
##   s01q14_religion s01q15_nationality s01q16_ethnicity s01q39a_internet_on_phone
## 1          Muslim       Burkina Faso            Mossi                         1
## 2          Muslim       Burkina Faso            Mossi                         0
## 3          Muslim       Burkina Faso            Mossi                        NA
## 4          Muslim       Burkina Faso            Mossi                         0
## 5          Muslim       Burkina Faso            Mossi                        NA
## 6       Christian       Burkina Faso            Mossi                         0
##   s01q39b_internet_at_work s01q39c_internet_at_cyber_cafe
## 1                        0                              0
## 2                        0                              0
## 3                       NA                             NA
## 4                        0                              0
## 5                       NA                             NA
## 6                        0                              0
##   s01q39d_internet_at_home s01q39e_internet_at_school
## 1                        0                          0
## 2                        0                          0
## 3                       NA                         NA
## 4                        0                          0
## 5                       NA                         NA
## 6                        0                          0
##   s06q01a_has_standard_bank_account s06q01b_has_bank_account_through_job
## 1                                 1                                    0
## 2                                 0                                    0
## 3                                NA                                   NA
## 4                                NA                                   NA
## 5                                NA                                   NA
## 6                                 0                                    0
##   s06q01c_has_rural_savings_bank_account s06q01d_has_mobile_bank_account
## 1                                      0                               1
## 2                                      0                               1
## 3                                     NA                              NA
## 4                                     NA                              NA
## 5                                     NA                              NA
## 6                                      1                               1
##   s06q01e_has_prepaid_card s06q02_has_savings_account
## 1                        0                          1
## 2                        0                          0
## 3                       NA                         NA
## 4                       NA                         NA
## 5                       NA                         NA
## 6                        0                          1
##   s06q03_applied_for_credit_last_twelve s06q05_obtained_credit_last_twelve
## 1                                     0                                 NA
## 2                                     0                                 NA
## 3                                    NA                                 NA
## 4                                    NA                                 NA
## 5                                    NA                                 NA
## 6                                     0                                 NA
##   s06q04_if_no_credit_why_Did not qualify
## 1                                       0
## 2                                       1
## 3                                      NA
## 4                                      NA
## 5                                      NA
## 6                                       1
##   s06q04_if_no_credit_why_High interest rates
## 1                                           0
## 2                                           0
## 3                                          NA
## 4                                          NA
## 5                                          NA
## 6                                           0
##   s06q04_if_no_credit_why_Not sure you would get one
## 1                                                  0
## 2                                                  0
## 3                                                 NA
## 4                                                 NA
## 5                                                 NA
## 6                                                  0
##   s06q04_if_no_credit_why_Didnt_need
## 1                                  1
## 2                                  0
## 3                                 NA
## 4                                 NA
## 5                                 NA
## 6                                  0
##   s06q04_if_no_credit_why_Other source of credit
## 1                                              0
## 2                                              0
## 3                                             NA
## 4                                             NA
## 5                                             NA
## 6                                              0
##   s06q04_if_no_credit_why_Do not know how to ask for one
## 1                                                      0
## 2                                                      0
## 3                                                     NA
## 4                                                     NA
## 5                                                     NA
## 6                                                      0
##   s06q04_if_no_credit_why_Unable to repay s06q04_if_no_credit_why_Other reason
## 1                                       0                                    0
## 2                                       0                                    0
## 3                                      NA                                   NA
## 4                                      NA                                   NA
## 5                                      NA                                   NA
## 6                                       0                                    0
##   s06q04_if_no_credit_why_No credit institutions available
## 1                                                        0
## 2                                                        0
## 3                                                       NA
## 4                                                       NA
## 5                                                       NA
## 6                                                        0
##   s06q04_if_no_credit_why_NA Bicycle Air conditioner Lawn mower Bed Fan
## 1                          0       1               0          0   1   1
## 2                          0       1               0          0   1   1
## 3                          1       1               0          0   1   1
## 4                          1       1               0          0   1   1
## 5                          1       1               0          0   1   1
## 6                          0       1               0          0   0   0
##   VCR CD DVD player Video camera Mattress Small boat TV Camcorder Dining table
## 1                 0            0        1          0  1         0            0
## 2                 0            0        1          0  1         0            0
## 3                 0            0        1          0  1         0            0
## 4                 0            0        1          0  1         0            0
## 5                 0            0        1          0  1         0            0
## 6                 0            0        1          0  0         0            0
##   Motorbike Camera Carpet Vacuum cleaner Refrigerator Radio Kitchen mixer
## 1         1      0      0              0            1     0             0
## 2         1      0      0              0            1     0             0
## 3         1      0      0              0            1     0             0
## 4         1      0      0              0            1     0             0
## 5         1      0      0              0            1     0             0
## 6         1      0      0              0            0     0             0
##   Hunting rifles Computer Musical instruments Generator Coal iron Hi Fi system
## 1              0        0                   0         0         0            0
## 2              0        0                   0         0         0            0
## 3              0        0                   0         0         0            0
## 4              0        0                   0         0         0            0
## 5              0        0                   0         0         0            0
## 6              0        0                   0         0         0            0
##   Personal car Cookstoves Cellphone Freezer Guitar Electric iron Printer
## 1            0          1         1       0      0             0       0
## 2            0          1         1       0      0             0       0
## 3            0          1         1       0      0             0       0
## 4            0          1         1       0      0             0       0
## 5            0          1         1       0      0             0       0
## 6            0          1         1       0      0             0       0
##   Propane tank Living room Portable stove Microwave Landline Satellite antenna
## 1            1           0              0         0        0                 0
## 2            1           0              0         0        0                 0
## 3            1           0              0         0        0                 0
## 4            1           0              0         0        0                 0
## 5            1           0              0         0        0                 0
## 6            0           0              0         0        0                 0
##   Washer dryer House Stove Tablet Land Blender Wardrobes other furniture
## 1            0     1     0      0    1       0                         1
## 2            0     1     0      0    1       0                         1
## 3            0     1     0      0    1       0                         1
## 4            0     1     0      0    1       0                         1
## 5            0     1     0      0    1       0                         1
## 6            0     0     0      0    0       0                         0