The objective of this script is to generate unique ID’s for the Volunteers surveyed at Baseline.
The ID’s should be generated based on country, name and surname. Each country would be attributed a number (see table below) and then each Volunteer will receive the number of his ranking in the table for his country.
| Country | Country_ID |
|---|---|
| Senegal | 1 |
| Sierra Leone | 2 |
| The Gambia | 3 |
| Guinea | 4 |
| Côte d’Ivoire | 5 |
| Liberia | 6 |
| Nigeria | 7 |
For instance, the first Volunteer in The Gambia will have their ID start with 3 (the country ID), followed by 1, his row number in the data set.
This is the code used to generate the unique ID’s. We checked for doubles. There was only one double name and surname in Guinea, but after checking the phone numbers, it was not the same person (see Annex)
#Loading the dataset
data<-read_excel(paste0(root,"Raw_data/MaM2_MHPSS_Baseline_data_en_-_2021-02-17-15-58-33.xlsx"))
#Generating country ID's
countries<-data%>%select("8 Residence : where do you currently live ?")%>%unique()%>%mutate(country_id=row_number())
colnames(countries)<-c("Country","Country_ID")
#select only the variables of interest: country, name and surname
data_ID<-data%>%select("8 Residence : where do you currently live ?","3. First Name","4. Last name")
#renaming variables
colnames(data_ID)<-c("Country","Name","Surname")
#Generating the ID's
#we add the country ID's
data_ID<-left_join(data_ID,countries,by="Country")
#Generating Individual ID's
MHPSS_ID<-data_ID%>%filter(Country%in%countries[1,1])%>%mutate(ID=(paste0(Country_ID,row_number())))
for(i in 2:nrow(countries)){
temp<-data_ID%>%filter(Country%in%countries[i,1])%>%mutate(ID=(paste0(Country_ID,row_number())))
MHPSS_ID<-bind_rows(MHPSS_ID,temp)
}
The ID’s have been saved in xls and csv format here:
| Country | Name | Surname | Phone_1 | Phone_2 |
|---|---|---|---|---|
| Guinea | Ousmane | Barry | 624937164 | 656468817 |
| Guinea | Ousmane | Barry | 626355937 | NA |