Wealth-index is a measure of the cumulative living standrd of a houselhold. It is calculated using data on assets posessed by households such as vehicles, clean water, electricity etc. This document aims at assesing the living standard of kenyans accross various regions(provinces) by measuring their wealth Indices. Calculation of wealth index will take the following simple steps 1. Selecting the variables 2. Exploratory data analysis 3. Recoding the data into binary variables 4. Conduct Principle Component analysis 5. Create wealth index quintiles 6. Graph the index and report
The data regarding assets has been collected and stored in spss (.sav) format. It is precoded as follows
## Provinces code
## 1 central 2
## 2 coast 3
## 3 eastern 4
## 4 north eastern 5
## 5 nyanza 6
## 6 rift valley 7
## 7 western 8
library(haven)
library(tidyverse)
library(dplyr)
assets <- read_sav("C:/Users/Christine/Desktop/Rpubs/ASSETS.sav")
vars_labelled = map(assets, function(x) attr(x, "class") == "haven_labelled") %>%
unlist() %>%
names()
assets_factor = assets %>%
mutate_at( vars(vars_labelled), as_factor)
head(assets_factor,3 )
## # A tibble: 3 x 24
## aprovinc bicycle Motobike Radio Telephone Refrigerator Fan Buckets Bed
## <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct>
## 1 eastern no no no yes no no yes yes
## 2 nyanza no no no yes no no yes no
## 3 central no no no yes no yes yes yes
## # ... with 15 more variables: Bedsheets <fct>, Blankets <fct>, Nets <fct>,
## # tables <fct>, Ox <fct>, Plough <fct>, Hoes <fct>, Axes <fct>, Muo <fct>,
## # Kosiowo <fct>, Ngomo <fct>, Bellows <fct>, Jembe <fct>, Panga <fct>,
## # Other <fct>
##
## yes no
## central 74 503
## coast 0 0
## eastern 227 944
## north eastern 0 0
## nyanza 3 505
## rift valley 91 537
## western 115 424
##
## yes no
## central 11 418
## coast 0 0
## eastern 97 848
## north eastern 0 0
## nyanza 15 341
## rift valley 4 367
## western 11 342
##
## yes no
## central 1 576
## coast 0 0
## eastern 6 1165
## north eastern 0 0
## nyanza 3 505
## rift valley 0 628
## western 2 537
The tables above show wealth distribution in terms of possession of selected goods in the household. Now we use the above information to calculate the actual living standard among kenyans. The data will be coded in binary, 1 to mean possession of the household item being measured and 0 otherwise . Missing values will be assumed as zero as shown below.
assets[,2:24][assets[,2:24]=="2"]<-0
vars_labelled = map(assets, function(x) attr(x, "class") == "haven_labelled") %>%
unlist() %>%
names()
print(vars_labelled)
## [1] "aprovinc" "bicycle" "Motobike" "Radio" "Telephone"
## [6] "Refrigerator" "Fan" "Buckets" "Bed" "Bedsheets"
## [11] "Blankets" "Nets" "tables" "Ox" "Plough"
## [16] "Hoes" "Axes" "Muo" "Kosiowo" "Ngomo"
## [21] "Bellows" "Jembe" "Panga" "Other"
assets.num = assets %>%
mutate_at( vars(vars_labelled), as.integer)
head(assets.num, 3)
## # A tibble: 3 x 24
## aprovinc bicycle Motobike Radio Telephone Refrigerator Fan Buckets Bed
## <int> <int> <int> <int> <int> <int> <int> <int> <int>
## 1 4 0 0 0 1 0 0 1 1
## 2 6 0 0 0 1 0 0 1 0
## 3 2 0 0 0 1 0 1 1 1
## # ... with 15 more variables: Bedsheets <int>, Blankets <int>, Nets <int>,
## # tables <int>, Ox <int>, Plough <int>, Hoes <int>, Axes <int>, Muo <int>,
## # Kosiowo <int>, Ngomo <int>, Bellows <int>, Jembe <int>, Panga <int>,
## # Other <int>
Wealth index is caclulated using Principle component analysis. This technique will take into account the components that portray the highest variability within the data, while minimizing autocorrelation. The first component will be used to construct wealth quintiles, ranked from 1 to 5, with 1 being the lowest and 5 the highest. 1 will represent the poorest in the society and 5 the wealthiest.
library(psych)
assets.num[is.na(assets.num)]<-0
assets.num$aprovinc<-as.factor(assets.num$aprovinc)
assets.pca<-psych::principal(assets.num[,2:24], rotate="varimax", nfactors=3,covar=T, scores=TRUE)
wealth_index=assets.pca$scores[,1]
Assets.indexed<-mutate(assets_factor,quintile=as.factor(cut(wealth_index,breaks=5,labels= c(1,2,3,4,5))))
ggplot(Assets.indexed, aes(aprovinc)) + geom_bar(aes(fill = quintile), position = "fill")+ xlab("Province")+ylab("Percentage")+ggtitle("Wealth by Province")
From the wealth index graph, Eastern seems to be the poorest province, followed by rift valley. Central province and rift- valley are the richer provinces.