Summary

This report documents the key demographics and drivers of financial inclusion in a community.

Financial Inclusion: Individuals who are over 18 years in age that use financial products/services whether formal or informal.

Financial Exclusion: individuals 18 years of age and over that do not have/use any financial products and/or services; formal and/or informal

Informal: Individuals aged 18 and above that use financial products/services which are not regulated

Formal other: Individuals aged 18 and above , who use/have financial products/services provided by* other regulated non-deposit money bank financial institutions e.g. microfinance

Banked: Individuals aged 18 and above who are using commercial banks to obtain products or services

## [1] "Marital.Status"   "Gender"           "Source"           "Financial.Access"
## [5] "Age.Group"        "Sector"           "Education.Level"
##                      Marital.Status     Gender     
##  Married (Monogamy)         :14519   Male  :12456  
##  Never married              : 5805   Female:12075  
##  Married (Polygamy)         : 2324                 
##  Widowed                    :  878                 
##  Separated                  :  510                 
##  Co-Habiting/living together:  211                 
##  (Other)                    :  284                 
##                                                                    Source    
##  Own business - provide a service (e.g. hairdresser, tailor, mechanic):3745  
##  Own business/trader - non-farming                                    :3739  
##  Subsistence/small scale farming                                      :3599  
##  Household member pays my expenses                                    :2516  
##  Own business/trader - farming products                               :2311  
##  Commercial/large scale farming                                       :1727  
##  (Other)                                                              :6894  
##  Financial.Access Age.Group      Sector     
##  Banked  :10390   15-17:   0   URBAN: 6666  
##  Other_F : 1397   18-25:6560   RURAL:17865  
##  Informal: 3670   26-35:8709                
##  Excluded: 9074   36-45:5936                
##                   46-55:3326                
##                   56+  :   0                
##                                             
##                      Education.Level 
##  Lower levels of education   :10863  
##  Achieved Secondary and above:13668  
##                                      
##                                      
##                                      
##                                      
## 

Financial Inclusion by Gender

attach(summary_data)
gender_data<-summary_data %>% count(Gender, Financial.Access, sort=TRUE)
ftable(table(Gender,Financial.Access))
##        Financial.Access Banked Other_F Informal Excluded
## Gender                                                  
## Male                      6059     708     1642     4047
## Female                    4331     689     2028     5027
attach(gender_data)
gd<-gender_data %>%                                    # Calculate percentage by group
    group_by(Financial.Access) %>%
    mutate(perc = paste(as.character(round(n*100 / sum(n),2)),"%")) %>% 
    as.data.frame()
g1<-ggplot(gd, aes(x=Gender,y=n, fill=Financial.Access)) +geom_bar(stat="identity")+scale_fill_brewer()
g1+geom_text(aes(label = perc, size="6"), position = position_stack(vjust = 0.5)) + ggtitle ("Financial Access by Gender")

Financial Inclusion by Sector

sector_data<-summary_data %>% count(Sector, Gender, Financial.Access, sort=TRUE)
attach(sector_data)
#ftable(table(Gender, Sector, Financial.Access))
attach(sector_data)
sd<-sector_data %>%                                    # Calculate percentage by group
    group_by(Financial.Access) %>%
    mutate(perc = paste(as.character(round(n*100 / sum(n),2)),"%")) %>% 
    as.data.frame()
s1<-ggplot(sd, aes(x=Sector,y=n, fill=Financial.Access))+geom_bar(stat="identity")+facet_wrap(facets=vars(Gender))+ scale_fill_brewer()
s1+geom_text(aes(label = perc, size="6"), position = position_stack(vjust = 0.5)) + ggtitle ("Financial Access by Sector")

Financial Inclusion by Marital Status

maritalStatusData<-summary_data %>% count(Marital.Status, Financial.Access, sort=TRUE)
msd<-maritalStatusData %>%                                    # Calculate percentage by group
    group_by(Financial.Access) %>%
    mutate(perc = paste(as.character(round(n*100 / sum(n),2)),"%")) %>% 
    as.data.frame()

m1<-ggplot(msd, aes(x=Marital.Status,y=n, fill=Financial.Access))+ geom_bar(stat="identity")+theme( axis.text.x=element_text(size=8)) + scale_fill_brewer()
m1 + ggtitle ("Financial Access by Marital Status")

Financial Inclusion by Education Level

attach(summary_data)
EducationData<-summary_data %>% count(Education.Level, Financial.Access, sort=TRUE)
ftable(table(Education.Level,Financial.Access)) 
##                              Financial.Access Banked Other_F Informal Excluded
## Education.Level                                                               
## Lower levels of education                       1631     736     2043     6453
## Achieved Secondary and above                    8759     661     1627     2621
ed<-EducationData %>%                                    # Calculate percentage by group
    group_by(Financial.Access) %>%
    mutate(perc = paste(as.character(round(n*100 / sum(n),2)),"%")) %>% 
    as.data.frame()
e1<-ggplot(ed, aes(x=Education.Level,y=n, fill=Financial.Access))+ geom_bar(stat="identity")+theme(axis.text.x=element_text(size=8))+scale_fill_brewer()
e1+geom_text(aes(label = perc, size="6"), position = position_stack(vjust = 0.5)) + ggtitle ("Financial Access by Education Level")

Financial Inclusion of Small holder farmers by Gender

smallholderfarmers<-subset(summary_data, Source %in% "Subsistence/small scale farming")
attach(smallholderfarmers)
farmerData<-smallholderfarmers %>% count(Gender, Financial.Access, sort=TRUE)
ftable(table(Gender,Financial.Access))
##        Financial.Access Banked Other_F Informal Excluded
## Gender                                                  
## Male                       628     157      434     1150
## Female                     216      89      344      581
fd<-farmerData %>%                                    # Calculate percentage by group
    group_by(Financial.Access) %>%
    mutate(perc = paste(as.character(round(n*100 / sum(n),2)),"%")) %>% 
    as.data.frame()
f1<-ggplot(fd, aes(x=Gender,y=n, fill=Financial.Access))+ geom_bar(stat="identity")+scale_fill_brewer()
f1+geom_text(aes(label = perc, size="8"), position = position_stack(vjust = 0.5)) + ggtitle ("Financial Access of Small Holder Farmers by Gender")

Drivers of Financial Inclusion among gender groups

Fitting a Logistic Regression

attach(summary_data)
FI<-Financial.Access
#collapse financial access 4 level variable("Banked", "Formal-other", "Informal", "Excluded") into binary variable with options "included" vs "excluded"
for(i in 1:length(FI)){if(Financial.Access[i]=="Informal" || Financial.Access[i]=="Other_F") FI[i]<-"Banked"}
FIdata<-droplevels(FI)
levels(FIdata)<-c("Included","Excluded")
#add new column with binary FI variable
summary_data$FI<-FIdata

#fit a logistic regression with binary y
glm0<-glm(FI~Gender+Sector+Marital.Status+Education.Level, family="binomial")
summary(glm0)
## 
## Call:
## glm(formula = FI ~ Gender + Sector + Marital.Status + Education.Level, 
##     family = "binomial")
## 
## Coefficients:
##                                             Estimate Std. Error z value
## (Intercept)                                 -0.46910    0.04272 -10.982
## GenderFemale                                 0.35937    0.03016  11.917
## SectorRURAL                                  0.74826    0.03671  20.381
## Marital.StatusMarried (Polygamy)             0.22794    0.04964   4.592
## Marital.StatusCo-Habiting/living together   -0.33156    0.16897  -1.962
## Marital.StatusDivorced                       0.02070    0.16385   0.126
## Marital.StatusSeparated                     -0.55810    0.11208  -4.979
## Marital.StatusWidowed                       -0.32253    0.07828  -4.120
## Marital.StatusNever married                  0.20426    0.03809   5.363
## Marital.StatusRefused to answer              0.28928    0.24275   1.192
## Education.LevelAchieved Secondary and above -1.71372    0.03136 -54.638
##                                             Pr(>|z|)    
## (Intercept)                                  < 2e-16 ***
## GenderFemale                                 < 2e-16 ***
## SectorRURAL                                  < 2e-16 ***
## Marital.StatusMarried (Polygamy)            4.39e-06 ***
## Marital.StatusCo-Habiting/living together     0.0497 *  
## Marital.StatusDivorced                        0.8995    
## Marital.StatusSeparated                     6.38e-07 ***
## Marital.StatusWidowed                       3.79e-05 ***
## Marital.StatusNever married                 8.20e-08 ***
## Marital.StatusRefused to answer               0.2334    
## Education.LevelAchieved Secondary and above  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 32327  on 24530  degrees of freedom
## Residual deviance: 27410  on 24520  degrees of freedom
## AIC: 27432
## 
## Number of Fisher Scoring iterations: 4
exp(glm0$coefficients)
##                                 (Intercept) 
##                                   0.6255677 
##                                GenderFemale 
##                                   1.4324302 
##                                 SectorRURAL 
##                                   2.1133146 
##            Marital.StatusMarried (Polygamy) 
##                                   1.2560060 
##   Marital.StatusCo-Habiting/living together 
##                                   0.7178046 
##                      Marital.StatusDivorced 
##                                   1.0209130 
##                     Marital.StatusSeparated 
##                                   0.5722936 
##                       Marital.StatusWidowed 
##                                   0.7243152 
##                 Marital.StatusNever married 
##                                   1.2266177 
##             Marital.StatusRefused to answer 
##                                   1.3354692 
## Education.LevelAchieved Secondary and above 
##                                   0.1801936
#plot residuals to check for goodness of fit

The Key Drivers of financial inclusion and odds ratios : logistic regression analysis

From the Logistic Regression at significance level P<0.01, The key drivers of Financial Inclusion are :

Gender with odds of being included increasing highest in Males =1.57 (57%) times versus females at 1.43times (43%)

Sector with odds of being included increasing highest in Urban sector at 2.89 times (189%) versus rural at 2.11 times (111%)

Education Level with odds of being included decreasing lowest for those who Achieved Higher Levels of Education at 0.18 times versus Lower Levels at 0.82 times and

Marital Status except those “Divorced” , “Cohabiting /Living together” and “Refused to answer” with odds of being included increasing higher for those in Married(Monogamous) at 1.75 times versus 1.25 times in Polygamous…etc

Fitting a Negative Binomial Regression model

#compute count variable y to fit possible poisson regression
data<-summary_data %>% count(Marital.Status, Gender, Education.Level, Sector, FI, sort=TRUE)
attach(data)
sample<- subset(data,FI=="Included")
colnames(sample)[6]<-"count.included"
head(sample)
##       Marital.Status Gender              Education.Level Sector       FI
## 1 Married (Monogamy)   Male Achieved Secondary and above  RURAL Included
## 4 Married (Monogamy) Female Achieved Secondary and above  RURAL Included
## 5      Never married   Male Achieved Secondary and above  RURAL Included
## 6 Married (Monogamy) Female Achieved Secondary and above  URBAN Included
## 7 Married (Monogamy)   Male    Lower levels of education  RURAL Included
## 8 Married (Monogamy)   Male Achieved Secondary and above  URBAN Included
##   count.included
## 1           2322
## 4           1516
## 5           1366
## 6           1297
## 7           1145
## 8           1123
attach(sample)


#EXPLORATORY DATA ANALYSIS 1: boxplot of our financial inclusion count variable
boxplot(count.included)

#EXPLORATORY DATA ANALYSIS 2:check the mean and variance
mean(count.included)
## [1] 241.5156
var(count.included)
## [1] 213725.9
#The variance is much larger than the mean which suggests we will have an overdispersion problem
#i.e the estimates in our poisson regression will be correct but the standard errors will be wrong


#EXPLORATORY DATA ANALYSIS 3:plot histogram and probability mass function of our count data
hist(count.included)

pmf(count.included)

#FITTING a negative binomial model to remedy overdispersion
library(MASS)
glm2<-glm.nb(count.included ~ Gender + Education.Level + Sector + Marital.Status)
summary(glm2)
## 
## Call:
## glm.nb(formula = count.included ~ Gender + Education.Level + 
##     Sector + Marital.Status, init.theta = 3.525431838, link = log)
## 
## Coefficients:
##                                             Estimate Std. Error z value
## (Intercept)                                   5.9142     0.2261  26.153
## GenderFemale                                  0.2828     0.1425   1.985
## Education.LevelAchieved Secondary and above   0.7948     0.1428   5.568
## SectorRURAL                                   0.8402     0.1428   5.885
## Marital.StatusMarried (Polygamy)             -2.0349     0.2688  -7.572
## Marital.StatusCo-Habiting/living together    -4.0611     0.2819 -14.405
## Marital.StatusDivorced                       -4.2604     0.2850 -14.948
## Marital.StatusSeparated                      -3.1183     0.2729 -11.428
## Marital.StatusWidowed                        -2.8208     0.2713 -10.398
## Marital.StatusNever married                  -0.8646     0.2673  -3.235
## Marital.StatusRefused to answer              -5.1844     0.3086 -16.802
##                                             Pr(>|z|)    
## (Intercept)                                  < 2e-16 ***
## GenderFemale                                 0.04720 *  
## Education.LevelAchieved Secondary and above 2.58e-08 ***
## SectorRURAL                                 3.99e-09 ***
## Marital.StatusMarried (Polygamy)            3.69e-14 ***
## Marital.StatusCo-Habiting/living together    < 2e-16 ***
## Marital.StatusDivorced                       < 2e-16 ***
## Marital.StatusSeparated                      < 2e-16 ***
## Marital.StatusWidowed                        < 2e-16 ***
## Marital.StatusNever married                  0.00122 ** 
## Marital.StatusRefused to answer              < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for Negative Binomial(3.5254) family taken to be 1)
## 
##     Null deviance: 665.361  on 63  degrees of freedom
## Residual deviance:  63.944  on 53  degrees of freedom
## AIC: 641.55
## 
## Number of Fisher Scoring iterations: 1
## 
## 
##               Theta:  3.525 
##           Std. Err.:  0.659 
## 
##  2 x log-likelihood:  -617.555
exp(glm2$coefficients)
##                                 (Intercept) 
##                                3.702632e+02 
##                                GenderFemale 
##                                1.326880e+00 
## Education.LevelAchieved Secondary and above 
##                                2.213955e+00 
##                                 SectorRURAL 
##                                2.316871e+00 
##            Marital.StatusMarried (Polygamy) 
##                                1.306891e-01 
##   Marital.StatusCo-Habiting/living together 
##                                1.722981e-02 
##                      Marital.StatusDivorced 
##                                1.411606e-02 
##                     Marital.StatusSeparated 
##                                4.423157e-02 
##                       Marital.StatusWidowed 
##                                5.955828e-02 
##                 Marital.StatusNever married 
##                                4.212399e-01 
##             Marital.StatusRefused to answer 
##                                5.603072e-03
#check for overdispersion using Pearsons Chi Sq statistic and Degrees of freedom
dp = sum(residuals(glm2,type ="pearson")^2)/glm2$df.residual
dp
## [1] 1.130467
#Great ! dp is close to 1 so we have solved for overdispersion.
#Plot residuals to check for goodness of fit
par(mfrow=c(2,2))
plot(glm2)

Negative Binomial Regression : The Key Drivers of Financial Inclusion

The AIC of the Negative Binomial regression is significantly lower than the Logistic regression and so we can conclude this model explains Financial Inclusion better .

The Residual plots reveal a close to normal distribution with possible 3 outliers

At significance of p<0.01 All predictors are significant except Gender:

Sector the difference in expected counts of those Financially included is expected to 2.31 times for those in rural sector/urban sector

Education Level the difference in expected counts of those Financially included is expected to increase 2.21 times for those whove Achieved Higher levels of education/Lower Levels of Education

Marital Status the difference in expected counts of those Financially included is expected to increase 1.3 times for Polygamous/Monogamous etc