DATA 606 01[15958] : Final Project [05/15]

1 Part 1 - Introduction
2 Part 2 - Data Preparation
3 Part 3 - Exploratory Data Analysis
4 Part 4 - Inference
5 Part 5 - Conclusion

1 Part 1 - Introduction

Kiva Microfunds (commonly known by its domain name, Kiva.org) is a non-profit organization that allows people to lend money via the Internet to low-income entrepreneurs and students in over 80 countries. Kiva’s mission is “to connect people through lending to alleviate poverty.

We will use the Kaggle Dataset Multidimensional Poverty Measures to investigate the poverty in the world. We would connect these measures with the Kiva loans for a more complete understanding and explore the different metrics of the Multidimensional Poverty Measures

Multidimensional poverty measures can be used to create a more comprehensive picture. They reveal who is poor and how they are poor - the range of different disadvantages they experience.Higher the MPI, poorer is the country

From Wiki

The Human Development Index (HDI) is a composite statistic (composite index) of life expectancy, education, and per capita income indicators, which are used to rank countries into four tiers of human development. A country scores higher HDI when the lifespan is higher, the education level is higher, and the GDP per capita is higher.

Kiva should direct the loans to required countries and regions that would improve the The Human Development Index

2 Part 2 - Data Preparation

2.1 Load Libraries

readr
tidyverse
stringr
lubridate
DT
leaflet
knitr
treemap
caret
data.table
statsr
broom

libraries_used <- c("readr", "tidyverse", "stringr", "lubridate", "DT", "leaflet", "knitr", "treemap", "caret", "forecast", "prophet", "data.table", "broom", "statsr")

# check missing libraries
libraries_missing <- libraries_used[!(libraries_used %in% installed.packages()[,"Package"])]
# install missing libraries
if(length(libraries_missing)) install.packages(libraries_missing, repos = "http://cran.us.r-project.org")

library(readr)      # read data
library(tidyverse)  # data manipulation and graphs
library(stringr)    # string manipulation
library(lubridate)  # date manipulation
library(DT)         # table format display of data
library(leaflet)    # maps
library(knitr)
library(treemap)
library(caret)
library(forecast)
library(prophet)
library(data.table)
library(broom)
library(statsr)

load("nc.Rdata")

2.2 Data collection

Download and load data from Kaggle

rm(list=ls())

fillColor = "#FFA07A"
fillColor2 = "#F1C40F"
fillColorLightCoral = "#F08080"

loans <- readr::read_csv('kiva_loans.csv')
regions <- readr::read_csv("kiva_mpi_region_locations.csv")
themes <- readr::read_csv("loan_theme_ids.csv")
themes_region <- readr::read_csv("loan_themes_by_region.csv")

mpi_national = readr::read_csv('MPI_national.csv')
mpi_subnational = readr::read_csv('MPI_subnational.csv')

country_stats <- readr::read_csv("country_stats.csv")
GEconV4 = readr::read_delim(file = "GEconV4.csv",delim=";")

countries_continents = readr::read_csv('countries and continents.csv')

ConflictsData =  readr::read_csv("african_conflicts.csv", 
                                col_types = readr::cols(.default = readr::col_character(),
                                                        FATALITIES = readr::col_integer(),
                                                        GEO_PRECISION = readr::col_integer(),
                                                        GWNO = readr::col_integer(),
                                                        INTER1 = readr::col_integer(),
                                                        INTER2 = readr::col_integer(),
                                                        INTERACTION = readr::col_integer(),
                                                        LATITUDE = readr::col_character(),
                                                        LONGITUDE = readr::col_character(),
                                                        TIME_PRECISION = readr::col_integer(),
                                                        YEAR = readr::col_integer()
                                                      ))

ConflictsData$LATITUDE[!grepl("^[0-9.]+$", ConflictsData$LATITUDE)] <- NA
ConflictsData$LONGITUDE[!grepl("^[0-9.]+$", ConflictsData$LONGITUDE)] <- NA
 
ConflictsData$LATITUDE = as.numeric(as.character(ConflictsData$LATITUDE))
ConflictsData$LONGITUDE = as.numeric(as.character(ConflictsData$LONGITUDE))

loans$use = trimws(loans$use)

2.3 Cases

This dataset is an Observational study. Consumers from different countries have requested for loan or charitable donation in Kiva. Each row will contain borrowers information and their current residence along-with repayment staus.

For this project, in some cases, I have used the comple dataset and for some analysis I have taken a simple random sample of 10000 rows.

2.4 Variables

It has the funded amount, loan amount, activity and about 20 variables. Also there are around 670K observations for Kiva loans.

But for this current analysis, below are the variables used.

Kiva Loans	Description
loan_amnt	The listed amount of the loan applied for by the borrower. If at some point in time, the credit department reduces the loan amount, then it will be reflected in this value.
funded_amnt	The total amount committed to that loan at that point in time.
country	Country where the Kiva loan was granted
region	Specific region/city in the country where the Kiva loan was granted
term	The number of payments on the loan. Values are in months and can be either 1 to 160

2.5 Type of study

This is an observational study. We will arrive at conclusion by performing below tests on the mentioned variables.

Hypothesis Test - Reasoning whether the inference is just by chance.
F-Test - Compare multiple variables
Create Linear regression - Form the regression line with the available parameters. Check the values between predicted and observed outcome.
Create logarithmic regression - Create an model for non-linear data varaibles.

2.6 Scope of inference

Generalizability: Population of Kiva Loans for this study is applicable globally but specifically we have studies Africa, Asia and Americas alongwith one country from each of these continents namely Kenya, India and El Salvador. To borrow the loan from banks, it requires credit score info, personal info like home ownership, purpose of loan, employment length, annual income.

It may not be applicable for the Kiva Loans.

Bias: Here the bias that prevents the generalizability is the borrower information. Only the person who has knowledge about Kiva, is requesting for a loan in Kiva. Bank might use another co-founding variable to get the interest rate. Often, countries with bad credit score get a higher interest loan due to low graded loans - risk of being written off.

2.7 Causality

As this is an observational study we cannot derive any causal connections between the variables.

2.7.1 Glimpse of Data

2.7.1.1 Loans data

Loans data contains data about some of Kiva’s loans. It is a subset of Kiva’s data snapshots

tibble::glimpse(loans)

## Observations: 671,205
## Variables: 20
## $ id                 <dbl> 653051, 653053, 653068, 653063, 653084, 108...
## $ funded_amount      <dbl> 300, 575, 150, 200, 400, 250, 200, 400, 475...
## $ loan_amount        <dbl> 300, 575, 150, 200, 400, 250, 200, 400, 475...
## $ activity           <chr> "Fruits & Vegetables", "Rickshaw", "Transpo...
## $ sector             <chr> "Food", "Transportation", "Transportation",...
## $ use                <chr> "To buy seasonal, fresh fruits to sell.", "...
## $ country_code       <chr> "PK", "PK", "IN", "PK", "PK", "KE", "IN", "...
## $ country            <chr> "Pakistan", "Pakistan", "India", "Pakistan"...
## $ region             <chr> "Lahore", "Lahore", "Maynaguri", "Lahore", ...
## $ currency           <chr> "PKR", "PKR", "INR", "PKR", "PKR", "KES", "...
## $ partner_id         <dbl> 247, 247, 334, 247, 245, NA, 334, 245, 245,...
## $ posted_time        <dttm> 2014-01-01 06:12:39, 2014-01-01 06:51:08, ...
## $ disbursed_time     <dttm> 2013-12-17 08:00:00, 2013-12-17 08:00:00, ...
## $ funded_time        <dttm> 2014-01-02 10:06:32, 2014-01-02 09:17:23, ...
## $ term_in_months     <dbl> 12, 11, 43, 11, 14, 4, 43, 14, 14, 11, 11, ...
## $ lender_count       <dbl> 12, 14, 6, 8, 16, 6, 8, 8, 19, 24, 3, 16, 1...
## $ tags               <chr> NA, NA, "user_favorite, user_favorite", NA,...
## $ borrower_genders   <chr> "female", "female, female", "female", "fema...
## $ repayment_interval <chr> "irregular", "irregular", "bullet", "irregu...
## $ date               <date> 2014-01-01, 2014-01-01, 2014-01-01, 2014-0...

2.7.1.2 Regions data

This data contains Kiva’s estimates as to the geolocation of subnational MPI regions

DT::datatable(head(regions), style="bootstrap", class="table-condensed", options = list(dom = 'tp',scrollX = TRUE))

2.7.1.3 Themes data

This data contains Kiva’s loan themes

DT::datatable(head(themes), style="bootstrap", class="table-condensed", options = list(dom = 'tp',scrollX = TRUE))

2.7.1.4 Themes and Regions data

This data contains Kiva’s maps the loan themes with the Regions data

tibble::glimpse(themes_region)

## Observations: 15,736
## Variables: 21
## $ `Partner ID`         <dbl> 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,...
## $ `Field Partner Name` <chr> "KREDIT Microfinance Institution", "KREDI...
## $ sector               <chr> "General Financial Inclusion", "General F...
## $ `Loan Theme ID`      <chr> "a1050000000slfi", "a10500000068jPe", "a1...
## $ `Loan Theme Type`    <chr> "Higher Education", "Vulnerable Populatio...
## $ country              <chr> "Cambodia", "Cambodia", "Cambodia", "Camb...
## $ forkiva              <chr> "No", "No", "No", "No", "No", "No", "No",...
## $ region               <chr> "Banteay Meanchey", "Battambang Province"...
## $ geocode_old          <chr> "(13.75, 103.0)", NA, NA, "(12.0, 105.5)"...
## $ ISO                  <chr> "KHM", "KHM", "KHM", "KHM", "KHM", "KHM",...
## $ number               <dbl> 1, 58, 7, 1383, 3, 36, 2, 249, 7, 18, 890...
## $ amount               <dbl> 450, 20275, 9150, 604950, 275, 62225, 130...
## $ LocationName         <chr> "Banteay Meanchey, Cambodia", "Battambang...
## $ geocode              <chr> "[(13.6672596, 102.8975098)]", "[(13.0286...
## $ names                <chr> "Banteay Meanchey Province; Cambodia", "B...
## $ geo                  <chr> "(13.6672596, 102.8975098)", "(13.0286971...
## $ lat                  <dbl> 13.66726, 13.02870, 13.02870, 12.09829, 1...
## $ lon                  <dbl> 102.8975, 102.9896, 102.9896, 105.3131, 1...
## $ mpi_region           <chr> "Banteay Mean Chey, Cambodia", "Banteay M...
## $ mpi_geo              <chr> "(13.6672596, 102.8975098)", "(13.6672596...
## $ rural_pct            <dbl> 90, 90, 90, 90, 90, 90, 90, 90, 90, 90, 9...

3 Part 3 - Exploratory Data Analysis

Below are some exploratory data analysis charts to understand more about the data.

3.1 Explanatory

Below table summarizes the question, response and explanatory variable. It also shows whether it is Numerical or Categorical.

Question	Response Variable	Explanatory Variable
1. What are Kiva loans used for?	use (Categorical)	sector (Categorical), activity (Categorical)
2. Popular Sector for Kiva loans for each continent and country?	loan_amount (Numerical)	sector (Categorical), purpose (Categorical)
3. Who funds Kiva Loans ?	Field Partner Name (Categorical)	amount (Numerical)
4. Countries that require the most loans to improve `Human Development Index`?	MPI National (Numerical)	-
5. Commonly request loan terms and gender ?	term_in_months (Numerical)	borrower_genders (Categorical)

3.2 Popular Sector

The Sector which is the most popular for loans is provided in the bar chart below

loans %>%
  group_by(sector) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  ungroup() %>%
  mutate(sector = reorder(sector,Count)) %>%
  head(10) %>%
  ggplot(aes(x = sector,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = fillColor2) +
  geom_text(aes(x = sector, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Sector', 
       y = 'Count', 
       title = 'Sector and Count') +
  coord_flip() +
  theme_bw()

3.3 Popular Activity for taking loans

The Activity which is the most popular for loans is provided in the bar chart below

loans %>%
  group_by(activity) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  ungroup() %>%
  mutate(activity = reorder(activity,Count)) %>%
  head(10) %>%
  ggplot(aes(x = activity,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = fillColor) +
  geom_text(aes(x = activity, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Activity', 
       y = 'Count', 
       title = 'Activity and Count') +
  coord_flip() +
  theme_bw()

3.4 Popular Use of loans

loans %>%
  mutate(use = trimws(use)) %>%
  filter(!is.na(use)) %>%
  group_by(use) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  ungroup() %>%
  mutate(use = reorder(use,Count)) %>%
  head(10) %>%
  ggplot(aes(x = use,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = fillColorLightCoral) +
  geom_text(aes(x = use, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Use of Loans', 
       y = 'Count', 
       title = 'Use of Loans and Count') +
  coord_flip() +
  theme_bw()

3.5 Distribution of the Funded Loan amount

The funded loan amount is shown in the form of a histogram. The Y axis and the X axis has been log transformed for better visualization.

fundedLoanAmountDistribution <- function(loans)
{
  loans %>%
    ggplot(aes(x = funded_amount) )+
    scale_x_log10(
                  breaks = scales::trans_breaks("log10", function(x) 10^x),
                  labels = scales::trans_format("log10", scales::math_format(10^.x))
    ) +
    scale_y_log10(
                  breaks = scales::trans_breaks("log10", function(x) 10^x),
                  labels = scales::trans_format("log10", scales::math_format(10^.x))
    ) + 
    geom_histogram(fill = fillColor2,bins=50) +
    labs(x = 'Funded Loan Amount' ,y = 'Count', title = paste("Distribution of", "Funded Loan Amount")) +
    theme_bw()
}

fundedLoanAmountDistribution(loans)

3.5.1 Distribution by Country

loans_funded_amount = loans %>%
  group_by(country) %>%
  summarise(FundedAmount = sum(funded_amount)) %>%
  arrange(desc(FundedAmount)) %>%
  ungroup() %>%
  mutate(country = reorder(country,FundedAmount)) %>%
  head(20) 

treemap(loans_funded_amount, 
        index="country", 
        vSize = "FundedAmount",  
        title="Funded Amount", 
        palette = "RdBu",
        fontsize.title = 14)

Philippines, Kenya, Peru, Paraguay and ElSalvador get the maximum funding through Kiva loans

### Summary of Funded Amount
loans %>%
   select(funded_amount) %>%
   summary()

3.5.2 Distribution by Sector

The funded loan amount is shown sector wise below. The amount has been scaled by log10.

loans %>%
  mutate( fill = as.factor(sector)) %>%
      ggplot(aes(x = sector, y= funded_amount, fill = sector)) +
      scale_y_log10(
        breaks = scales::trans_breaks("log10", function(x) 10^x),
        labels = scales::trans_format("log10", scales::math_format(10^.x))
      ) +
  geom_boxplot() +
  labs(x= 'Sector Type',y = 'Funded Amount', 
       title = paste("Distribution of", ' Funded Amount ')) +
  theme_bw() + 
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

3.5.3 Distribution by Gender

loans %>%
  filter(!is.na(borrower_genders)) %>%
  mutate(gender = ifelse(str_detect(borrower_genders,"female"), "female","male")) %>%
  group_by(gender) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  ungroup() %>%
  mutate(gender = reorder(gender,Count)) %>%
  head(10) %>%
  ggplot(aes(x = gender,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = fillColor) +
  geom_text(aes(x = gender, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Gender', 
       y = 'Count', 
       title = 'Gender and Count') +
  coord_flip() +
  theme_bw()

As per above, Females loan more than males.

3.6 Common Loan Term In Months

loans %>%
  filter(!is.na(term_in_months)) %>%
  group_by(term_in_months) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  ungroup() %>%
  mutate(term_in_months = reorder(term_in_months,Count)) %>%
  head(10) %>%
  ggplot(aes(x = term_in_months,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = fillColor2) +
  geom_text(aes(x = term_in_months, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Term in Months', 
       y = 'Count', 
       title = 'Term in Months and Count') +
  coord_flip() +
   theme_bw()

As per above, 14 months is the most common Term for the loans followed by 8, 11 ,7 and 13 months

3.7 Popular Countries for loans

The Country which is the most popular for loans is provided in the bar chart below

loans %>%
  group_by(country) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  ungroup() %>%
  mutate(country = reorder(country,Count)) %>%
  head(10) %>%
  ggplot(aes(x = country,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = fillColor) +
  geom_text(aes(x = country, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Country', 
       y = 'Count', 
       title = 'Country and Count') +
  coord_flip() +
  theme_bw()

3.8 Maps of Loans

We plot the loans in the word map with the size of the dots proportional to the amount of the loans

leaflet(themes_region) %>% addTiles() %>%
  addCircles(lng = ~lon, lat = ~lat,radius = ~(amount/10) ,
             color = fillColor2)  %>%
  # controls
  setView(lng=0, lat=0,zoom = 2)

3.9 Popular Theme

The following bar chart shows the most popular themes in a bar chart.We have removed rows where the theme was not mentioned. * General is the most popular theme which does not give us a lot of information.
* Underserved is the next popular theme, followed by Agriculture,Rural Inclusion , Water and Higher Education

themes_region_combined = inner_join(themes_region, regions, 
                                    by=c('country')) %>%
                          select(world_region,lat.x,lon.x,country,amount) %>%
                          rename (lat = lat.x) %>%
                          rename(lon = lon.x)

themes %>%
  rename (themeType = `Loan Theme Type`) %>%
  filter(!is.na(themeType)) %>%
  group_by(themeType) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  ungroup() %>%
  mutate(themeType = reorder(themeType,Count)) %>%
  head(10) %>%
  ggplot(aes(x = themeType,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = fillColor2) +
  geom_text(aes(x = themeType, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Type of Theme', 
       y = 'Count', 
       title = 'Type of Theme and Count') +
  coord_flip() +
  theme_bw()

3.10 Kiva Loans in respective Continents

3.10.1 AFRICA

3.10.1.1 Loan Distribution

The funded loan amount is shown in the form of a histogram. The Y axis and the X axis has been log transformed for better visualization.

AfricanLoans <- regions %>%
                  select(country, world_region) %>%
                  unique() %>%
                  inner_join(loans) %>%
                  filter(str_detect(world_region,"Africa"))

fundedLoanAmountDistribution(AfricanLoans)

AfricanLoans %>%
   select(funded_amount) %>%
   summary()

##  funded_amount    
##  Min.   :    0.0  
##  1st Qu.:  200.0  
##  Median :  375.0  
##  Mean   :  660.6  
##  3rd Qu.:  725.0  
##  Max.   :50000.0

The plots show the African countries which has been given the most loans in the bar plot. This also shows the map of Africa with the loans appearing as dots in the map.

plotBarPlotLoansInGeography <- function(country_loans)
{
  country_loans %>%
    group_by(country) %>%
    summarise(Count = n()) %>%
    arrange(desc(Count)) %>%
    ungroup() %>%
    mutate(country = reorder(country,Count)) %>%
    head(10) %>%
    ggplot(aes(x = country,y = Count)) +
    geom_bar(stat='identity',colour="white", fill = fillColorLightCoral) +
    geom_text(aes(x = country, y = 1, label = paste0("(",Count,")",sep="")),
              hjust=0, vjust=.5, size = 4, colour = 'black',
              fontface = 'bold') +
    labs(x = 'Countries', 
         y = 'Count', 
         title = 'Countries and Count') +
    coord_flip() +
    theme_bw()
}

plotMapsLoansInGeography <- function(country_loans)
{
  center_lon = median(country_loans$lon,na.rm = TRUE)
  center_lat = median(country_loans$lat,na.rm = TRUE)

  leaflet(country_loans) %>% addTiles() %>%
    addCircles(lng = ~lon, lat = ~lat,radius = ~(amount/10) ,
               color =fillColor2)  %>%
    # controls
    setView(lng=center_lon, lat=center_lat,zoom = 3) 
}

country_loans = themes_region_combined %>%
  filter(str_detect(world_region,"Africa"))

unique(country_loans$world_region)

## [1] "Sub-Saharan Africa"

plotBarPlotLoansInGeography(country_loans)

plotMapsLoansInGeography(country_loans)

Kenya, Lesotho, Uganda, Malawi and Ghana are the countries which have got the most loans.

Observations

We observe from the sections Multidimensional Poverty Measures and Distribution of loans in Africa that we do not see loans in the poorest areas. This might be an opportunity for Kiva to help these very underpriviliged countries.

3.10.1.2 African Conflicts Data

We extend our analysis to the African Conflicts data found in Kaggle and map the most battle prone areas in 2016 and 2017. The intention is to highlight that these Battle prone areas may be in need of funds for very basic neccessities such as water and food

# Identify areas where there is Battle
keywordBattle = "Battle"

BattleData = ConflictsData %>% 
  filter(!is.na(LATITUDE)) %>%
  filter(!is.na(LONGITUDE)) %>%
  filter(str_detect(EVENT_TYPE,keywordBattle) )

BattleData20162017 = BattleData %>% filter(YEAR >= 2016)

center_lon = median(BattleData$LONGITUDE)
center_lat = median(BattleData$LATITUDE)

leaflet(BattleData20162017) %>% addTiles() %>%
  addCircles(lng = ~LONGITUDE, lat = ~LATITUDE,radius = ~(FATALITIES), 
             color = fillColor)  %>%
  # controls
  setView(lng=center_lon, lat=center_lat, zoom=3)

3.10.1.3 Battle affected Countries

The following plot shows the plot of high Conflict Countries and the Count shows the number of Battles in 2016 and 2017.

HighConflictCountries <- BattleData20162017 %>%
  group_by(COUNTRY) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  ungroup() %>%
  mutate(COUNTRY = reorder(COUNTRY,Count)) %>%
  head(10)

HighConflictCountries %>%
  ggplot(aes(x = COUNTRY,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = c("red")) +
  geom_text(aes(x = COUNTRY, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Country', 
       y = 'Count', 
       title = 'High Conflict Countries and Count') +
  coord_flip() +
  theme_bw()

We observe that the following countries Somalia, South Sudan, Libya, Nigeria and Sudan are high Battle affected countries and we will investigate how many loans were given for the major Battle affected countries.

3.10.1.4 Loans in Battle affected Countries

We explore the number of loans provided in the Top Twenty High Conflict countries.

HighConflictCountries <- BattleData20162017 %>%
  group_by(COUNTRY) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  head(20)

HighConflictCountryLoans <- inner_join(AfricanLoans, HighConflictCountries,
                                       by=c("country" = "COUNTRY")) %>%
                            group_by(country) %>%
                            summarise(Count = n()) %>%
                            arrange(desc(Count))

HighConflictCountryLoans %>%
  ungroup() %>%
  mutate(country = reorder(country,Count)) %>%
  ggplot(aes(x = country,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = fillColor2) +
  geom_text(aes(x = country, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Battle affected Countries', 
       y = 'Loan Count', 
       title = 'Battle affected Countries and Loan Count') +
  coord_flip() +
   theme_bw()

The countries Somalia, South Sudan, Libya and Sudan do not get feature a lot in the Kiva loans.There is a lot of oppurtunity for people in these countries to leverage Kiva.
#### Use of loans

AfricanLoans <- regions %>%
                select(country, world_region) %>%
                unique() %>%
                inner_join(loans) %>%
                filter(str_detect(world_region,"Africa"))

AfricanLoans %>%
  filter(!is.na(use)) %>%
  group_by(use) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  ungroup() %>%
  mutate(use = reorder(use,Count)) %>%
  head(10) %>%
  ggplot(aes(x = use,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = fillColor2) +
  geom_text(aes(x = use, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Use of Loans', 
       y = 'Count', 
       title = 'Use of Loans and Count') +
     coord_flip() +
     theme_bw()

3.10.1.5 Popular Sector

The Sector which is the most popular for loans in Africa is provided in the bar chart below

AfricanLoans %>%
  filter(!is.na(sector)) %>%
  group_by(sector) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  ungroup() %>%
  mutate(sector = reorder(sector,Count)) %>%
  head(10) %>%
  ggplot(aes(x = sector,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = fillColor2) +
  geom_text(aes(x = sector, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Sector', 
       y = 'Count', 
       title = 'Sector and Count') +
  coord_flip() +
   theme_bw()

Observations

We observe that the African median funded amount( $375 ) is less than the World median funded amount ( $450). Kiva can channelise more funds to African continent since we can help with smaller amounts.

3.10.2 ASIA

3.10.2.1 Loan Distribution

country_loans = themes_region_combined %>%
  filter(str_detect(world_region,"Asia"))

unique(country_loans$world_region)

## [1] "East Asia and the Pacific" "Europe and Central Asia"  
## [3] "South Asia"

plotBarPlotLoansInGeography(country_loans)

Philippines, Cambodia, Indonesia, Tajikastan and Pakistan have the most loans

The funded loan amount is shown in the form of a histogram. The Y axis and the X axis has been log transformed for better visualization.

AsianLoans <- regions %>%
                select(country,world_region) %>%
                unique() %>%
                inner_join(loans) %>%
                filter(str_detect(world_region,"Asia"))

fundedLoanAmountDistribution(AsianLoans)

#### Summary of Funded Amount
AsianLoans %>%
   select(funded_amount) %>%
   summary()

##  funded_amount  
##  Min.   :    0  
##  1st Qu.:  225  
##  Median :  325  
##  Mean   :  497  
##  3rd Qu.:  575  
##  Max.   :50000

3.10.2.2 Poorest Asian Countries

Afghanistan, Yemen, Pakistan, India and Bangladesh are the poorest Asian Countries from the MPI Rural measure.
Afghanistan, Bangladesh, Pakistan, Yemen and India are the poorest Asian Countries from the MPI Urban measure.

3.10.2.2.1 Rural

countries_continents = countries_continents %>%
  select(name,Continent)

mpi_national_continent = inner_join(mpi_national, countries_continents,
                                    by=c('Country'= 'name'))

poor_countries_rural <- mpi_national_continent %>%
                        filter(Continent == 'AS') %>%
                        rename(MPIRural = `MPI Rural`) %>%
                        arrange(desc(MPIRural)) %>%
                        head(15) %>%
                        select(Country,MPIRural)

treemap(poor_countries_rural, 
        index="Country", 
        vSize = "MPIRural",  
        title="Poorest Countries Rural Perspective", 
        palette = "RdBu",
        fontsize.title = 14)

3.10.2.2.2 Urban

poor_countries_urban <- mpi_national_continent %>%
                          filter(Continent == 'AS') %>%
                          rename(MPIUrban = `MPI Urban`) %>%
                          arrange(desc(MPIUrban)) %>%
                          head(15) %>%
                          select(Country,MPIUrban)

treemap(poor_countries_urban, 
        index="Country", 
        vSize = "MPIUrban",  
        title="Poorest Countries Urban Perspective", 
        palette = "RdBu",
        fontsize.title = 14)

3.10.2.3 Loans in Poorest Asian countries

poor_countries_loans <- inner_join(AsianLoans, poor_countries_rural,
                                   by=c("country"="Country")) %>%
                        group_by(country) %>%
                        summarise(Count = n()) %>%
                        arrange(desc(Count))

as.tibble(setdiff(poor_countries_rural$Country,poor_countries_loans$country))

The above countries though they feature among the poorest Asian countries from the MPI rural measure do not feature in Kiva loans. There is a good opportunity for them to be included to be in the Kiva family.

3.10.2.4 Use of loans

  AsianLoans %>%
  filter(!is.na(use)) %>%
  
  group_by(use) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  ungroup() %>%
  mutate(use = reorder(use,Count)) %>%
  head(10) %>%
  ggplot(aes(x = use,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = fillColor2) +
  geom_text(aes(x = use, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Use of Loans', 
       y = 'Count', 
       title = 'Use of Loans and Count') +
  coord_flip() +
  theme_bw()

Philippines, Cambodia, Indonesia, Tajikastan and Pakistan have the most loans

3.10.2.5 Popular Sector

The Sector which is the most popular for loans in Asia is provided in the bar chart below

 AsianLoans %>%
  filter(!is.na(sector)) %>%
  group_by(sector) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  ungroup() %>%
  mutate(sector = reorder(sector,Count)) %>%
  head(10) %>%
  ggplot(aes(x = sector,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = fillColor2) +
  geom_text(aes(x = sector, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Sector', 
       y = 'Count', 
       title = 'Sector and Count') +
  coord_flip() +
  theme_bw()

3.10.3 AMERICAS

3.10.3.1 Loan Distribution

The funded loan amount is shown in the form of a histogram. The Y axis and the X axis has been log transformed for better visualization.

AmericasLoans <- regions %>%
                  select(country,world_region) %>%
                  unique() %>%
                  inner_join(loans) %>%
                  filter(str_detect(world_region,"America"))

fundedLoanAmountDistribution(AmericasLoans)

AmericasLoans %>%
   select(funded_amount) %>%
   summary()

##  funded_amount     
##  Min.   :     0.0  
##  1st Qu.:   400.0  
##  Median :   600.0  
##  Mean   :   917.5  
##  3rd Qu.:  1000.0  
##  Max.   :100000.0

country_loans = themes_region_combined %>%
  filter(str_detect(world_region,"America"))

unique(country_loans$world_region)

## [1] "Latin America and Caribbean"

plotBarPlotLoansInGeography(country_loans)

3.10.3.2 Poorest South American Countries

Peru, Suriname, Colombia, Brazil, Ecuador and Guyana are the poorest countries from the MPI Rural and MPI Urban perspective

3.10.3.2.1 Rural

poor_countries_rural <- mpi_national_continent %>%
  filter(Continent == 'SA') %>%
  rename(MPIRural = `MPI Rural`) %>%
  arrange(desc(MPIRural)) %>%
  head(15) %>%
  select(Country,MPIRural)

treemap(poor_countries_rural, 
        index="Country", 
        vSize = "MPIRural",  
        title="Poorest Countries Rural Perspective", 
        palette = "RdBu",
        fontsize.title = 14)

3.10.3.2.2 Urban

poor_countries_urban <- mpi_national_continent %>%
  filter(Continent == 'SA') %>%
  rename(MPIUrban = `MPI Urban`) %>%
  arrange(desc(MPIUrban)) %>%
  head(15) %>%
  select(Country,MPIUrban)

treemap(poor_countries_urban, 
        index="Country", 
        vSize = "MPIUrban",  
        title="Poorest Countries Urban Perspective", 
        palette = "RdBu",
        fontsize.title = 14)

3.10.3.3 Loans in Poorest South American countries

poor_countries_loans <- inner_join(AmericasLoans, poor_countries_rural,
                                   by =c("country" = "Country")) %>%
                        group_by(country) %>%
                        summarise(Count = n()) %>%
                        arrange(desc(Count))

as.tibble(setdiff(poor_countries_rural$Country,poor_countries_loans$country))

The above countries though they feature among the poorest South American countries from the MPI rural measure do not feature in Kiva loans. There is a good opportunity for them to be included to be in the Kiva family.

3.10.3.4 Use of loans

AmericasLoans <- regions %>%
                  select(country,world_region) %>%
                  unique() %>%
                  inner_join(loans) %>%
                  filter(str_detect(world_region,"America"))

AmericasLoans %>%
  filter(!is.na(use)) %>%
  group_by(use) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  ungroup() %>%
  mutate(use = reorder(use,Count)) %>%
  head(10) %>%
  ggplot(aes(x = use,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = fillColor2) +
  geom_text(aes(x = use, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Use of Loans', 
       y = 'Count', 
       title = 'Use of Loans and Count') +
     coord_flip() +
     theme_bw()

3.10.3.5 Popular Sector

The Sector which is the most popular for loans in America is provided in the bar chart below

AmericasLoans %>%
  filter(!is.na(sector)) %>%
  group_by(sector) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  ungroup() %>%
  mutate(sector = reorder(sector,Count)) %>%
  head(10) %>%
  ggplot(aes(x = sector,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = fillColor2) +
  geom_text(aes(x = sector, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Sector', 
       y = 'Count', 
       title = 'Sector and Count') +
  coord_flip() +
   theme_bw()

3.11 Kiva Loans in respective Countries

3.11.1 Kenya

3.11.1.1 Loans Distribution

We show the different types of loans in Kenya in the map.

country_loans = loans %>%
    filter(country == "Kenya")

fundedLoanAmountDistribution(country_loans)

summary(country_loans$funded_amount)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     0.0   200.0   300.0   425.3   500.0 50000.0

country_loans = themes_region %>% 
  filter(country == "Kenya") %>%
  rename (themeType = `Loan Theme Type`) 

center_lon = median(country_loans$lon,na.rm = TRUE)
center_lat = median(country_loans$lat,na.rm = TRUE)

leaflet(country_loans) %>% addTiles() %>%
  addCircles(lng = ~lon, lat = ~lat,radius = ~(amount/100) ,
             color = ~c("blue"))  %>%
  # controls
  setView(lng=center_lon, lat=center_lat,zoom = 5)

3.11.1.2 Funding Partners

We plot the most dominant funding partners of Kiva in Kenya

country_loans %>%
  rename(FieldPartnerName =`Field Partner Name`) %>%
  group_by(FieldPartnerName) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  ungroup() %>%
  mutate(FieldPartnerName = reorder(FieldPartnerName,Count)) %>%
  head(10) %>%
  ggplot(aes(x = FieldPartnerName,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = fillColor2) +
  geom_text(aes(x = FieldPartnerName, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Field Partner Name', 
       y = 'Count', 
       title = 'Field Partner Name and Count') +
  coord_flip() +
   theme_bw()

3.11.1.3 Popular Sector

The Sector which is the most popular for loans is provided in the bar chart below

plotLoansAndSectorByCountry <- function(loans, countryName, fillColor) {
  loans %>%
    filter(country == countryName) %>%
    group_by(sector) %>%
    summarise(Count = n()) %>%
    arrange(desc(Count)) %>%
    ungroup() %>%
    mutate(sector = reorder(sector,Count)) %>%
    head(10) %>%
    ggplot(aes(x = sector,y = Count)) +
    geom_bar(stat='identity',colour="white", fill = fillColor) +
    geom_text(aes(x = sector, y = 1, label = paste0("(",Count,")",sep="")),
              hjust=0, vjust=.5, size = 4, colour = 'black',
              fontface = 'bold') +
    labs(x = 'Sector', 
         y = 'Count', 
         title = 'Sector and Count') +
    coord_flip() +
     theme_bw()
}

plotLoansAndSectorByCountry(loans, "Kenya", fillColor)

3.11.1.4 Popular Activity

The Activity which is the most popular for loans is provided in the bar chart below

plotLoansAndActivityByCountry <- function(loans, countryName,fillColor) {
  loans %>%
    filter(country == countryName) %>%
    group_by(activity) %>%
    summarise(Count = n()) %>%
    arrange(desc(Count)) %>%
    ungroup() %>%
    mutate(activity = reorder(activity,Count)) %>%
    head(10) %>%
    ggplot(aes(x = activity,y = Count)) +
    geom_bar(stat='identity',colour="white", fill = fillColor) +
    geom_text(aes(x = activity, y = 1, label = paste0("(",Count,")",sep="")),
              hjust=0, vjust=.5, size = 4, colour = 'black',
              fontface = 'bold') +
    labs(x = 'Activity', 
         y = 'Count', 
         title = 'Activity and Count') +
    coord_flip() +
     theme_bw()
}

plotLoansAndActivityByCountry(loans, "Kenya", fillColor2)

3.11.1.5 Popular Use of Loans

plotLoansAndUseByCountry <- function(loans, countryName,fillColor2) {
    loans %>%
    filter(country == countryName) %>%
    filter(!is.na(use)) %>%
    group_by(use) %>%
    summarise(Count = n()) %>%
    arrange(desc(Count)) %>%
    ungroup() %>%
    mutate(use = reorder(use,Count)) %>%
    head(10) %>%
    ggplot(aes(x = use,y = Count)) +
    geom_bar(stat='identity',colour="white", fill = fillColor2) +
    geom_text(aes(x = use, y = 1, label = paste0("(",Count,")",sep="")),
              hjust=0, vjust=.5, size = 4, colour = 'black',
              fontface = 'bold') +
    labs(x = 'Use of Loans', 
         y = 'Count', 
         title = 'Use of Loans and Count') +
       coord_flip() +
       theme_bw() 
  }

plotLoansAndUseByCountry(loans,"Kenya",fillColorLightCoral)

3.11.1.6 Loan Trends

We show the trend of loans from the years 2014 onwards. The trend shows that the number of loans keep on increasing with time.

loansData = loans %>%
  filter(country == "Kenya") %>%
  filter(!is.na(funded_time)) %>%
  mutate(year = year(ymd_hms(funded_time))) %>%
  mutate(month = month(ymd_hms(funded_time))) %>%
  filter(!is.na(year)) %>%
  filter(!is.na(month)) %>%
  group_by(year,month) %>%
  summarise(Count = n()) %>%
  mutate(YearMonth = make_date(year=year,month=month)) 
  
loansData %>%
  ggplot(aes(x=YearMonth,y=Count,group = 1)) +
  geom_line(size=1, color="red")+
  geom_point(size=3, color="red") +
  labs(x = 'Time', y = 'Count',title = 'Trend of loans') +
  theme_bw()

3.11.1.7 Loans Data

datatable(loansData, style="bootstrap", class="table-condensed", options = list(dom = 'tp',scrollX = TRUE))

3.11.2 India

3.11.2.1 Loan Distribution

country_loans = loans %>%
    filter(country == "India")

fundedLoanAmountDistribution(country_loans)

summary(country_loans$funded_amount)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     0.0   250.0   325.0   575.5   550.0 12925.0

We show the different types of loans in India in the map.We observe that a major part of India is not utilizing Kiva. An awareness campaign in India about Kiva might be a good idea.

country_loans = themes_region %>% 
  filter(country == "India") %>%
  rename (themeType = `Loan Theme Type`) 

center_lon = median(country_loans$lon,na.rm = TRUE)
center_lat = median(country_loans$lat,na.rm = TRUE)

leaflet(country_loans) %>% addTiles() %>%
  addCircles(lng = ~lon, lat = ~lat,radius = ~(amount/10) ,
             color = ~c("blue"))  %>%
  # controls
  setView(lng=center_lon, lat=center_lat,zoom = 5)

3.11.2.2 Dominant Funding Partner

We plot the most dominant funding partners of Kiva in India

country_loans %>%
  rename(FieldPartnerName =`Field Partner Name`) %>%
  group_by(FieldPartnerName) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  ungroup() %>%
  mutate(FieldPartnerName = reorder(FieldPartnerName,Count)) %>%
  head(10) %>%
  ggplot(aes(x = FieldPartnerName,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = fillColor2) +
  geom_text(aes(x = FieldPartnerName, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Field Partner Name', 
       y = 'Count', 
       title = 'Field Partner Name and Count') +
  coord_flip() +
   theme_bw()

3.11.2.3 Popular Sector

The Sector which is the most popular for loans is provided in the bar chart below

plotLoansAndSectorByCountry(loans,"India",fillColor)

3.11.2.4 Popular Activity

The Activity which is the most popular for loans is provided in the bar chart below

plotLoansAndActivityByCountry(loans,"India",fillColor2)

3.11.2.5 Popular Use of loans

plotLoansAndUseByCountry(loans,"India",fillColorLightCoral)

3.11.2.6 Loans Trend

We show the trend of loans from the years 2014 onwards. The trend shows that the number of loans keep on increasing with time.

loansData = loans %>%
              filter(country == "India") %>%
              filter(!is.na(funded_time)) %>%
              mutate(year = year(ymd_hms(funded_time))) %>%
              mutate(month = month(ymd_hms(funded_time))) %>%
              filter(!is.na(year)) %>%
              filter(!is.na(month)) %>%
              group_by(year,month) %>%
              summarise(Count = n()) %>%
              mutate(YearMonth = make_date(year=year,month=month) ) 
  
loansData %>%
  ggplot(aes(x=YearMonth,y=Count,group = 1)) +
  geom_line(size=1, color="red")+
  geom_point(size=3, color="red") +
  labs(x = 'Time', y = 'Count',title = 'Trend of loans') +
  theme_bw()

3.11.2.7 Loans data

datatable(loansData, style="bootstrap", class="table-condensed", options = list(dom = 'tp',scrollX = TRUE))

3.11.3 El Salvador

3.11.3.1 Loan Distribution

We show the different types of loans in El Salvador in the map.

country_loans = loans %>%
    filter(country == "El Salvador")

fundedLoanAmountDistribution(country_loans)

summary(country_loans$funded_amount)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     0.0   325.0   500.0   585.8   800.0  2900.0

country_loans = themes_region %>% 
  filter(country == "El Salvador") %>%
  rename (themeType = `Loan Theme Type`) 

center_lon = median(country_loans$lon,na.rm = TRUE)
center_lat = median(country_loans$lat,na.rm = TRUE)

leaflet(country_loans) %>% addTiles() %>%
  addCircles(lng = ~lon, lat = ~lat,radius = ~(amount/100) ,
             color = ~c("blue"))  %>%
  # controls
  setView(lng=center_lon, lat=center_lat,zoom = 7)

3.11.3.2 Dominant Field Partner

We plot the most dominant Field partners of Kiva in El Salvador

country_loans %>%
  rename(FieldPartnerName =`Field Partner Name`) %>%
  group_by(FieldPartnerName) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  ungroup() %>%
  mutate(FieldPartnerName = reorder(FieldPartnerName,Count)) %>%
  head(10) %>%
  ggplot(aes(x = FieldPartnerName,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = fillColor2) +
  geom_text(aes(x = FieldPartnerName, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Field Partner Name', 
       y = 'Count', 
       title = 'Field Partner Name and Count') +
  coord_flip() +
  theme_bw()

3.11.3.3 Popular Sector

The Sector which is the most popular for loans is provided in the bar chart below

plotLoansAndSectorByCountry(loans,"El Salvador",fillColor)

3.11.3.4 Popular Activity

The Activity which is the most popular for loans is provided in the bar chart below

plotLoansAndActivityByCountry(loans,"El Salvador",fillColor2)

3.11.3.5 Popular Use of loans

  plotLoansAndUseByCountry(loans,"El Salvador",fillColorLightCoral)

3.11.3.6 Loan Trends

We show the trend of loans from the years 2014 onwards. The trend shows that the number of loans keep on increasing with time.

loansData = loans %>%
  filter(country == "El Salvador") %>%
  filter(!is.na(funded_time)) %>%
  mutate(year = year(ymd_hms(funded_time))) %>%
  mutate(month = month(ymd_hms(funded_time))) %>%
  filter(!is.na(year)) %>%
  filter(!is.na(month)) %>%
  group_by(year,month) %>%
  summarise(Count = n()) %>%
  mutate(YearMonth = make_date(year=year,month=month) ) 
  
loansData %>%
  ggplot(aes(x=YearMonth,y=Count,group = 1)) +
  geom_line(size=1, color="red")+
  geom_point(size=3, color="red") +
  labs(x = 'Time', y = 'Count',title = 'Trend of loans') +
  theme_bw()

3.11.3.7 Loans Data

datatable(loansData, style="bootstrap", class="table-condensed", options = list(dom = 'tp',scrollX = TRUE))

3.12 Multidimensional Poverty Measures

We use the Kaggle Dataset Multidimensional Poverty Measures. The following explains this term and this has been taken from the dataset documentation

Most countries of the world define poverty as a lack of money. Yet poor people themselves consider their experience of poverty much more broadly. A person who is poor can suffer from multiple disadvantages at the same time - for example they may have poor health or malnutrition, a lack of clean water or electricity, poor quality of work or little schooling. Focusing on one factor alone, such as income, is not enough to capture the true reality of poverty.
Multidimensional poverty measures can be used to create a more comprehensive picture. They reveal who is poor and how they are poor - the range of different disadvantages they experience. As well as providing a headline measure of poverty, multidimensional measures can be broken down to reveal the poverty level in different areas of a country, and among different sub-groups of people.

3.12.1 MPI Map

Higher the MPI, poorer is the country. The map clearly shows the poorer countries are centered around Africa.The Red Dots indicate that the country is poorer.

pal <- colorNumeric(
  palette = colorRampPalette(c('green', 'red'))(length(regions$MPI)), 
  domain = regions$MPI)

regions_no_NA = regions %>%
  filter(!is.na(lon)) %>%
  filter(!is.na(lat))

center_lon = median(regions$lon,na.rm = TRUE)
center_lat = median(regions$lat,na.rm = TRUE)

leaflet(data = regions_no_NA) %>%
  addTiles() %>%
  addCircleMarkers(
    lng =  ~ lon,
    lat =  ~ lat,
    radius = ~ MPI*10,
    popup =  ~ country,
    color =  ~ pal(MPI)
  ) %>%
  # controls
  setView(lng=center_lon, lat=center_lat,zoom = 3) %>%
  addLegend("topleft", pal = pal, values = ~MPI,
            title = "MPI Map",
            opacity = 1)

3.12.2 MPI Rural

We use the metric MPI Rural from the Kaggle Dataset Multidimensional Poverty Measures. This metric provides the Average distance below the poverty line of those listed as poor in rural areas. This will provide Kiva an understanding on the countries where loans would be most needed.

mpi_national %>%
  rename(mpi_rural = `MPI Rural`) %>%
  arrange(desc(mpi_rural)) %>%
  mutate(Country = reorder(Country,mpi_rural)) %>%
  head(10) %>%
  
  ggplot(aes(x = Country,y = mpi_rural)) +
  geom_bar(stat='identity',colour="white", fill = fillColor2) +
  geom_text(aes(x = Country, y = 1, label = paste0("(",mpi_rural,")",sep="")),
            hjust=1, vjust=1, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Country', 
       y = 'mpi_rural', 
       title = 'Country and Count') +
  coord_flip() +
  theme_bw()

Niger,Somalia, Ethopia,Burkina Faso and Chad are the poorest from the MPI Rural measure

3.12.3 MPI Urban

We use the metric MPI Urban from the Kaggle Dataset Multidimensional Poverty Measures. This metric provides the Average distance below the poverty line of those listed as poor in urban areas. This will provide Kiva an understanding on the countries where loans would be most needed.

mpi_national %>%
  rename(mpi_urban = `MPI Urban`) %>%
  arrange(desc(mpi_urban)) %>%
  mutate(Country = reorder(Country,mpi_urban)) %>%
  head(10) %>%
  
  ggplot(aes(x = Country,y = mpi_urban)) +
  geom_bar(stat='identity',colour="white", fill = fillColor2) +
  geom_text(aes(x = Country, y = 1, label = paste0("(",mpi_urban,")",sep="")),
            hjust=1, vjust=1, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Country', 
       y = 'mpi_urban', 
       title = 'Country and Count') +
  coord_flip() +
  theme_bw()

South Sudan,Chad, Somalia,Liberia and Central African Republic are the poorest from the MPI Urban measure

3.13 MPI Countries and Kiva Loans

We show the High MPI Rural and Low MPI Rural countries with the most loans.Mali, Sierra Leone,Liberia, Mozambique,Burkina Faso are the High MPI Rural countries with the most loans.Armenia,Kyrgyzstan,Albania,Ukraine and Thailand are the Low MPI Rural countries with the most loans

3.13.1 High MPI Countries with Kiva Loans

mpi_national_rural_top_10 = mpi_national %>%
  rename(mpi_rural = `MPI Rural`) %>%
  arrange(desc(mpi_rural)) %>%
  mutate(Country = reorder(Country,mpi_rural)) %>%
  head(15)

mpi_national_rural_top_10_loans <- loans %>%
                                    inner_join(mpi_national_rural_top_10,
                                               by=c("country" = "Country"))

getTopLoansByCountry <- function(dataset,fillColorName) {
  dataset %>%
    group_by(country) %>%
    summarise(Count = n()) %>%
    arrange(desc(Count)) %>%
    ungroup() %>%
    mutate(country = reorder(country,Count)) %>%
    head(10) %>%
    ggplot(aes(x = country,y = Count)) +
    geom_bar(stat='identity',colour="white", fill = fillColorName) +
    geom_text(aes(x = country, y = 1, label = paste0("(",Count,")",sep="")),
              hjust=0, vjust=.5, size = 4, colour = 'black',
              fontface = 'bold') +
    labs(x = 'Country', 
         y = 'Count', 
         title = 'Country and Count') +
    coord_flip() +
     theme_bw()
}

getTopLoansByCountry(mpi_national_rural_top_10_loans,fillColor2)

As per above, High MPI Rural countries with the most loans.
Mali, Sierra Leone, Liberia, Mozambique, Burkina Faso are the High MPI Rural countries with the most loans.

3.13.2 Low MPI Countries with Kiva Loans

mpi_national_rural_bottom_10 = mpi_national %>%
  rename(mpi_rural = `MPI Rural`) %>%
  arrange(mpi_rural) %>%
  mutate(Country = reorder(Country,mpi_rural)) %>%
  head(15) 

mpi_national_rural_bottom_10 = loans %>%
                                inner_join(mpi_national_rural_bottom_10,
                                           by =c("country" = "Country"))

getTopLoansByCountry(mpi_national_rural_bottom_10,fillColor)

3.14 Use of loans and MPI

The loans for High MPI countries are used for buying condiments, to buy sheep for resale, to buy contruction materials,to buy building materials, to buy fertilizer for groundnuts, okra and peanuts. The loans for Low MPI countries are used to buy cows, to pay for her higher education,to buy some livestock to increase her herd and to buy sheep

3.14.1 Use of loans in High MPI Rural Countries

We show the use of loans in the High MPI Rural countries.

  mpi_national_rural_top_10_loans %>%
  filter(!is.na(use)) %>%
  group_by(use) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  ungroup() %>%
  mutate(use = reorder(use,Count)) %>%
  head(10) %>%
  ggplot(aes(x = use,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = fillColor) +
  geom_text(aes(x = use, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Use of Loans', 
       y = 'Count', 
       title = 'Use of Loans and Count in High MPI countries') +
     coord_flip() +
     theme_bw()

3.14.2 Use of loans in Low MPI Rural Countries

We show the use of loans in the Low MPI Rural countries.

mpi_national_rural_bottom_10 %>%
  filter(!is.na(use)) %>%
  group_by(use) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  ungroup() %>%
  mutate(use = reorder(use,Count)) %>%
  head(10) %>%
  ggplot(aes(x = use,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = fillColor) +
  geom_text(aes(x = use, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Use of Loans', 
       y = 'Count', 
       title = 'Use of Loans and Count in Low MPI countries') +
     coord_flip() +
     theme_bw()

3.15 Popular Sectors of loans and MPI

Food,Retail,Agriculture,Clothing and Services are the most popular sector for loans in High MPI Rural countries. Agriculture,Education,Health,Housing and Clothing are the most popular sector for loans in Low MPI Rural countries

3.15.1 Popular Sector in loans in High MPI Rural Countries

The Sector which is the most popular for High MPI Rural Countries is provided in the bar chart below.

 getTopLoansBySector <- function(dataset,fillColorName,titleName) {
   dataset %>%
    filter(!is.na(sector)) %>%
    group_by(sector) %>%
    summarise(Count = n()) %>%
    arrange(desc(Count)) %>%
    ungroup() %>%
    mutate(sector = reorder(sector,Count)) %>%
    head(10) %>%
    ggplot(aes(x = sector,y = Count)) +
    geom_bar(stat='identity',colour="white", fill = fillColorName) +
    geom_text(aes(x = sector, y = 1, label = paste0("(",Count,")",sep="")),
              hjust=0, vjust=.5, size = 4, colour = 'black',
              fontface = 'bold') +
    labs(x = 'Sector', 
         y = 'Count', 
         title = titleName) +
    coord_flip() +
     theme_bw()
 }

getTopLoansBySector(mpi_national_rural_top_10_loans, fillColor, 'Most Popular Sectors in High MPI Rural Countries')

3.15.2 Popular Sector in loans in Low MPI Rural Countries

getTopLoansBySector(mpi_national_rural_bottom_10, fillColor, 'Most Popular Sectors in Low MPI Rural Countries')

3.16 Distribution of the Funded Loan amount

High MPI Countries have Median Funded Loan Amount is $600
Low MPI Countries have Median Funded Loan Amount is $1100

3.16.1 Distribution of Funded Loan amount in High MPI countries

The funded loan amount is shown in the form of a histogram. The Y axis and the X axis has been log transformed for better visualization.

fundedLoanAmountDistribution(mpi_national_rural_top_10_loans)

Summary of Funded Amount in High MPI countries

mpi_national_rural_top_10_loans %>%
   select(funded_amount) %>%
   summary()

##  funded_amount    
##  Min.   :    0.0  
##  1st Qu.:  275.0  
##  Median :  600.0  
##  Mean   :  930.3  
##  3rd Qu.: 1175.0  
##  Max.   :50000.0

3.16.2 Distribution of Funded Loan amount in Low MPI countries

The funded loan amount is shown in the form of a histogram. The Y axis and the X axis has been log transformed for better visualization.

fundedLoanAmountDistribution(mpi_national_rural_bottom_10)

Summary of Funded Amount in Low MPI countries

mpi_national_rural_bottom_10 %>%
   select(funded_amount) %>%
   summary()

##  funded_amount  
##  Min.   :    0  
##  1st Qu.:  725  
##  Median : 1100  
##  Mean   : 1281  
##  3rd Qu.: 1675  
##  Max.   :50000

3.17 Poorest Regions

We explore the Poorest Regions in the world. We use the Metric Intensity of Deprivation Regional.

Lac in Chad is the poorest region followed by Affar in Ethopia, Est in Burkina Faso, Ouaddað and Wadi Fira in Chad.

poorest_regions = mpi_subnational %>%
  rename(intensity_deprivation_regional = `Intensity of deprivation Regional`) %>%
  rename(sub_national_region = `Sub-national region`) %>%
  arrange(desc(intensity_deprivation_regional)) 
  
datatable(head(poorest_regions,10), style="bootstrap", class="table-condensed", options = list(dom = 'tp',scrollX = TRUE))

3.18 Human Development Index

Sierra Leone , Eritrea , South Sudan , Mozambique, Guinea, Burundi, Burkina Faso, Chad, Niger and Central African Republic are countries with the lowest Human Development Index
Kiva should direct the loans to these countries.

GEconV4_lat_lon = GEconV4 %>%
  group_by(COUNTRY) %>%
  mutate(lat = median(LAT,na.rm = TRUE)) %>%
  mutate(lon =  median(LONGITUDE,na.rm = TRUE)) %>%
  select(COUNTRY,lat,lon) %>%
  unique()

country_stats_lat_lon = inner_join(country_stats, GEconV4_lat_lon, 
                                   by=c('country_name'='COUNTRY'))

country_stats %>%
  arrange(hdi) %>%
  mutate(Country = reorder(country_name,hdi)) %>%
  head(10) %>%
  ggplot(aes(x = Country,y = hdi)) +
  geom_bar(stat='identity',colour="white", fill = fillColor2) +
  geom_text(aes(x = Country, y = 1, label = paste0("(",round(hdi,3),")",sep="")),
            hjust=0, vjust=0, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Country', 
       y = 'hdi', 
       title = 'Countries with Lowest Human Development Index') +
  coord_flip() +
  theme_bw()

3.18.1 Human Development Index Map

pal <- colorNumeric(
  palette = colorRampPalette(c('red', 'green'))(length(country_stats$hdi)), 
  domain = country_stats$hdi)

country_stats_lat_lon_no_NA = country_stats_lat_lon %>%
  filter(!is.na(hdi)) %>%
  filter(!is.na(lon)) %>%
  filter(!is.na(lat))

center_lon = median(country_stats_lat_lon_no_NA$lon,na.rm = TRUE)
center_lat = median(country_stats_lat_lon_no_NA$lat,na.rm = TRUE)

leaflet(data = country_stats_lat_lon_no_NA) %>%
  addTiles() %>%
  addCircleMarkers(
                    lng =  ~ lon,
                    lat =  ~ lat,
                    radius = ~ hdi*10,
                    popup =  ~ country_name,
                    color =  ~ pal(hdi)
                  ) %>%
  setView(lng=center_lon, lat=center_lat,zoom = 2) %>% # controls
  addLegend("topleft", pal = pal, values = ~hdi,
            title = "Human Development Index Map",
            opacity = 1)

The more red the circles, the lesser is the Human Development Index. African continent has very low HDI and Kiva should focus on loans on this belt.

3.18.2 Population below Poverty Line

country_stats %>%
  arrange(desc(population_below_poverty_line)) %>%
  mutate(Country = reorder(country_name,population_below_poverty_line)) %>%
  head(10) %>%
  ggplot(aes(x = Country,y = population_below_poverty_line)) +
  geom_bar(stat='identity',colour="white", fill = fillColorLightCoral) +
  geom_text(aes(x = Country, y = 1, label = paste0("(",population_below_poverty_line,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Country', 
       y = 'Population Below Poverty Line', 
       title = 'Countries with Highest Population Below Poverty Line') +
  coord_flip() +
  theme_bw()

Syria,Zimbabwe,Madagascar,Sierra Leone,Suriname,Nigeria,Guinea-Bissau,Burundi,Swaziland and Democratic Republic of Congo are the countries which have the highest population below poverty line.

3.18.3 Population under Poverty Line Map

pal <- colorNumeric(
  palette = colorRampPalette(c('green', 'red'))(length(country_stats$population_below_poverty_line)), 
  domain = country_stats$population_below_poverty_line)

country_stats_lat_lon_no_NA = country_stats_lat_lon %>%
  filter(!is.na(population_below_poverty_line)) %>%
  filter(!is.na(lon)) %>%
  filter(!is.na(lat))

center_lon = median(country_stats_lat_lon_no_NA$lon,na.rm = TRUE)
center_lat = median(country_stats_lat_lon_no_NA$lat,na.rm = TRUE)

leaflet(data = country_stats_lat_lon_no_NA) %>%
  addTiles() %>%
  addCircleMarkers(
    lng =  ~ lon,
    lat =  ~ lat,
    radius = ~ population_below_poverty_line/10,
    popup =  ~ country_name,
    color =  ~ pal(population_below_poverty_line)
  ) %>%
  # controls
  setView(lng=center_lon, lat=center_lat,zoom = 2) %>%
  
  addLegend("topleft", pal = pal, values = ~population_below_poverty_line,
          title = "Population under Povery Line Map",
           opacity = 1)

3.19 Top Funding Partners

The Top Ten Funding Partners accounting for most of the loans are provided below

themes_region %>%
  rename(FieldPartnerName =`Field Partner Name`) %>%
  group_by(FieldPartnerName) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  ungroup() %>%
  mutate(FieldPartnerName = reorder(FieldPartnerName,Count)) %>%
  head(10) %>%
  ggplot(aes(x = FieldPartnerName,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = fillColor) +
  geom_text(aes(x = FieldPartnerName, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Field Partner Name', 
       y = 'Count', 
       title = 'Field Partner Name and Count') +
  coord_flip() +
   theme_bw()

3.20 Naive Poverty Metric

The Naive Poverty Metric is calculated on the Population below Poverty Line, Human Development Index, MPI Urban and MPI Rural

country_stats = left_join(country_stats,mpi_national,
                          by =c('country_name'= 'Country'))

country_stats_AG = country_stats %>% 
                    select(population_below_poverty_line,hdi,country_name,`MPI Urban`,`MPI Rural`) %>%
                    mutate(AGMetric = (country_stats$population_below_poverty_line)/100 + 
                                      (1-country_stats$hdi) +
                                      `MPI Urban` + `MPI Rural`)

3.20.1 Distribution of the Naive Poverty Metric

country_stats_AG %>%
    ggplot(aes(x = AGMetric)) +
    geom_histogram(fill = fillColorLightCoral,bins=100) +
    labs(x = 'Naive Poverty Metric' ,y = 'Count', title = paste("Distribution of", "Naive Poverty Metric")) +   
    theme_bw()

3.20.2 Poorest Countries based on Naive Poverty Metric

country_stats_AG %>%
  arrange(desc(AGMetric)) %>%
  mutate(Country = reorder(country_name,AGMetric)) %>%
  head(10) %>%
  ggplot(aes(x = Country,y = AGMetric)) +
  geom_bar(stat='identity',colour="white", fill = fillColor2) +
  geom_text(aes(x = Country, y = 1, label = paste0("(",round(AGMetric,2),")",sep="")),
            hjust=1, vjust=0, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Country', 
       y = 'AGMetric', 
       title = 'Countries with Highest Naive Poverty Metric') +
  coord_flip() +
  theme_bw()

Nigeria, Guinea, Burkina Faso, Liberia ,Burundi, Guinea-Bissau, Chad, Niger, Sierra Leone and South Sudan are the poorest according to the Naive Poverty Metric

4 Part 4 - Inference

4.1 Statistics

Exploratory data analysis suggests below statistics.

Statistic	Variable	Value
Population	Mean Loan Amount	842.3971067
Population	SD Loan Amount	1198.6600729
Sample Statistics	Mean Loan Amount	842.3971067
Sample Statistics	SD Loan Amount	1198.6600729

4.2 Confidence interval of Loan Amount

Point estimate from the sample with the confidence interval is shown below

inference(y=loans$loan_amount, est="mean", null=0, type="ci", conflevel=0.95, method="theoretical")

For this test lets validate the total sample size required.

s = sd(loans$loan_amount)
n = ((pnorm(0.025)*s)/0.03)^2

If the margin of error to be 3%, we need to get the samples of around 4.15186410^{8}.

Some countries have receved higher loan amounts than another.
Countries with poorest MPI are not yet fully funded.

4.3 Below are the conditions for least squared line

We are going to validate the conditions for least squared line.

4.3.1 1. Linearity

From the below chart, it shows that there is a very slight upward relationship between Term in Months and Count. The linear model is very strong due to large number of variability.

loans_TermCount <- as.data.frame(data.table::rbindlist(list(AfricanLoans,AsianLoans,AmericasLoans))) %>%
                    filter(!is.na(term_in_months)) %>%
                    mutate(term_in_months = (as.numeric(term_in_months))) %>%
                    group_by(term_in_months) %>%
                    summarise(Count = n()) %>%
                    arrange(desc(Count)) %>%
                    #ungroup() %>%
                    mutate(term_in_months = reorder(term_in_months,Count))
loans_TermCount %>%
  ggplot( aes(x=term_in_months, y=Count)) +
  geom_point(size=1,alpha=0.8) +
  geom_smooth(method = "lm") + 
  ggtitle("Term in Months vs Count")

#cor(loans_TermCount$term_in_months, loans_TermCount$Count)

4.3.2 2. Nearly normal residuals

loans_lm <- lm(loan_amount ~ term_in_months + lender_count, loans)
df_residuals <- broom::augment(loans_lm)

Let’s check the residuals normality with histogram and qqplot.

#Historgram plot of residuals
ggplot(df_residuals,aes(df_residuals$.resid)) + geom_histogram() + ggtitle("Residual Histogram")

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

#QQ norm plot of the residuals
qqnorm(loans_lm$residuals)
qqline(loans_lm$residuals)

The plots show that residuals are slightly left skewed. But the residuals are normal

4.3.3 3. Constant Variability

df_residuals <- filter(df_residuals,.fitted >-17)
ggplot(df_residuals,aes(x=.fitted,y=.resid)) + geom_point(size=1,alpha=0.8) + geom_smooth(method = "lm")  + ggtitle("Loans vs Residuals")

ggplot(df_residuals,aes(x=.fitted,y=abs(.resid))) + geom_point(size=1,alpha=0.8) + ggtitle("Loans vs Residuals")

Above plot shows that there is a constant variability in the chart.

4.4 Different purpose of loan request

** Is loan funded amount are equal for different purpose of loan request?**

Let’s validate if the purpose of loan varies or not.

#Hypothesis test between purpose
statsr::inference(y=loans$loan_amount, x=loans$use, est="mean", null=0, alternative="greater", type="ht", method="theoretical")

Above output shows that loan amount varies for each purpose of loan.

Based on above analysis for all Kiva loans based on regions we can infer the following

Section : African Loans

Kenya,Lesotho,Uganda,Malawi and Ghana are the countries which have got the most loans.
We observe from the sections Multidimensional Poverty Measures and Distribution of loans in Africa that we do not see loans in the poorest areas. This might be an opportunity for Kiva to help these very underpriviliged countries.
We extend our analysis to the African Conflicts data found in Kaggle and map the most battle prone areas in 2016 and 2017. The intention is to highlight that these Battle prone areas may be in need of funds for very basic neccessities such as water and food.We find that the most Battle prone countries Somalia, South Sudan, Libya and Sudan do not get feature a lot in the Kiva loans.There is a lot of oppurtunity for people in these countries to leverage Kiva.
We observe that the African median funded amount( $375 ) is less than the World median funded amount ( $450). Kiva can channelise more funds to African continent since we can help with smaller amounts.

Section : American Loans

Peru,Suriname,Colombia,Brazil,Ecuador and Guyana are the poorest countries from the MPI Rural and MPI Urban perspective.

Section : Asian Loans

Philippines, Cambodia, Indonesia, Tajikastan and Pakistan have the most loans
Afghanistan,Yemen,Pakistan,India and Bangladesh are the poorest Asian Countries from the MPI Rural measure.Afghanistan,Bangladesh,Pakistan,Yemen and India are the poorest Asian Countries from the MPI Urban measure.
We see that the countries Yemen, Bangladesh, Myanmar and Iraq though they are the poorest do not feature in Kiva loans. There is a good opportunity for them to be included to be in the Kiva family.

5 Part 5 - Conclusion

As a whole, we can conclude following in global perspective the effect of Kiva Loans

Section : Global view of Kiva loans

Most Popular themes are General, Underserved followed by Agriculture,Rural Inclusion , Water and Higher Education
Most Popular sectors are Agriculture,Food, Retail, Services and Personal Use
Most Popular activities for usage of loans are Farming, General Store,Personal Housing Expenses,Food Production/ Sales and Agriculture
Most popular uses of loans are To Buy Water Filter, To construct a Sanitary toilet, To buy ingredients for food production business, To buy groceries to sell, To buy food for pigs
The median funded amount is $450 and the mean funded amount is $786
14 months is the most common Term for the loans followed by 8, 11, 7 and 13 months
Women get more loans than men
The countries which have received the most loans are Philipines, Kenya, El Salvador, Cambodia and Pakistan

Section : Multidimensional Poverty Measures

Niger, Somalia, Ethopia, Burkina Faso and Chad are the poorest from the MPI Rural measure
South Sudan,Chad, Somalia,Liberia and Central African Republic are the poorest from the MPI Urban measure
Lac in Chad is the poorest region followed by Affar in Ethopia, Est in Burkina Faso, Ouaddað and Wadi Fira in Chad.
Mali, Sierra Leone,Liberia, Mozambique,Burkina Faso are the High MPI Rural countries with the most loans.Armenia,Kyrgyzstan,Albania,Ukraine and Thailand are the Low MPI Rural countries with the most loans
The loans for High MPI countries are used for buying condiments, to buy sheep for resale, to buy contruction materials,to buy building materials, to buy fertilizer for groundnuts, okra and peanuts. The loans for Low MPI countries are used to buy cows, to pay for her higher education,to buy some livestock to increase her herd and to buy sheep
Food,Retail,Agriculture,Clothing and Services are the most popular sector for loans in High MPI Rural countries. Agriculture,Education,Health,Housing and Clothing are the most popular sector for loans in Low MPI Rural countries
High MPI Countries have Median Funded Loan Amount $600 and Low MPI Countries have Median Funded Loan Amount $1100

Section : Human Development Index

Sierra Leone , Eritrea , South Sudan , Mozambique, Guinea, Burundi, Burkina Faso, Chad , Niger and Central African Republic are countries with the lowest Human Development Index

Section : Population below Poverty Line

Syria,Zimbabwe,Madagascar,Sierra Leone,Suriname,Nigeria,Guinea-Bissau,Burundi,Swaziland and Democratic Republic of Congo are the countries which have the highest population below poverty line.

Section : Naive Poverty Metric

The Naive Poverty Metric is calculated on the Population below Poverty Line , Human Development Index, MPI Urban and MPI Rural
Nigeria, Guinea, Burkina Faso, Liberia ,Burundi, Guinea-Bissau, Chad, Niger, Sierra Leone and South Sudan are the poorest according to the Naive Poverty Metric

This project helped in giving me the exposure to treemap and leaflet packages and implement them in the wonderful world of Kiva.

DATA 606 01[15958] : Final Project [05/15]

DATA 606 - Final Project

DATA 606 01[15958] : Final Project [05/15]

1 Part 1 - Introduction

2 Part 2 - Data Preparation

2.1 Load Libraries

2.2 Data collection

2.3 Cases

2.4 Variables

2.5 Type of study

2.6 Scope of inference

2.7 Causality

2.7.1 Glimpse of Data

2.7.1.1 Loans data

2.7.1.2 Regions data

2.7.1.3 Themes data

2.7.1.4 Themes and Regions data

3 Part 3 - Exploratory Data Analysis

3.1 Explanatory

3.2 Popular Sector

3.3 Popular Activity for taking loans

3.4 Popular Use of loans

3.5 Distribution of the Funded Loan amount

3.5.1 Distribution by Country

3.5.2 Distribution by Sector

3.5.3 Distribution by Gender

3.6 Common Loan Term In Months

3.7 Popular Countries for loans

3.8 Maps of Loans

3.9 Popular Theme

3.10 Kiva Loans in respective Continents

3.10.1 AFRICA

3.10.1.1 Loan Distribution

3.10.1.2 African Conflicts Data

3.10.1.3 Battle affected Countries

3.10.1.4 Loans in Battle affected Countries

3.10.1.5 Popular Sector

3.10.2 ASIA

3.10.2.1 Loan Distribution

3.10.2.2 Poorest Asian Countries

3.10.2.2.1 Rural

3.10.2.2.2 Urban

3.10.2.3 Loans in Poorest Asian countries

3.10.2.4 Use of loans

3.10.2.5 Popular Sector

3.10.3 AMERICAS

3.10.3.1 Loan Distribution

3.10.3.2 Poorest South American Countries

3.10.3.2.1 Rural

3.10.3.2.2 Urban

3.10.3.3 Loans in Poorest South American countries

3.10.3.4 Use of loans

3.10.3.5 Popular Sector

3.11 Kiva Loans in respective Countries

3.11.1 Kenya

3.11.1.1 Loans Distribution

3.11.1.2 Funding Partners

3.11.1.3 Popular Sector

3.11.1.4 Popular Activity

3.11.1.5 Popular Use of Loans

3.11.1.6 Loan Trends

3.11.1.7 Loans Data

3.11.2 India

3.11.2.1 Loan Distribution

3.11.2.2 Dominant Funding Partner

3.11.2.3 Popular Sector

3.11.2.4 Popular Activity

3.11.2.5 Popular Use of loans

3.11.2.6 Loans Trend

3.11.2.7 Loans data

3.11.3 El Salvador

3.11.3.1 Loan Distribution

3.11.3.2 Dominant Field Partner

3.11.3.3 Popular Sector

3.11.3.4 Popular Activity

3.11.3.5 Popular Use of loans

3.11.3.6 Loan Trends

3.11.3.7 Loans Data

3.12 Multidimensional Poverty Measures

3.12.1 MPI Map