1 Overview

This notebook explores the relationship between share of a country’s population receiving cash transfers and various variables which proxy for the severity of the covid crisis in the country and the capacity to deliver cash transfers.

2 Analyze missing data

The table below provides basic info on the number of rows with different numbers of missing values. For example, the first row indicates that there are 57 (out fo 74) rows with 0 missing values.

## # A tibble: 6 x 3
##   n_miss_in_case n_cases pct_cases
##            <int>   <int>     <dbl>
## 1              0      57     77.0 
## 2              1       7      9.46
## 3              2       4      5.41
## 4              3       4      5.41
## 5              5       1      1.35
## 6              6       1      1.35

The figure below shows the prevalence of missing values for each of the variables in the kitchen sink regression.

The figure below provides more info on the various combinations of missingness. For example, the first column indicates that there are 3 rows for which soc_ins is missing but no other variables are missing.

3 Regressions

3.1 Basic regression of total beneficiaries

Dependent variable:
total_bens
tax_to_gdp -0.002
(0.007)
delta_2020 -0.011
(0.016)
has_id 0.829***
(0.235)
Constant -0.337
(0.214)
Observations 53
R2 0.254
Adjusted R2 0.208
Residual Std. Error 0.281 (df = 49)
F Statistic 5.553*** (df = 3; 49)
Note: p<0.1; p<0.05; p<0.01

3.2 Kitchen sink regression

For this regression, I include the following variables on the right hand side:

  • delta_2020 – pct ppt drop in GDP growth from 2019 to 2020 (forecasted). proxy for severity of crisis
  • GDP per capita
  • tax to GDP
  • ease of payment – see readme for details. this is calculated using the share of people with an account and the number of bank branches and ATMs per 100k people
  • ease of ID – this is calculated using the proportion of the population with an ID and social insurance coverage.
Dependent variable:
total_bens
delta_2020 -0.021
(0.013)
gdp_per_capita 0.00002**
(0.00001)
tax_to_gdp -0.005
(0.006)
ease_pay 0.431**
(0.206)
ease_id 0.004
(0.005)
Constant -0.238
(0.173)
Observations 57
R2 0.554
Adjusted R2 0.510
Residual Std. Error 0.227 (df = 51)
F Statistic 12.651*** (df = 5; 51)
Note: p<0.1; p<0.05; p<0.01

3.3 Kitchen sink regression without ease_id or ease_pay

This is the same regression as above except that I use the variables that go into ease_id (share of pop with ID, soc insurance coverage, social registry coverage) and ease_pay (number of bank branches per 100k, number of ATMs per 100k, share of population with a bank account)

Dependent variable:
total_bens
delta_2020 -0.012
(0.013)
gdp_per_capita 0.00002***
(0.00001)
tax_to_gdp -0.004
(0.005)
has_id_id4d 0.001
(0.002)
soc_ins -0.896**
(0.340)
soc_reg_coverage 0.273**
(0.104)
i_branches_A1_pop 0.010**
(0.004)
i_ATMs_pop 0.002
(0.002)
has_account 0.172
(0.195)
Constant -0.105
(0.169)
Observations 57
R2 0.671
Adjusted R2 0.608
Residual Std. Error 0.203 (df = 47)
F Statistic 10.662*** (df = 9; 47)
Note: p<0.1; p<0.05; p<0.01

3.4 Regressions using new transfers as the dependent variable

Dependent variable:
new_transfers
top_ups -0.154
(0.155)
has_account 0.296***
(0.107)
soc_reg_coverage 0.057
(0.071)
soc_ins 0.066
(0.166)
Constant -0.092*
(0.050)
Observations 59
R2 0.189
Adjusted R2 0.129
Residual Std. Error 0.144 (df = 54)
F Statistic 3.148** (df = 4; 54)
Note: p<0.1; p<0.05; p<0.01
## mutate: new variable 'high_soc' (logical) with 3 unique values and 5% NA
Dependent variable:
new_transfers
top_ups -0.166
(0.158)
has_account 0.296***
(0.106)
high_soc 0.049
(0.054)
soc_ins 0.052
(0.169)
Constant -0.087*
(0.048)
Observations 59
R2 0.192
Adjusted R2 0.132
Residual Std. Error 0.143 (df = 54)
F Statistic 3.198** (df = 4; 54)
Note: p<0.1; p<0.05; p<0.01

4 Bivariate scatterplots

severity_vars <- c("delta_2020", "april_ld_severity")
infra_vars <- c("ease_pay", "has_id_id4d", "ease_id", "has_account", "soc_ins", "soc_reg_coverage", "tax_to_gdp")

for (var in c(severity_vars, infra_vars, "gdp_per_capita")) {
  p <- ggplot(bens, aes(.data[[var]], total_bens, label = code)) +
  geom_point()+
  geom_text_repel()+
  geom_smooth(method = "lm")
  print(p)
}
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 4 rows containing non-finite values (stat_smooth).
## Warning: Removed 4 rows containing missing values (geom_point).
## Warning: Removed 4 rows containing missing values (geom_text_repel).

## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 9 rows containing non-finite values (stat_smooth).
## Warning: Removed 9 rows containing missing values (geom_point).
## Warning: Removed 9 rows containing missing values (geom_text_repel).

## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 7 rows containing non-finite values (stat_smooth).
## Warning: Removed 7 rows containing missing values (geom_point).
## Warning: Removed 7 rows containing missing values (geom_text_repel).

## `geom_smooth()` using formula 'y ~ x'

## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 15 rows containing non-finite values (stat_smooth).
## Warning: Removed 15 rows containing missing values (geom_point).
## Warning: Removed 15 rows containing missing values (geom_text_repel).

## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 6 rows containing non-finite values (stat_smooth).
## Warning: Removed 6 rows containing missing values (geom_point).
## Warning: Removed 6 rows containing missing values (geom_text_repel).

## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 13 rows containing non-finite values (stat_smooth).
## Warning: Removed 13 rows containing missing values (geom_point).
## Warning: Removed 13 rows containing missing values (geom_text_repel).

## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 4 rows containing non-finite values (stat_smooth).
## Warning: Removed 4 rows containing missing values (geom_point).
## Warning: Removed 4 rows containing missing values (geom_text_repel).

## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 3 rows containing non-finite values (stat_smooth).
## Warning: Removed 3 rows containing missing values (geom_point).
## Warning: Removed 3 rows containing missing values (geom_text_repel).

## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 6 rows containing non-finite values (stat_smooth).
## Warning: Removed 6 rows containing missing values (geom_point).
## Warning: Removed 6 rows containing missing values (geom_text_repel).

5 Partial adjustment plots

Since many of these variables are highly correlated with GDP per capita, I also plot the partial adjustment scatterplot of total beneficiaries on each variable with the effect of GDP per capita netted out.

for (var in c(severity_vars, infra_vars)) {
  rx <- as.formula(str_c("total_bens ~ gdp_per_capita +", var))
  fit <- lm(rx, bens)
  avPlot(fit, variable = var)
}