1 Overview

This notebook explores the relationship between share of a country’s population receiving cash transfers and various variables which proxy for the severity of the covid crisis in the country and the capacity to deliver cash transfers.

2 Analyze missing data

The table below provides basic info on the number of rows with different numbers of missing values. For example, the first row indicates that there are 57 (out fo 74) rows with 0 missing values.

## # A tibble: 6 x 3
##   n_miss_in_case n_cases pct_cases
##            <int>   <int>     <dbl>
## 1              0      57     77.0 
## 2              1       7      9.46
## 3              2       4      5.41
## 4              3       4      5.41
## 5              5       1      1.35
## 6              6       1      1.35

The figure below shows the prevalence of missing values for each of the variables in the kitchen sink regression.

The figure below provides more info on the various combinations of missingness. For example, the first column indicates that there are 3 rows for which soc_ins is missing but no other variables are missing.

3 Regressions

3.1 Basic regression of total beneficiaries


	Dependent variable:

	total_bens

tax_to_gdp	-0.002
	(0.007)

delta_2020	-0.011
	(0.016)

has_id	0.829^***
	(0.235)

Constant	-0.337
	(0.214)


Observations	53
R²	0.254
Adjusted R²	0.208
Residual Std. Error	0.281 (df = 49)
F Statistic	5.553^*** (df = 3; 49)

Note:	p<0.1; p<0.05; p<0.01

3.2 Kitchen sink regression

For this regression, I include the following variables on the right hand side:

delta_2020 – pct ppt drop in GDP growth from 2019 to 2020 (forecasted). proxy for severity of crisis
GDP per capita
tax to GDP
ease of payment – see readme for details. this is calculated using the share of people with an account and the number of bank branches and ATMs per 100k people
ease of ID – this is calculated using the proportion of the population with an ID and social insurance coverage.


	Dependent variable:

	total_bens

delta_2020	-0.021
	(0.013)

gdp_per_capita	0.00002^**
	(0.00001)

tax_to_gdp	-0.005
	(0.006)

ease_pay	0.431^**
	(0.206)

ease_id	0.004
	(0.005)

Constant	-0.238
	(0.173)


Observations	57
R²	0.554
Adjusted R²	0.510
Residual Std. Error	0.227 (df = 51)
F Statistic	12.651^*** (df = 5; 51)

Note:	p<0.1; p<0.05; p<0.01

3.3 Kitchen sink regression without ease_id or ease_pay

This is the same regression as above except that I use the variables that go into ease_id (share of pop with ID, soc insurance coverage, social registry coverage) and ease_pay (number of bank branches per 100k, number of ATMs per 100k, share of population with a bank account)


	Dependent variable:

	total_bens

delta_2020	-0.012
	(0.013)

gdp_per_capita	0.00002^***
	(0.00001)

tax_to_gdp	-0.004
	(0.005)

has_id_id4d	0.001
	(0.002)

soc_ins	-0.896^**
	(0.340)

soc_reg_coverage	0.273^**
	(0.104)

i_branches_A1_pop	0.010^**
	(0.004)

i_ATMs_pop	0.002
	(0.002)

has_account	0.172
	(0.195)

Constant	-0.105
	(0.169)


Observations	57
R²	0.671
Adjusted R²	0.608
Residual Std. Error	0.203 (df = 47)
F Statistic	10.662^*** (df = 9; 47)

Note:	p<0.1; p<0.05; p<0.01

3.4 Regressions using new transfers as the dependent variable


	Dependent variable:

	new_transfers

top_ups	-0.154
	(0.155)

has_account	0.296^***
	(0.107)

soc_reg_coverage	0.057
	(0.071)

soc_ins	0.066
	(0.166)

Constant	-0.092^*
	(0.050)


Observations	59
R²	0.189
Adjusted R²	0.129
Residual Std. Error	0.144 (df = 54)
F Statistic	3.148^** (df = 4; 54)

Note:	p<0.1; p<0.05; p<0.01

## mutate: new variable 'high_soc' (logical) with 3 unique values and 5% NA


	Dependent variable:

	new_transfers

top_ups	-0.166
	(0.158)

has_account	0.296^***
	(0.106)

high_soc	0.049
	(0.054)

soc_ins	0.052
	(0.169)

Constant	-0.087^*
	(0.048)


Observations	59
R²	0.192
Adjusted R²	0.132
Residual Std. Error	0.143 (df = 54)
F Statistic	3.198^** (df = 4; 54)

Note:	p<0.1; p<0.05; p<0.01

4 Bivariate scatterplots

severity_vars <- c("delta_2020", "april_ld_severity")
infra_vars <- c("ease_pay", "has_id_id4d", "ease_id", "has_account", "soc_ins", "soc_reg_coverage", "tax_to_gdp")

for (var in c(severity_vars, infra_vars, "gdp_per_capita")) {
  p <- ggplot(bens, aes(.data[[var]], total_bens, label = code)) +
  geom_point()+
  geom_text_repel()+
  geom_smooth(method = "lm")
  print(p)
}

## `geom_smooth()` using formula 'y ~ x'

## Warning: Removed 4 rows containing non-finite values (stat_smooth).

## Warning: Removed 4 rows containing missing values (geom_point).

## Warning: Removed 4 rows containing missing values (geom_text_repel).

## `geom_smooth()` using formula 'y ~ x'

## Warning: Removed 9 rows containing non-finite values (stat_smooth).

## Warning: Removed 9 rows containing missing values (geom_point).

## Warning: Removed 9 rows containing missing values (geom_text_repel).

## `geom_smooth()` using formula 'y ~ x'

## Warning: Removed 7 rows containing non-finite values (stat_smooth).

## Warning: Removed 7 rows containing missing values (geom_point).

## Warning: Removed 7 rows containing missing values (geom_text_repel).

## `geom_smooth()` using formula 'y ~ x'

## `geom_smooth()` using formula 'y ~ x'

## Warning: Removed 15 rows containing non-finite values (stat_smooth).

## Warning: Removed 15 rows containing missing values (geom_point).

## Warning: Removed 15 rows containing missing values (geom_text_repel).

## `geom_smooth()` using formula 'y ~ x'

## Warning: Removed 6 rows containing non-finite values (stat_smooth).

## Warning: Removed 6 rows containing missing values (geom_point).

## Warning: Removed 6 rows containing missing values (geom_text_repel).

## `geom_smooth()` using formula 'y ~ x'

## Warning: Removed 13 rows containing non-finite values (stat_smooth).

## Warning: Removed 13 rows containing missing values (geom_point).

## Warning: Removed 13 rows containing missing values (geom_text_repel).

## `geom_smooth()` using formula 'y ~ x'

## Warning: Removed 4 rows containing non-finite values (stat_smooth).

## Warning: Removed 4 rows containing missing values (geom_point).

## Warning: Removed 4 rows containing missing values (geom_text_repel).

## `geom_smooth()` using formula 'y ~ x'

## Warning: Removed 3 rows containing non-finite values (stat_smooth).

## Warning: Removed 3 rows containing missing values (geom_point).

## Warning: Removed 3 rows containing missing values (geom_text_repel).

## `geom_smooth()` using formula 'y ~ x'

## Warning: Removed 6 rows containing non-finite values (stat_smooth).

## Warning: Removed 6 rows containing missing values (geom_point).

## Warning: Removed 6 rows containing missing values (geom_text_repel).

5 Partial adjustment plots

Since many of these variables are highly correlated with GDP per capita, I also plot the partial adjustment scatterplot of total beneficiaries on each variable with the effect of GDP per capita netted out.

for (var in c(severity_vars, infra_vars)) {
  rx <- as.formula(str_c("total_bens ~ gdp_per_capita +", var))
  fit <- lm(rx, bens)
  avPlot(fit, variable = var)
}

Analyze New Beneficiary Data