This notebook explores the relationship between share of a country’s population receiving cash transfers and various variables which proxy for the severity of the covid crisis in the country and the capacity to deliver cash transfers.
The table below provides basic info on the number of rows with different numbers of missing values. For example, the first row indicates that there are 57 (out fo 74) rows with 0 missing values.
## # A tibble: 6 x 3
## n_miss_in_case n_cases pct_cases
## <int> <int> <dbl>
## 1 0 57 77.0
## 2 1 7 9.46
## 3 2 4 5.41
## 4 3 4 5.41
## 5 5 1 1.35
## 6 6 1 1.35
The figure below shows the prevalence of missing values for each of the variables in the kitchen sink regression.
The figure below provides more info on the various combinations of missingness. For example, the first column indicates that there are 3 rows for which soc_ins is missing but no other variables are missing.
| Dependent variable: | |
| total_bens | |
| tax_to_gdp | -0.002 |
| (0.007) | |
| delta_2020 | -0.011 |
| (0.016) | |
| has_id | 0.829*** |
| (0.235) | |
| Constant | -0.337 |
| (0.214) | |
| Observations | 53 |
| R2 | 0.254 |
| Adjusted R2 | 0.208 |
| Residual Std. Error | 0.281 (df = 49) |
| F Statistic | 5.553*** (df = 3; 49) |
| Note: | p<0.1; p<0.05; p<0.01 |
For this regression, I include the following variables on the right hand side:
| Dependent variable: | |
| total_bens | |
| delta_2020 | -0.021 |
| (0.013) | |
| gdp_per_capita | 0.00002** |
| (0.00001) | |
| tax_to_gdp | -0.005 |
| (0.006) | |
| ease_pay | 0.431** |
| (0.206) | |
| ease_id | 0.004 |
| (0.005) | |
| Constant | -0.238 |
| (0.173) | |
| Observations | 57 |
| R2 | 0.554 |
| Adjusted R2 | 0.510 |
| Residual Std. Error | 0.227 (df = 51) |
| F Statistic | 12.651*** (df = 5; 51) |
| Note: | p<0.1; p<0.05; p<0.01 |
This is the same regression as above except that I use the variables that go into ease_id (share of pop with ID, soc insurance coverage, social registry coverage) and ease_pay (number of bank branches per 100k, number of ATMs per 100k, share of population with a bank account)
| Dependent variable: | |
| total_bens | |
| delta_2020 | -0.012 |
| (0.013) | |
| gdp_per_capita | 0.00002*** |
| (0.00001) | |
| tax_to_gdp | -0.004 |
| (0.005) | |
| has_id_id4d | 0.001 |
| (0.002) | |
| soc_ins | -0.896** |
| (0.340) | |
| soc_reg_coverage | 0.273** |
| (0.104) | |
| i_branches_A1_pop | 0.010** |
| (0.004) | |
| i_ATMs_pop | 0.002 |
| (0.002) | |
| has_account | 0.172 |
| (0.195) | |
| Constant | -0.105 |
| (0.169) | |
| Observations | 57 |
| R2 | 0.671 |
| Adjusted R2 | 0.608 |
| Residual Std. Error | 0.203 (df = 47) |
| F Statistic | 10.662*** (df = 9; 47) |
| Note: | p<0.1; p<0.05; p<0.01 |
| Dependent variable: | |
| new_transfers | |
| top_ups | -0.154 |
| (0.155) | |
| has_account | 0.296*** |
| (0.107) | |
| soc_reg_coverage | 0.057 |
| (0.071) | |
| soc_ins | 0.066 |
| (0.166) | |
| Constant | -0.092* |
| (0.050) | |
| Observations | 59 |
| R2 | 0.189 |
| Adjusted R2 | 0.129 |
| Residual Std. Error | 0.144 (df = 54) |
| F Statistic | 3.148** (df = 4; 54) |
| Note: | p<0.1; p<0.05; p<0.01 |
## mutate: new variable 'high_soc' (logical) with 3 unique values and 5% NA
| Dependent variable: | |
| new_transfers | |
| top_ups | -0.166 |
| (0.158) | |
| has_account | 0.296*** |
| (0.106) | |
| high_soc | 0.049 |
| (0.054) | |
| soc_ins | 0.052 |
| (0.169) | |
| Constant | -0.087* |
| (0.048) | |
| Observations | 59 |
| R2 | 0.192 |
| Adjusted R2 | 0.132 |
| Residual Std. Error | 0.143 (df = 54) |
| F Statistic | 3.198** (df = 4; 54) |
| Note: | p<0.1; p<0.05; p<0.01 |
severity_vars <- c("delta_2020", "april_ld_severity")
infra_vars <- c("ease_pay", "has_id_id4d", "ease_id", "has_account", "soc_ins", "soc_reg_coverage", "tax_to_gdp")
for (var in c(severity_vars, infra_vars, "gdp_per_capita")) {
p <- ggplot(bens, aes(.data[[var]], total_bens, label = code)) +
geom_point()+
geom_text_repel()+
geom_smooth(method = "lm")
print(p)
}
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 4 rows containing non-finite values (stat_smooth).
## Warning: Removed 4 rows containing missing values (geom_point).
## Warning: Removed 4 rows containing missing values (geom_text_repel).
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 9 rows containing non-finite values (stat_smooth).
## Warning: Removed 9 rows containing missing values (geom_point).
## Warning: Removed 9 rows containing missing values (geom_text_repel).
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 7 rows containing non-finite values (stat_smooth).
## Warning: Removed 7 rows containing missing values (geom_point).
## Warning: Removed 7 rows containing missing values (geom_text_repel).
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 15 rows containing non-finite values (stat_smooth).
## Warning: Removed 15 rows containing missing values (geom_point).
## Warning: Removed 15 rows containing missing values (geom_text_repel).
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 6 rows containing non-finite values (stat_smooth).
## Warning: Removed 6 rows containing missing values (geom_point).
## Warning: Removed 6 rows containing missing values (geom_text_repel).
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 13 rows containing non-finite values (stat_smooth).
## Warning: Removed 13 rows containing missing values (geom_point).
## Warning: Removed 13 rows containing missing values (geom_text_repel).
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 4 rows containing non-finite values (stat_smooth).
## Warning: Removed 4 rows containing missing values (geom_point).
## Warning: Removed 4 rows containing missing values (geom_text_repel).
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 3 rows containing non-finite values (stat_smooth).
## Warning: Removed 3 rows containing missing values (geom_point).
## Warning: Removed 3 rows containing missing values (geom_text_repel).
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 6 rows containing non-finite values (stat_smooth).
## Warning: Removed 6 rows containing missing values (geom_point).
## Warning: Removed 6 rows containing missing values (geom_text_repel).
Since many of these variables are highly correlated with GDP per capita, I also plot the partial adjustment scatterplot of total beneficiaries on each variable with the effect of GDP per capita netted out.
for (var in c(severity_vars, infra_vars)) {
rx <- as.formula(str_c("total_bens ~ gdp_per_capita +", var))
fit <- lm(rx, bens)
avPlot(fit, variable = var)
}