This notebook explores the relationship between the share of a country’s population receiving covid cash transfers and various variables which proxy for the severity of the covid crisis in the country and the capacity to deliver cash transfers.
The figure below graphically shows which variables have missing values. Before doing this analysis, I first removed all countries for which m21_bens_actual is missing – i.e. we don’t have data on cash transfer coverage from the latest WB SP spreadsheet. (Otherwise, there are tons of countries with missing data for a lot of variables and it becomes difficult to read the output.)
# Visualize the missingness
infra %>% vis_miss(cluster = TRUE)
infra %>% gg_miss_upset()
Socbens on govtransfer, deaths 1 but with GDP 2020 2 but with lockdown severity 1, 2 and 3 with goveffect added
| Dependent variable: | ||||||
| m21_bens_actual | ||||||
| (1) | (2) | (3) | (4) | (5) | (6) | |
| govt_transfer | 1.227*** | 1.237*** | 1.303*** | 0.826** | 0.850** | 0.869** |
| (0.337) | (0.342) | (0.347) | (0.339) | (0.349) | (0.356) | |
| deaths | 0.0001* | 0.0001 | 0.0001 | 0.0001 | 0.0001 | 0.0001 |
| (0.0001) | (0.0001) | (0.0001) | (0.0001) | (0.0001) | (0.0001) | |
| gdpg_2020e | -0.005 | -0.002 | ||||
| (0.007) | (0.006) | |||||
| april_ld_severity | 0.011 | 0.005 | ||||
| (0.008) | (0.008) | |||||
| gov_effect | 0.150*** | 0.154*** | 0.150*** | |||
| (0.043) | (0.050) | (0.047) | ||||
| Constant | 0.071 | 0.072 | -0.120 | 0.206*** | 0.208*** | 0.109 |
| (0.045) | (0.049) | (0.160) | (0.058) | (0.064) | (0.168) | |
| Observations | 77 | 73 | 74 | 76 | 72 | 73 |
| R2 | 0.219 | 0.226 | 0.234 | 0.330 | 0.324 | 0.335 |
| Adjusted R2 | 0.197 | 0.193 | 0.201 | 0.302 | 0.284 | 0.296 |
| Residual Std. Error | 0.253 (df = 74) | 0.256 (df = 69) | 0.256 (df = 70) | 0.238 (df = 72) | 0.243 (df = 67) | 0.242 (df = 68) |
| F Statistic | 10.349*** (df = 2; 74) | 6.723*** (df = 3; 69) | 7.138*** (df = 3; 70) | 11.800*** (df = 3; 72) | 8.038*** (df = 4; 67) | 8.578*** (df = 4; 68) |
| Note: | p<0.1; p<0.05; p<0.01 | |||||
# create dataset without Kuwait
df1 <- infra %>% filter(code != "KWT")
# create dataset without Kuwait and without the recently added countries
df2 <- df1 %>% filter(m21_status != "No plan")
# Excluding Kuwait
ggplot(df1, aes(log_gdp, m21_bens_actual, colour = region, label = code)) + geom_point()+ geom_text_repel() +labs(caption = "Kuwait excluded") + geom_smooth(mapping = aes(log_gdp, m21_bens_actual), inherit.aes = FALSE, method = "lm")
r1 <- lm(m21_bens_actual ~ log_gdp, df1)
ggplot(df1, aes(gdp_per_capita, m21_bens_actual, colour = region, label = code)) + geom_point()+ geom_text_repel()+labs(caption = "Kuwait excluded")+ geom_smooth(mapping = aes(gdp_per_capita, m21_bens_actual), inherit.aes = FALSE, method = "lm")
r2 <- lm(m21_bens_actual ~ gdp_per_capita, df1)
# Excluding Kuwait and recently added countries
ggplot(df2, aes(log_gdp, m21_bens_actual, colour = region, label = code)) + geom_point()+ geom_text_repel() +labs(caption = "Just countries from Ana's list")+ geom_smooth(mapping = aes(log_gdp, m21_bens_actual), inherit.aes = FALSE, method = "lm")
r3 <- lm(m21_bens_actual ~ log_gdp, df2)
# cleaned up version of the graph for RP
ggplot(df2, aes(gdp_per_capita, m21_bens_actual, colour = region, label = code)) + geom_point()+ geom_text_repel()+labs(y = "% population receiving cash transfer", x = "GDP per capita")+ geom_smooth(mapping = aes(gdp_per_capita, m21_bens_actual), inherit.aes = FALSE, method = "lm") + guides(colour = guide_legend(override.aes = list(size = 5)))
r4 <- lm(m21_bens_actual ~ gdp_per_capita, df2)
# generate weighted averages by region of m21_bens_actual
df <- infra %>%
mutate(r2 = as.character(region)) %>%
mutate(r2 = if_else(code == "IND", "India", r2)) %>%
mutate(r2 = if_else(code == "CHN", "China", r2)) %>%
group_by(r2) %>%
summarise(bens = weighted.mean(m21_bens_actual, pop_2018, na.rm = TRUE)) %>%
arrange(bens)
ggplot(df, aes(reorder(r2, bens), bens))+geom_bar(stat = "identity")
In the regressions below, I regress the share of the total population receiving covid cash transfers, spending on covid cash transfers per capita, the share of the population receiving cash transfers pre covid, and total spending on covid response as a share of GDP on various variables.
| Dependent variable: | |||
| m21_bens_actual | govt_transfer | total_fiscal | |
| (1) | (2) | (3) | |
| gdpg_2020e | -0.019** | -0.001 | 0.109 |
| (0.008) | (0.004) | (0.096) | |
| gdp_per_capita | 0.00002 | -0.00000 | 0.0003 |
| (0.00002) | (0.00001) | (0.0002) | |
| tax_to_gdp | 0.007 | -0.002 | 0.132** |
| (0.005) | (0.002) | (0.056) | |
| has_id_id4d | -0.003* | 0.001 | -0.002 |
| (0.002) | (0.001) | (0.020) | |
| received_wages | 0.632 | -0.279 | -4.545 |
| (0.611) | (0.268) | (7.184) | |
| soc_reg_coverage | 0.015 | 0.050 | -0.212 |
| (0.119) | (0.052) | (1.397) | |
| i_branches_A1_pop | 0.003 | 0.0004 | -0.0001 |
| (0.004) | (0.002) | (0.049) | |
| i_ATMs_pop | 0.004** | -0.00000 | 0.036* |
| (0.002) | (0.001) | (0.020) | |
| has_account | -0.310 | 0.393*** | 2.369 |
| (0.298) | (0.131) | (3.508) | |
| deaths | -0.0002 | 0.00005 | 0.002 |
| (0.0001) | (0.0001) | (0.002) | |
| gov_effect | 0.126 | -0.041 | -0.846 |
| (0.096) | (0.042) | (1.127) | |
| tsa | -0.033 | 0.016 | -0.325 |
| (0.036) | (0.016) | (0.420) | |
| unpan_egov | -0.0001 | -0.001 | -0.014 |
| (0.003) | (0.001) | (0.030) | |
| regionEAP | -0.045 | 0.102 | -1.661 |
| (0.157) | (0.069) | (1.843) | |
| regionLAC | 0.165 | -0.074 | -0.048 |
| (0.125) | (0.055) | (1.470) | |
| regionMNA | -0.026 | -0.005 | -1.278 |
| (0.137) | (0.060) | (1.607) | |
| regionSAR | 0.173 | -0.016 | -2.897 |
| (0.155) | (0.068) | (1.828) | |
| m21_statusIn progress | -0.286** | 0.027 | -1.307 |
| (0.130) | (0.057) | (1.530) | |
| m21_statusPlanned | -0.313** | -0.032 | -2.718 |
| (0.136) | (0.060) | (1.599) | |
| inc_levelLMIC | 0.074 | -0.006 | 2.601 |
| (0.135) | (0.059) | (1.583) | |
| inc_levelUMIC | -0.161 | 0.059 | 0.621 |
| (0.188) | (0.082) | (2.207) | |
| pop_2018 | -0.000 | -0.000** | 0.000 |
| (0.000) | (0.000) | (0.000) | |
| m21_precovid | 0.123 | -0.053 | 6.354** |
| (0.255) | (0.112) | (2.997) | |
| april_ld_severity | -0.018 | -0.004 | 0.028 |
| (0.011) | (0.005) | (0.127) | |
| log_gdp | -0.108 | -0.034 | -4.785** |
| (0.168) | (0.074) | (1.980) | |
| progress | |||
| Constant | 1.606 | 0.484 | 39.834** |
| (1.391) | (0.612) | (16.369) | |
| Observations | 49 | 49 | 49 |
| R2 | 0.872 | 0.799 | 0.692 |
| Adjusted R2 | 0.734 | 0.581 | 0.356 |
| Residual Std. Error (df = 23) | 0.155 | 0.068 | 1.827 |
| F Statistic (df = 25; 23) | 6.287*** | 3.663*** | 2.063** |
| Note: | p<0.1; p<0.05; p<0.01 | ||
Same as above, but exclude all countries with status equal to “planned.”
| Dependent variable: | |||
| m21_bens_actual | govt_transfer | total_fiscal | |
| (1) | (2) | (3) | |
| gdpg_2020e | -0.024* | -0.005 | 0.116 |
| (0.012) | (0.004) | (0.123) | |
| gdp_per_capita | 0.0001 | 0.00002 | 0.0002 |
| (0.00004) | (0.00001) | (0.0004) | |
| tax_to_gdp | 0.014 | -0.003 | 0.003 |
| (0.009) | (0.003) | (0.099) | |
| has_id_id4d | -0.001 | 0.002 | -0.002 |
| (0.004) | (0.001) | (0.039) | |
| received_wages | 1.855* | 0.049 | -12.226 |
| (0.912) | (0.327) | (10.196) | |
| soc_reg_coverage | -0.104 | 0.048 | 0.764 |
| (0.156) | (0.054) | (1.686) | |
| i_branches_A1_pop | 0.006 | -0.0003 | 0.015 |
| (0.004) | (0.002) | (0.050) | |
| i_ATMs_pop | 0.004* | -0.0003 | 0.042* |
| (0.002) | (0.001) | (0.023) | |
| has_account | -1.104* | 0.341* | 6.585 |
| (0.537) | (0.165) | (5.157) | |
| deaths | -0.0002 | 0.00000 | 0.003 |
| (0.0002) | (0.0001) | (0.002) | |
| gov_effect | 0.108 | -0.071 | 0.225 |
| (0.132) | (0.043) | (1.330) | |
| tsa | -0.030 | 0.032 | -0.031 |
| (0.068) | (0.023) | (0.705) | |
| unpan_egov | 0.001 | -0.002 | 0.037 |
| (0.004) | (0.001) | (0.041) | |
| regionEAP | -0.108 | -0.038 | 0.913 |
| (0.195) | (0.069) | (2.153) | |
| regionLAC | 0.123 | -0.153** | -0.685 |
| (0.211) | (0.061) | (1.910) | |
| regionMNA | -0.028 | -0.084 | -1.897 |
| (0.195) | (0.066) | (2.048) | |
| regionSAR | 0.325 | -0.218* | -2.415 |
| (0.356) | (0.111) | (3.464) | |
| m21_statusIn progress | -0.316 | -0.045 | -1.846 |
| (0.235) | (0.083) | (2.600) | |
| inc_levelLMIC | 0.602 | 0.439** | -3.153 |
| (0.592) | (0.170) | (5.311) | |
| inc_levelUMIC | 0.434 | 0.497** | -4.981 |
| (0.642) | (0.180) | (5.617) | |
| pop_2018 | -0.000 | -0.000 | -0.000 |
| (0.000) | (0.000) | (0.000) | |
| govt_transfer | 0.693 | ||
| (0.806) | |||
| april_ld_severity | -0.020 | -0.006 | 0.195 |
| (0.016) | (0.006) | (0.172) | |
| log_gdp | -0.530 | -0.397** | -0.461 |
| (0.563) | (0.166) | (5.187) | |
| progress | |||
| Constant | 4.158 | 3.219** | 0.038 |
| (4.288) | (1.224) | (38.200) | |
| Observations | 36 | 36 | 36 |
| R2 | 0.882 | 0.907 | 0.732 |
| Adjusted R2 | 0.625 | 0.729 | 0.218 |
| Residual Std. Error | 0.172 (df = 11) | 0.062 (df = 12) | 1.927 (df = 12) |
| F Statistic | 3.428** (df = 24; 11) | 5.101*** (df = 23; 12) | 1.425 (df = 23; 12) |
| Note: | p<0.1; p<0.05; p<0.01 | ||
The regressions below are slightly more parsimonious than the kitchen sink regressions above.
| Dependent variable: | ||||||||||
| m21_bens_actual | progress | |||||||||
| (1) | (2) | (3) | (4) | (5) | (6) | (7) | (8) | (9) | (10) | |
| has_id_id4d | -0.001 | -0.0002 | 0.002 | 0.003 | 0.005* | 0.006* | 0.003 | |||
| (0.001) | (0.002) | (0.003) | (0.003) | (0.003) | (0.003) | (0.003) | ||||
| i_branches_A1_pop | 0.006** | 0.007** | 0.008 | 0.009 | ||||||
| (0.003) | (0.003) | (0.005) | (0.005) | |||||||
| i_ATMs_pop | 0.003** | 0.003** | 0.0004 | -0.0004 | 0.005** | 0.005** | ||||
| (0.001) | (0.002) | (0.002) | (0.003) | (0.002) | (0.002) | |||||
| received_wages | 0.043 | 0.104 | -0.279 | -0.303 | ||||||
| (0.355) | (0.400) | (0.799) | (0.811) | |||||||
| gov_effect | 0.112** | 0.095 | 0.368*** | 0.381*** | 0.404*** | 0.361*** | ||||
| (0.050) | (0.058) | (0.112) | (0.113) | (0.077) | (0.086) | |||||
| gdpg_2020e | -0.004 | 0.003 | -0.008 | 0.016 | 0.003 | |||||
| (0.006) | (0.006) | (0.011) | (0.012) | (0.010) | ||||||
| deaths | 0.0002** | -0.00003 | 0.0003* | 0.0002 | 0.00005 | 0.0002 | 0.0001 | |||
| (0.0001) | (0.0001) | (0.0002) | (0.0002) | (0.0002) | (0.0001) | (0.0001) | ||||
| Constant | 0.188 | 0.154*** | 0.128 | 0.658** | 0.577*** | 0.649** | 0.146 | 0.129 | 0.857*** | 0.623*** |
| (0.147) | (0.040) | (0.162) | (0.314) | (0.077) | (0.320) | (0.208) | (0.211) | (0.074) | (0.228) | |
| Observations | 76 | 85 | 72 | 58 | 62 | 57 | 62 | 64 | 61 | 61 |
| R2 | 0.420 | 0.080 | 0.428 | 0.414 | 0.092 | 0.446 | 0.251 | 0.242 | 0.385 | 0.398 |
| Adjusted R2 | 0.378 | 0.057 | 0.365 | 0.358 | 0.061 | 0.367 | 0.212 | 0.204 | 0.364 | 0.366 |
| Residual Std. Error | 0.224 (df = 70) | 0.268 (df = 82) | 0.229 (df = 64) | 0.374 (df = 52) | 0.450 (df = 59) | 0.373 (df = 49) | 0.413 (df = 58) | 0.411 (df = 60) | 0.367 (df = 58) | 0.366 (df = 57) |
| F Statistic | 10.125*** (df = 5; 70) | 3.558** (df = 2; 82) | 6.840*** (df = 7; 64) | 7.357*** (df = 5; 52) | 2.975* (df = 2; 59) | 5.636*** (df = 7; 49) | 6.473*** (df = 3; 58) | 6.395*** (df = 3; 60) | 18.187*** (df = 2; 58) | 12.557*** (df = 3; 57) |
| Note: | p<0.1; p<0.05; p<0.01 | |||||||||
Additional regressions specified in the email.
| Dependent variable: | |||||||
| total_fiscal | m21_bens_actual | ||||||
| (1) | (2) | (3) | (4) | (5) | (6) | (7) | |
| gdpg_2020e | -0.035 | 0.001 | -0.0004 | ||||
| (0.051) | (0.005) | (0.005) | |||||
| tax_to_gdp | 0.072* | 0.083** | |||||
| (0.040) | (0.039) | ||||||
| i_ATMs_pop | 0.004*** | ||||||
| (0.001) | |||||||
| gov_effect | 0.625 | 0.596 | 0.194*** | 0.180*** | 0.079* | 0.190*** | 0.173*** |
| (0.422) | (0.418) | (0.044) | (0.039) | (0.044) | (0.049) | (0.045) | |
| deaths | 0.0001 | 0.0003 | 0.0001* | 0.0001* | 0.0001* | ||
| (0.001) | (0.001) | (0.0001) | (0.0001) | (0.0001) | |||
| has_id_id4d | 0.001 | 0.0004 | |||||
| (0.002) | (0.001) | ||||||
| Constant | 1.950** | 1.844** | 0.287*** | 0.274*** | 0.116** | 0.221* | 0.240* |
| (0.769) | (0.759) | (0.047) | (0.040) | (0.053) | (0.132) | (0.127) | |
| Observations | 78 | 79 | 84 | 88 | 93 | 87 | 88 |
| R2 | 0.138 | 0.132 | 0.260 | 0.267 | 0.320 | 0.235 | 0.267 |
| Adjusted R2 | 0.091 | 0.097 | 0.232 | 0.249 | 0.305 | 0.207 | 0.241 |
| Residual Std. Error | 2.104 (df = 73) | 2.100 (df = 75) | 0.243 (df = 80) | 0.238 (df = 85) | 0.226 (df = 90) | 0.245 (df = 83) | 0.239 (df = 84) |
| F Statistic | 2.916** (df = 4; 73) | 3.788** (df = 3; 75) | 9.373*** (df = 3; 80) | 15.444*** (df = 2; 85) | 21.197*** (df = 2; 90) | 8.506*** (df = 3; 83) | 10.212*** (df = 3; 84) |
| Note: | p<0.1; p<0.05; p<0.01 | ||||||
Additional regressions specified in the email.
| Dependent variable: | |||||||
| total_fiscal | m21_bens_actual | ||||||
| (1) | (2) | (3) | (4) | (5) | (6) | (7) | |
| gdpg_2020e | -0.082 | -0.006 | -0.007 | ||||
| (0.071) | (0.009) | (0.009) | |||||
| tax_to_gdp | -0.046 | -0.011 | -0.004 | -0.002 | -0.004 | -0.001 | |
| (0.069) | (0.063) | (0.009) | (0.008) | (0.009) | (0.008) | ||
| i_ATMs_pop | 0.004** | ||||||
| (0.001) | |||||||
| gov_effect | 1.115 | 1.218* | 0.244*** | 0.251*** | 0.135 | 0.265*** | 0.270*** |
| (0.673) | (0.670) | (0.084) | (0.083) | (0.084) | (0.096) | (0.095) | |
| has_id_id4d | -0.001 | -0.001 | |||||
| (0.003) | (0.003) | ||||||
| deaths | 0.001 | 0.001* | 0.0001 | 0.0001 | -0.00003 | 0.0001 | 0.0001 |
| (0.001) | (0.001) | (0.0001) | (0.0001) | (0.0001) | (0.0001) | (0.0001) | |
| Constant | 3.460*** | 3.253*** | 0.482*** | 0.467*** | 0.306*** | 0.588** | 0.555** |
| (1.173) | (1.165) | (0.147) | (0.145) | (0.075) | (0.260) | (0.254) | |
| Observations | 42 | 42 | 43 | 43 | 43 | 43 | 43 |
| R2 | 0.205 | 0.176 | 0.244 | 0.234 | 0.332 | 0.249 | 0.237 |
| Adjusted R2 | 0.119 | 0.111 | 0.164 | 0.175 | 0.280 | 0.147 | 0.157 |
| Residual Std. Error | 2.010 (df = 37) | 2.019 (df = 38) | 0.253 (df = 38) | 0.251 (df = 39) | 0.234 (df = 39) | 0.255 (df = 37) | 0.254 (df = 38) |
| F Statistic | 2.382* (df = 4; 37) | 2.701* (df = 3; 38) | 3.058** (df = 4; 38) | 3.965** (df = 3; 39) | 6.455*** (df = 3; 39) | 2.447* (df = 5; 37) | 2.955** (df = 4; 38) |
| Note: | p<0.1; p<0.05; p<0.01 | ||||||
It’s a bit hard to grasp the relationship between two variables from correlation coefficients alone which is why I have graphed bivariate scatterplots for a few pairs of variables.
library(rlang)
create_scatter <- function(x, y) {
p <- ggplot(infra, aes({{x}}, {{y}}, colour = .data[["m21_status"]], label = .data[["code"]])) + geom_point()+ geom_text_repel()
}
print(create_scatter(gdpg_2020e, total_fiscal))
print(create_scatter(gdpg_2020e, m21_bens_actual))
print(create_scatter(gdpg_2020e, deaths))
print(create_scatter(log_gdp, govt_transfer))
print(create_scatter(govt_transfer, m21_bens_actual))
print(create_scatter(govt_transfer, m21_precovid))
print(create_scatter(i_ATMs_pop, m21_bens_actual))
print(create_scatter(gov_effect, m21_bens_actual))
for (var in c("gov_effect", "i_ATMs_pop")) {
rx <- as.formula(str_c("m21_bens_actual ~ log_gdp +", var))
fit <- lm(rx, infra)
avPlot(fit, variable = var)
}