The tables below regress coverage, coverage-transfer, and status as of end 2020 on various variables.
Analyze missing data
The tables below show which of the vars that we use in the main analyses still have some missing data. I also show which specific countries we are missing social registry data for because that var is causing us to drop a few countries.


## [1] "Countries for which we are missing social registry data"
## character(0)
Graphs for May 3rd


New regressions from April 29th
Coverage / invisible as the dependent variable
|
|
|
|
Dependent variable:
|
|
|
|
|
|
cov_over_invis
|
|
|
|
high_vis
|
2.006***
|
|
|
(0.394)
|
|
|
|
|
gdpg_2020e
|
-0.009
|
|
|
(0.014)
|
|
|
|
|
soc_reg_new
|
-0.154
|
|
|
(0.284)
|
|
|
|
|
Constant
|
0.023
|
|
|
(0.126)
|
|
|
|
|
|
|
Observations
|
85
|
|
R2
|
0.298
|
|
Adjusted R2
|
0.272
|
|
Residual Std. Error
|
0.731 (df = 81)
|
|
F Statistic
|
11.475*** (df = 3; 81)
|
|
|
|
Note:
|
p<0.1; p<0.05; p<0.01
|
Coverage as dependent variable
|
|
|
|
Dependent variable:
|
|
|
|
|
|
m21_bens_actual
|
|
|
|
govt_transfer
|
1.474***
|
|
|
(0.374)
|
|
|
|
|
pw_no_t
|
0.065
|
|
|
(0.453)
|
|
|
|
|
has_id_id4d
|
0.028
|
|
|
(0.172)
|
|
|
|
|
soc_reg_new
|
0.119
|
|
|
(0.109)
|
|
|
|
|
log_gdp
|
0.083
|
|
|
(0.052)
|
|
|
|
|
dig_id
|
-0.010
|
|
|
(0.087)
|
|
|
|
|
Constant
|
-0.674*
|
|
|
(0.346)
|
|
|
|
|
|
|
Observations
|
66
|
|
R2
|
0.458
|
|
Adjusted R2
|
0.403
|
|
Residual Std. Error
|
0.232 (df = 59)
|
|
F Statistic
|
8.317*** (df = 6; 59)
|
|
|
|
Note:
|
p<0.1; p<0.05; p<0.01
|
Coverage as dependent variable and instrumenting for non-health fiscal
|
|
|
|
Dependent variable:
|
|
|
|
|
|
m21_bens_actual
|
|
|
(1)
|
(2)
|
|
|
|
govt_transfer
|
0.848*
|
0.777
|
|
|
(0.468)
|
(0.492)
|
|
|
|
|
|
pw_no_t
|
-0.383
|
-0.490
|
|
|
(0.555)
|
(0.599)
|
|
|
|
|
|
i_ATMs_pop
|
0.004**
|
0.004*
|
|
|
(0.002)
|
(0.002)
|
|
|
|
|
|
has_id_id4d
|
-0.109
|
-0.108
|
|
|
(0.177)
|
(0.177)
|
|
|
|
|
|
soc_reg_new
|
0.279**
|
0.253**
|
|
|
(0.120)
|
(0.121)
|
|
|
|
|
|
log_gdp
|
|
0.050
|
|
|
|
(0.069)
|
|
|
|
|
|
non_health_fiscal
|
0.051
|
0.050
|
|
|
(0.055)
|
(0.052)
|
|
|
|
|
|
gov_effect
|
-0.024
|
-0.045
|
|
|
(0.072)
|
(0.082)
|
|
|
|
|
|
Constant
|
0.049
|
-0.335
|
|
|
(0.198)
|
(0.581)
|
|
|
|
|
|
|
|
Observations
|
63
|
63
|
|
R2
|
0.436
|
0.447
|
|
Adjusted R2
|
0.364
|
0.365
|
|
Residual Std. Error
|
0.239 (df = 55)
|
0.239 (df = 54)
|
|
|
|
Note:
|
p<0.1; p<0.05; p<0.01
|
Coverage regressions
|
|
|
|
Dependent variable:
|
|
|
|
|
|
m21_bens_actual
|
|
|
(1)
|
(2)
|
(3)
|
(4)
|
(5)
|
(6)
|
(7)
|
(8)
|
(9)
|
(10)
|
(11)
|
(12)
|
|
|
|
medium_vis
|
0.053
|
-0.100
|
|
|
0.066
|
0.022
|
|
|
0.295
|
0.292
|
|
|
|
|
(0.212)
|
(0.211)
|
|
|
(0.208)
|
(0.202)
|
|
|
(0.182)
|
(0.179)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
high_vis
|
|
|
0.526***
|
0.602***
|
|
|
0.495***
|
0.504***
|
|
|
0.112
|
0.113
|
|
|
|
|
(0.139)
|
(0.138)
|
|
|
(0.138)
|
(0.140)
|
|
|
(0.164)
|
(0.167)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
deaths
|
0.0003***
|
|
0.0001
|
|
0.0002*
|
|
0.0001
|
|
0.00001
|
|
-0.00002
|
|
|
|
(0.0001)
|
|
(0.0001)
|
|
(0.0001)
|
|
(0.0001)
|
|
(0.0001)
|
|
(0.0001)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
gdpg_2020e
|
|
-0.009*
|
|
-0.001
|
|
-0.009*
|
|
-0.002
|
|
-0.001
|
|
0.001
|
|
|
|
(0.005)
|
|
(0.005)
|
|
(0.005)
|
|
(0.005)
|
|
(0.005)
|
|
(0.005)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
i_ATMs_pop
|
|
|
|
|
|
|
|
|
0.006***
|
0.006***
|
0.005***
|
0.005***
|
|
|
|
|
|
|
|
|
|
|
(0.001)
|
(0.001)
|
(0.001)
|
(0.001)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
soc_reg_dummy
|
|
|
|
|
0.202**
|
0.273***
|
0.165*
|
0.187**
|
0.118
|
0.122
|
0.120
|
0.111
|
|
|
|
|
|
|
(0.092)
|
(0.083)
|
(0.086)
|
(0.079)
|
(0.080)
|
(0.075)
|
(0.081)
|
(0.076)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Constant
|
0.173***
|
0.212***
|
0.079*
|
0.084*
|
0.160***
|
0.155***
|
0.076*
|
0.074*
|
0.013
|
0.012
|
0.054
|
0.055
|
|
|
(0.048)
|
(0.050)
|
(0.043)
|
(0.045)
|
(0.048)
|
(0.050)
|
(0.042)
|
(0.044)
|
(0.048)
|
(0.049)
|
(0.040)
|
(0.041)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Observations
|
85
|
85
|
85
|
85
|
85
|
85
|
85
|
85
|
85
|
85
|
85
|
85
|
|
R2
|
0.108
|
0.041
|
0.240
|
0.220
|
0.158
|
0.155
|
0.273
|
0.271
|
0.396
|
0.396
|
0.379
|
0.379
|
|
Adjusted R2
|
0.086
|
0.018
|
0.221
|
0.201
|
0.126
|
0.124
|
0.246
|
0.244
|
0.365
|
0.365
|
0.348
|
0.348
|
|
Residual Std. Error
|
0.284 (df = 82)
|
0.294 (df = 82)
|
0.262 (df = 82)
|
0.265 (df = 82)
|
0.277 (df = 81)
|
0.278 (df = 81)
|
0.258 (df = 81)
|
0.258 (df = 81)
|
0.236 (df = 80)
|
0.236 (df = 80)
|
0.240 (df = 80)
|
0.240 (df = 80)
|
|
F Statistic
|
4.948*** (df = 2; 82)
|
1.765 (df = 2; 82)
|
12.939*** (df = 2; 82)
|
11.563*** (df = 2; 82)
|
5.053*** (df = 3; 81)
|
4.948*** (df = 3; 81)
|
10.123*** (df = 3; 81)
|
10.014*** (df = 3; 81)
|
13.087*** (df = 4; 80)
|
13.086*** (df = 4; 80)
|
12.219*** (df = 4; 80)
|
12.202*** (df = 4; 80)
|
|
|
|
Note:
|
p<0.1; p<0.05; p<0.01
|
Coverage - transfer regressions
|
|
|
|
Dependent variable:
|
|
|
|
|
|
cov_minus_transfer
|
|
|
(1)
|
(2)
|
(3)
|
(4)
|
(5)
|
(6)
|
(7)
|
(8)
|
(9)
|
(10)
|
(11)
|
(12)
|
|
|
|
medium_vis
|
-0.154
|
-0.278
|
|
|
-0.141
|
-0.169
|
|
|
-0.015
|
-0.018
|
|
|
|
|
(0.179)
|
(0.177)
|
|
|
(0.173)
|
(0.168)
|
|
|
(0.167)
|
(0.164)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
high_vis
|
|
|
0.311**
|
0.399***
|
|
|
0.276**
|
0.290**
|
|
|
0.038
|
0.041
|
|
|
|
|
(0.123)
|
(0.123)
|
|
|
(0.120)
|
(0.122)
|
|
|
(0.149)
|
(0.152)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
deaths
|
0.0002**
|
|
0.0001*
|
|
0.0001
|
|
0.0001
|
|
0.00001
|
|
0.00001
|
|
|
|
(0.0001)
|
|
(0.0001)
|
|
(0.0001)
|
|
(0.0001)
|
|
(0.0001)
|
|
(0.0001)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
gdpg_2020e
|
|
-0.005
|
|
-0.001
|
|
-0.005
|
|
-0.001
|
|
0.00001
|
|
0.0002
|
|
|
|
(0.005)
|
|
(0.005)
|
|
(0.004)
|
|
(0.004)
|
|
(0.004)
|
|
(0.004)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
i_ATMs_pop
|
|
|
|
|
|
|
|
|
0.003***
|
0.003***
|
0.003**
|
0.003**
|
|
|
|
|
|
|
|
|
|
|
(0.001)
|
(0.001)
|
(0.001)
|
(0.001)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
soc_reg_dummy
|
|
|
|
|
0.203***
|
0.243***
|
0.185**
|
0.208***
|
0.157**
|
0.160**
|
0.157**
|
0.159**
|
|
|
|
|
|
|
(0.077)
|
(0.069)
|
(0.075)
|
(0.069)
|
(0.073)
|
(0.069)
|
(0.073)
|
(0.069)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Constant
|
0.102**
|
0.139***
|
0.018
|
0.024
|
0.089**
|
0.088**
|
0.015
|
0.014
|
0.007
|
0.009
|
0.001
|
0.002
|
|
|
(0.041)
|
(0.042)
|
(0.038)
|
(0.040)
|
(0.040)
|
(0.042)
|
(0.037)
|
(0.038)
|
(0.044)
|
(0.045)
|
(0.036)
|
(0.037)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Observations
|
85
|
85
|
85
|
85
|
85
|
85
|
85
|
85
|
85
|
85
|
85
|
85
|
|
R2
|
0.110
|
0.051
|
0.167
|
0.134
|
0.182
|
0.178
|
0.226
|
0.222
|
0.283
|
0.283
|
0.284
|
0.283
|
|
Adjusted R2
|
0.089
|
0.028
|
0.147
|
0.113
|
0.151
|
0.148
|
0.197
|
0.193
|
0.247
|
0.247
|
0.248
|
0.248
|
|
Residual Std. Error
|
0.239 (df = 82)
|
0.247 (df = 82)
|
0.231 (df = 82)
|
0.236 (df = 82)
|
0.231 (df = 81)
|
0.231 (df = 81)
|
0.224 (df = 81)
|
0.225 (df = 81)
|
0.217 (df = 80)
|
0.217 (df = 80)
|
0.217 (df = 80)
|
0.217 (df = 80)
|
|
F Statistic
|
5.087*** (df = 2; 82)
|
2.221 (df = 2; 82)
|
8.248*** (df = 2; 82)
|
6.371*** (df = 2; 82)
|
5.988*** (df = 3; 81)
|
5.854*** (df = 3; 81)
|
7.862*** (df = 3; 81)
|
7.690*** (df = 3; 81)
|
7.895*** (df = 4; 80)
|
7.891*** (df = 4; 80)
|
7.914*** (df = 4; 80)
|
7.911*** (df = 4; 80)
|
|
|
|
Note:
|
p<0.1; p<0.05; p<0.01
|
Status end 2020 regressions
|
|
|
|
Dependent variable:
|
|
|
|
|
|
status_binary
|
|
|
(1)
|
(2)
|
(3)
|
(4)
|
(5)
|
(6)
|
(7)
|
(8)
|
(9)
|
(10)
|
(11)
|
(12)
|
|
|
|
medium_vis
|
-0.266
|
-0.399
|
|
|
-0.238
|
-0.181
|
|
|
0.096
|
0.165
|
|
|
|
|
(0.360)
|
(0.344)
|
|
|
(0.346)
|
(0.326)
|
|
|
(0.316)
|
(0.308)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
high_vis
|
|
|
1.007***
|
1.012***
|
|
|
0.940***
|
0.820***
|
|
|
0.465*
|
0.388
|
|
|
|
|
(0.230)
|
(0.226)
|
|
|
(0.224)
|
(0.226)
|
|
|
(0.278)
|
(0.282)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
deaths
|
0.0004**
|
|
0.0001
|
|
0.0002
|
|
-0.00002
|
|
-0.0001
|
|
-0.0001
|
|
|
|
(0.0001)
|
|
(0.0001)
|
|
(0.0002)
|
|
(0.0001)
|
|
(0.0002)
|
|
(0.0001)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
gdpg_2020e
|
|
-0.022**
|
|
-0.009
|
|
-0.021**
|
|
-0.011
|
|
-0.010
|
|
-0.008
|
|
|
|
(0.009)
|
|
(0.009)
|
|
(0.008)
|
|
(0.008)
|
|
(0.008)
|
|
(0.008)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
i_ATMs_pop
|
|
|
|
|
|
|
|
|
0.008***
|
0.007***
|
0.006***
|
0.005**
|
|
|
|
|
|
|
|
|
|
|
(0.002)
|
(0.002)
|
(0.002)
|
(0.002)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
soc_reg_dummy
|
|
|
|
|
0.427***
|
0.488***
|
0.361**
|
0.365***
|
0.304**
|
0.295**
|
0.306**
|
0.281**
|
|
|
|
|
|
|
(0.153)
|
(0.133)
|
(0.140)
|
(0.128)
|
(0.139)
|
(0.130)
|
(0.137)
|
(0.128)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Constant
|
0.513***
|
0.515***
|
0.276***
|
0.266***
|
0.485***
|
0.411***
|
0.270***
|
0.248***
|
0.270***
|
0.229***
|
0.243***
|
0.227***
|
|
|
(0.082)
|
(0.081)
|
(0.072)
|
(0.073)
|
(0.079)
|
(0.081)
|
(0.069)
|
(0.070)
|
(0.084)
|
(0.085)
|
(0.067)
|
(0.069)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Observations
|
85
|
85
|
85
|
85
|
85
|
85
|
85
|
85
|
85
|
85
|
85
|
85
|
|
R2
|
0.093
|
0.095
|
0.260
|
0.262
|
0.173
|
0.224
|
0.316
|
0.329
|
0.353
|
0.364
|
0.374
|
0.376
|
|
Adjusted R2
|
0.071
|
0.073
|
0.242
|
0.244
|
0.142
|
0.195
|
0.291
|
0.305
|
0.320
|
0.332
|
0.343
|
0.345
|
|
Residual Std. Error
|
0.481 (df = 82)
|
0.480 (df = 82)
|
0.434 (df = 82)
|
0.434 (df = 82)
|
0.462 (df = 81)
|
0.447 (df = 81)
|
0.420 (df = 81)
|
0.416 (df = 81)
|
0.411 (df = 80)
|
0.408 (df = 80)
|
0.404 (df = 80)
|
0.404 (df = 80)
|
|
F Statistic
|
4.228** (df = 2; 82)
|
4.312** (df = 2; 82)
|
14.421*** (df = 2; 82)
|
14.522*** (df = 2; 82)
|
5.638*** (df = 3; 81)
|
7.787*** (df = 3; 81)
|
12.480*** (df = 3; 81)
|
13.267*** (df = 3; 81)
|
10.900*** (df = 4; 80)
|
11.440*** (df = 4; 80)
|
11.949*** (df = 4; 80)
|
12.070*** (df = 4; 80)
|
|
|
|
Note:
|
p<0.1; p<0.05; p<0.01
|
Regression tree approach
Rather than use a linear model with no interactions, the code below fits the data using a simple regression tree using all of the variables above except for log GDP per capita. The way to read the final output is that the model predicts coverage of 63% for countries with low_vis < 53% and GDP growth higher than -9.5%. The model is kind of interesting on its own since it implies that the most important predictor is the share of adults who are low vis and that the next most important predictors are variables related to need (deaths and GDP growth). The model also gives us a rough sense of how accurate a more complex model which allows interactions between variables and higher-order terms could be. The cross-validated R squared from this model is about .5.
##
## Regression tree:
## rpart(formula = m21_bens_actual ~ govt_transfer + low_vis + medium_vis +
## high_vis + i_ATMs_pop + gdpg_2020e + deaths + soc_reg_dummy,
## data = infra, method = "anova", control = list(maxdepth = 2))
##
## Variables actually used in tree construction:
## [1] deaths gdpg_2020e low_vis
##
## Root node error: 7.5647/89 = 0.084996
##
## n= 89
##
## CP nsplit rel error xerror xstd
## 1 0.357357 0 1.00000 1.02102 0.14777
## 2 0.122924 1 0.64264 0.83608 0.13961
## 3 0.079975 2 0.51972 0.80225 0.13021
## 4 0.010000 3 0.43974 0.73860 0.12998
