Censored Response Variable : The outcome variable is a censored variable , censored at 0. Looking at the summary statistics, we can see that there is a large number of ZEROS in the outcome variable. The data generating process of why we observe these zeros: either random or decision process needs to be analysed - two-step models or hurdle models could be furger analysed?
In a time series framework, a two to three period lagged GDP would be approporiate as a lagged decision to undertake “legal migration” decisions.
Clustering Of the Standard Errors: Since the time dimensions spans a decade, the authors fail to discuss the time and space autocorrelation in estimation of standard erros. By default, clustering standard errors at the individual level assumes there is no correlation across time and space? The paper by Abadie et al. could be a good reference to know hat to do with clustering standard errors when time and space matters. Alternatively, a clutsered bootstrapped standard errors could be tried (caveat, memory space constraint of the computer)
Bilateral Migration Regression Variables: Understanding the reasoning behind the authors use of the interaction terms alongwith the “GTI_bilateral” index, wouldn’t it be more feasible to just stick to interaction with the addition of the intercept and slope effects of “unilateral” and “destination” countries rather than including “bilateral GTI” index.
- Aren’t \(\gamma_{ot} = \delta_{dt}\)? Hwo do the authors distinguish between the two to include them spearately in the regression equation number (2) ?
Perfromance Measurement Issue: - Why was within-R2/r-squared used as measure performance. As mentioned from the works of Pagan & Ullah (1999) that began from nonparametric work to the recent studies in Machine Learning, accuracy of fitted values is measured using either “mean absolute bias” and/or “root mean squared error”. Using R-sq is not the the accurate measure.Furthermore, fixed effects model performance are, too, evaluated on these measures (see Buddelmeyer et al. 2008 ).
- Pagan, A., & Ullah, A. (1999). Nonparametric econometrics. Cambridge university press.
- Buddelmeyer, H., Jensen, P. H., Oguzoglu, U., & Webster, E. (2008). Fixed effects bias in panel data estimators. Available at SSRN 1136288. [http://ftp.iza.org/dp3487.pdf]
Why was “regxfe” used? The authors have a manageable two-way fixed effetcs and not a high dimensional fixed effects model. The authors of “regxfe” highlight the fact of where to use it and when: suitable for 3 or higher (upto 7) dimensions of fixed effects. Simple within estimator with time dummies could have sufficed in estimating the model.