Discussion Paper:

Böhme, Gröger & Stöhr (JDE 2020)

Reference:

Böhme, M. H., Gröger, A., & Stöhr, T. (2020). Searching for a better life: Predicting international migration with online search keywords. Journal of Development Economics, 142, 102347. Click Here For The Paper!

Overview:

What to take away from this paper:

  • Migration a heated topic.
  • Accurate measurement of “migration intention” is needed.
  • Real migration data is lagged.
  • “If our approach works, the signals extracted from the GTI should track actual migration flows relatively well.”
  • What Data Has Been Used: Google Trends data to measure the main predictor “GTI” index. Overall, the follwoing data used: OECD IMD 2004–2015, World Development Indicators, Google Trends, Polity IV, State Fragility Index, and EM-DAT International Disasters Database.

Conclusions of The Paper

  • The authors conclude, assertively, that their measure of using GTI (Google Trends Indices) outperforms and is the best to predict the migration flow. [I doubt this! As such assertive statements are not tested against other tools. Here comes, knocking loudly, Machine Learning.]
  • Important to keep in mind: This paper presents a discussion on prediction and predictors and not “causality”?
  • This paper can vastly improve through application of Machine Learning tools, since the goal of the paper was prediction. The GTI variable used in the model is, in my opinion, generating a higher variance for a low bias. ML tools can help in addressing this over-fitting issue.

Discussion Thoughts:

Main Issue:

  • I would have liked to see the “overall” and “within” correlation between GTI and Migration Flow variable. There could be an issue of GTI being a near perfect predictor owing to the sample and pool of “migrants”.
    • When predictors are highly correlated, the answers you get depend on the predictors in the model. Have A Look at this : https://online.stat.psu.edu/stat462/node/179/
    • When predictor variables are correlated, the estimated regression coefficient of any one variable depends on which other predictor variables are included in the model.
    • When predictor variables are correlated, the precision of the estimated regression coefficients decreases as more predictor variables are added to the model.
  • A test of the assumptions of “linear fixed- effects model” would be a good check. That is, homoscedasticity and normally distributed errors.
  • Have the authors checked for outliers or leverage of observational points. Suggestion: try a “jacknife” estimation or some outliers test ex-post regression.
  • The author should check for “spatial autocorrelation” as its presence might hamper inference.

Introduction Critique: Replicability Issue.

  • How is Migration defined: What’s the scope of “migration”? Question arises of heterogeneity in migration type.
  • Measurememt error in aggregation of daily indices to the annual.
  • Semantics keywords are extracted from articles on migration from Wikipedia? Can we do a bit better with journal articles, or let’s take “Thomson Reuters API” and extract keywords from news.
  • Semantic Link not a valid source as I was not able to find related words for “migration” or “women”. More appropriate would have been word2vec using word embedding and “parrelogram-ic” word analogies.
  • Moreover, This is risky as “semantics” corpus is deeply ingrained in gender/racial and other biases.
  • 1-grams a possible measuremet error for “Migration”. Authors should combine posible 2 and 3-grams/ tri-grams for better accuracy/robustness and deal with sparsity matrix adjustment.
  • Excuding Languages biases the “generalization” and other assertive statements in the introduction. Research should have delved deepeer into ther languages.
  • Language choice through the use of “proprietary data” that the authors in the introduction railed against use of “twitter”,etc.
  • The author used singular and plurals. But won’t that “double count” the effect of word. E.g. “job” and “jobs”. Singular subumes plurals.
  • Keywords Issue:
    • the list of keywords include words that could have many different applications: benefits, arrival?
    • Surprisingly, education or schooling doesn’t show up as one of the keywords. Woruldn’t that be reflective of migration?
  • The authors aggregate the keywords intensities by saying that they “capture their joint intesity”- how? summing up.. measurement error?

Critique of the “Pitfalls” mentioned in the paper:

  • Pitfall 1: what’s medium? medium for the authors could be small in future. Not an enough convinicing. It could explore to large features and used more ML methods for “dimension reductions” that capture the lagre effects with small dimmensions.
  • Pitfall 2: Changes in google search algorithm could be a reason for “not consistent” methdology that is strict in AERs.
  • Since we are looking at migration through a “legal channel”,

On The Data Descriptive Statistics

  • FSI score is lower the better: So the origin countries you are looking at are “well to do”. Isn’t that a bit of a problem for the analysis? Worse countries have a higher index. Does your interpretation change?

  • Polity score: The author should have used the POLITY2 variable that standardizes the raw polity score to a the conventional level of -10 to +10. What the authors have used is , can be seen through the “min”, is affected by extreme values in the negative side. There are 3 special categories : -88,-77 and -66 - that are special standardized codes denoted in actual ploty socores to represent “complex tranistions” in regimes. Therefore, the author should revise his estimates using the “conventional level” . Look at page 8 and page 17 of the Polity IV Manual Document.

  • What are your 101 origin countries analysed? Because, based on autrhors Fig 1, if Spain & Australia are OECD countries - a destination- how is it also an origin country in your figure 1 ? Are origin-destination countries mutually exclusive?

  • Critique of Fig 1: would the analysis stand up to a fitted value controlling for time as origin country dummies only controlled? What does the figure look like if your fiited values are generated after you control for time variation and then add GTI on that specification? Would you still see GTI tracking migration that is not explained by time variation? The fig 1 , reflects that GTI tracks migration really well in a one-way fixed effects model, excluding time effects. Seems like GTI is picking up the time noise?

Econometrics Issue:

  • Censored Response Variable : The outcome variable is a censored variable , censored at 0. Looking at the summary statistics, we can see that there is a large number of ZEROS in the outcome variable. The data generating process of why we observe these zeros: either random or decision process needs to be analysed - two-step models or hurdle models could be furger analysed?

  • In a time series framework, a two to three period lagged GDP would be approporiate as a lagged decision to undertake “legal migration” decisions.

  • Clustering Of the Standard Errors: Since the time dimensions spans a decade, the authors fail to discuss the time and space autocorrelation in estimation of standard erros. By default, clustering standard errors at the individual level assumes there is no correlation across time and space? The paper by Abadie et al. could be a good reference to know hat to do with clustering standard errors when time and space matters. Alternatively, a clutsered bootstrapped standard errors could be tried (caveat, memory space constraint of the computer)

  • Bilateral Migration Regression Variables: Understanding the reasoning behind the authors use of the interaction terms alongwith the “GTI_bilateral” index, wouldn’t it be more feasible to just stick to interaction with the addition of the intercept and slope effects of “unilateral” and “destination” countries rather than including “bilateral GTI” index.

    • Aren’t \(\gamma_{ot} = \delta_{dt}\)? Hwo do the authors distinguish between the two to include them spearately in the regression equation number (2) ?
  • Perfromance Measurement Issue: - Why was within-R2/r-squared used as measure performance. As mentioned from the works of Pagan & Ullah (1999) that began from nonparametric work to the recent studies in Machine Learning, accuracy of fitted values is measured using either “mean absolute bias” and/or “root mean squared error”. Using R-sq is not the the accurate measure.Furthermore, fixed effects model performance are, too, evaluated on these measures (see Buddelmeyer et al. 2008 ).

    • Pagan, A., & Ullah, A. (1999). Nonparametric econometrics. Cambridge university press.
    • Buddelmeyer, H., Jensen, P. H., Oguzoglu, U., & Webster, E. (2008). Fixed effects bias in panel data estimators. Available at SSRN 1136288. [http://ftp.iza.org/dp3487.pdf]
  • Why was “regxfe” used? The authors have a manageable two-way fixed effetcs and not a high dimensional fixed effects model. The authors of “regxfe” highlight the fact of where to use it and when: suitable for 3 or higher (upto 7) dimensions of fixed effects. Simple within estimator with time dummies could have sufficed in estimating the model.

Mechanisms:

  • Is there near perfect correlation between “GTI” and “Migration Flow”? That is, id GTI has a high predictor power because of high correlation ? Analagously, if I ask you “do you work?” and “do you earn a wage”? there would be a high correlation, you can’t have wage if you don’t work. So, in this paper, implicitly migration is happening through legal channels

Heterogeneity :

  • Land-locked countries and bordered countries?