I replicate a part of the analysis in J. Vernon Henderson, Adam Storeygard, and David N. Weil, “Measuring Economic Growth from Outer Space,” American Economic Review 102, no. 2 (2012): 994-1028. Countries report their GDP to World Bank which goes into World Bank database after some verification steps.However, some countries have incentives to exaggerate their GDP, and they tend to report inflated numbers to World Bank. To remedy this problem, the authors predict the GDP growth of countries with the level of their light emissions observed from space. Any discrepancy found in the reported GDP and predicted GDP suggest a chance that those governments may be misreporting the numbers.

library(tidyverse)
library(dplyr)
library(ggplot2)
aer_gdp_space<- read.csv("C:/Users/Seoyun1009/Downloads/aer_gdp_space (1).csv")
#Create a new variable gdp_growth
aer_gdp_space<-aer_gdp_space %>% mutate(gdp_growth=100 * (gdp-prior_gdp)/prior_gdp)
#Create a new variable light_growth
aer_gdp_space<-aer_gdp_space %>% mutate(light_growth=100 * (light-prior_light)/prior_light)
#Run a linear regression
model<- lm(gdp_growth~light_growth,data=aer_gdp_space)
summary(model)
## 
## Call:
## lm(formula = gdp_growth ~ light_growth, data = aer_gdp_space)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -67.12 -22.32  -1.08  14.36 190.05 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  49.82024    3.48742  14.286  < 2e-16 ***
## light_growth  0.25460    0.03815   6.674 3.47e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 37.16 on 168 degrees of freedom
## Multiple R-squared:  0.2096, Adjusted R-squared:  0.2048 
## F-statistic: 44.54 on 1 and 168 DF,  p-value: 3.469e-10

Coefficient of light_growth (0.2546): GDP growth goes up by an average of 0.25 percentage points for every 1 percentage point rise in nighttime light growth.

p-value < 0.001: The link between nighttime light growth and GDP growth is statistically very strong, which means we don’t accept the null hypothesis that light growth has no effect on GDP growth.

R2 = 0.21: About 21% of the difference in GDP growth can be explained by changes in the amount of light at night. This means that the other 79% is due to things that this model doesn’t take into account.

Conclusion: Nighttime light emissions observed from satellite imagery serve as a statistically valid predictor of GDP growth. However, given the model’s limited explanatory power, it is best utilized as a tool for detecting potential discrepancies between reported and predicted GDP, which may indicate instances of GDP misreporting by governments.

aer_gdp_space$predicted_gdp <- predict(model)
ggplot(data=aer_gdp_space,aes(x=predicted_gdp, y = gdp_growth)) +
geom_point(alpha = 0.5, colour = "black") +
geom_abline(intercept = 0, slope = 1, colour = "red") +
labs( title = "Predicted vs. Observed GDP Growth",
x = "Predicted GDP Growth (based on Light)",
y = "Observed (Reported) GDP Growth") +
theme_bw()

top_20<- aer_gdp_space %>%
mutate(diff = gdp_growth -predicted_gdp,
abs_diff = abs(diff)
) %>%
arrange(desc(abs_diff)) %>%
head(20)
print(top_20 %>% select(country, gdp_growth, predicted_gdp, diff))
##                             country gdp_growth predicted_gdp      diff
## 1                           Myanmar 246.092915      56.04168 190.05123
## 2                             China 237.873910      74.62252 163.25139
## 3                           Armenia 161.621613      71.32700  90.29461
## 4                           Ireland 148.152803      62.06136  86.09144
## 5                       Isle of Man 139.511590      54.19846  85.31313
## 6                            Uganda 140.954153      60.13759  80.81656
## 7                        Mozambique 157.862441      79.49547  78.36698
## 8                            Angola 140.717092      66.52531  74.19178
## 9                             India 128.749085      57.89956  70.84953
## 10                         Cambodia 171.832331     104.06993  67.76240
## 11                          Liberia 133.671629     200.79151 -67.11988
## 12                    Cote d'Ivoire  26.345797      92.14104 -65.79524
## 13                         Zimbabwe -13.559821      49.57826 -63.13808
## 14 Democratic Republic of the Congo  -6.896552      53.11628 -60.01283
## 15                          Burundi  -8.866609      51.02629 -59.89290
## 16                            Congo  40.212889      95.88387 -55.67098
## 17              Trinidad and Tobago 112.387555      57.06518  55.32238
## 18                 Papua New Guinea  20.704494      75.23040 -54.52591
## 19                          Ukraine -14.988857      38.75385 -53.74271
## 20                            Haiti  -3.528899      49.19757 -52.72647
library(knitr)
top_20 %>%
select(country, diff, gdp_growth ,predicted_gdp) %>%
kable()
country diff gdp_growth predicted_gdp
Myanmar 190.05123 246.092915 56.04168
China 163.25139 237.873910 74.62252
Armenia 90.29461 161.621613 71.32700
Ireland 86.09144 148.152803 62.06136
Isle of Man 85.31313 139.511590 54.19846
Uganda 80.81656 140.954153 60.13759
Mozambique 78.36698 157.862441 79.49547
Angola 74.19178 140.717092 66.52531
India 70.84953 128.749086 57.89956
Cambodia 67.76240 171.832331 104.06993
Liberia -67.11988 133.671629 200.79151
Cote d’Ivoire -65.79524 26.345797 92.14104
Zimbabwe -63.13808 -13.559821 49.57826
Democratic Republic of the Congo -60.01283 -6.896552 53.11628
Burundi -59.89290 -8.866609 51.02629
Congo -55.67098 40.212889 95.88387
Trinidad and Tobago 55.32238 112.387554 57.06518
Papua New Guinea -54.52591 20.704494 75.23040
Ukraine -53.74271 -14.988857 38.75385
Haiti -52.72647 -3.528899 49.19757
ggplot(top_20, aes(x = reorder(country, diff), y = diff, fill = diff > 0)) +
geom_col() +
coord_flip() +
scale_fill_manual(values = c("steelblue", "red"),
labels = c("Under-reported", "Over-reported")) +
labs(title = "Top 20 Countries with Greatest GDP Discrepancy",
x = "Country",
y = "Discrepancy (Observed - Predicted)",
fill = "Type") +
theme_bw()