Load required packages
library(readr)
library(dplyr)
library(tidyverse)
Load the World Bank Dataset
#fill '..' values in numerical columns with NA.
world_bank <- read_csv("C:/Users/SP KHALID/Downloads/WDI- World Bank Dataset.csv" , na = c('..'))
world_bank
dim(world_bank)
## [1] 1675 19
# Check column data types
glimpse(world_bank)
## Rows: 1,675
## Columns: 19
## $ Time <dbl> 2000, 20…
## $ `Time Code` <chr> "YR2000"…
## $ `Country Name` <chr> "Brazil"…
## $ `Country Code` <chr> "BRA", "…
## $ Region <chr> "Latin A…
## $ `Income Group` <chr> "Upper m…
## $ `GDP (constant 2015 US$)` <dbl> 1.18642e…
## $ `GDP growth (annual %)` <dbl> 4.387949…
## $ `GDP (current US$)` <dbl> 6.554482…
## $ `Unemployment, total (% of total labor force)` <dbl> NA, 3.70…
## $ `Inflation, consumer prices (annual %)` <dbl> 7.044141…
## $ `Labor force, total` <dbl> 80295093…
## $ `Population, total` <dbl> 17401828…
## $ `Exports of goods and services (% of GDP)` <dbl> 10.18805…
## $ `Imports of goods and services (% of GDP)` <dbl> 12.45171…
## $ `General government final consumption expenditure (% of GDP)` <dbl> 18.76784…
## $ `Foreign direct investment, net inflows (% of GDP)` <dbl> 5.033917…
## $ `Gross savings (% of GDP)` <dbl> 13.99170…
## $ `Current account balance (% of GDP)` <dbl> -4.04774…
# Convert Time column to integer
world_bank$Time <- as.integer(world_bank$Time)
# Clean column names
library(janitor)
df <- world_bank |> clean_names()
glimpse(df)
## Rows: 1,675
## Columns: 19
## $ time <int> 2000, …
## $ time_code <chr> "YR200…
## $ country_name <chr> "Brazi…
## $ country_code <chr> "BRA",…
## $ region <chr> "Latin…
## $ income_group <chr> "Upper…
## $ gdp_constant_2015_us <dbl> 1.1864…
## $ gdp_growth_annual_percent <dbl> 4.3879…
## $ gdp_current_us <dbl> 6.5544…
## $ unemployment_total_percent_of_total_labor_force <dbl> NA, 3.…
## $ inflation_consumer_prices_annual_percent <dbl> 7.0441…
## $ labor_force_total <dbl> 802950…
## $ population_total <dbl> 174018…
## $ exports_of_goods_and_services_percent_of_gdp <dbl> 10.188…
## $ imports_of_goods_and_services_percent_of_gdp <dbl> 12.451…
## $ general_government_final_consumption_expenditure_percent_of_gdp <dbl> 18.767…
## $ foreign_direct_investment_net_inflows_percent_of_gdp <dbl> 5.0339…
## $ gross_savings_percent_of_gdp <dbl> 13.991…
## $ current_account_balance_percent_of_gdp <dbl> -4.047…
using gdp growth annual percentage
highgrowth = ifelse(df$gdp_growth_annual_percent > 3 ,1 ,0)
inflation_consumer_prices_annual_percent unemployment_total_percent_of_total_labor_force exports_of_goods_and_services_percent_of_gdp foreign_direct_investment_net_inflows_percent_of_gdp
model <- glm(highgrowth ~ inflation_consumer_prices_annual_percent +
unemployment_total_percent_of_total_labor_force +
exports_of_goods_and_services_percent_of_gdp,
data = df,
family = binomial)
summary(model)
##
## Call:
## glm(formula = highgrowth ~ inflation_consumer_prices_annual_percent +
## unemployment_total_percent_of_total_labor_force + exports_of_goods_and_services_percent_of_gdp,
## family = binomial, data = df)
##
## Coefficients:
## Estimate Std. Error z value
## (Intercept) 0.406756 0.145297 2.799
## inflation_consumer_prices_annual_percent 0.024074 0.008425 2.857
## unemployment_total_percent_of_total_labor_force -0.048189 0.012642 -3.812
## exports_of_goods_and_services_percent_of_gdp 0.001827 0.002118 0.863
## Pr(>|z|)
## (Intercept) 0.005119 **
## inflation_consumer_prices_annual_percent 0.004271 **
## unemployment_total_percent_of_total_labor_force 0.000138 ***
## exports_of_goods_and_services_percent_of_gdp 0.388377
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1656.8 on 1210 degrees of freedom
## Residual deviance: 1631.4 on 1207 degrees of freedom
## (464 observations deleted due to missingness)
## AIC: 1639.4
##
## Number of Fisher Scoring iterations: 4
The model estimates how inflation, unemployment, and exports affect the likelihood of a country experiencing high economic growth. Residual deviance (1631.4) is lower than null deviance (1656.8) that shows model improves over baseline
Logistic Regression Form
\[ \log\left(\frac{p}{1-p}\right) = 0.4068 + 0.0241 \cdot \text{Inflation} - 0.0482 \cdot \text{Unemployment} + 0.0018 \cdot \text{Exports} \]
Inflation (+0.024)
p-value = 0.004 (< 0.01) -statistically significant
Positive coefficient
A 1% increase in inflation increases the log-odds of high growth by 0.024
e^0.024 ≈ 1.024
1% increase in inflation means ~2.4% increase in odds of high growth
Unemployment (-0.048) - Highly Significant
p-value < 0.001, very strong evidence
Negative coefficient
𝑒^0.048 ≈ 0.953
1% increase in unemployment means ~4.7% decrease in odds of high growth
Exports (+0.0018)
p-value = 0.388 (> 0.05) - not significant
No strong evidence that exports (% of GDP) affect high growth in this model
Export share alone may not capture growth dynamics because they could depend on type of exports and country income level
Intercept (0.407)
When all variables = 0, log-odds of high growth = 0.407
confint(model)
## 2.5 % 97.5 %
## (Intercept) 0.121941549 0.692052900
## inflation_consumer_prices_annual_percent 0.008772898 0.041728118
## unemployment_total_percent_of_total_labor_force -0.073453667 -0.023774674
## exports_of_goods_and_services_percent_of_gdp -0.002264411 0.006080381
A 95% confidence interval was constructed for the model coefficients to assess their reliability. For inflation, the interval (0.0088, 0.0417) does not include zero, indicating a statistically significant positive effect on the probability of high growth. Similarly, unemployment has a confidence interval (−0.073, −0.024), which is entirely negative, confirming a significant negative impact on growth. In contrast, the confidence interval for exports (−0.0023, 0.0061) includes zero, suggesting that its effect is not statistically significant. Overall, these intervals reinforce that inflation and unemployment are meaningful predictors in the model, while exports are not.
Unemployment is the strongest predictor i.e higher unemployment reduces growth probability Inflation has a small but positive effect, possibly indicating expanding economies Exports are not significant, suggesting that export share alone is not enough to explain growth
Helps understand macroeconomic drivers of growth and suggests that labor market conditions are critical and inflation may reflect economic activity. It is also useful for policymakers economic forecasting
Should we include GDP growth level or population?
Are there non-linear effects (e.g., inflation too high becomes harmful)?
Does the relationship differ by income_group?
Do developed vs developing countries behave differently?
Should we include interaction terms (e.g., inflation × unemployment)?
Would adding FDI or savings improve the model?