gdp_2022 <- 905
gdp_2023 <- 1023
gdp_growth <- (gdp_2023 - gdp_2022) / gdp_2022 * 100
gdp_growth[1] 13.03867
Reproducible Workflow & Importing Economic Data
In this course, you will work inside one professional research project.
You are not creating a new project each week.
By the end of the semester, your folder should look like a real research repository.
If you already created a Week 1 project:
Rename it to:
ECON465_DataScience
If not:
Create a new project:
File → New Project → New Directory → New Project
Name it:
ECON465_DataScience
Inside the project, create:
data/
scripts/
reports/
figures/
final_project/
This structure will remain all semester.
Inside the project:
File → New File → R Script
Save it as:
scripts/week02_workflow.R
All Week 2 code must go inside this script.
Run inside your script:
gdp_2022 <- 905
gdp_2023 <- 1023
gdp_growth <- (gdp_2023 - gdp_2022) / gdp_2022 * 100
gdp_growth[1] 13.03867
inf_2022 <- 64.27
inf_2023 <- 54.62
inf_change <- inf_2023 - inf_2022
inf_change[1] -9.65
unemp <- c("2019" = 13.7,
"2020" = 13.1,
"2021" = 12.0,
"2022" = 10.4)
mean(unemp)[1] 12.3
sd(unemp)[1] 1.449138
turkey_econ <- data.frame(
year = 2019:2022,
inflation = c(15.2, 12.3, 19.6, 64.3),
unemployment = c(13.7, 13.1, 12.0, 10.4)
)
turkey_econ year inflation unemployment
1 2019 15.2 13.7
2 2020 12.3 13.1
3 2021 19.6 12.0
4 2022 64.3 10.4
Install once (Console only):
install.packages("tidyverse")Load:
library(tidyverse)Save example file in:
data/
write_csv(turkey_econ, "data/turkey_econ.csv")Import:
df <- read_csv("data/turkey_econ.csv")
glimpse(df)Install once:
install.packages("readxl")Load:
library(readxl)Import file into:
data/raw/
df_excel <- read_excel("data/raw/turkey_ind.xlsx")
glimpse(df_excel)We now connect directly to the World Bank database.
Install once:
install.packages("WDI")Load:
library(WDI)WDI is the World Bank’s compilation of internationally comparable statistics about global development and the fight against poverty. The database contains 1400 time series indicators for 217 economies and more than 40 country groups, with data for many indicators going back more than 50 years.
WDI data falls under six themes:
API = Application Programming Interface.
It allows R to communicate directly with the World Bank database.
No manual downloads. No copy–paste. Fully reproducible.
WDIsearch("GDP growth") indicator
690 6.0.GDP_growth
13483 NV.AGR.TOTL.ZG
13776 NY.GDP.MKTP.KD.ZG
13779 NY.GDP.MKTP.KN.87.ZG
13893 NYGDPMKTPKDZ
name
690 GDP growth (annual %)
13483 Real agricultural GDP growth rates (%)
13776 GDP growth (annual %)
13779 GDP growth (annual %)
13893 GDP growth, constant (average 2010-19 prices and exchange rates)
WDIsearch("inflation") indicator name
8924 FP.CPI.TOTL.ZG Inflation, consumer prices (annual %)
8926 FP.FPI.TOTL.ZG Inflation, food prices (annual %)
8928 FP.WPI.TOTL.ZG Inflation, wholesale prices (annual %)
13751 NY.GDP.DEFL.87.ZG Inflation, GDP deflator (annual %)
13752 NY.GDP.DEFL.KD.ZG Inflation, GDP deflator (annual %)
13753 NY.GDP.DEFL.KD.ZG.AD Inflation, GDP deflator: linked series (annual %)
WDIsearch("unemployment") indicator
11786 JI.UEM.1524.FE.ZS
11787 JI.UEM.1524.HE.ZS
11788 JI.UEM.1524.LE.ZS
11789 JI.UEM.1524.MA.ZS
11790 JI.UEM.1524.RU.ZS
11791 JI.UEM.1524.UR.ZS
11792 JI.UEM.1524.ZS
11793 JI.UEM.1564.FE.ZS
11794 JI.UEM.1564.HE.ZS
11795 JI.UEM.1564.LE.ZS
11796 JI.UEM.1564.MA.ZS
11797 JI.UEM.1564.OL.ZS
11798 JI.UEM.1564.RU.ZS
11799 JI.UEM.1564.UR.ZS
11800 JI.UEM.1564.YG.ZS
11801 JI.UEM.1564.ZS
11930 lm_ub.bi_q1
11931 lm_ub.cov_pop
11932 lm_ub.gen_pop
15182 per_lm_alllm.adq_pop_tot
15234 per_lm_alllm.ben_q1_tot
15282 per_lm_alllm.cov_pop_tot
15286 per_lm_alllm.cov_q1_tot
15290 per_lm_alllm.cov_q2_tot
15294 per_lm_alllm.cov_q3_tot
15298 per_lm_alllm.cov_q4_tot
15302 per_lm_alllm.cov_q5_tot
21947 SL.UEM.1524.FE.NE.ZS
21948 SL.UEM.1524.FE.ZS
21949 SL.UEM.1524.FM.NE.ZS
21950 SL.UEM.1524.FM.ZS
21951 SL.UEM.1524.MA.NE.ZS
21952 SL.UEM.1524.MA.ZS
21953 SL.UEM.1524.NE.ZS
21954 SL.UEM.1524.ZS
21955 SL.UEM.ADVN.FE.ZS
21956 SL.UEM.ADVN.MA.ZS
21957 SL.UEM.ADVN.ZS
21958 SL.UEM.BASC.FE.ZS
21959 SL.UEM.BASC.MA.ZS
21960 SL.UEM.BASC.ZS
21961 SL.UEM.INTM.FE.ZS
21962 SL.UEM.INTM.MA.ZS
21963 SL.UEM.INTM.ZS
21964 SL.UEM.LTRM.FE.ZS
21965 SL.UEM.LTRM.MA.ZS
21966 SL.UEM.LTRM.ZS
21973 SL.UEM.PRIM.FE.ZS
21974 SL.UEM.PRIM.MA.ZS
21975 SL.UEM.PRIM.ZS
21976 SL.UEM.SECO.FE.ZS
21977 SL.UEM.SECO.MA.ZS
21978 SL.UEM.SECO.ZS
21979 SL.UEM.TERT.FE.ZS
21980 SL.UEM.TERT.MA.ZS
21981 SL.UEM.TERT.ZS
21983 SL.UEM.TOTL.FE.NE.ZS
21984 SL.UEM.TOTL.FE.ZS
21985 SL.UEM.TOTL.MA.NE.ZS
21986 SL.UEM.TOTL.MA.ZS
21987 SL.UEM.TOTL.NE.ZS
21988 SL.UEM.TOTL.ZS
22714 UNEMPSA_
name
11786 Youth unemployment rate, aged 15-24, female (% of female youth labor force)
11787 Youth unemployment rate, aged 15-24, above primary education (% of youth labor force with high education)
11788 Youth unemployment rate, aged 15-24, primary education and below (% of youth labor force with low education)
11789 Youth unemployment rate, aged 15-24, male (% of male youth labor force)
11790 Youth unemployment rate, aged 15-24, rural (% of rural youth labor force)
11791 Youth unemployment rate, aged 15-24, urban (% of urban youth labor force)
11792 Youth unemployment rate, aged 15-24, total (% of total youth labor force)
11793 Unemployment rate, aged 15-64, female (% of female labor force in working age)
11794 Unemployment rate, aged 15-64, above primary education (% of labor force with high education in working age)
11795 Unemployment rate, aged 15-64, primary education and below (% of labor force with low education in working age)
11796 Unemployment rate, aged 15-64, male (% of male labor force in working age)
11797 Unemployment rate, aged 25-64 (% of labor force aged 25-64)
11798 Unemployment rate, aged 15-64, rural (% of rural labor force in working age)
11799 Unemployment rate, aged 15-64, urban (% of urban labor force in working age)
11800 Unemployment rate, aged 15-24 (% of labor force aged 15-24)
11801 Unemployment rate, aged 15-64, total (% of total labor force in working age)
11930 Benefit incidence of unemployment benefits and ALMP to poorest quintile (% of total U/ALMP benefits)
11931 Coverage of unemployment benefits and ALMP (% of population)
11932 Generosity of unemployment benefits and ALMP (% of total welfare of beneficiary households)
15182 Adequacy of unemployment benefits and ALMP (% of total welfare of beneficiary households)
15234 Benefit incidence of unemployment benefits and ALMP to poorest quintile (% of total U/ALMP benefits)
15282 Coverage of unemployment benefits and ALMP (% of population)
15286 Coverage of unemployment benefits and ALMP in poorest quintile (% of population)
15290 Coverage of unemployment benefits and ALMP in 2nd quintile (% of population)
15294 Coverage of unemployment benefits and ALMP in 3rd quintile (% of population)
15298 Coverage of unemployment benefits and ALMP in 4th quintile (% of population)
15302 Coverage of unemployment benefits and ALMP in richest quintile (% of population)
21947 Unemployment, youth female (% of female labor force ages 15-24) (national estimate)
21948 Unemployment, youth female (% of female labor force ages 15-24) (modeled ILO estimate)
21949 Ratio of female to male youth unemployment rate (%) (national estimate)
21950 Ratio of female to male youth unemployment rate (% ages 15-24) (modeled ILO estimate)
21951 Unemployment, youth male (% of male labor force ages 15-24) (national estimate)
21952 Unemployment, youth male (% of male labor force ages 15-24) (modeled ILO estimate)
21953 Unemployment, youth total (% of total labor force ages 15-24) (national estimate)
21954 Unemployment, youth total (% of total labor force ages 15-24) (modeled ILO estimate)
21955 Unemployment with advanced education, female (% of female labor force with advanced education)
21956 Unemployment with advanced education, male (% of male labor force with advanced education)
21957 Unemployment with advanced education (% of total labor force with advanced education)
21958 Unemployment with basic education, female (% of female labor force with basic education)
21959 Unemployment with basic education, male (% of male labor force with basic education)
21960 Unemployment with basic education (% of total labor force with basic education)
21961 Unemployment with intermediate education, female (% of female labor force with intermediate education)
21962 Unemployment with intermediate education, male (% of male labor force with intermediate education)
21963 Unemployment with intermediate education (% of total labor force with intermediate education)
21964 Long-term unemployment, female (% of female unemployment)
21965 Long-term unemployment, male (% of male unemployment)
21966 Long-term unemployment (% of total unemployment)
21973 Unemployment with primary education, female (% of female unemployment)
21974 Unemployment with primary education, male (% of male unemployment)
21975 Unemployment with primary education (% of total unemployment)
21976 Unemployment with secondary education, female (% of female unemployment)
21977 Unemployment with secondary education, male (% of male unemployment)
21978 Unemployment with secondary education (% of total unemployment)
21979 Unemployment with tertiary education, female (% of female unemployment)
21980 Unemployment with tertiary education, male (% of male unemployment)
21981 Unemployment with tertiary education (% of total unemployment)
21983 Unemployment, female (% of female labor force) (national estimate)
21984 Unemployment, female (% of female labor force) (modeled ILO estimate)
21985 Unemployment, male (% of male labor force) (national estimate)
21986 Unemployment, male (% of male labor force) (modeled ILO estimate)
21987 Unemployment, total (% of total labor force) (national estimate)
21988 Unemployment, total (% of total labor force) (modeled ILO estimate)
22714 Unemployment rate,Percent,,,
Look carefully at:
Example codes:
macro_panel <- WDI(
country = c("TR", "GR", "BG", "RO", "PL"),
indicator = c(
gdp_growth = "NY.GDP.MKTP.KD.ZG",
inflation = "FP.CPI.TOTL.ZG",
unemployment = "SL.UEM.TOTL.ZS"
),
start = 2000,
end = 2023
)
glimpse(macro_panel)Rows: 120
Columns: 7
$ country <chr> "Bulgaria", "Bulgaria", "Bulgaria", "Bulgaria", "Bulgaria…
$ iso2c <chr> "BG", "BG", "BG", "BG", "BG", "BG", "BG", "BG", "BG", "BG…
$ iso3c <chr> "BGR", "BGR", "BGR", "BGR", "BGR", "BGR", "BGR", "BGR", "…
$ year <int> 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 200…
$ gdp_growth <dbl> 4.5872254, 3.8237036, 5.8719311, 5.2371536, 6.5104258, 7.…
$ inflation <dbl> 10.3162621, 7.3609393, 5.8101437, 2.3486417, 6.1471307, 5…
$ unemployment <dbl> 16.218, 19.921, 18.110, 13.733, 12.037, 10.083, 8.951, 6.…
You now have a cross-country panel dataset.
library(dplyr)
macro_panel_clean <- macro_panel |>
select(country, year, gdp_growth, inflation, unemployment)
write_csv(macro_panel_clean, "data/macro_panel_clean.csv")Choose ONE new indicator using WDIsearch()
Download it for:
Years: 2000–2023
Save cleaned data to `data/
Write 4–6 sentences:
Upload:
scripts/week02_workflow.RWhy is an API-based workflow superior to downloading Excel files manually?
Write 3 sentences in your report.
You now:
This is how real economic research is done.