ECON 465 – Lab 2 (Week 2)

Reproducible Workflow & Importing Economic Data

Author

Gül Ertan Özgüzer

Published

February 2, 2028

1 Important: One Master Project for the Entire Semester

In this course, you will work inside one professional research project.

You are not creating a new project each week.

By the end of the semester, your folder should look like a real research repository.


2 Part 1 — Set Up Your Semester Project (15 minutes)

If you already created a Week 1 project:

Rename it to:

ECON465_DataScience

If not:

Create a new project:

File → New Project → New Directory → New Project
Name it:

ECON465_DataScience


2.1 Required Folder Structure

Inside the project, create:

data/

scripts/
reports/
figures/
final_project/

This structure will remain all semester.


3 Part 2 — Create This Week’s Script

Inside the project:

File → New File → R Script

Save it as:

scripts/week02_workflow.R

All Week 2 code must go inside this script.


4 Part 3 — R Refresher (Inside the Master Project)

Run inside your script:

gdp_2022 <- 905
gdp_2023 <- 1023

gdp_growth <- (gdp_2023 - gdp_2022) / gdp_2022 * 100
gdp_growth
[1] 13.03867

4.0.1 Exercise

inf_2022 <- 64.27
inf_2023 <- 54.62

inf_change <- inf_2023 - inf_2022
inf_change
[1] -9.65

4.1 Vectors (Use Safe Named Version)

unemp <- c("2019" = 13.7,
           "2020" = 13.1,
           "2021" = 12.0,
           "2022" = 10.4)

mean(unemp)
[1] 12.3
sd(unemp)
[1] 1.449138

4.2 Data Frame

turkey_econ <- data.frame(
  year = 2019:2022,
  inflation = c(15.2, 12.3, 19.6, 64.3),
  unemployment = c(13.7, 13.1, 12.0, 10.4)
)

turkey_econ
  year inflation unemployment
1 2019      15.2         13.7
2 2020      12.3         13.1
3 2021      19.6         12.0
4 2022      64.3         10.4

5 Part 4 — Importing CSV (Saved Properly)

Install once (Console only):

install.packages("tidyverse")

Load:

library(tidyverse)

Save example file in:

data/

write_csv(turkey_econ, "data/turkey_econ.csv")

Import:

df <- read_csv("data/turkey_econ.csv")
glimpse(df)

6 Part 5 — Import Excel

Install once:

install.packages("readxl")

Load:

library(readxl)

Import file into:

data/raw/

df_excel <- read_excel("data/raw/turkey_ind.xlsx")
glimpse(df_excel)

7 Part 6 — World Bank API

We now connect directly to the World Bank database.

Install once:

install.packages("WDI")

Load:

library(WDI)

7.1 The World Development Indicators (WDI)

WDI is the World Bank’s compilation of internationally comparable statistics about global development and the fight against poverty. The database contains 1400 time series indicators for 217 economies and more than 40 country groups, with data for many indicators going back more than 50 years.

WDI data falls under six themes:

  • Poverty and Inequality
  • People
  • Environment
  • Economy
  • States and Markets
  • Global Links

7.2 What Is an API?

API = Application Programming Interface.

It allows R to communicate directly with the World Bank database.

No manual downloads. No copy–paste. Fully reproducible.


7.3 Step 1 — Search for Indicators (Never Guess Codes)

WDIsearch("GDP growth")
                 indicator
690         6.0.GDP_growth
13483       NV.AGR.TOTL.ZG
13776    NY.GDP.MKTP.KD.ZG
13779 NY.GDP.MKTP.KN.87.ZG
13893         NYGDPMKTPKDZ
                                                                  name
690                                              GDP growth (annual %)
13483                           Real agricultural GDP growth rates (%)
13776                                            GDP growth (annual %)
13779                                            GDP growth (annual %)
13893 GDP growth, constant (average 2010-19 prices and exchange rates)
WDIsearch("inflation")
                 indicator                                              name
8924        FP.CPI.TOTL.ZG             Inflation, consumer prices (annual %)
8926        FP.FPI.TOTL.ZG                 Inflation, food prices (annual %)
8928        FP.WPI.TOTL.ZG            Inflation, wholesale prices (annual %)
13751    NY.GDP.DEFL.87.ZG                Inflation, GDP deflator (annual %)
13752    NY.GDP.DEFL.KD.ZG                Inflation, GDP deflator (annual %)
13753 NY.GDP.DEFL.KD.ZG.AD Inflation, GDP deflator: linked series (annual %)
WDIsearch("unemployment")
                     indicator
11786        JI.UEM.1524.FE.ZS
11787        JI.UEM.1524.HE.ZS
11788        JI.UEM.1524.LE.ZS
11789        JI.UEM.1524.MA.ZS
11790        JI.UEM.1524.RU.ZS
11791        JI.UEM.1524.UR.ZS
11792           JI.UEM.1524.ZS
11793        JI.UEM.1564.FE.ZS
11794        JI.UEM.1564.HE.ZS
11795        JI.UEM.1564.LE.ZS
11796        JI.UEM.1564.MA.ZS
11797        JI.UEM.1564.OL.ZS
11798        JI.UEM.1564.RU.ZS
11799        JI.UEM.1564.UR.ZS
11800        JI.UEM.1564.YG.ZS
11801           JI.UEM.1564.ZS
11930              lm_ub.bi_q1
11931            lm_ub.cov_pop
11932            lm_ub.gen_pop
15182 per_lm_alllm.adq_pop_tot
15234  per_lm_alllm.ben_q1_tot
15282 per_lm_alllm.cov_pop_tot
15286  per_lm_alllm.cov_q1_tot
15290  per_lm_alllm.cov_q2_tot
15294  per_lm_alllm.cov_q3_tot
15298  per_lm_alllm.cov_q4_tot
15302  per_lm_alllm.cov_q5_tot
21947     SL.UEM.1524.FE.NE.ZS
21948        SL.UEM.1524.FE.ZS
21949     SL.UEM.1524.FM.NE.ZS
21950        SL.UEM.1524.FM.ZS
21951     SL.UEM.1524.MA.NE.ZS
21952        SL.UEM.1524.MA.ZS
21953        SL.UEM.1524.NE.ZS
21954           SL.UEM.1524.ZS
21955        SL.UEM.ADVN.FE.ZS
21956        SL.UEM.ADVN.MA.ZS
21957           SL.UEM.ADVN.ZS
21958        SL.UEM.BASC.FE.ZS
21959        SL.UEM.BASC.MA.ZS
21960           SL.UEM.BASC.ZS
21961        SL.UEM.INTM.FE.ZS
21962        SL.UEM.INTM.MA.ZS
21963           SL.UEM.INTM.ZS
21964        SL.UEM.LTRM.FE.ZS
21965        SL.UEM.LTRM.MA.ZS
21966           SL.UEM.LTRM.ZS
21973        SL.UEM.PRIM.FE.ZS
21974        SL.UEM.PRIM.MA.ZS
21975           SL.UEM.PRIM.ZS
21976        SL.UEM.SECO.FE.ZS
21977        SL.UEM.SECO.MA.ZS
21978           SL.UEM.SECO.ZS
21979        SL.UEM.TERT.FE.ZS
21980        SL.UEM.TERT.MA.ZS
21981           SL.UEM.TERT.ZS
21983     SL.UEM.TOTL.FE.NE.ZS
21984        SL.UEM.TOTL.FE.ZS
21985     SL.UEM.TOTL.MA.NE.ZS
21986        SL.UEM.TOTL.MA.ZS
21987        SL.UEM.TOTL.NE.ZS
21988           SL.UEM.TOTL.ZS
22714                 UNEMPSA_
                                                                                                                 name
11786                                     Youth unemployment rate, aged 15-24, female (% of female youth labor force)
11787       Youth unemployment rate, aged 15-24, above primary education (% of youth labor force with high education)
11788    Youth unemployment rate, aged 15-24, primary education and below (% of youth labor force with low education)
11789                                         Youth unemployment rate, aged 15-24, male (% of male youth labor force)
11790                                       Youth unemployment rate, aged 15-24, rural (% of rural youth labor force)
11791                                       Youth unemployment rate, aged 15-24, urban (% of urban youth labor force)
11792                                       Youth unemployment rate, aged 15-24, total (% of total youth labor force)
11793                                  Unemployment rate, aged 15-64, female (% of female labor force in working age)
11794    Unemployment rate, aged 15-64, above primary education (% of labor force with high education in working age)
11795 Unemployment rate, aged 15-64, primary education and below (% of labor force with low education in working age)
11796                                      Unemployment rate, aged 15-64, male (% of male labor force in working age)
11797                                                     Unemployment rate, aged 25-64 (% of labor force aged 25-64)
11798                                    Unemployment rate, aged 15-64, rural (% of rural labor force in working age)
11799                                    Unemployment rate, aged 15-64, urban (% of urban labor force in working age)
11800                                                     Unemployment rate, aged 15-24 (% of labor force aged 15-24)
11801                                    Unemployment rate, aged 15-64, total (% of total labor force in working age)
11930            Benefit incidence of unemployment benefits and ALMP to poorest quintile (% of total U/ALMP benefits)
11931                                                    Coverage of unemployment benefits and ALMP (% of population)
11932                     Generosity of unemployment benefits and ALMP (% of total welfare of beneficiary households)
15182                       Adequacy of unemployment benefits and ALMP (% of total welfare of beneficiary households)
15234            Benefit incidence of unemployment benefits and ALMP to poorest quintile (% of total U/ALMP benefits)
15282                                                    Coverage of unemployment benefits and ALMP (% of population)
15286                                Coverage of unemployment benefits and ALMP in poorest quintile (% of population)
15290                                    Coverage of unemployment benefits and ALMP in 2nd quintile (% of population)
15294                                    Coverage of unemployment benefits and ALMP in 3rd quintile (% of population)
15298                                    Coverage of unemployment benefits and ALMP in 4th quintile (% of population)
15302                                Coverage of unemployment benefits and ALMP in richest quintile (% of population)
21947                             Unemployment, youth female (% of female labor force ages 15-24) (national estimate)
21948                          Unemployment, youth female (% of female labor force ages 15-24) (modeled ILO estimate)
21949                                         Ratio of female to male youth unemployment rate (%) (national estimate)
21950                           Ratio of female to male youth unemployment rate (% ages 15-24) (modeled ILO estimate)
21951                                 Unemployment, youth male (% of male labor force ages 15-24) (national estimate)
21952                              Unemployment, youth male (% of male labor force ages 15-24) (modeled ILO estimate)
21953                               Unemployment, youth total (% of total labor force ages 15-24) (national estimate)
21954                            Unemployment, youth total (% of total labor force ages 15-24) (modeled ILO estimate)
21955                  Unemployment with advanced education, female (% of female labor force with advanced education)
21956                      Unemployment with advanced education, male (% of male labor force with advanced education)
21957                           Unemployment with advanced education (% of total labor force with advanced education)
21958                        Unemployment with basic education, female (% of female labor force with basic education)
21959                            Unemployment with basic education, male (% of male labor force with basic education)
21960                                 Unemployment with basic education (% of total labor force with basic education)
21961          Unemployment with intermediate education, female (% of female labor force with intermediate education)
21962              Unemployment with intermediate education, male (% of male labor force with intermediate education)
21963                   Unemployment with intermediate education (% of total labor force with intermediate education)
21964                                                       Long-term unemployment, female (% of female unemployment)
21965                                                           Long-term unemployment, male (% of male unemployment)
21966                                                                Long-term unemployment (% of total unemployment)
21973                                          Unemployment with primary education, female (% of female unemployment)
21974                                              Unemployment with primary education, male (% of male unemployment)
21975                                                   Unemployment with primary education (% of total unemployment)
21976                                        Unemployment with secondary education, female (% of female unemployment)
21977                                            Unemployment with secondary education, male (% of male unemployment)
21978                                                 Unemployment with secondary education (% of total unemployment)
21979                                         Unemployment with tertiary education, female (% of female unemployment)
21980                                             Unemployment with tertiary education, male (% of male unemployment)
21981                                                  Unemployment with tertiary education (% of total unemployment)
21983                                              Unemployment, female (% of female labor force) (national estimate)
21984                                           Unemployment, female (% of female labor force) (modeled ILO estimate)
21985                                                  Unemployment, male (% of male labor force) (national estimate)
21986                                               Unemployment, male (% of male labor force) (modeled ILO estimate)
21987                                                Unemployment, total (% of total labor force) (national estimate)
21988                                             Unemployment, total (% of total labor force) (modeled ILO estimate)
22714                                                                                    Unemployment rate,Percent,,,

Look carefully at:

  • Indicator code
  • Description
  • Units

Example codes:

  • NY.GDP.MKTP.KD.ZG
  • FP.CPI.TOTL.ZG
  • SL.UEM.TOTL.ZS

7.4 Step 2 — Download 3 Variables for 5 Countries (2000–2023)

macro_panel <- WDI(
  country = c("TR", "GR", "BG", "RO", "PL"),
  indicator = c(
    gdp_growth = "NY.GDP.MKTP.KD.ZG",
    inflation = "FP.CPI.TOTL.ZG",
    unemployment = "SL.UEM.TOTL.ZS"
  ),
  start = 2000,
  end = 2023
)

glimpse(macro_panel)
Rows: 120
Columns: 7
$ country      <chr> "Bulgaria", "Bulgaria", "Bulgaria", "Bulgaria", "Bulgaria…
$ iso2c        <chr> "BG", "BG", "BG", "BG", "BG", "BG", "BG", "BG", "BG", "BG…
$ iso3c        <chr> "BGR", "BGR", "BGR", "BGR", "BGR", "BGR", "BGR", "BGR", "…
$ year         <int> 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 200…
$ gdp_growth   <dbl> 4.5872254, 3.8237036, 5.8719311, 5.2371536, 6.5104258, 7.…
$ inflation    <dbl> 10.3162621, 7.3609393, 5.8101437, 2.3486417, 6.1471307, 5…
$ unemployment <dbl> 16.218, 19.921, 18.110, 13.733, 12.037, 10.083, 8.951, 6.…

You now have a cross-country panel dataset.


7.5 Step 3 — Clean and Save Properly

library(dplyr)

macro_panel_clean <- macro_panel |>
  select(country, year, gdp_growth, inflation, unemployment)

write_csv(macro_panel_clean, "data/macro_panel_clean.csv")

8 In-Class Assignment

  1. Choose ONE new indicator using WDIsearch()

  2. Download it for:

    • Turkey
    • Two neighboring countries
  3. Years: 2000–2023

  4. Save cleaned data to `data/

  5. Write 4–6 sentences:

    • What trend do you observe?
    • Any structural breaks?
    • Does Turkey differ?
    • Possible economic explanation?

9 Submission

Upload:

  • scripts/week02_workflow.R
  • Clean CSV file
  • Rendered Quarto report
  • Short interpretation
  • AI-use log (if used)

10 Reflection

Why is an API-based workflow superior to downloading Excel files manually?

Write 3 sentences in your report.


11 By the End of Week 2

You now:

  • Work inside a semester-long research project
  • Use professional folder structure
  • Import data reproducibly
  • Build panel datasets
  • Save clean versions systematically

This is how real economic research is done.