For the initial draft, I will use 2 datasets: the World Bank Sovereign ESG dataset and the IMF WB Country Groups dataset. First, I will load each dataset and review the data. Next, I will convert the country_name column in the IMF WB Country Groups dataset into iso3c codes Then, I will join the two datasets using the iso3c codes Finally, I will have a combined dataset that includes ESG indicator data by country, with country group tags.
folder_path <- partial(here, "00_data_raw")
folder_path() %>% list.files()
## [1] "imf_wb_country_groups.csv" "World Bank Sovereign ESG dataset"
WB_sovereign_ESG <- folder_path("World Bank Sovereign ESG dataset/ESGData2.csv") %>%
read_csv()
## Rows: 16013 Columns: 67
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): Country Name, iso3c, Indicator Name, Indicator Code
## dbl (63): 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969, 1970, ...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
imf_wb_country_groups <- folder_path("imf_wb_country_groups.csv") %>%
read_csv()
## Rows: 2587 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): country_name, country_group, group_type
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
glimpse(WB_sovereign_ESG)
## Rows: 16,013
## Columns: 67
## $ `Country Name` <chr> "Arab World", "Arab World", "Arab World", "Arab World…
## $ iso3c <chr> "ARB", "ARB", "ARB", "ARB", "ARB", "ARB", "ARB", "ARB…
## $ `Indicator Name` <chr> "Access to clean fuels and technologies for cooking (…
## $ `Indicator Code` <chr> "EG.CFT.ACCS.ZS", "EG.ELC.ACCS.ZS", "NY.ADJ.DRES.GN.Z…
## $ `1960` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ `1961` <dbl> NA, NA, NA, NA, 30.981194, NA, NA, NA, NA, NA, NA, NA…
## $ `1962` <dbl> NA, NA, NA, NA, 30.982443, NA, NA, NA, NA, NA, NA, NA…
## $ `1963` <dbl> NA, NA, NA, NA, 31.006834, NA, NA, NA, NA, NA, NA, NA…
## $ `1964` <dbl> NA, NA, NA, NA, 31.01778, NA, NA, NA, NA, NA, NA, NA,…
## $ `1965` <dbl> NA, NA, NA, NA, 31.042245, NA, NA, NA, NA, NA, NA, NA…
## $ `1966` <dbl> NA, NA, NA, NA, 31.05018, NA, NA, NA, NA, NA, NA, NA,…
## $ `1967` <dbl> NA, NA, NA, NA, 31.103003, NA, NA, NA, NA, NA, NA, NA…
## $ `1968` <dbl> NA, NA, NA, NA, 31.133345, 14.660731, NA, NA, NA, NA,…
## $ `1969` <dbl> NA, NA, NA, NA, 31.190209, 14.295950, NA, NA, NA, NA,…
## $ `1970` <dbl> NA, NA, 6.8192031, 0.1745661, 31.2542726, 14.6376317,…
## $ `1971` <dbl> NA, NA, 6.8625877, 0.1392865, 31.3863673, 13.9126712,…
## $ `1972` <dbl> NA, NA, 8.9932942, 0.1320823, 31.4997277, 13.7760100,…
## $ `1973` <dbl> NA, NA, 12.2730219, 0.1524257, 31.4965877, 12.3616551…
## $ `1974` <dbl> NA, NA, 25.24123751, 0.08739206, 31.55058649, 7.77632…
## $ `1975` <dbl> NA, NA, 17.17663885, 0.09875801, 31.52950124, 7.98857…
## $ `1976` <dbl> NA, NA, 18.92687785, 0.06634076, 31.59973643, 7.44355…
## $ `1977` <dbl> NA, NA, 19.2409638, 0.1024133, 31.6219972, 7.1412275,…
## $ `1978` <dbl> NA, NA, 16.3069297, 0.1071046, 31.6660778, 7.1654123,…
## $ `1979` <dbl> NA, NA, 32.75748927, 0.07422815, 31.67849388, 5.67074…
## $ `1980` <dbl> NA, NA, 27.88409161, 0.05775981, 31.75886761, 4.75050…
## $ `1981` <dbl> NA, NA, 20.08867813, 0.05511109, 31.45424086, 5.07641…
## $ `1982` <dbl> NA, NA, 10.9940045, 0.1141440, 31.4801559, 5.9800629,…
## $ `1983` <dbl> NA, NA, 9.65310503, 0.08148845, 31.52860052, 6.738849…
## $ `1984` <dbl> NA, NA, 9.7577998, 0.0795408, 31.9421227, 7.3157059, …
## $ `1985` <dbl> NA, NA, 8.32055276, 0.03830571, 32.44202971, 8.068573…
## $ `1986` <dbl> NA, NA, 5.59763556, 0.08965615, 33.02639234, 9.621148…
## $ `1987` <dbl> NA, NA, 7.36173263, 0.08506867, 33.58285192, 10.27551…
## $ `1988` <dbl> NA, NA, 6.5508191, 0.0897014, 34.1868305, 10.5341524,…
## $ `1989` <dbl> NA, NA, 9.05466681, 0.08732953, 34.69763715, 11.08920…
## $ `1990` <dbl> NA, NA, 7.98566651, 0.06761466, 35.10930645, 10.38169…
## $ `1991` <dbl> NA, NA, 8.43729462, 0.07275225, 35.15972003, 12.27688…
## $ `1992` <dbl> NA, NA, 8.51357510, 0.05785795, 35.32097250, 9.225802…
## $ `1993` <dbl> NA, NA, 8.14964549, 0.04460087, 36.09585061, 9.263672…
## $ `1994` <dbl> NA, NA, 7.44516045, 0.04575975, 36.75422959, 10.22704…
## $ `1995` <dbl> NA, NA, 7.67706491, 0.06196782, 37.38644379, 10.38116…
## $ `1996` <dbl> NA, 76.61107378, 8.91043066, 0.05868878, 37.98259419,…
## $ `1997` <dbl> NA, 77.25362114, 7.75296001, 0.05524078, 38.49337516,…
## $ `1998` <dbl> NA, 78.11157597, 5.40270002, 0.07836069, 39.11826905,…
## $ `1999` <dbl> NA, 78.69106128, 7.00074834, 0.03706546, 39.64700699,…
## $ `2000` <dbl> 75.31875484, 80.73614121, 9.72850267, 0.02122214, 39.…
## $ `2001` <dbl> 76.65447960, 81.58623137, 7.51233649, 0.02455567, 39.…
## $ `2002` <dbl> 77.85548458, 81.54022203, 7.39429040, 0.02680793, 39.…
## $ `2003` <dbl> 79.04554536, 82.50815948, 8.71632561, 0.03261723, 39.…
## $ `2004` <dbl> 80.03416635, 82.50368475, 10.19516060, 0.02505279, 40…
## $ `2005` <dbl> 81.02576797, 83.21298163, 12.94744287, 0.02041118, 40…
## $ `2006` <dbl> 81.96581490, 85.45900327, 12.83170086, 0.02157661, 40…
## $ `2007` <dbl> 82.78718917, 83.76443999, 11.44096571, 0.01803353, 40…
## $ `2008` <dbl> 83.47573318, 83.38698965, 13.30778857, 0.02509408, 40…
## $ `2009` <dbl> 84.15742348, 84.31348380, 8.26486020, 0.02885193, 40.…
## $ `2010` <dbl> 84.68334427, 87.11486281, 9.45136425, 0.03003109, 40.…
## $ `2011` <dbl> 85.1927423, 87.3326612, 13.4275913, 0.0309041, 40.172…
## $ `2012` <dbl> 85.64421787, 87.03958839, 12.85922547, 0.03226624, 36…
## $ `2013` <dbl> 85.93256689, 88.99261981, 11.63839653, 0.06262475, 36…
## $ `2014` <dbl> 86.23238390, 88.01535618, 10.29789118, 0.08494782, 36…
## $ `2015` <dbl> 86.47859724, 88.68188567, 6.23770324, 0.09978365, 36.…
## $ `2016` <dbl> 86.72268517, 89.19506235, 5.20389537, 0.09505614, 36.…
## $ `2017` <dbl> 86.93793290, 90.32465949, 6.48008284, 0.09549819, 36.…
## $ `2018` <dbl> 87.04077373, 88.91074940, 8.47907008, 0.05105818, 36.…
## $ `2019` <dbl> 87.23553889, 89.99994555, 7.43714482, 0.06219453, 36.…
## $ `2020` <dbl> 87.30706752, 90.27773508, 4.37637424, 0.07908057, 36.…
## $ `2021` <dbl> NA, NA, NA, NA, NA, 5.237110, NA, NA, NA, NA, NA, NA,…
## $ `2050` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
glimpse(imf_wb_country_groups)
## Rows: 2,587
## Columns: 3
## $ country_name <chr> "Australia", "Austria", "Belgium", "Canada", "Switzerlan…
## $ country_group <chr> "Advanced Economies", "Advanced Economies", "Advanced Ec…
## $ group_type <chr> "IMF", "IMF", "IMF", "IMF", "IMF", "IMF", "IMF", "IMF", …
country_name_regex_to_iso3c <- function(country_name) {
country_name %>%
countrycode(origin = "country.name",
destination = "iso3c",
origin_regex = TRUE)
}
imf_wb_country_groups_iso3c <- imf_wb_country_groups %>%
mutate(iso3c = country_name_regex_to_iso3c(country_name))
## Warning in countrycode_convert(sourcevar = sourcevar, origin = origin, destination = dest, : Some values were not matched unambiguously: Channel Islands, Kosovo
imf_wb_country_groups_iso3c
## # A tibble: 2,587 × 4
## country_name country_group group_type iso3c
## <chr> <chr> <chr> <chr>
## 1 Australia Advanced Economies IMF AUS
## 2 Austria Advanced Economies IMF AUT
## 3 Belgium Advanced Economies IMF BEL
## 4 Canada Advanced Economies IMF CAN
## 5 Switzerland Advanced Economies IMF CHE
## 6 Cyprus Advanced Economies IMF CYP
## 7 Czechia Advanced Economies IMF CZE
## 8 Germany Advanced Economies IMF DEU
## 9 Denmark Advanced Economies IMF DNK
## 10 Spain Advanced Economies IMF ESP
## # … with 2,577 more rows
WB_sovereign_ESG_country_groups <- imf_wb_country_groups_iso3c %>% left_join(WB_sovereign_ESG, by = "iso3c")
WB_sovereign_ESG_country_groups
## # A tibble: 166,201 × 70
## country_…¹ count…² group…³ iso3c Count…⁴ Indic…⁵ Indic…⁶ `1960` `1961` `1962`
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 Australia Advanc… IMF AUS Austra… Access… EG.CFT… NA NA NA
## 2 Australia Advanc… IMF AUS Austra… Access… EG.ELC… NA NA NA
## 3 Australia Advanc… IMF AUS Austra… Adjust… NY.ADJ… NA NA NA
## 4 Australia Advanc… IMF AUS Austra… Adjust… NY.ADJ… NA NA NA
## 5 Australia Advanc… IMF AUS Austra… Agricu… AG.LND… NA 61.8 62.0
## 6 Australia Advanc… IMF AUS Austra… Agricu… NV.AGR… NA NA NA
## 7 Australia Advanc… IMF AUS Austra… Annual… ER.H2O… NA NA NA
## 8 Australia Advanc… IMF AUS Austra… Annual… SI.SPR… NA NA NA
## 9 Australia Advanc… IMF AUS Austra… Cause … SH.DTH… NA NA NA
## 10 Australia Advanc… IMF AUS Austra… Childr… SL.TLF… NA NA NA
## # … with 166,191 more rows, 60 more variables: `1963` <dbl>, `1964` <dbl>,
## # `1965` <dbl>, `1966` <dbl>, `1967` <dbl>, `1968` <dbl>, `1969` <dbl>,
## # `1970` <dbl>, `1971` <dbl>, `1972` <dbl>, `1973` <dbl>, `1974` <dbl>,
## # `1975` <dbl>, `1976` <dbl>, `1977` <dbl>, `1978` <dbl>, `1979` <dbl>,
## # `1980` <dbl>, `1981` <dbl>, `1982` <dbl>, `1983` <dbl>, `1984` <dbl>,
## # `1985` <dbl>, `1986` <dbl>, `1987` <dbl>, `1988` <dbl>, `1989` <dbl>,
## # `1990` <dbl>, `1991` <dbl>, `1992` <dbl>, `1993` <dbl>, `1994` <dbl>, …
write.csv(WB_sovereign_ESG_country_groups,"C:/Users/Gen Shiraishi/Desktop/Sustainable Finance - Application and Methods/Final project/03_data_processed\\WB_sovereign_ESG_country_groups.csv", row.names = FALSE)
Note to self: review the joined dataset and confirm that the country-level data of interest from the ESG dataset was kept through the left join function. Also check to see if the joined dataset should be pivoted to facilitate analysis.