Data collection is a pivotal part in the field of data science. So, it is important that one must know how to identify relevant data stored in various formats and various locations, perform proper data acquisition, and store it for further processing.
Statistical offices, central banks, and other multinational organizations have widely used SDMX to disseminate their data and the interest to implement SDMX among other institutions is continuously growing 1.
In this vignette, we are going to perform data collection on SDMX based data from those institutions using the RJSDMX package.
Statistical Data and Metadata eXchange (SDMX) was initiated by 7 institutions so-called SDMX Sponsors that comprise of BIS, ECB, Eurostat, IMF, OECD, UNSD, and the World Bank. During its first emergence in 2002, SDMX aimed to replace the fragmented data exchange mechanism among these institutions. Nowadays, SDMX not only a mere data exchange format, but SDMX has transformed into an information model of any data or statistical domains within organizations 2.
In SDMX, each dataset is uniquely identified by using SDMX Key (or series key or just “key”) that build based on a set of dimensions within a Data Structure Definition (DSD). A DSD expressed in dimensions, measures/observation value, and attributes.
For example, IMF publishes an annual Financial Access Survey (FAS) that provides data on access and use of basic financial services. The SDMX key for the value of mobile money transactions (% of GDP) in Indonesia is DS-FAS.A.ID.FCMTVG_GDP_PT. The DSD for this data consists of 3 dimensions:
1. Frequency, with “A” for Annual
2. Area/Country, with “ID” for Indonesia
3. Indicator, with “FCMTVG_GDP_PT” for Use of Financial Services, Value of mobile money transactions during the reference year
(here are the complete codelist value for DSD above)
Mind that the first element (in this case: DS-FAS) in SDMX key often called a dataflow.
Make sure java is installed on the computer before using RJSDMX.
Execute the following to load RJSDMX (some tidyverse functionality will be used to manipulate RJSDMX outputs):
library(RJSDMX)
library(tidyverse)
By default, RJSDMX provides resources for 21 SDMX Data Providers than can be identified by calling getProviders
function
getProviders()
## [1] "ABS" "ECB" "EUROSTAT" "ILO"
## [5] "ILO_Legacy" "IMF2" "IMF_SDMX_CENTRAL" "INEGI"
## [9] "INSEE" "ISTAT" "ISTAT_CENSUS_AGR" "ISTAT_CENSUS_IND"
## [13] "ISTAT_CENSUS_POP" "NBB" "OECD" "OECD_RESTR"
## [17] "StatsEE" "UIS" "UNDATA" "WB"
## [21] "WITS"
providers | desc |
---|---|
ABS | Australian Bureau of Statistics |
ECB | European Central Bank |
EUROSTAT | Eurostat/Statistical Office of the European Union |
ILO | International Labour Organization |
IMF2 | International Monetary Fund (new endpoint) |
IMF_SDMX_CENTRAL | International Monetary Fund |
INEGI | The National Institute of Statistics and Geography, Mexico |
INSEE | The National Institute of Statistics and Economic Studies, France |
ISTAT | The Italian National Institute of Statistics, Italy |
NBB | National Bank of Belgium |
OECD | Organisation for Economic Co-operation and Development |
StatsEE | Statistics Estonia |
UIS | UNESCO Institute for Statistics |
UNDATA | United Nations |
WB | World Bank |
WITS | World Integrated Trade Solution |
To add another data provider, first must locate the SDMX API request path/URL, for example, FAO’s SDMX API address is http://data.fao.org/sdmx, UNICEF: https://api.data.unicef.org/SDMX/rest. Call addProvider
function to add a new data provider
addProvider("FAO","http://data.fao.org/sdmx")
addProvider("UNICEF", "https://api.data.unicef.org/SDMX/rest")
getProviders()
## [1] "ABS" "ECB" "EUROSTAT" "FAO"
## [5] "ILO" "ILO_Legacy" "IMF2" "IMF_SDMX_CENTRAL"
## [9] "INEGI" "INSEE" "ISTAT" "ISTAT_CENSUS_AGR"
## [13] "ISTAT_CENSUS_IND" "ISTAT_CENSUS_POP" "NBB" "OECD"
## [17] "OECD_RESTR" "StatsEE" "UIS" "UNDATA"
## [21] "UNICEF" "WB" "WITS"
If you looking for particular datasets and know the SDMX key, then skip this step and jump to step 3.3.
But, if the SDMX key is unknown, then first we need to retrieve the dataflows published by the data provider. Use wildcard character *
to match any strings of any length. For instance, we are looking for dataflows that contain ‘Financial’ and ‘Access’:
org <- "IMF2"
#to retrieve all data flows in IMF: data_flows <- getFlows(org)
#to retrieve particular dataflows based on a certain pattern
data_flows <- getFlows(org, pattern = "*Financial*Access*")
data_flows
## $`DS-FAS_2015`
## [1] "Financial Access Survey (FAS), 2015"
##
## $`DS-FAS_2018`
## [1] "Financial Access Survey (FAS), 2018"
##
## $`DS-FAS`
## [1] "Financial Access Survey (FAS)"
##
## $`DS-FAS_2016`
## [1] "Financial Access Survey (FAS), 2016"
##
## $`DS-FAS_2017`
## [1] "Financial Access Survey (FAS), 2017"
We found that there are 5 dataflows related to Financial Access Survey (FAS) from IMF. We are going to use the DS-FAS dataflow.
After that, we need to retrieve the DSD that consist of dimensions and corresponding codelists. Unfortunately, there are some inconsistencies in naming the dataflow’s id between organizations.
1. Case 1: id_dataflow, e.g: OECD, IMF
getFlows("IMF2")[1]
## $`DS-BOP_2017M09`
## [1] "Balance of Payments (BOP), 2017 M09"
2. Case 2: provider, id_dataflow, version, e.g: Wordbank, ECB.
getFlows("WB")[2]
## $`WB,WDI,1.0`
## [1] "World Development Indicators"
getFlows("ECB")[1]
## $`ECB,SHS,1.0`
## [1] "Securities Holding Statistics"
So, we need to parse the dataflow and use getDimensions
functions to get the list of dimensions
#parse "Financial Access Survey (FAS)" dataflow
fas_dataflow <- strsplit(names(data_flows[3]),",")
idx = 1
#if Case 2 found, then get the second element
ifelse(length(fas_dataflow[[1]]) > 1, idx <-2, idx <-1)
fas_dataflow <- fas_dataflow[[1]][idx]
#get list of dimensions of "Average duration of unemployment" DSD
fas_dimensions <- getDimensions(org, fas_dataflow)
fas_dimensions
## [[1]]
## [[1]]$FREQ
## [1] "IMF/CL_FREQ"
##
##
## [[2]]
## [[2]]$REF_AREA
## [1] "IMF/CL_AREA_FAS"
##
##
## [[3]]
## [[3]]$INDICATOR
## [1] "IMF/CL_INDICATOR_FAS"
We found that there 3 dimensions in the Financial Access Survey (FAS) datasets. Then, fetch the codelist for each dimension
#use flatten to remove first level of indices
fas_codelist <- map(.x = names(flatten(fas_dimensions)),
flow = fas_dataflow,
provider = org,
.f = getCodes) %>% set_names(names(flatten(fas_dimensions)))
Here are the complete codelist value for IMF’s Financial Access Survey (FAS):
id <chr> | name <chr> | |||
---|---|---|---|---|
A | Annual | |||
Q | Quarterly | |||
B | Bi-annual | |||
D | Daily | |||
W | Weekly | |||
M | Monthly |
id <chr> | name <chr> |
---|---|
1C_ALLC | All Countries |
AE | United Arab Emirates |
AF | Afghanistan |
AG | Antigua and Barbuda |
AI | Anguilla |
AL | Albania |
AM | Armenia |
AO | Angola |
AR | Argentina |
AT | Austria |
id <chr> | name <chr> |
---|---|
FCNAMFA_NUM | Key Indicators, Use of Financial Services, Number of loan accounts with all microfinance institutions per 1,000 adults |
FCMIBTV_XDC | Use of Financial Services, Mobile and internet banking (for commercial banks only), Value of mobile and internet banking transactions (during the reference year), domestic currency |
FCLODCS_XDC | Use of Financial Services, Liabilities: Outstanding Deposits, Commercial banks, of which: SME deposits, Domestic Currency |
FCSODCHM_XDC | Use of Financial Services, Assets: Outstanding Loans, Commercial banks, of which: household sector loans, of which: loans to men, Domestic Currency |
FCNODMFHM_NUM | Use of Financial Services, Number of Loan Accounts, Deposit-taking microfinance institutions, of which: household sector loan accounts, of which: men-owned loan accounts |
FCRODU_PE_NUM | Use of Financial Services, Number of Borrowers, Credit unions and credit cooperatives |
FCAODMFH_NUM | Use of Financial Services, Number of Deposit Accounts, Deposit-taking microfinance institutions, of which: household sector deposit accounts |
FCAODC_NUM | Use of Financial Services, Number of Deposit Accounts, Commercial banks |
FCNODCS_NUM | Use of Financial Services, Number of Loan Accounts, Commercial banks, of which: SME loan accounts |
FCDODC_PE_NUM | Use of Financial Services, Number of Depositors, Commercial banks |
Use the following to retrieve particular dataset/SDMX Key. For instance, collecting value of mobile money transactions (% of GDP) data in Indonesia (DS-FAS.A.ID.FCMTVG_GDP_PT):
#Get entire "DS-FAS.A.ID.FCMTVG_GDP_PT" data series without metadata (dimensions with their respective
#codelist and attributes)
emoney_trx_id <- as_tibble(sdmxdf(getTimeSeries(org,"DS-FAS.A.ID.FCMTVG_GDP_PT")))
emoney_trx_id
## # A tibble: 12 x 3
## ID TIME_PERIOD OBS_VALUE
## <chr> <chr> <dbl>
## 1 DS-FAS.A.ID.FCMTVG_GDP_PT 2007 0.000133
## 2 DS-FAS.A.ID.FCMTVG_GDP_PT 2008 0.00155
## 3 DS-FAS.A.ID.FCMTVG_GDP_PT 2009 0.00926
## 4 DS-FAS.A.ID.FCMTVG_GDP_PT 2010 0.0101
## 5 DS-FAS.A.ID.FCMTVG_GDP_PT 2011 0.0125
## 6 DS-FAS.A.ID.FCMTVG_GDP_PT 2012 0.0229
## 7 DS-FAS.A.ID.FCMTVG_GDP_PT 2013 0.0305
## 8 DS-FAS.A.ID.FCMTVG_GDP_PT 2014 0.0314
## 9 DS-FAS.A.ID.FCMTVG_GDP_PT 2015 0.0458
## 10 DS-FAS.A.ID.FCMTVG_GDP_PT 2016 0.0570
## 11 DS-FAS.A.ID.FCMTVG_GDP_PT 2017 0.0911
## 12 DS-FAS.A.ID.FCMTVG_GDP_PT 2018 0.318
#Get "ADS-FAS.A.ID.FCMTVG_GDP_PT" data series without metadata for a certain period
emoney_trx_id <- as_tibble(sdmxdf(getTimeSeries(org,"DS-FAS.A.ID.FCMTVG_GDP_PT", start = "2015", end = "2019")))
emoney_trx_id
## # A tibble: 4 x 3
## ID TIME_PERIOD OBS_VALUE
## <chr> <chr> <dbl>
## 1 DS-FAS.A.ID.FCMTVG_GDP_PT 2015 0.0458
## 2 DS-FAS.A.ID.FCMTVG_GDP_PT 2016 0.0570
## 3 DS-FAS.A.ID.FCMTVG_GDP_PT 2017 0.0911
## 4 DS-FAS.A.ID.FCMTVG_GDP_PT 2018 0.318
#Get entire "DS-FAS.A.ID.FCMTVG_GDP_PT" data series with metadata (including attributes)
emoney_trx_id <- as_tibble(sdmxdf(getTimeSeries(org,"DS-FAS.A.ID.FCMTVG_GDP_PT"), meta = TRUE))
emoney_trx_id
## # A tibble: 12 x 11
## ID FREQ REF_AREA INDICATOR UNIT_MULT CONNECTORS_AUTO~ TIME_FORMAT
## <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 DS-F~ A ID FCMTVG_G~ 0 IMF,DS-FAS,1.0 ~ P1Y
## 2 DS-F~ A ID FCMTVG_G~ 0 IMF,DS-FAS,1.0 ~ P1Y
## 3 DS-F~ A ID FCMTVG_G~ 0 IMF,DS-FAS,1.0 ~ P1Y
## 4 DS-F~ A ID FCMTVG_G~ 0 IMF,DS-FAS,1.0 ~ P1Y
## 5 DS-F~ A ID FCMTVG_G~ 0 IMF,DS-FAS,1.0 ~ P1Y
## 6 DS-F~ A ID FCMTVG_G~ 0 IMF,DS-FAS,1.0 ~ P1Y
## 7 DS-F~ A ID FCMTVG_G~ 0 IMF,DS-FAS,1.0 ~ P1Y
## 8 DS-F~ A ID FCMTVG_G~ 0 IMF,DS-FAS,1.0 ~ P1Y
## 9 DS-F~ A ID FCMTVG_G~ 0 IMF,DS-FAS,1.0 ~ P1Y
## 10 DS-F~ A ID FCMTVG_G~ 0 IMF,DS-FAS,1.0 ~ P1Y
## 11 DS-F~ A ID FCMTVG_G~ 0 IMF,DS-FAS,1.0 ~ P1Y
## 12 DS-F~ A ID FCMTVG_G~ 0 IMF,DS-FAS,1.0 ~ P1Y
## # ... with 4 more variables: IS_NUMERIC <lgl>, IS_ERROR <lgl>,
## # TIME_PERIOD <chr>, OBS_VALUE <dbl>
Mind that we are going to use the results from step 3.1 and 3.2 above to construct the SDMX key of the datasets. For example, since Financial Access Survey (FAS) has 3 dimensions, so the SDMX key sequences are: id_dataflow.codelist_dimension_1.codelist_dimension_2.codelist_dimension_3. The sequence is matters.
To retrieve multiple datasets, use +
to select more than one codelist values within a dimension or use *
to select all codelist values within the dimension.
#Get FCMTVG_GDP_PT and FCBODCA_NUM (number of commecial bank branches) data for 4 ASEAN countries: Indonesia,
#Malaysia, Singapore, and Thailand
emoney_trx_asean <- as_tibble(sdmxdf(getTimeSeries(org,"DS-FAS.A.ID+MY+SG+TH.FCMTVG_GDP_PT+FCBODCA_NUM"),
meta = TRUE))
emoney_trx_asean
## # A tibble: 102 x 12
## ID FREQ REF_AREA INDICATOR UNIT_MULT CONNECTORS_AUTO~ TIME_FORMAT
## <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 DS-F~ A ID FCBODCA_~ 0 IMF,DS-FAS,1.0 ~ P1Y
## 2 DS-F~ A ID FCBODCA_~ 0 IMF,DS-FAS,1.0 ~ P1Y
## 3 DS-F~ A ID FCBODCA_~ 0 IMF,DS-FAS,1.0 ~ P1Y
## 4 DS-F~ A ID FCBODCA_~ 0 IMF,DS-FAS,1.0 ~ P1Y
## 5 DS-F~ A ID FCBODCA_~ 0 IMF,DS-FAS,1.0 ~ P1Y
## 6 DS-F~ A ID FCBODCA_~ 0 IMF,DS-FAS,1.0 ~ P1Y
## 7 DS-F~ A ID FCBODCA_~ 0 IMF,DS-FAS,1.0 ~ P1Y
## 8 DS-F~ A ID FCBODCA_~ 0 IMF,DS-FAS,1.0 ~ P1Y
## 9 DS-F~ A ID FCBODCA_~ 0 IMF,DS-FAS,1.0 ~ P1Y
## 10 DS-F~ A ID FCBODCA_~ 0 IMF,DS-FAS,1.0 ~ P1Y
## # ... with 92 more rows, and 5 more variables: IS_NUMERIC <lgl>,
## # IS_ERROR <lgl>, TIME_PERIOD <chr>, OBS_VALUE <dbl>, OBS_STATUS <chr>
#Get entire FAS indicators data for 4 ASEAN countries: Indonesia, Malaysia, Singapore, and Thailad
all_indicators <- as_tibble(sdmxdf(getTimeSeries(org,"DS-FAS.A.ID+MY+SG+TH.*"), meta = TRUE))
all_indicators
## # A tibble: 6,170 x 12
## ID FREQ REF_AREA INDICATOR UNIT_MULT CONNECTORS_AUTO~ TIME_FORMAT
## <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 DS-F~ A ID FCAA_NUM 0 IMF,DS-FAS,1.0 ~ P1Y
## 2 DS-F~ A ID FCAA_NUM 0 IMF,DS-FAS,1.0 ~ P1Y
## 3 DS-F~ A ID FCAA_NUM 0 IMF,DS-FAS,1.0 ~ P1Y
## 4 DS-F~ A ID FCAA_NUM 0 IMF,DS-FAS,1.0 ~ P1Y
## 5 DS-F~ A ID FCAA_NUM 0 IMF,DS-FAS,1.0 ~ P1Y
## 6 DS-F~ A ID FCAA_NUM 0 IMF,DS-FAS,1.0 ~ P1Y
## 7 DS-F~ A ID FCAA_NUM 0 IMF,DS-FAS,1.0 ~ P1Y
## 8 DS-F~ A ID FCAA_NUM 0 IMF,DS-FAS,1.0 ~ P1Y
## 9 DS-F~ A ID FCAA_NUM 0 IMF,DS-FAS,1.0 ~ P1Y
## 10 DS-F~ A ID FCAA_NUM 0 IMF,DS-FAS,1.0 ~ P1Y
## # ... with 6,160 more rows, and 5 more variables: IS_NUMERIC <lgl>,
## # IS_ERROR <lgl>, TIME_PERIOD <chr>, OBS_VALUE <dbl>, OBS_STATUS <chr>
Regarding this dataset, we’re going to perform these steps:
1. Indicators are variables so we need to spread them into columns
2. In terms of consistency, change the column name into a lower case
3. Convert frequency and country/area into factors
#spread the indicators into multiple columns
all_indicators <- pivot_wider(all_indicators, id_cols = c(FREQ, REF_AREA, TIME_PERIOD),
names_from = INDICATOR, values_from = OBS_VALUE)
#change column name case to lower case
names(all_indicators) <- tolower(names(all_indicators))
#convert the dimensions (freq and area) into factor
all_indicators$freq <- as_factor(all_indicators$freq)
all_indicators$ref_area <- as_factor(all_indicators$ref_area)
#unemployment$TIME_PERIOD <- as.POSIXct(unemployment$TIME_PERIOD, format = "%Y")
#since it's annual data, convert TIME_PERIODE to integer instead
all_indicators$time_period <- as.integer(all_indicators$time_period)
glimpse(all_indicators %>% filter(time_period>2008))
## Observations: 40
## Variables: 168
## $ freq <fct> A, A, A, A, A, A, A, A, A, A, A, A, A, A, A, A, A,...
## $ ref_area <fct> ID, ID, ID, ID, ID, ID, ID, ID, ID, ID, MY, MY, MY...
## $ time_period <int> 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 20...
## $ fcaa_num <dbl> 14.11555, 13.03989, 16.44817, 35.72946, 42.01644, ...
## $ fcac_num <dbl> 23944, 22444, 28817, 63671, 76136, 90678, 99286, 1...
## $ fcak_num <dbl> 13.21726, 12.38925, 15.90720, 35.14686, 42.02763, ...
## $ fcaodc_num <dbl> 83693845, 98694633, 109165197, 123638075, 15498352...
## $ fcaodca_num <dbl> 493.3948, 573.4126, 623.0933, 693.8043, 855.2926, ...
## $ fcaodch_num <dbl> 80873565, 95661508, 105599930, 117499661, 15003721...
## $ fcaodcha_num <dbl> 476.7686, 555.7903, 602.7435, 659.3581, 827.9959, ...
## $ fcaodmf_num <dbl> NA, NA, NA, NA, NA, NA, NA, 51589, 44462, 45852, N...
## $ fcaofilp_num <dbl> 1378643, 1654935, 1996023965, 2480798622, 28042487...
## $ fcaofilpa_num <dbl> 8.127424, 9.615117, 11392.909732, 13921.186838, 15...
## $ fcbamfa_num <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcbamfk_num <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcbodc_num <dbl> 12837, 13837, 25646, 29945, 31847, 32739, 32949, 3...
## $ fcbodca_num <dbl> 7.639044, 8.110134, 14.706723, 16.871199, 17.64132...
## $ fcbodck_num <dbl> 7.152912, 7.705471, 14.223022, 16.596102, 17.64602...
## $ fcbodd_num <dbl> 3867, 4196, 4536, 4826, 5080, 5334, 6428, 6528, 66...
## $ fcbodmf_num <dbl> NA, NA, NA, NA, NA, NA, 20, 51, 51, 252, NaN, NaN,...
## $ fcbofnmf_num <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcccc_num <dbl> NA, NA, NA, 14817168, 15091684, 16043347, 16863842...
## $ fcccca_num <dbl> NA, NA, NA, 83.14765, 83.28502, 87.05718, 89.96230...
## $ fccdc_num <dbl> NA, NA, NA, 73219365, 83170125, 98638287, 11294881...
## $ fccdca_num <dbl> NA, NA, NA, 410.8759, 458.9829, 535.2481, 602.5398...
## $ fcdodmf_pe_num <dbl> NA, NA, NA, NA, NA, NA, NA, 51589, 44462, 45852, N...
## $ fcdodu_pe_num <dbl> 500863, 692659, 604548, 2944916, 3172714, 3314757,...
## $ fcdodua_num <dbl> 2.952705, 4.024326, 3.450640, 16.525616, 17.508951...
## $ fcdofilp_pe_num <dbl> 38597121, 34564028, 33859313, 46490489, 49812658, ...
## $ fciodc_num <dbl> 121, 122, 120, 120, 120, 119, 118, 116, 115, 115, ...
## $ fciodd_num <dbl> 1872, 1856, 1824, 1811, 1798, 1806, 1799, 1799, 17...
## $ fciodmf_num <dbl> NA, NA, NA, NA, NA, NA, 20, 129, 180, 183, NaN, Na...
## $ fciodu_num <dbl> 3624, 3624, 3163, 8761, 11850, 12008, NA, NA, NA, ...
## $ fciofi_num <dbl> 142, 142, 139, 137, 141, 141, 146, 146, 152, 151, ...
## $ fciofia_num <dbl> 0.08371232, 0.08250154, 0.07933845, 0.07687857, 0....
## $ fciofmfn_num <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fclodc_xdc <dbl> 1973041660.0, 2338823971.0, 2784911764.0, 32251977...
## $ fclodcg_gdp_pt <dbl> 35.19390, 34.07312, 35.55936, 37.43394, 38.38169, ...
## $ fclodd_xdc <dbl> 2.680249e+07, 3.291554e+07, 4.030476e+07, 4.780742...
## $ fclodmf_xdc <dbl> NA, NA, NA, NA, NA, NA, 31780.0, 108192.1, 210833....
## $ fclodu_xdc <dbl> 5738120, 6159620, 6825620, 6820383, 12490000, 1291...
## $ fclodug_gdp_pt <dbl> 0.10235305, 0.08973631, 0.08715345, 0.07916222, 0....
## $ fclofilp_xdc <dbl> 114955829.3, 140399517.8, 171102183.1, 188842350.0...
## $ fclofinp_xdc <dbl> 9171557.81, 10860395.46, 12838553.17, 22520055.42,...
## $ fcmaab_xdc <dbl> NA, NA, NA, 2.546101e+05, 3.454827e+05, 4.943537e+...
## $ fcmaabg_gdp_pt <dbl> NA, NA, NA, 2.955186e-03, 3.619085e-03, 4.677081e-...
## $ fcmar_num <dbl> 3016272, 7914018, 14299726, 21869946, 36225373, 35...
## $ fcmara_num <dbl> 17.7816290, 45.9801875, 81.6200057, 122.7248361, 1...
## $ fcmibt_num <dbl> NA, NA, NA, NA, 1926975397, 2780117378, 3431900610...
## $ fcmibta_num <dbl> NA, NA, NA, NA, 10634.21, 15085.95, 18307.91, 2293...
## $ fcmibtv_xdc <dbl> NA, NA, NA, NA, 12584415901, 11743408711, 12348281...
## $ fcmibtvg_gdp_pt <dbl> NA, NA, NA, NA, 131.8274, 111.1044, 107.1311, 116....
## $ fcmt_num <dbl> 17436631, 26541982, 41060149, 100623916, 137900779...
## $ fcmta_num <dbl> 102.793019, 154.208053, 234.363204, 564.658625, 76...
## $ fcmtv_xdc <dbl> 5.192126e+05, 6.934670e+05, 9.812970e+05, 1.971550...
## $ fcmtvg_gdp_pt <dbl> 9.261395e-03, 1.010276e-02, 1.252977e-02, 2.288321...
## $ fcnamfa_num <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcnodc_num <dbl> 32641631, 34382012, 37933259, 39440522, 38974819, ...
## $ fcnodca_num <dbl> 192.4301, 199.7584, 216.5155, 221.3234, 215.0866, ...
## $ fcnodch_num <dbl> 32178542, 33829981, 37419521, 38922826, 38374247, ...
## $ fcnodcha_num <dbl> 189.7000, 196.5511, 213.5832, 218.4183, 211.7723, ...
## $ fcnodcs_num <dbl> NA, 6732564, 7241480, 7926250, 9462228, 10975824, ...
## $ fcnodmf_num <dbl> NA, NA, NA, NA, NA, NA, NA, 41303, 41653, 28139, N...
## $ fcnofnmf_num <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcnofnmfh_num <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcnofnmfhf_num <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcnofnmfhm_num <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcramfa_num <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcrodc_pe_num <dbl> 38238384, 46122634, 51238897, 59637795, 67730731, ...
## $ fcrodca_num <dbl> 225.4242, 267.9710, 292.4615, 334.6619, 373.7790, ...
## $ fcrodch_pe_num <dbl> NA, NA, 29478683, 30916019, 33678568, 35494286, 36...
## $ fcrodcha_num <dbl> NA, NA, 168.25849, 173.48755, 185.85866, 192.60523...
## $ fcrodchf_pe_num <dbl> NA, NA, 9890274, 10976298, 12645268, 13662033, 143...
## $ fcrodchffa_num <dbl> NA, NA, 112.87002, 123.17004, 139.55769, 148.27236...
## $ fcrodchm_pe_num <dbl> NA, NA, 19588409, 19939721, 21033300, 21832253, 22...
## $ fcrodchmma_num <dbl> NA, NA, 223.68056, 223.82114, 232.16736, 236.93726...
## $ fcrodcs_pe_num <dbl> NA, NA, 10707581, 11279588, 12740796, 14193993, 14...
## $ fcrodcsp_pt <dbl> NA, NA, 49.20715, 39.27190, 37.41553, 38.27464, 34...
## $ fcrodmf_pe_num <dbl> NA, NA, NA, NA, NA, NA, NA, 41303, 41653, 28139, N...
## $ fcrodu_pe_num <dbl> 5706913, 6769181, 6577194, 2944916, 3172714, 33147...
## $ fcrodua_num <dbl> 33.64359, 39.32872, 37.54132, 16.52562, 17.50895, ...
## $ fcrofnmf_num <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcrofnmfh_num <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcrofnmfhf_num <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcrofnmfhm_num <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcroodc_num <dbl> NA, NA, NA, NA, NA, 19708, 69548, 133811, 204960, ...
## $ fcroodca_num <dbl> NA, NA, NA, NA, NA, 10.694295, 37.101262, 70.30621...
## $ fcroodck_num <dbl> NA, NA, NA, NA, NA, 10.878961, 38.391009, 73.86465...
## $ fcsamfg_pt <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcsodc_xdc <dbl> 1437929582.0, 1765844906.0, 2200094225.0, 27078619...
## $ fcsodcg_gdp_pt <dbl> 25.64890, 25.72568, 28.09207, 31.42937, 34.49432, ...
## $ fcsodch_xdc <dbl> NA, NA, NA, 619264802.0, 688050812.8, 832751802.9,...
## $ fcsodchg_gdp_pt <dbl> NA, NA, NA, 7.187628, 7.207638, 7.878666, 7.949208...
## $ fcsodcs_xdc <dbl> 737385000.0, 926782000.0, 458163635.0, 526396576.0...
## $ fcsodcsg_gdp_pt <dbl> 13.153019, 13.501807, 5.850098, 6.109733, 6.377696...
## $ fcsodd_xdc <dbl> 2.958759e+07, 3.590470e+07, 4.377545e+07, 5.337192...
## $ fcsodmf_xdc <dbl> NA, NA, NA, NA, NA, NA, 23300.0, 155533.6, 330181....
## $ fcsodu_xdc <dbl> 8457488, 9564468, 10643468, 9389257, 20790000, 202...
## $ fcsodug_gdp_pt <dbl> 0.1508595, 0.1393398, 0.1359019, 0.1089784, 0.2177...
## $ fcsofnmf_xdc <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcsofnmfh_xdc <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcsofnmfhf_xdc <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcsofnmfhm_xdc <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcaodmfh_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcaodmfhf_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcaodmfhm_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcaodu_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcaodua_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcaoduh_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcaodus_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcaofi_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcbodu_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcbodua_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcboduk_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcdodc_pe_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcdodca_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcdodch_pe_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcdodcha_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcdodchf_pe_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcdodchffa_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcdodchm_pe_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcdodchmma_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcdodmfh_pe_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcdodmfhf_pe_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcdodmfhm_pe_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcdoduh_pe_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcdodus_pe_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcdofi_pe_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fclodch_xdc <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 375818.3, ...
## $ fclodchg_gdp_pt <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 52.72002, ...
## $ fcloddh_xdc <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1288.8960,...
## $ fclodmfh_xdc <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fclodmfhf_xdc <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fclodmfhm_xdc <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcloduh_xdc <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fclodus_xdc <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fclofi_xdc <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 130578.5, ...
## $ fcnodchf_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcnodchffa_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcnodchm_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcnodchmma_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcnodmfh_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcnodmfhf_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcnodmfhm_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcnodu_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcnodua_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcnoduh_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcnodus_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcrodmfh_pe_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcrodmfhf_pe_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcrodmfhm_pe_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcroduh_pe_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcrodus_pe_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcsoddh_xdc <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1394.086, ...
## $ fcsodmfh_xdc <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcsodmfhf_xdc <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcsodmfhm_xdc <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcsoduh_xdc <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcsodus_xdc <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcmaa_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcmaaa_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcmoa_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcmoaa_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcmoak_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcmor_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcmora_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcmork_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcaofiln_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
Use ggplot2
to generate graphics:
#value of mobile money transaction (% of GDP) in Indonesia, Malaysia, Thailand, and Singapore
all_indicators %>%
#filter ASEAN countries only
filter(ref_area %in% c("ID","SG","MY","TH")) %>%
#select the needed column
select(ref_area, time_period, fcmtvg_gdp_pt) %>%
ggplot(mapping = aes(x = time_period, y = fcmtvg_gdp_pt, group = ref_area, color = ref_area)) +
geom_line(size = 1.5)+
geom_point() +
#add value label for Indonesia and Thailand dataset
geom_text(mapping = aes(label = ifelse(time_period > 2015 & fcmtvg_gdp_pt>0.05,round(fcmtvg_gdp_pt,2),"")), hjust=0, vjust=0) +
ggtitle(label="Value of Mobile Money Transaction (% of GDP)", subtitle = "(Region: South East Asia)") +
xlab("year") +
ylab("% of GDP")
From the graph above, mobile money transactions in Thailand and Indonesia grown significantly in the last 5 years.
#Number of Commercial Bank branches in Indonesia, Malaysia, Thailand, and Singapore
all_indicators %>%
#filter ASEAN countries only
filter(ref_area %in% c("ID","SG","MY","TH")) %>%
#select particular column
select(ref_area, time_period, fcbodca_num) %>%
ggplot(mapping = aes(x = time_period, y=fcbodca_num, group=1)) +
geom_line(size=1.5, color="steelblue")+
facet_wrap(~ ref_area) +
ggtitle(label="Number of Commercial Bank Branches per 100.000 Adults", subtitle = "(Region: South East Asia)") +
xlab("year") +
ylab("number of branches")
On the other hand, the number of commercial bank branches per 100.000 adults steadily decreasing within the same period.
RJSDMX enable us to explore data published by data providers within the R console and also we able to fetch data simply by providing an SDMX key.
2019 SDMX Survey Result. https://sdmx.org/wp-content/uploads/SDMX-implementation-status.pptx↩
Reinhold Stahl, Patricia Staab. 2018. Measuring the Data Universe: Data Integration Using Statistical Data and Metadata Exchange. Cham, Switzerland: Springer↩