• 1. Introduction
    • 1.1. SDMX Short Introduction
  • 2. Setting Up
  • 3. Data Collection
    • 3.1. Fetch The DataFlows
    • 3.2. Fetch The DSD
    • 3.3 Fetch The Data
  • 4. Exploratory Data Analysis
    • 4.1. Data Munging
    • 4.2. Data Visualisation
  • 5. Conclusion
  • 6. Reference

1. Introduction

Data collection is a pivotal part in the field of data science. So, it is important that one must know how to identify relevant data stored in various formats and various locations, perform proper data acquisition, and store it for further processing.

Statistical offices, central banks, and other multinational organizations have widely used SDMX to disseminate their data and the interest to implement SDMX among other institutions is continuously growing 1.

In this vignette, we are going to perform data collection on SDMX based data from those institutions using the RJSDMX package.

1.1. SDMX Short Introduction

Statistical Data and Metadata eXchange (SDMX) was initiated by 7 institutions so-called SDMX Sponsors that comprise of BIS, ECB, Eurostat, IMF, OECD, UNSD, and the World Bank. During its first emergence in 2002, SDMX aimed to replace the fragmented data exchange mechanism among these institutions. Nowadays, SDMX not only a mere data exchange format, but SDMX has transformed into an information model of any data or statistical domains within organizations 2.

In SDMX, each dataset is uniquely identified by using SDMX Key (or series key or just “key”) that build based on a set of dimensions within a Data Structure Definition (DSD). A DSD expressed in dimensions, measures/observation value, and attributes.

For example, IMF publishes an annual Financial Access Survey (FAS) that provides data on access and use of basic financial services. The SDMX key for the value of mobile money transactions (% of GDP) in Indonesia is DS-FAS.A.ID.FCMTVG_GDP_PT. The DSD for this data consists of 3 dimensions:
1. Frequency, with “A” for Annual
2. Area/Country, with “ID” for Indonesia
3. Indicator, with “FCMTVG_GDP_PT” for Use of Financial Services, Value of mobile money transactions during the reference year
(here are the complete codelist value for DSD above)
Mind that the first element (in this case: DS-FAS) in SDMX key often called a dataflow.

2. Setting Up

Make sure java is installed on the computer before using RJSDMX.
Execute the following to load RJSDMX (some tidyverse functionality will be used to manipulate RJSDMX outputs):

library(RJSDMX)
library(tidyverse)

By default, RJSDMX provides resources for 21 SDMX Data Providers than can be identified by calling getProviders function

getProviders()
##  [1] "ABS"              "ECB"              "EUROSTAT"         "ILO"             
##  [5] "ILO_Legacy"       "IMF2"             "IMF_SDMX_CENTRAL" "INEGI"           
##  [9] "INSEE"            "ISTAT"            "ISTAT_CENSUS_AGR" "ISTAT_CENSUS_IND"
## [13] "ISTAT_CENSUS_POP" "NBB"              "OECD"             "OECD_RESTR"      
## [17] "StatsEE"          "UIS"              "UNDATA"           "WB"              
## [21] "WITS"
Data Providers Description
providers desc
ABS Australian Bureau of Statistics
ECB European Central Bank
EUROSTAT Eurostat/Statistical Office of the European Union
ILO International Labour Organization
IMF2 International Monetary Fund (new endpoint)
IMF_SDMX_CENTRAL International Monetary Fund
INEGI The National Institute of Statistics and Geography, Mexico
INSEE The National Institute of Statistics and Economic Studies, France
ISTAT The Italian National Institute of Statistics, Italy
NBB National Bank of Belgium
OECD Organisation for Economic Co-operation and Development
StatsEE Statistics Estonia
UIS UNESCO Institute for Statistics
UNDATA United Nations
WB World Bank
WITS World Integrated Trade Solution

To add another data provider, first must locate the SDMX API request path/URL, for example, FAO’s SDMX API address is http://data.fao.org/sdmx, UNICEF: https://api.data.unicef.org/SDMX/rest. Call addProvider function to add a new data provider

addProvider("FAO","http://data.fao.org/sdmx")
addProvider("UNICEF", "https://api.data.unicef.org/SDMX/rest")
getProviders()
##  [1] "ABS"              "ECB"              "EUROSTAT"         "FAO"             
##  [5] "ILO"              "ILO_Legacy"       "IMF2"             "IMF_SDMX_CENTRAL"
##  [9] "INEGI"            "INSEE"            "ISTAT"            "ISTAT_CENSUS_AGR"
## [13] "ISTAT_CENSUS_IND" "ISTAT_CENSUS_POP" "NBB"              "OECD"            
## [17] "OECD_RESTR"       "StatsEE"          "UIS"              "UNDATA"          
## [21] "UNICEF"           "WB"               "WITS"

3. Data Collection

3.1. Fetch The DataFlows

If you looking for particular datasets and know the SDMX key, then skip this step and jump to step 3.3.
But, if the SDMX key is unknown, then first we need to retrieve the dataflows published by the data provider. Use wildcard character * to match any strings of any length. For instance, we are looking for dataflows that contain ‘Financial’ and ‘Access’:

org <- "IMF2"
#to retrieve all data flows in IMF: data_flows <- getFlows(org)
#to retrieve particular dataflows based on a certain pattern
data_flows <- getFlows(org, pattern = "*Financial*Access*")
data_flows
## $`DS-FAS_2015`
## [1] "Financial Access Survey (FAS), 2015"
## 
## $`DS-FAS_2018`
## [1] "Financial Access Survey (FAS), 2018"
## 
## $`DS-FAS`
## [1] "Financial Access Survey (FAS)"
## 
## $`DS-FAS_2016`
## [1] "Financial Access Survey (FAS), 2016"
## 
## $`DS-FAS_2017`
## [1] "Financial Access Survey (FAS), 2017"

We found that there are 5 dataflows related to Financial Access Survey (FAS) from IMF. We are going to use the DS-FAS dataflow.

3.2. Fetch The DSD

After that, we need to retrieve the DSD that consist of dimensions and corresponding codelists. Unfortunately, there are some inconsistencies in naming the dataflow’s id between organizations.
1. Case 1: id_dataflow, e.g: OECD, IMF

getFlows("IMF2")[1]
## $`DS-BOP_2017M09`
## [1] "Balance of Payments (BOP), 2017 M09"


2. Case 2: provider, id_dataflow, version, e.g: Wordbank, ECB.

getFlows("WB")[2]
## $`WB,WDI,1.0`
## [1] "World Development Indicators"
getFlows("ECB")[1]
## $`ECB,SHS,1.0`
## [1] "Securities Holding Statistics"


So, we need to parse the dataflow and use getDimensions functions to get the list of dimensions

#parse "Financial Access Survey (FAS)" dataflow
fas_dataflow <- strsplit(names(data_flows[3]),",")

idx = 1

#if Case 2 found, then get the second element
ifelse(length(fas_dataflow[[1]]) > 1, idx <-2, idx <-1)
fas_dataflow <- fas_dataflow[[1]][idx]

#get list of dimensions of "Average duration of unemployment" DSD
fas_dimensions <- getDimensions(org, fas_dataflow)
fas_dimensions
## [[1]]
## [[1]]$FREQ
## [1] "IMF/CL_FREQ"
## 
## 
## [[2]]
## [[2]]$REF_AREA
## [1] "IMF/CL_AREA_FAS"
## 
## 
## [[3]]
## [[3]]$INDICATOR
## [1] "IMF/CL_INDICATOR_FAS"

We found that there 3 dimensions in the Financial Access Survey (FAS) datasets. Then, fetch the codelist for each dimension

#use flatten to remove first level of indices
fas_codelist <- map(.x = names(flatten(fas_dimensions)),
                      flow = fas_dataflow,
                      provider = org,
                      .f = getCodes) %>% set_names(names(flatten(fas_dimensions)))

Codelist

Here are the complete codelist value for IMF’s Financial Access Survey (FAS):

ABCDEFGHIJ0123456789
id
<chr>
name
<chr>
AAnnual
QQuarterly
BBi-annual
DDaily
WWeekly
MMonthly
ABCDEFGHIJ0123456789
id
<chr>
name
<chr>
1C_ALLCAll Countries
AEUnited Arab Emirates
AFAfghanistan
AGAntigua and Barbuda
AIAnguilla
ALAlbania
AMArmenia
AOAngola
ARArgentina
ATAustria
ABCDEFGHIJ0123456789
id
<chr>
name
<chr>
FCNAMFA_NUMKey Indicators, Use of Financial Services, Number of loan accounts with all microfinance institutions per 1,000 adults
FCMIBTV_XDCUse of Financial Services, Mobile and internet banking (for commercial banks only), Value of mobile and internet banking transactions (during the reference year), domestic currency
FCLODCS_XDCUse of Financial Services, Liabilities: Outstanding Deposits, Commercial banks, of which: SME deposits, Domestic Currency
FCSODCHM_XDCUse of Financial Services, Assets: Outstanding Loans, Commercial banks, of which: household sector loans, of which: loans to men, Domestic Currency
FCNODMFHM_NUMUse of Financial Services, Number of Loan Accounts, Deposit-taking microfinance institutions, of which: household sector loan accounts, of which: men-owned loan accounts
FCRODU_PE_NUMUse of Financial Services, Number of Borrowers, Credit unions and credit cooperatives
FCAODMFH_NUMUse of Financial Services, Number of Deposit Accounts, Deposit-taking microfinance institutions, of which: household sector deposit accounts
FCAODC_NUMUse of Financial Services, Number of Deposit Accounts, Commercial banks
FCNODCS_NUMUse of Financial Services, Number of Loan Accounts, Commercial banks, of which: SME loan accounts
FCDODC_PE_NUMUse of Financial Services, Number of Depositors, Commercial banks

3.3 Fetch The Data

Use the following to retrieve particular dataset/SDMX Key. For instance, collecting value of mobile money transactions (% of GDP) data in Indonesia (DS-FAS.A.ID.FCMTVG_GDP_PT):

#Get entire "DS-FAS.A.ID.FCMTVG_GDP_PT" data series without metadata (dimensions with their respective 
#codelist and attributes)   
emoney_trx_id <- as_tibble(sdmxdf(getTimeSeries(org,"DS-FAS.A.ID.FCMTVG_GDP_PT")))
emoney_trx_id
## # A tibble: 12 x 3
##    ID                        TIME_PERIOD OBS_VALUE
##    <chr>                     <chr>           <dbl>
##  1 DS-FAS.A.ID.FCMTVG_GDP_PT 2007         0.000133
##  2 DS-FAS.A.ID.FCMTVG_GDP_PT 2008         0.00155 
##  3 DS-FAS.A.ID.FCMTVG_GDP_PT 2009         0.00926 
##  4 DS-FAS.A.ID.FCMTVG_GDP_PT 2010         0.0101  
##  5 DS-FAS.A.ID.FCMTVG_GDP_PT 2011         0.0125  
##  6 DS-FAS.A.ID.FCMTVG_GDP_PT 2012         0.0229  
##  7 DS-FAS.A.ID.FCMTVG_GDP_PT 2013         0.0305  
##  8 DS-FAS.A.ID.FCMTVG_GDP_PT 2014         0.0314  
##  9 DS-FAS.A.ID.FCMTVG_GDP_PT 2015         0.0458  
## 10 DS-FAS.A.ID.FCMTVG_GDP_PT 2016         0.0570  
## 11 DS-FAS.A.ID.FCMTVG_GDP_PT 2017         0.0911  
## 12 DS-FAS.A.ID.FCMTVG_GDP_PT 2018         0.318
#Get "ADS-FAS.A.ID.FCMTVG_GDP_PT" data series without metadata for a certain period
emoney_trx_id <- as_tibble(sdmxdf(getTimeSeries(org,"DS-FAS.A.ID.FCMTVG_GDP_PT", start = "2015", end = "2019")))
emoney_trx_id
## # A tibble: 4 x 3
##   ID                        TIME_PERIOD OBS_VALUE
##   <chr>                     <chr>           <dbl>
## 1 DS-FAS.A.ID.FCMTVG_GDP_PT 2015           0.0458
## 2 DS-FAS.A.ID.FCMTVG_GDP_PT 2016           0.0570
## 3 DS-FAS.A.ID.FCMTVG_GDP_PT 2017           0.0911
## 4 DS-FAS.A.ID.FCMTVG_GDP_PT 2018           0.318
#Get entire "DS-FAS.A.ID.FCMTVG_GDP_PT" data series with metadata (including attributes)
emoney_trx_id <- as_tibble(sdmxdf(getTimeSeries(org,"DS-FAS.A.ID.FCMTVG_GDP_PT"), meta = TRUE))
emoney_trx_id
## # A tibble: 12 x 11
##    ID    FREQ  REF_AREA INDICATOR UNIT_MULT CONNECTORS_AUTO~ TIME_FORMAT
##    <chr> <chr> <chr>    <chr>     <chr>     <chr>            <chr>      
##  1 DS-F~ A     ID       FCMTVG_G~ 0         IMF,DS-FAS,1.0 ~ P1Y        
##  2 DS-F~ A     ID       FCMTVG_G~ 0         IMF,DS-FAS,1.0 ~ P1Y        
##  3 DS-F~ A     ID       FCMTVG_G~ 0         IMF,DS-FAS,1.0 ~ P1Y        
##  4 DS-F~ A     ID       FCMTVG_G~ 0         IMF,DS-FAS,1.0 ~ P1Y        
##  5 DS-F~ A     ID       FCMTVG_G~ 0         IMF,DS-FAS,1.0 ~ P1Y        
##  6 DS-F~ A     ID       FCMTVG_G~ 0         IMF,DS-FAS,1.0 ~ P1Y        
##  7 DS-F~ A     ID       FCMTVG_G~ 0         IMF,DS-FAS,1.0 ~ P1Y        
##  8 DS-F~ A     ID       FCMTVG_G~ 0         IMF,DS-FAS,1.0 ~ P1Y        
##  9 DS-F~ A     ID       FCMTVG_G~ 0         IMF,DS-FAS,1.0 ~ P1Y        
## 10 DS-F~ A     ID       FCMTVG_G~ 0         IMF,DS-FAS,1.0 ~ P1Y        
## 11 DS-F~ A     ID       FCMTVG_G~ 0         IMF,DS-FAS,1.0 ~ P1Y        
## 12 DS-F~ A     ID       FCMTVG_G~ 0         IMF,DS-FAS,1.0 ~ P1Y        
## # ... with 4 more variables: IS_NUMERIC <lgl>, IS_ERROR <lgl>,
## #   TIME_PERIOD <chr>, OBS_VALUE <dbl>

Mind that we are going to use the results from step 3.1 and 3.2 above to construct the SDMX key of the datasets. For example, since Financial Access Survey (FAS) has 3 dimensions, so the SDMX key sequences are: id_dataflow.codelist_dimension_1.codelist_dimension_2.codelist_dimension_3. The sequence is matters.
To retrieve multiple datasets, use + to select more than one codelist values within a dimension or use * to select all codelist values within the dimension.

#Get FCMTVG_GDP_PT and FCBODCA_NUM (number of commecial bank branches) data for 4 ASEAN countries: Indonesia, 
#Malaysia, Singapore, and Thailand
emoney_trx_asean <- as_tibble(sdmxdf(getTimeSeries(org,"DS-FAS.A.ID+MY+SG+TH.FCMTVG_GDP_PT+FCBODCA_NUM"), 
                                        meta = TRUE))
emoney_trx_asean
## # A tibble: 102 x 12
##    ID    FREQ  REF_AREA INDICATOR UNIT_MULT CONNECTORS_AUTO~ TIME_FORMAT
##    <chr> <chr> <chr>    <chr>     <chr>     <chr>            <chr>      
##  1 DS-F~ A     ID       FCBODCA_~ 0         IMF,DS-FAS,1.0 ~ P1Y        
##  2 DS-F~ A     ID       FCBODCA_~ 0         IMF,DS-FAS,1.0 ~ P1Y        
##  3 DS-F~ A     ID       FCBODCA_~ 0         IMF,DS-FAS,1.0 ~ P1Y        
##  4 DS-F~ A     ID       FCBODCA_~ 0         IMF,DS-FAS,1.0 ~ P1Y        
##  5 DS-F~ A     ID       FCBODCA_~ 0         IMF,DS-FAS,1.0 ~ P1Y        
##  6 DS-F~ A     ID       FCBODCA_~ 0         IMF,DS-FAS,1.0 ~ P1Y        
##  7 DS-F~ A     ID       FCBODCA_~ 0         IMF,DS-FAS,1.0 ~ P1Y        
##  8 DS-F~ A     ID       FCBODCA_~ 0         IMF,DS-FAS,1.0 ~ P1Y        
##  9 DS-F~ A     ID       FCBODCA_~ 0         IMF,DS-FAS,1.0 ~ P1Y        
## 10 DS-F~ A     ID       FCBODCA_~ 0         IMF,DS-FAS,1.0 ~ P1Y        
## # ... with 92 more rows, and 5 more variables: IS_NUMERIC <lgl>,
## #   IS_ERROR <lgl>, TIME_PERIOD <chr>, OBS_VALUE <dbl>, OBS_STATUS <chr>
#Get entire FAS indicators data for 4 ASEAN countries: Indonesia, Malaysia, Singapore, and Thailad
all_indicators <- as_tibble(sdmxdf(getTimeSeries(org,"DS-FAS.A.ID+MY+SG+TH.*"), meta = TRUE))
all_indicators
## # A tibble: 6,170 x 12
##    ID    FREQ  REF_AREA INDICATOR UNIT_MULT CONNECTORS_AUTO~ TIME_FORMAT
##    <chr> <chr> <chr>    <chr>     <chr>     <chr>            <chr>      
##  1 DS-F~ A     ID       FCAA_NUM  0         IMF,DS-FAS,1.0 ~ P1Y        
##  2 DS-F~ A     ID       FCAA_NUM  0         IMF,DS-FAS,1.0 ~ P1Y        
##  3 DS-F~ A     ID       FCAA_NUM  0         IMF,DS-FAS,1.0 ~ P1Y        
##  4 DS-F~ A     ID       FCAA_NUM  0         IMF,DS-FAS,1.0 ~ P1Y        
##  5 DS-F~ A     ID       FCAA_NUM  0         IMF,DS-FAS,1.0 ~ P1Y        
##  6 DS-F~ A     ID       FCAA_NUM  0         IMF,DS-FAS,1.0 ~ P1Y        
##  7 DS-F~ A     ID       FCAA_NUM  0         IMF,DS-FAS,1.0 ~ P1Y        
##  8 DS-F~ A     ID       FCAA_NUM  0         IMF,DS-FAS,1.0 ~ P1Y        
##  9 DS-F~ A     ID       FCAA_NUM  0         IMF,DS-FAS,1.0 ~ P1Y        
## 10 DS-F~ A     ID       FCAA_NUM  0         IMF,DS-FAS,1.0 ~ P1Y        
## # ... with 6,160 more rows, and 5 more variables: IS_NUMERIC <lgl>,
## #   IS_ERROR <lgl>, TIME_PERIOD <chr>, OBS_VALUE <dbl>, OBS_STATUS <chr>

4. Exploratory Data Analysis

4.1. Data Munging

Regarding this dataset, we’re going to perform these steps:
1. Indicators are variables so we need to spread them into columns
2. In terms of consistency, change the column name into a lower case
3. Convert frequency and country/area into factors

#spread the indicators into multiple columns
all_indicators <- pivot_wider(all_indicators, id_cols = c(FREQ, REF_AREA, TIME_PERIOD), 
                                 names_from = INDICATOR, values_from = OBS_VALUE)

#change column name case to lower case
names(all_indicators) <- tolower(names(all_indicators))

#convert the dimensions (freq and area) into factor
all_indicators$freq <- as_factor(all_indicators$freq)
all_indicators$ref_area <- as_factor(all_indicators$ref_area)
#unemployment$TIME_PERIOD <- as.POSIXct(unemployment$TIME_PERIOD, format = "%Y")
#since it's annual data, convert TIME_PERIODE to integer instead
all_indicators$time_period <- as.integer(all_indicators$time_period)

glimpse(all_indicators %>% filter(time_period>2008))
## Observations: 40
## Variables: 168
## $ freq             <fct> A, A, A, A, A, A, A, A, A, A, A, A, A, A, A, A, A,...
## $ ref_area         <fct> ID, ID, ID, ID, ID, ID, ID, ID, ID, ID, MY, MY, MY...
## $ time_period      <int> 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 20...
## $ fcaa_num         <dbl> 14.11555, 13.03989, 16.44817, 35.72946, 42.01644, ...
## $ fcac_num         <dbl> 23944, 22444, 28817, 63671, 76136, 90678, 99286, 1...
## $ fcak_num         <dbl> 13.21726, 12.38925, 15.90720, 35.14686, 42.02763, ...
## $ fcaodc_num       <dbl> 83693845, 98694633, 109165197, 123638075, 15498352...
## $ fcaodca_num      <dbl> 493.3948, 573.4126, 623.0933, 693.8043, 855.2926, ...
## $ fcaodch_num      <dbl> 80873565, 95661508, 105599930, 117499661, 15003721...
## $ fcaodcha_num     <dbl> 476.7686, 555.7903, 602.7435, 659.3581, 827.9959, ...
## $ fcaodmf_num      <dbl> NA, NA, NA, NA, NA, NA, NA, 51589, 44462, 45852, N...
## $ fcaofilp_num     <dbl> 1378643, 1654935, 1996023965, 2480798622, 28042487...
## $ fcaofilpa_num    <dbl> 8.127424, 9.615117, 11392.909732, 13921.186838, 15...
## $ fcbamfa_num      <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcbamfk_num      <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcbodc_num       <dbl> 12837, 13837, 25646, 29945, 31847, 32739, 32949, 3...
## $ fcbodca_num      <dbl> 7.639044, 8.110134, 14.706723, 16.871199, 17.64132...
## $ fcbodck_num      <dbl> 7.152912, 7.705471, 14.223022, 16.596102, 17.64602...
## $ fcbodd_num       <dbl> 3867, 4196, 4536, 4826, 5080, 5334, 6428, 6528, 66...
## $ fcbodmf_num      <dbl> NA, NA, NA, NA, NA, NA, 20, 51, 51, 252, NaN, NaN,...
## $ fcbofnmf_num     <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcccc_num        <dbl> NA, NA, NA, 14817168, 15091684, 16043347, 16863842...
## $ fcccca_num       <dbl> NA, NA, NA, 83.14765, 83.28502, 87.05718, 89.96230...
## $ fccdc_num        <dbl> NA, NA, NA, 73219365, 83170125, 98638287, 11294881...
## $ fccdca_num       <dbl> NA, NA, NA, 410.8759, 458.9829, 535.2481, 602.5398...
## $ fcdodmf_pe_num   <dbl> NA, NA, NA, NA, NA, NA, NA, 51589, 44462, 45852, N...
## $ fcdodu_pe_num    <dbl> 500863, 692659, 604548, 2944916, 3172714, 3314757,...
## $ fcdodua_num      <dbl> 2.952705, 4.024326, 3.450640, 16.525616, 17.508951...
## $ fcdofilp_pe_num  <dbl> 38597121, 34564028, 33859313, 46490489, 49812658, ...
## $ fciodc_num       <dbl> 121, 122, 120, 120, 120, 119, 118, 116, 115, 115, ...
## $ fciodd_num       <dbl> 1872, 1856, 1824, 1811, 1798, 1806, 1799, 1799, 17...
## $ fciodmf_num      <dbl> NA, NA, NA, NA, NA, NA, 20, 129, 180, 183, NaN, Na...
## $ fciodu_num       <dbl> 3624, 3624, 3163, 8761, 11850, 12008, NA, NA, NA, ...
## $ fciofi_num       <dbl> 142, 142, 139, 137, 141, 141, 146, 146, 152, 151, ...
## $ fciofia_num      <dbl> 0.08371232, 0.08250154, 0.07933845, 0.07687857, 0....
## $ fciofmfn_num     <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fclodc_xdc       <dbl> 1973041660.0, 2338823971.0, 2784911764.0, 32251977...
## $ fclodcg_gdp_pt   <dbl> 35.19390, 34.07312, 35.55936, 37.43394, 38.38169, ...
## $ fclodd_xdc       <dbl> 2.680249e+07, 3.291554e+07, 4.030476e+07, 4.780742...
## $ fclodmf_xdc      <dbl> NA, NA, NA, NA, NA, NA, 31780.0, 108192.1, 210833....
## $ fclodu_xdc       <dbl> 5738120, 6159620, 6825620, 6820383, 12490000, 1291...
## $ fclodug_gdp_pt   <dbl> 0.10235305, 0.08973631, 0.08715345, 0.07916222, 0....
## $ fclofilp_xdc     <dbl> 114955829.3, 140399517.8, 171102183.1, 188842350.0...
## $ fclofinp_xdc     <dbl> 9171557.81, 10860395.46, 12838553.17, 22520055.42,...
## $ fcmaab_xdc       <dbl> NA, NA, NA, 2.546101e+05, 3.454827e+05, 4.943537e+...
## $ fcmaabg_gdp_pt   <dbl> NA, NA, NA, 2.955186e-03, 3.619085e-03, 4.677081e-...
## $ fcmar_num        <dbl> 3016272, 7914018, 14299726, 21869946, 36225373, 35...
## $ fcmara_num       <dbl> 17.7816290, 45.9801875, 81.6200057, 122.7248361, 1...
## $ fcmibt_num       <dbl> NA, NA, NA, NA, 1926975397, 2780117378, 3431900610...
## $ fcmibta_num      <dbl> NA, NA, NA, NA, 10634.21, 15085.95, 18307.91, 2293...
## $ fcmibtv_xdc      <dbl> NA, NA, NA, NA, 12584415901, 11743408711, 12348281...
## $ fcmibtvg_gdp_pt  <dbl> NA, NA, NA, NA, 131.8274, 111.1044, 107.1311, 116....
## $ fcmt_num         <dbl> 17436631, 26541982, 41060149, 100623916, 137900779...
## $ fcmta_num        <dbl> 102.793019, 154.208053, 234.363204, 564.658625, 76...
## $ fcmtv_xdc        <dbl> 5.192126e+05, 6.934670e+05, 9.812970e+05, 1.971550...
## $ fcmtvg_gdp_pt    <dbl> 9.261395e-03, 1.010276e-02, 1.252977e-02, 2.288321...
## $ fcnamfa_num      <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcnodc_num       <dbl> 32641631, 34382012, 37933259, 39440522, 38974819, ...
## $ fcnodca_num      <dbl> 192.4301, 199.7584, 216.5155, 221.3234, 215.0866, ...
## $ fcnodch_num      <dbl> 32178542, 33829981, 37419521, 38922826, 38374247, ...
## $ fcnodcha_num     <dbl> 189.7000, 196.5511, 213.5832, 218.4183, 211.7723, ...
## $ fcnodcs_num      <dbl> NA, 6732564, 7241480, 7926250, 9462228, 10975824, ...
## $ fcnodmf_num      <dbl> NA, NA, NA, NA, NA, NA, NA, 41303, 41653, 28139, N...
## $ fcnofnmf_num     <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcnofnmfh_num    <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcnofnmfhf_num   <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcnofnmfhm_num   <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcramfa_num      <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcrodc_pe_num    <dbl> 38238384, 46122634, 51238897, 59637795, 67730731, ...
## $ fcrodca_num      <dbl> 225.4242, 267.9710, 292.4615, 334.6619, 373.7790, ...
## $ fcrodch_pe_num   <dbl> NA, NA, 29478683, 30916019, 33678568, 35494286, 36...
## $ fcrodcha_num     <dbl> NA, NA, 168.25849, 173.48755, 185.85866, 192.60523...
## $ fcrodchf_pe_num  <dbl> NA, NA, 9890274, 10976298, 12645268, 13662033, 143...
## $ fcrodchffa_num   <dbl> NA, NA, 112.87002, 123.17004, 139.55769, 148.27236...
## $ fcrodchm_pe_num  <dbl> NA, NA, 19588409, 19939721, 21033300, 21832253, 22...
## $ fcrodchmma_num   <dbl> NA, NA, 223.68056, 223.82114, 232.16736, 236.93726...
## $ fcrodcs_pe_num   <dbl> NA, NA, 10707581, 11279588, 12740796, 14193993, 14...
## $ fcrodcsp_pt      <dbl> NA, NA, 49.20715, 39.27190, 37.41553, 38.27464, 34...
## $ fcrodmf_pe_num   <dbl> NA, NA, NA, NA, NA, NA, NA, 41303, 41653, 28139, N...
## $ fcrodu_pe_num    <dbl> 5706913, 6769181, 6577194, 2944916, 3172714, 33147...
## $ fcrodua_num      <dbl> 33.64359, 39.32872, 37.54132, 16.52562, 17.50895, ...
## $ fcrofnmf_num     <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcrofnmfh_num    <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcrofnmfhf_num   <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcrofnmfhm_num   <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcroodc_num      <dbl> NA, NA, NA, NA, NA, 19708, 69548, 133811, 204960, ...
## $ fcroodca_num     <dbl> NA, NA, NA, NA, NA, 10.694295, 37.101262, 70.30621...
## $ fcroodck_num     <dbl> NA, NA, NA, NA, NA, 10.878961, 38.391009, 73.86465...
## $ fcsamfg_pt       <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcsodc_xdc       <dbl> 1437929582.0, 1765844906.0, 2200094225.0, 27078619...
## $ fcsodcg_gdp_pt   <dbl> 25.64890, 25.72568, 28.09207, 31.42937, 34.49432, ...
## $ fcsodch_xdc      <dbl> NA, NA, NA, 619264802.0, 688050812.8, 832751802.9,...
## $ fcsodchg_gdp_pt  <dbl> NA, NA, NA, 7.187628, 7.207638, 7.878666, 7.949208...
## $ fcsodcs_xdc      <dbl> 737385000.0, 926782000.0, 458163635.0, 526396576.0...
## $ fcsodcsg_gdp_pt  <dbl> 13.153019, 13.501807, 5.850098, 6.109733, 6.377696...
## $ fcsodd_xdc       <dbl> 2.958759e+07, 3.590470e+07, 4.377545e+07, 5.337192...
## $ fcsodmf_xdc      <dbl> NA, NA, NA, NA, NA, NA, 23300.0, 155533.6, 330181....
## $ fcsodu_xdc       <dbl> 8457488, 9564468, 10643468, 9389257, 20790000, 202...
## $ fcsodug_gdp_pt   <dbl> 0.1508595, 0.1393398, 0.1359019, 0.1089784, 0.2177...
## $ fcsofnmf_xdc     <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcsofnmfh_xdc    <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcsofnmfhf_xdc   <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcsofnmfhm_xdc   <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, ...
## $ fcaodmfh_num     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcaodmfhf_num    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcaodmfhm_num    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcaodu_num       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcaodua_num      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcaoduh_num      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcaodus_num      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcaofi_num       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcbodu_num       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcbodua_num      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcboduk_num      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcdodc_pe_num    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcdodca_num      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcdodch_pe_num   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcdodcha_num     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcdodchf_pe_num  <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcdodchffa_num   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcdodchm_pe_num  <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcdodchmma_num   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcdodmfh_pe_num  <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcdodmfhf_pe_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcdodmfhm_pe_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcdoduh_pe_num   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcdodus_pe_num   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcdofi_pe_num    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fclodch_xdc      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 375818.3, ...
## $ fclodchg_gdp_pt  <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 52.72002, ...
## $ fcloddh_xdc      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1288.8960,...
## $ fclodmfh_xdc     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fclodmfhf_xdc    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fclodmfhm_xdc    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcloduh_xdc      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fclodus_xdc      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fclofi_xdc       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 130578.5, ...
## $ fcnodchf_num     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcnodchffa_num   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcnodchm_num     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcnodchmma_num   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcnodmfh_num     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcnodmfhf_num    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcnodmfhm_num    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcnodu_num       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcnodua_num      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcnoduh_num      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcnodus_num      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcrodmfh_pe_num  <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcrodmfhf_pe_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcrodmfhm_pe_num <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcroduh_pe_num   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcrodus_pe_num   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcsoddh_xdc      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1394.086, ...
## $ fcsodmfh_xdc     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcsodmfhf_xdc    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcsodmfhm_xdc    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcsoduh_xdc      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcsodus_xdc      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN, ...
## $ fcmaa_num        <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcmaaa_num       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcmoa_num        <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcmoaa_num       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcmoak_num       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcmor_num        <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcmora_num       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcmork_num       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ fcaofiln_num     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...

4.2. Data Visualisation

Use ggplot2 to generate graphics:

#value of mobile money transaction (% of GDP) in Indonesia, Malaysia, Thailand, and Singapore
all_indicators %>%
  #filter ASEAN countries only
  filter(ref_area %in% c("ID","SG","MY","TH")) %>%
  #select the needed column
  select(ref_area, time_period, fcmtvg_gdp_pt) %>%
  ggplot(mapping = aes(x = time_period, y = fcmtvg_gdp_pt, group = ref_area, color = ref_area)) +
  geom_line(size = 1.5)+
  geom_point() + 
  #add value label for Indonesia and Thailand dataset
  geom_text(mapping = aes(label = ifelse(time_period > 2015 & fcmtvg_gdp_pt>0.05,round(fcmtvg_gdp_pt,2),"")), hjust=0, vjust=0) +
  ggtitle(label="Value of Mobile Money Transaction (% of GDP)", subtitle = "(Region: South East Asia)") +
  xlab("year") +
  ylab("% of GDP")

From the graph above, mobile money transactions in Thailand and Indonesia grown significantly in the last 5 years.

#Number of Commercial Bank branches in Indonesia, Malaysia, Thailand, and Singapore
all_indicators %>%
  #filter ASEAN countries only
  filter(ref_area %in% c("ID","SG","MY","TH")) %>%
  #select particular column
  select(ref_area, time_period, fcbodca_num) %>%
  ggplot(mapping = aes(x = time_period, y=fcbodca_num, group=1)) +
  geom_line(size=1.5, color="steelblue")+
  facet_wrap(~ ref_area) +  
  ggtitle(label="Number of Commercial Bank Branches per 100.000 Adults", subtitle = "(Region: South East Asia)") +
  xlab("year") +
  ylab("number of branches")

On the other hand, the number of commercial bank branches per 100.000 adults steadily decreasing within the same period.

5. Conclusion

RJSDMX enable us to explore data published by data providers within the R console and also we able to fetch data simply by providing an SDMX key.

6. Reference


  1. 2019 SDMX Survey Result. https://sdmx.org/wp-content/uploads/SDMX-implementation-status.pptx

  2. Reinhold Stahl, Patricia Staab. 2018. Measuring the Data Universe: Data Integration Using Statistical Data and Metadata Exchange. Cham, Switzerland: Springer