Hello! This document serves as a record of the work I did as a Trainee and Junior Researcher at EMEA to be used in the future by the ever-growing research team. I hope this resource helps with the continuation of some of the projects and dashboards I worked on and provides clarity about the processes I used. If something is unclear, my door is always open to the EMEA team. Feel free to reach out on LinkedIn or by email (elena.stottslee@bse.eu).
When I started at EMEA, there was no system for code management. Everyone was working on their personal computers and using different methods for the storage of files and code. There also wasn’t an agreed upon coding language. Most of the senior researchers were using Stata because of access through their universities. I used Stata for one project at the beginning and then switched to R when the license from my master’s expired.
You will find most of my code in two places: The EMEA Microsoft OneDrive & Github
For data cleaning, data visualization, and other experiments, my code will be in the relevant folders in the Microsoft OneDrive. When I first started, we were using DropBox instead of OneDrive but I migrated the most important items over to OneDrive. I mention this, however, because if you are doing a deep dive on some old project, in particular the TRIS, it may be helpful to look in the EMEA DropBox.
For OneDrive there is an option to operate in your personal OneDrive or the EMEA wide section. For the most part I worked out of My Files, the personal section. However, in case my email no longer becomes available, I will copy my work into the main OneDrive. The main things to note is that as a result of migrating the files, the file paths will be different. So to run a script, you will have to change them. I have linked locations in the EMEA OneDrive throughout this document, but I beleive that you will only be able to access these links when logged into an EMEA email account.
One of the main benefits of using Github is the ability to track changes and share code between a team. I started to used Github at EMEA when I was building the dashboards, mostly because R Shiny apps are easy to break so it is very usual to be able to return to a previous stage where the app is working. I used my personal Github when building the dashboards (@elena-stotts-lee). However, I created an EMEA Github account for code management (@emea-codemanagement) which I have added as a contributor to each of my repositories. This will grant whoever uses it full access and ability to make changes to the repositories. If someone wishes to add their personal account as a contributor, contact me. The log in for the code management Github account is info@euromed-economists.org and the password is the same as the desktop computers in the office.
Of the works I have contributed to EMEA, the thing I imagine others will need to interact with the most are the dashboards. The dashboards I built were meant to be constantly evolving and I would be very happy to see them transform and be improved over time. I do ask that I be kept in the credits of the About sections if you are still using my original Shiny code.
When asked to build the TRIS Dashboard, it was not immediately apparent I would be using Shiny. The original Brain Capital dashboard, built by Sara Ronco, was constructed on a platform called Infogram. I explored this option but quickly realised that it would not be possible to use it with the amount of data we hoped to incorporate into the TRIS. Other options I considered were Tableau (easy to use but not free) and Streamlit (free with Python but a steeper learning curve and less resources for mapping). I landed on R Shiny. Despite have only coded in Stata and Python previously, this was the best option for myself and EMEA. Shiny is free with lots of documentation and resources available and can be easily incorporated into websites.
I mention that I had never coded in R before working in EMEA because I think it really shows it some of my early work. Each dashboard is slightly different from the last because I learned new skills and found more efficient ways of doing things as I grew as a programmer in R. For this reason, I would say that even if you don’t have experience in Shiny now, it is so possible to learn and start building functional apps quickly. In this section I will go through each dashboard and discuss a little bit about how they are structured. Then at the end of the section, I will add any tips, tricks, or resources I think of as I go.
The TRIS Dashboard is based on a development model created by Rym and Carlo in this paper. I was originally asked to build the TRIS Dashboard to be presented at COP27 in Sharm El Sheikh. For this reason, the starting point for looking at indicators was the monitoring data of the UN SDGs.
The way you download SDG data is a bit cumbersome. This link is where I exported the data. Select your indicators and then all countries and years. If I was a more advanced programmer back when I was doing this, I probably would have looked into using the SDG API.
Following Sandra’s lead, I did the first data cleaning of the SDGs in
Stata. The do files of this work are in this
folder, but I imagine you will not use it. I later went back and did
the data cleaning of ALL SDG data (that was available in 2023) in R in a
much more simple way. That R Script is available to you under the GitHub
repository TRIS-datacleaning
and is called sdg_datacleaning.R. The file path for the
goals data as I was using it lead to a folder in the old EMEA DropBox. I
have added the data to a new folder in the shared EMEA drive called SDG
Goals Download. If you wish to update the TRIS, it should work to
download and rename the excel files as I have and run this R script.
After exhausting the indicators available from the UN SDGs, I began
to look at other open source data from places like the World Bank, IMF,
IRENA…etc. To me, this was data beyond the SDGs so I collected in a
folder called, additional_data which I have added to the
shared OneDrive here.
The script I used to clean the additional_data is available in the TRIS-datacleaning
GitHub repository under the name additional_data.R.
In a separate R script, I merge the final output of the additional data (“additional_data.csv”) and the SDG data (“trisvar_long.csv”). I do so in a file entitled merge_for_shiny.R, in the TRIS-datacleaning Github Repository.
Now, the TRIS Dashboard was the first Shiny App I ever built. As a result, there are many things about the way I did it that I understand to be inefficient. The most notable example of this is the way in which I import the data into the app. You’ll notice that the exported file at the end of merge_for_shiny.R is a wide data set going to a second version of the TRIS Dashboard (a repository called V2-TRIS-dashboard). It was my intention to rebuild the TRIS Dashboard using the same method you will see soon in the Brain Capital Dashboard, which I will explain in the next section. However, I never had enough time between tasks to execute. This could be a nice warm up exercise for your start at EMEA because I know that Carlo and Rym and interested in having another look at the TRIS and the way it is organized.
So, instead of this nicer more manageable df_wide.csv
that I created in the cleaning files, the current TRIS Dashboard calls
data from df_long.csv a massive file with 5 columns and
over 400,000 rows. I added df_long.csv to the OneDrive here.
The file paths in the Shiny app are commented out because in order to
publish an app, you just select the files and it requires you to remove
the file paths. I am not sure as a result how it will go running the app
from your computer. But if there are issues, you will always be able to
run the app from the computer at my old desk in Room 4 of the office in
Sant Pau.
The columns of df_long.csv are as follows:
countrycode package
at the bottom of this section.The 5 columns above are the integral part of
df_long.csv, however, I needed the data to be connected
with a lot of other information in order to make an interactive app that
is more than just a drop down menu of indicators. I organized that
information in an excel file called titles_pillars_dimensions.xlsx
which I have moved to the OneDrive for you to access. In subsequent
versions of dashboards, I learned how to connect the information in this
type of excel with the data in the app rather than in advance (which is
much less computationally expensive). However, for the TRIS, I didn’t
know any better but to tack it on to every row of
df_long.csv. Hence, the rest of the columns of
df_long.csv are:
The work of creating the df_long.csv with all of this
information is in an R script called shiny_dataprep.R
available here
in the OneDrive.
So! Moving to the code of the actual app. I made another decision
here that I have not repeated in any app since where I made my Shiny app
out of multiple files rather than a single file. As you may know, to
make a Shiny app you open R Studio go to File, New File, and then select
Shiny Web App. This is where you are allowed to select if you want a
single file or multiple. I chose multiple here which created files
global.R, ui.R, and server.R.
This proved to be nothing but annoying and I suggest you go for a single
file.
In global.R I read in the packages needed for the app. I
will explain what some of these are as I go. Then I read in my data
df_long.csv and I define another dataframe called
world. When building a Shiny app with maps, there are many
different ways to geolocate your data. It is especially easy if you
already have a shape file, but in this case I have country names and
isocodes that I wanted to fit onto a map. The solution I found was to
uses the two packages rnaturalearth and
rnaturalearthdata. Using ne_countries() from
these packages, I define a shape file (using
returnclass = "sf") of the countries of the world. I merge
it with df_long using the iso3 codes. Note that I select
only the column for the iso3 and geometry of the country using
select(world, c(iso_a3, geometry)) because there is a lot
of information in the world files that we don’t need.
Thanks to the geometry column, mapping the indicators is
not to difficult from here. I then use st_to_sf() from the
package sf to convert all of df_long to a
shape file with its new geometry element. See the relevant code here
below so you get the picture:
# relevant packages
library(shiny)
library(tidyverse)
library(sf)
library(rnaturalearth)
library(rnaturalearthdata)
library(countrycode)
library(s2)
library(leaflet)
library(tigris)
library(htmlwidgets)
library(shinythemes)
library(leafpop)
library(shinyWidgets)
library(ggiraph)
#read in data
df_long <- read.csv("df_long.csv")
world <- ne_countries(scale = "Medium", returnclass = "sf")
world[120, 45] = "KOS"
#merge
df_long <- left_join(df_long, select(world, c(iso_a3, geometry)), by = c("iso3" = "iso_a3"))
#drop any unmatched rows
df_long <- df_long[!(is.na(df_long$pillar)), ]
#convert to sf
df_long <- st_as_sf(df_long)
Note: if any of the code samples look terrible, try widening the window you are using. I had to add the chunks of code in this way so that R wouldn’t try to run them since they are nonsense outside of a Shiny app. I apologize if it is difficult to follow.
The two key components of a Shiny app are the UI (User Interface) and the Server. In a Shiny app, the UI defines the layout and elements visible to the user, while the server handles the logic, calculations, and interactions based on user inputs to generate dynamic content. Again, because I selected multiples files for the TRIS app, each of these are a separate file. I explain the UI first.
At the head of the UI is the following code:
tags$head(
tags$script(src="js/index.js"),
tags$link(rel = "icon", type = "image/png", href = "favicon.png"),
)
The line I wish to draw your attention to is
tags$script(src="js/index.js"). This links to a Javascript
file which is in a folder called js in the www
folder. Almost every Shiny app will require you to make a
www folder, especially for images and adding elements in
different coding languages. The JS code in the folder is what enabled me
to create links between the different tabs on the navbar. If you run the
app without it, then the links that say “See the TRIS visualised with
our interactive map!” …etc. won’t work.
In Shiny, one of the main things that determines the layout is the
type of page you specify. In the TRIS APP and Brain Capital, I used a
navbarPage(). See different options in this
resource.
navbarPage("TRIS Dashboard", id = "tris",
theme = shinythemes::shinytheme("flatly"),
tags$style(type="text/css",
".shiny-output-error { visibility: hidden; }",
".shiny-output-error:before { visibility: hidden; }"
))
What I wish to draw your attention to here is the
tags$styles(). What is happening here is I am suppressing
errors that come up in the app. I do this because in my Leaflet map,
when I run it without this, it flashes an error before displaying the
map. The error has to do with their being too many values to fit in 9
bins, but it doesn’t stop the map from building. However, I draw your
attention to it because if you are building a new app, I would not
include this code until it is time to publish so that you can understand
what is going on in your app.
For each tab of the app, you define a tabPanel(). In the
first one, “About TRIS”, I have the pillars and dimensions displayed
just as an image. This was a decision taken to make sure the app was
ready for COP27, but if you build another version, I suggest to make
this element interactive. As I mentioned before, to add an image to the
app, put it as a jpg or png in the www folder and then add
it with a line of code like so
img(src = 'dimensions.jpg', align = "center", width = "100%").
You can also make images clickable links! If you stick it in a
tags$a()! I use tags$div() to enclose all of
the content in very text heavy tabs because then you can just write text
without using HTML tags to define every line and every paragraph
break.
Now, on the Interactive Map tab, a few things to note. I use a sidebar layout, pretty common in Shiny apps. The code for the selecting the indicators, dimensions, year, etc… is as follows:
selectInput("pillar", "Select a Pillar",
choices = unique(df_long$pillar)),
selectInput("dimension", "Select a Dimension",
choices = "", selected = ""),
selectInput("indicatordescription", "Select an Indicator",
choices = "", selected = ""),
selectInput("year", "Select a Year",
choices = "", selected = "")
For the pillars, the choices may be filled because they will always be the same four. However, the dimensions depend on which pillars is selected, the indicator on what dimension is selected, and the year based on which data is available for the indicator. Hence I leave the choices and selected option blank and use this corresponding code in the server:
shinyServer(function(session, input, output){
observeEvent(
input$pillar,
updateSelectInput(session, "dimension", "Select a Dimension",
choices = unique(df_long$dimension[df_long$pillar==input$pillar])))
observeEvent(
input$dimension,
updateSelectInput(session, "indicatordescription", "Select an Indicator",
choices = unique(df_long$indicatordescription[df_long$dimension==input$dimension & df_long$pillar==input$pillar])))
observeEvent(
input$indicatordescription,
updateSelectInput(session, "year", "Select a Year",
choices = unique(df_long$TimePeriod[df_long$indicatordescription==input$indicatordescription])))
)
In the function that opens the Shiny server, “input,
output” will be automatically listed, but it is important to add
“session” as well for the code above to work.
observeEvent() in Shiny triggers a reactive expression when
a specific event, such as a user action or input change, occurs. So for
example, in the first block of code I observe what the user has selected
for the pillar using input$pillar referring to the id in
the UI (the first expression - selectInput("pillar", ...)).
Then I updateSelectInput() filtering for the dimensions
under that pillar. I do the same observing the dimension and indicator
selected. This process is super simple when the data is in a long format
with all of the information attached. You will see with Brain Capital
and Regional Integration it is a little less intuitive. It is nice that
Shiny does all of this updating in real time where the user doesn’t have
to press any buttons or anything.
Below all of the selection, the sidebar displays information about the indicator selected. In the UI, the code is as follows:
h5(tags$b("Indicator")),
textOutput("selected_var"),
tags$br(),
h5(tags$b("Units")),
textOutput("unit"),
tags$br(),
h5(tags$b("Access Source")),
uiOutput("accesssource"),
tags$br(),
h5(tags$b("Data Source")),
uiOutput("datasource")
The key information to note here is that the selected indicator and
units are textOutput() and the sources are
uiOutput(). The latter are uiOutput() because
their content is links. The corresponding server code is:
output$selected_var <- renderText({
paste(unique(d1()$indicatordescription))
})
output$unit <- renderText({
paste(unique(d1()$unit))
})
output$accesssource <- renderUI({
if(unique(d1()$access_source) == "UN SDGs"){
tagList(unique(d1()$access_source), "( Goal ", a(href = unique(d1()$access_link), unique(d1()$sdgcode), target = "_blank"), ")")
} else {
tagList(a(href = unique(d1()$access_link), unique(d1()$access_source), target = "_blank"))
}
})
output$datasource <- renderUI({
if (!(is.na(unique(d1()$data_link)))){
tagList(a(href = unique(d1()$data_link), unique(d1()$data_source), target = "_blank"))
} else {
paste("Not Applicable")
}
})
For the unit and the selected indicator, displaying the text is a
simple task of using renderText() and pasting the unique
value in the selected data (I will explain in a moment what
d1() is). The for the sources, I use an ifelse for the
first to have a specific method to link for the SDGs. Then the second is
an ifelse for if there is another original source or not.
So, d1() is a reactive dataframe that is the result of
the selections made by the user. Reactives in Shiny are objects that
automatically update when their dependencies change, allowing for
dynamic and responsive behavior in the application based on user inputs
or other reactive values. To call a reactive, you always put the name
and curved brackets (“()”). Again, using a long dataframe makes the
process of defining the reactive very simple. I do so as follows:
d1 <- reactive({
d <- df_long %>% filter(TimePeriod == input$year & indicatordescription == input$indicatordescription & !(df_long$GeoAreaName == "World"))
return(d)
})
The reactive serves as the data input for the interactive map. To build the map, I used Leaflet. This website has most of the basic documentation for Leaflet you could possibly want. Since I built this app before ChatGPT came out, I used resources like this all the time. The code for my map, which I used in all other future dashboards is:
output$map <- renderLeaflet({
labels <- sprintf(
"<strong>%s</strong><br/>%g %s",
d1()$GeoAreaName, d1()$value, unique(d1()$label)) %>%
lapply(htmltools::HTML)
if(input$pillar == "P1: Transparent governance"){
colpal <- colorBin(palette = "YlGnBu", bins = unique(d1()$bins), domain = d1()$value)
}
else if(input$pillar == "P2: Responsible living"){
colpal <- colorBin(palette = "BuPu", bins = unique(d1()$bins), domain = d1()$value)
}
else if(input$pillar == "P3: Inclusive economies"){
colpal <- colorBin(palette = "YlOrRd", bins = unique(d1()$bins), domain = d1()$value)
}
else if(input$pillar == "P4: Sustainable Energy and Environment"){
colpal <- colorBin(palette = "YlGn", bins = unique(d1()$bins), domain = d1()$value)
}
d1() %>%
st_transform(crs = "+init=epsg:4326") %>%
leaflet() %>%
addProviderTiles(provider = "Esri.NatGeoWorldMap") %>%
addPolygons(label = labels,
layerId = ~GeoAreaName,
stroke = TRUE,
weight = 0.5,
color = "black",
smoothFactor = 0.5,
opacity = 1,
fillOpacity = 1,
fillColor = ~ colpal(d1()$value),
#popup = popupGraph(popupPlot, type = "svg"),
highlightOptions = highlightOptions(weight = 5,
fillOpacity = 1,
color = "blue",
opacity = 0,
bringToFront = TRUE)) %>%
addLegend("topright",
pal = colpal,
values = ~ d1()$value,
title = "Legend",
opacity = 1) %>%
setView(lng = 18.1, lat = 34.5, zoom = 2.5) %>%
addEasyButton(easyButton(
icon="fa-globe", title="Worldwide View",
onClick=JS("function(btn, map){ map.setZoom(1.5); }")))
})
Note that all of this code I have described is encompassed by
renderLeaflet({}). Inside I first define the text for the
labels using sprintf(). Figuring out the right way to write
"<strong>%s</strong><br/>%g %s"
to display what I want took ages, so tread carefully if you want to
change it to meet your needs. Next I define a color pallet using
colorbrewer. This
website is helpful for selecting palettes, the options with
colorbrewer are limited but effective. I condition based on
the pillar selected and have a different color palette for each.
Then, to make the map, I name my data source, in this case the
reactive d1().
st_transform(crs = "+init=epsg:4326") in Leaflet reprojects
spatial data to the WGS 84 coordinate reference system (EPSG:4326),
which is commonly used for web mapping, so you won’t typically change
this between different maps. addProviderTiles() refers to
the background of the map. Here
is a very nice GitHub app which displays all of the different options.
Not all of them work. Not all of them work, but some wild ones do like
Thunderforest_SpinalMap. Next we add information to the
map. In this case, we addPolygons() to create a choropleth
map! See the Leaflet documentation on choropleths here.
Other ways to add information to the map can be
addCircles(), addTiles(), and more. In this
case, within addPolygons(), I customize. I add the labels
defined previously, the names of countries using
layerId = ~GeoAreaName, the color palette based on the
value with fillColor = ~ colpal(d1()$value), and so on.
Then I proceed to addLegend(), which will reflect the
number of bins I defined in the color palettes. After I
setView() to set a default amount of zoom centering
specific coordinates (I picked the Mediterranean since we are EMEA) and
lastly add some buttons with a bit of Java script to let the user return
to world view from any level of zoom.
One feature that appears only in the TRIS Dashboard are the Country
Plots. Rym requested that I plot the indicators over time to show how
they have changed. I wanted to make the plots interactive, specifically
so that you hover over the points and see the values. There are several
potential ways of building plots like this (including using plotly as I
did in the Regional Integration Matrix Dashboard). However, in this case
(probably because I built this app before ChatGPT was released), I went
for the ggiraf package.
To do the selection for the country plots (of pillar, dimension, and indicator), I repeat the same steps as the interactive map with different ID names. Knowing what I know now, this would be a great case for modules so you aren’t repeating the same code.
The code for the plot itself is as follows:
output$countryplots <- renderGirafe({
ylim_max <- c1()$max[1]
ylim_min <- c1()$min[1]
caption_text <- if(unique(c1()$access_source) == "UN SDGs"){
paste0("Source: UN SDGS (Goal ", unique(c1()$sdgcode), ")")
} else {
paste0("Source: ", unique(c1()$access_source))
}
myplot = ggplot(c1(), aes(TimePeriod, value, color = Country,
tooltip = round(value, digits = 2), data_id = value)) +
geom_line(size = 0.8) +
geom_point_interactive(size = 1.5) +
theme_light() +
theme(legend.position = "bottom",
legend.background = element_rect(fill="#ebebeb",
size=0.5, linetype="solid"),
plot.title = element_text(face = "bold"),
axis.title = element_text(face = "bold"),
#text = element_text(family = "sans"),
plot.caption = element_text(hjust = 0.5, face = "italic")) +
xlab("Year") +
ylab(str_wrap(unique(c1()$y_axis_title), width = 48)) +
ylim(ylim_min, ylim_max) +
labs(title = str_wrap(unique(c1()$indicatordescription),
width = 59), caption = caption_text) +
guides(fill=guide_legend(title="Country"))
girafe(ggobj = myplot,
options = list(opts_selection(type = "none")))
})
Within renderGirafe I set the min and max value of the Y
axis. Then I created the plot I wanted in ggplot using the reactive data
c1(). Then you close with
girafe(ggobj = myplot, options = list(opts_selection(type = "none")))
and you have your interactive plot! Such a simple way to make a ggplot
more dynamic in the app. I think it is great.
It is likely that during you work, Rym will ask you to update the TRIS Dashboard. Carlo has been working on connecting the concepts to regenerative economy and wishes to update the indicators in the TRIS accordingly. I suppose you have the choice of continuing with this app which is a bit inefficient with the long format data but functional, or trying to build a new app (or debug the one I started here) with wide format data. Either way, I kindly ask that if you are using an app that has my code that you keep my name somewhere in the credits!
Alright! The next dashboard I built after the TRIS was the Brain Capital Dashboard. The first version I constructed was modeled very closely after the TRIS, with the data in long format. The code for that app is available here on Github, but I doubt you will need to refer to it at any point.
The data cleaning for brain capital also transpired in a very similar vein to the TRIS. The R Scripts for data cleaning are in a GitHub repository called BrainCapital-datacleaning. Sara had done some previous data collection, so for the first dashboard I used her data with additions. However, the data was in an excel file with each indicator on a different sheet in a different format. So for the second version, I recollected all of the data.
The code for cleaning the data is in this
file entitled bc_datawrangling.R. The data is called
from a folder called bc_data which I have moved here
in the OneDrive for you. You will have to change the file path if you
want to run the code. It is a bit better organized than the TRIS one
with the pillars and dimensions being in order. I merge the data in long
format. I save the long format data for V1 of the BC Dashboard and then
reshape the data wide for V2 of the dashboard.
df_wide.csv
from the data cleaning file and BC_titles.xlsx an excel
sheet the same as the TRIS that contains information about the
indicators. In the UI, the map and the selection is essentially the
same. In the server, the first difference appears in the
updateSelectInput().
observeEvent(
input$pillar,
updateSelectInput(session, "dimension", "Select a Dimension",
choices = unique(titles$dimension[titles$pillar==input$pillar])))
observeEvent(
input$dimension,
updateSelectInput(session, "indicatordescription", "Select an Indicator",
choices = unique(titles$indicatordescription[titles$dimension==input$dimension & titles$pillar==input$pillar])))
The code is the same but now the selection is happening from the data
frame titles which has all of the information about the
indicators and dimensions. Where it gets a little bit complicated is for
displaying on the years available for each indicator. You could just
have all of the years and have a warning that says “No Data,” but I
didn’t like the idea of that. How I deal with it is to make a reactive
for the selected indicator like so:
indicator_sel <- reactive({
#store indicator selected
ind <- unique(titles$variable[titles$indicatordescription == input$indicatordescription])
#make a list of what to keep from df_wide
keep <- append(ind, df_wide_colnames)
#filter df_wide based on column names
df_selected <- df_wide %>%
select(any_of(keep)) %>%
na.omit()
#using na.omit() filters for years available
#make a column for the indicator name (to later merge in titles)
df_selected$variable <- names(df_selected)[1]
return(df_selected)
observeEvent(
input$indicatordescription,
updateSelectInput(session, "year", "Select a Year",
choices = unique(indicator_sel()$TimePeriod)))
})
Note the comments for the steps, I hope it makes it clear what I do!
At the top of the app, I define a vector called
df_wide_colnames with the key variables needed for mapping
besides the value (“GeoAreaName”, “iso3”, “TimePeriod”, “label_year”,
“geometry”). I filter the wide data frame down to only these 5 columns
and the indicator selected. I omit rows with NA values so that I am only
left with those who have data observed for the indicator selected. I
make an extra column with the code for the indicator based on the column
name so that I can later merge with the titles file I kept separate.
Then I am able to updateSelectInput() for the year with the
indicator_sel() as the data frame
Since some filtering already took place in
indicator_sel(), the main reactive for the interactive map
looks slightly different.
d1 <- reactive({
#filter for selected year
df_formap <- indicator_sel() %>%
filter(TimePeriod == input$year) %>%
rename(value = 1)
#merge in titles
df_formap <- left_join(df_formap, titles, by = "variable")
#make full label for hovering over countries
df_formap$label[is.na(df_formap$label)] <- ""
df_formap$full_label <- paste(df_formap$label, df_formap$label_year)
return(df_formap)
})
I filter for the year selected and rename the column
with data of the indicator selected “value.” This makes the input into
the leaflet more standard rather than a new column name for every
indicator. I merge the information from the titles file in using the
column I created in indicator_sel(). Then I generate the
label for hovering over countries! In this dashboard it includes the
year in parentheses.
Contrary to the TRIS Dashboard, the first page of the BCD is more
interactive.I use bsCollapse() to add drop down
descriptions of the dimensions. I also use fluidRow() to
create columns of text. For some reason the columns you have must add up
to 12. So you end up with something like:
fluidRow(
column(4,
hr(),
h2(tags$b("Pillar 2: Brain Health")),
hr(),
),
column(8,
tags$div(
style="text-align: justify;",
h4("Brain health means...."),
tags$br())))
The rest of the app is basically the same! But it is overall much more efficient because it is not trying to process the super heavy long data frame when you open the app.
The final dashboard I built to completion is the Regional Integration Matrix (RIM) Dashboard. Similar to the others, there were 2 versions of this work. The RIM comes from a 2020 study conducted by Rym and several colleagues where they define several dimensions and indicators to measure regional integration in the Mediterranean.
For the 2020 study, the data was collected by a team at E3 modelling. Different from the other dashboards at EMEA, the team used their specialization in Macro to write equations to measure different concepts. The data from this effort is available in an excel file I have added to the OneDrive here. All variables starting with “BV_” are the base variables. These are collected from mostly open source locations like the World Bank and the IMF. However, there is some data, specifically from EORA and GTAP, that was purchased. In the excel there is a sheet called “Base_Variables” that has the code of the variable, the description, the unit, the link to the source, and whether the data is publically available.
The base variables were then use to construct the indicators. The indicators are described in a sheet just after “Base_Variables” in one called indicators “Indicators.” This sheet is key because it gives the equation in column E of how to contruct each indicator.
The E3 team collected all of data for the Base Variables (noteably, not for all countries in the world, just the ones relevant to the study) and make a sheet for each. Then in Excel, they conducted all the construction of the indicators. This was a fine thing to do, but made it a serious undertaking when I needed to get the data in a workable format for R where every indicator is on a different sheet surrounded by noise (i.e. words that aren’t data). I used a bunch of strategies I found on the internet and wrangled the data into a usable format for the first version of the dashboard in this file. I then built the first app, only with the data from the 2020 study. The code for that app is available here in Github.
After building V1, Rym asked me to try and update the RIM concept with current data and a larger coverage area of Euro-Med-Africa. In my desk research on how to measure regional integration, I gained a hyper-fixation with using bilateral data to measure integration. In the 2020 study, the only region considered was the Union for the Mediterranean. I wanted the flexibility to define any number of regions and study their level of integration. Bilateral data provided that and I fell down a bit of a rabbit hole. In the following sections, I am not going to explain my methodology in very close detail because that is available in the published RIM Dashboard app. I also made this Powerpoint for the EMANES conference explaining what I did. Instead I will focus on the code and how to continue this work.
As mentioned, to update the study, I wanted to be able to define regions in a flexible manner and find a way to measure the level of integration within them. I looked at both geographic and institutional regions. I was curious to investigate whether countries are more closely linked to their physical neighbors versus countries that they have entered an institutional consortium with, but I never had time. The geographic regions I sources from the UN geoscheme. The institutional regions in Africa I pulled from the African Regional Integration Index.
I organized the regions I wanted in this
Excel file, which is now called regions_old.xlsx. I
rather stupidly put the geographic regions in one column and the other
each in their own column as a dummy variable. I gave each region a 4
letter code that can be seen in this
file the RI Codebook. Then, in order to run my code through all of
the regions, I made a file with just the isocodes and a comma separated
list of the regions each country belongs to. That file became
regions.xlsx here
and the code for how I transformed regions_old.xlsx into
that format is here
in the GitHub called regions.R.
In the time I had, I only managed to make a measure of integration within a region. However, it was part of Rym and I’s original ambition to look at intergration between different regions.
Now, here is where things start to get a little complicated. Because of the methodology I used, I needed to apply an equation to every region, in every year, for every indicator. This was necessary, not only to create the scores, but also in the case of bilateral data, the values to display in the map. Hence, I started building some crazy loops (with some help from ChatGPT). The code is decently commented relative to the apps I have built (sorry), so if you look at the files in the folder Composite Indicator Scripts in GitHub, I hope you will be able to follow the process. I will say, that the process differs in each script along the lines of whether data is bilateral or not.
Since the process was so hands on and required different things for each format of data imported. I have a different file for each of the different imports. Some scripts take much longer to run than others. Here is a brief guide as to what each of the files is in Composite Indicator Scripts (Note: I say B for bilateral and NB for non-bilateral):
Here is the Github repository of the dashboard and here is the published dashboard on the Open Knowledge Platform. At this point, I think I was getting tired of building the same app over again. So I tried to learn and apply some new skills. As such, the code of this app looks somewhat different. The main difference is the use of modules.
The benefit of module is the ability to reuse the same code in different locations of the same app. When you are coding, ideally you don’t want to repeat anything that you have already written. You want to use loops and other tools to be as concise as possible.
The reason I wanted to use modules is because I wanted to have separate tabs within each page for the two different types of regions. For example, in the place of the app where there is a drop down to regions, I didn’t want the use to scroll through a massive list with no distinction for geographic and institutional regions. There are a few ways I could have achieved this, like with some radio buttons or a slider between the two options, but I liked these tabs.
For each module, you define a ui and a server, sort of like it is a
mini Shiny app in its own right. Then when you want to use it, you call
it into you general app. You can call a module more than once with
different parameters like you would a function. There are a lot of
resources about how to use modules, so I won’t go into too much detail,
but for me the most important thing to remember is to put
ns() around the inputs of the UI.
I wish I had more time to write and explain how this app works because it is probably the one I am most proud of. I propose that if it because relevant for your work, write me an email and we can do a walk through over a call!
The last dashboard I was meant to building for EMEA was for the BBM,
but I got caught writing a paper on Energy that took all of my time. The
nice thing about this project is that all of the data is already
collected and clean. Doriana organized it and I have it in the GitHub
folder under the name bbm_clean.csv here.
I built the shell of an app and I started adding a few potential
visualizations. It is here
in the Github. Really this could be an excellent first endevour for you
to develop new skills in Shiny and design your own thing rather than
having to deal with updating the apps I have built. I suggested this
type of dashboard from the Shiny gallery to Doriana for inspiration, but
really it is your project!
Countrycode Package
When working with country data, it is very messy to try and merge
based on country name alone. Sometimes countries will have “Republic of”
in some cases or others not. Also over time country names can change and
you will have issues with matching (e.g. Eswatini, Czechia,
Myanmar…etc.). So I prefer to use iso3 codes for matching. However, not
all data comes with these codes. The package in R
countrycodes is a really great work around for this
issue.
df$iso3 <- countrycode(df$GeoAreaName, "country.name", "iso3c", warn = T)
unmatched_names <- c("UNMATCHED COUNTRY NAME 1", "UNMATCHED COUNTRY NAME 2")
right_iso <- c("ISOCODE OF UNMATCHED COUNTRY 1","ISOCODE OF UNMATCHED COUNTRY 2")
for (i in seq_along(unmatched_names)){
df$iso3[gsci2021$GeoAreaName == unmatched_names[i]] <- right_iso[i]
}
The first line of code creates a new column of isocodes, matching the
variable for countries names (in this case df$GeoAreaName)
with iso3c. The warn = T gives a list of
unmatched country names. Often these will be international groupings
like MENA and LDS, but there are some frequent unmatched ones like
“Türkiye.” So below I define two vectors, one for the unmatched
countries and the other for their isocodes. Then the loop replaces the
isocode value of the missings.
Publishing your Shiny App The way in which I published the Shiny apps was using shinyapps.io. I am realising, however, as I finish this document that I connected these publications with my personal Github account so I can’t just put the username and password here. I suggest that if you urgently need to update/republish any of the apps, contact me. Otherwise, you can make an account and publish it from scratch. Then you would just have to contact Nektar/the IT team and they will embed the Shiny app in the EMEA website.
In the Brain Capital Handover folder there are a few things I wish to explain. Here is what the folders within the folder are:
Just a quick note that Sara and I wrote a paper on energy for the German think tank Bertelsmann Stiftung. All of the drafts and notes we took for it are here in the OneDrive. This work is closed but I mention it because we really had to do a lot of research to execute it so if you need some resources regarding the energy transition in the Southern Med or EU external policy/finacial mechanisms this could be a helpful resource. I have especially found the Workshop Minutes that I put together from the event we had with Bertelsmann Stiftung in July 2023 have a lot of insights.
I hope that you have found this document useful! Please really feel free to contact me if you need help. Good luck!