The World Happiness Report is a landmark survey of the state of global happiness. The report was first published in 2012 and since then continues to gain global recognition. Happiness indicators have been used increasingly by governments, organizations and civil society to supplement their policy-making decisions. Leading experts across fields of economics, psychology, survey analysis, national statistics, health, public policy and more have described how measurements of well-being can be used effectively to assess the progress of nations.
In Part 1 of “The Happiness of the World” which can be found at https://rpubs.com/gracepua/happiness_of_the_world, Several things were explored. It covered an overview of review of world happiness through visualizing top 20 happiest countries through the years, how each world region as well as analyzing to what extent the 6 factors, Economy, Social Support, Health, Freedom, Perceptions of Corruption and Generosity relate to happiness.
However, the purpose of this visualization, The Happiness of the World Part 2", aims to provide an interactive geographic display of happiness.
The World Happiness data was taken from a Kaggle dataset entitled “World Happiness Report”. The dataset can be accessed from: https://www.kaggle.com/unsdsn/world-happiness
The data initially consists of 5 CSVs, 1 for each year from 2015 to 2019. Simple data processing was done outside of R to combine the CSVs into 1 CSV.
The final dataset looks like this:
The second data file is a Shapefile that is used to draw out the world map. The data was taken from Natural Earth website: https://www.naturalearthdata.com/downloads/50m-cultural-vectors/50m-admin-0-countries-2/
One challenge is that in order for the map to be built, 2 sets of data need to be joined together, the world happiness data and the shapefile data. However, the joining proves to be difficult, given that different datasets have different state names. For example, Hong Kong can be named Hong Kong or Hong Kong S.A.R. depending on which dataset we are looking at. To add on top of the difficulty, the shapefile data contains up to 8 columns of different naming conventions. Hence, we will need to find the correct columns to join on as well as manually search for country names to be edited in order for the joining to be successful.
The second challenge is that tmap as a package have limitations of creating filters in interactive map visualizations. Hence, the different years of happiness scores, from 2015 to 2019, cannot be most effectively visualized in the interactive map design. The workaround for this challenge is to visualize all 5 years of data as side-by-side maps or use faceting. However, this would result in a very clunky visualization which may not be the most ideal.
Therefore, in order to tackle the second challenge, the interactive map would only visualize happiness scores of countries in 2019. This would also allow users to only be shown and informed of happiness findings that are most recent.
The below design will only show happiness scores in 2019.
The first step is to install and load the necessary R packages.
tidyverse contains libraries such as readr, dplyr, tidyr and ggplot for data manipulation as well as exploration.
sf is a package used to encode spatial vector data. It will be used here mainly to read the shapefile that we have using st_read().
tmap is a package that will be used to visualise static and interactive thematic maps.
The CSV file of the World Happiness dataset is loaded into the variable happy_df using the function read_csv.
Data formatting needed to be done to the column Perceptions of Corruption. The column was changed from col_character() data type to col_double() data type, hence the argument col_types was used.
happy_df <- read_csv("data/WorldHappiness.csv",
col_types = cols(`Perceptions of Corruption` = col_double()))Next, we will also need to load the shapefile that contains all of the coordinates of the countries in the world. This will be used to draw the world map later on.
## Reading layer `ne_50m_admin_0_countries' from data source `C:\Users\gracepua\OneDrive\Desktop\Visual Analytics\assignment 5\IS428_Assignment5\data\ne_50m_admin_0_countries\ne_50m_admin_0_countries.shp' using driver `ESRI Shapefile'
## Simple feature collection with 241 features and 94 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -180 ymin: -89.99893 xmax: 180 ymax: 83.59961
## geographic CRS: WGS 84
Firstly, in this visualization, we would only need the columns Country, Happiness Score and the 6 factor scores from the World Happiness data. Hence the first step would be to remove all other unnecessary columns from the data frame.
Secondly, some country names need to be edited in the world happiness dataset as they do not match with the names in the shapefile dataset. This is necessary before going to the next step of joining the datasets together.
happy_df$Country[happy_df$Country == "Hong Kong"] <- "Hong Kong S.A.R."
happy_df$Country[happy_df$Country == "North Cyprus"] <- "Northern Cyprus"
happy_df$Country[happy_df$Country == "Czech Republic"] <- "Czechia"
happy_df$Country[happy_df$Country == "Serbia"] <- "Republic of Serbia"
happy_df$Country[happy_df$Country == "Somaliland Region"] <- "Somaliland"
happy_df$Country[happy_df$Country == "Swaziland"] <- "eSwatini"
happy_df$Country[happy_df$Country == "Palestinian Territories"] <- "Palestine"
happy_df$Country[happy_df$Country == "Congo (Brazzaville)"] <- "Republic of the Congo"
happy_df$Country[happy_df$Country == "Congo (Kinshasa)"] <- "Democratic Republic of the Congo"
happy_df$Country[happy_df$Country == "North Macedonia"] <- "Macedonia"Lastly, the world happiness dataset and the shapefile dataset need to be combined together before the interactive map can be built.
left_join() function is used to join the world happiness data to the shapefile data.
The happiness scores of 2019 are first visualized in a static map to show the latest status of happiness around the world.
tmap_style("white")
tm_shape(map_df[map_df$Year==2019, ]) +
tm_fill("Happiness Score",
style = "quantile",
palette = "Greens",
title = "Happiness Score",
legend.is.portrait = TRUE) +
tm_layout(main.title = "World Happiness (2019)",
main.title.position = "center",
main.title.size = 1.2,
legend.height = 0.45,
legend.width = 0.35,
legend.outside = FALSE,
legend.position = c("left", "bottom"),
legend.frame = TRUE,
legend.bg.color = "white",
legend.bg.alpha = 0.2,
bg.color = "skyblue",
inner.margins = c(0, .02, .02, .02)) +
tm_borders(alpha = 0.5) +
tm_text("ISO_A3", size = "AREA") +
tm_credits("Source: World Happines data taken from Kaggle and geospatial world countries data taken from Natural Earth",
position = c("right", "bottom")) +
tm_grid(alpha = 0.2) +
tm_compass(type="8star", position = c(.65, .15), size = 3, color.light = "grey90")A side-by-side comparison of happiness scores between 2015 and 2019 is done to show the changes in happiness of countries between the period of 5 years.
map.2019 <- tm_shape(map_df[map_df$Year==2019, ]) +
tm_fill("Happiness Score",
style = "quantile",
palette = "div") +
tm_layout(main.title = "World Happiness (2019)",
legend.show = TRUE,
legend.frame = TRUE,
legend.bg.color = "white",
legend.bg.alpha = 0.2,
title.position = c("center", "center"),
title.size = 20,
bg.color = "skyblue",
inner.margins = c(0, .02, .02, .02),
legend.position = c("left", "bottom")) +
tm_text("ISO_A3", size = "AREA") +
tm_borders(alpha = 0.5) +
tm_grid(alpha = 0.2) +
tm_compass(type="8star",
position = c("right", "bottom"),
size = 3,
color.light = "grey90")
map.2015 <- tm_shape(map_df[map_df$Year==2015, ]) +
tm_fill("Happiness Score",
style = "quantile",
palette = "div") +
tm_layout(main.title = "World Happiness (2015)",
legend.show = TRUE,
legend.frame = TRUE,
legend.bg.color = "white",
legend.bg.alpha = 0.2,
title.position = c("center", "center"),
title.size = 20,
bg.color = "skyblue",
inner.margins = c(0, .02, .02, .02),
legend.position = c("left", "bottom")) +
tm_text("ISO_A3", size = "AREA") +
tm_borders(alpha = 0.5) +
tm_grid(alpha = 0.2) +
tm_compass(type="8star",
position = c("right", "bottom"),
size = 3,
color.light = "grey90")
tmap_arrange(map.2015, map.2019, ncol=1)The main visualization is to show the happiness of the world in 2019.
The interactivity is done by changing the tmap mode to View. When a country is clicked on the map, the pop up box will show the happines score and the 6 factor scores of that country in 2019. The 6 factors are Economy, Social Support, Health, Freedom, Perceptions of Corruption and Generosity.
tmap_mode("view")
tm_shape(map_df[map_df$Year==2019, ])+
tm_polygons("Happiness Score",
n = 10,
palette = "div",
id = "happiness_score",
popup.vars = c("Happiness Score",
"Economy",
"Social Support",
"Health", "Freedom",
"Perceptions of Corruption",
"Generosity")) +
tm_scale_bar(width = 0.2) +
tm_text("ISO_A3", size = "AREA")+
tm_borders(alpha = 0.5)