IS434 Assignment 5
This interactive visualization aims to uncover insights on the World Happiness Report 2020, specifically, which countries are the happiest, the factors contributing to this happiness score, and how did Singapore fare. This visualization would be useful people who want to know how happy different people are over the globe, and what to do to increase the level of happiness.
Data is taken from the World Happiness report
There is no geospatial data on countries in the given dataset. It would be useful to visualise the geospatial distribution of the happiness score of different countries in the world via a chloropleth map. The solution would be to import the shapefile of countries to supplement the region data.
Shape downloaded from http://tapiquen-sig.jimdofree.com. Carlos Efraín Porto Tapiquén. Geography, GIS and Digital Cartography. Valencia, Spain, 2020.
There are 6 factors affecting happiness score in this dataset. To do a more effective analysis, I selected just 2 factors that I want to explore - GDP per Capita and Generosity. These information can be visualised on a scatter plot to see how the countries are distributed across the factor scores as well as how a factor is correlated with the happiness score. Generosity is the residual of regressing the national average of GWP responses to the question “Have you donated money to a charity in the past month?” on GDP per capita, whereas GDP per capita is in terms of Purchasing Power Parity (PPP).
It might be hard to see the difference of the happiness scores of countries that are near each other on the map. Hence, I will use colors with more variations instead of just using different hues of a single color. This will make the map nicer as well and easier to differentiate the scores among the countries.
Import sf, tmap, highcharter and tidyverse (containing ggplot2) into R. The R packages will be used to read the data and plot the visualizations.
packages = c('sf', 'tmap', 'tidyverse', 'plotly','viridis','hrbrthemes', 'ggthemes')
for (p in packages){
if(!require(p, character.only = T)){
install.packages(p)
}
library(p,character.only = T)
}
Load the dataset on the World Happiness Report and countries in R.
whr <- read_csv(file = "data/WHR20_DataForFigure2.1.csv")
#head(data.frame(whr))
countries <- st_read(dsn = "data/World_Countries",
layer = "World_Countries")
## Reading layer `World_Countries' from data source `C:\Users\User\Documents\data\World_Countries' using driver `ESRI Shapefile'
## Simple feature collection with 252 features and 1 field
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -180 ymin: -90 xmax: 180 ymax: 83.6236
## geographic CRS: WGS 84
#head(data.frame(countries))
Merge countries dataset and WHR dataset together, based on country name. Filter out the countries with ‘NA’ values.
join <- left_join(countries, whr, by = c("COUNTRY" = "Country name"))
data <- join %>% filter(!is.na(`Ladder score`))
head(data.frame(data))
## COUNTRY Regional.indicator Ladder.score
## 1 Afghanistan South Asia 2.5669
## 2 Algeria Middle East and North Africa 5.0051
## 3 Azerbaijan Commonwealth of Independent States 5.1648
## 4 Albania Central and Eastern Europe 4.8827
## 5 Armenia Commonwealth of Independent States 4.6768
## 6 Argentina Latin America and Caribbean 5.9747
## Standard.error.of.ladder.score upperwhisker lowerwhisker
## 1 0.03131143 2.628270 2.505530
## 2 0.04423601 5.091802 4.918397
## 3 0.03419724 5.231827 5.097774
## 4 0.05611573 4.992687 4.772713
## 5 0.05859536 4.791646 4.561953
## 6 0.05344176 6.079446 5.869954
## Logged.GDP.per.capita Social.support Healthy.life.expectancy
## 1 7.462861 0.4703670 52.59000
## 2 9.537965 0.8033851 65.90517
## 3 9.687727 0.8193083 65.50840
## 4 9.417931 0.6710705 68.70814
## 5 9.100476 0.7574794 66.75066
## 6 9.810955 0.9005679 68.80380
## Freedom.to.make.life.choices Generosity Perceptions.of.corruption
## 1 0.3965730 -0.09642940 0.9336866
## 2 0.4666109 -0.12110516 0.7354851
## 3 0.7868241 -0.24025528 0.5525376
## 4 0.7819942 -0.04230949 0.8963037
## 5 0.7120178 -0.13877961 0.7735448
## 6 0.8311324 -0.19491386 0.8420098
## Ladder.score.in.Dystopia Explained.by..Log.GDP.per.capita
## 1 1.972317 0.3007058
## 2 1.972317 0.9438560
## 3 1.972317 0.9902727
## 4 1.972317 0.9066530
## 5 1.972317 0.8082624
## 6 1.972317 1.0284656
## Explained.by..Social.support Explained.by..Healthy.life.expectancy
## 1 0.3564338 0.2660515
## 2 1.1430036 0.7454185
## 3 1.1806130 0.7311341
## 4 0.8304839 0.8463296
## 5 1.0345769 0.7758573
## 6 1.3725437 0.8497737
## Explained.by..Freedom.to.make.life.choices Explained.by..Generosity
## 1 0.0000000 0.13523471
## 2 0.0839438 0.11891501
## 3 0.4677347 0.04011321
## 4 0.4619459 0.17102776
## 5 0.3780758 0.10722574
## 6 0.5208403 0.07010047
## Explained.by..Perceptions.of.corruption Dystopia...residual
## 1 0.001225785 1.507236
## 2 0.129190654 1.840812
## 3 0.247307181 1.507633
## 4 0.025361285 1.640897
## 5 0.104618184 1.468162
## 6 0.060415059 2.072541
## geometry
## 1 MULTIPOLYGON (((61.27656 35...
## 2 MULTIPOLYGON (((-5.152135 3...
## 3 MULTIPOLYGON (((46.54037 38...
## 4 MULTIPOLYGON (((20.79192 40...
## 5 MULTIPOLYGON (((46.54037 38...
## 6 MULTIPOLYGON (((-71.01648 -...
Start by creating a static visualisation of the happiness scores in a world map using tmap. By default, tmap_mode is set to “plot”.
tmap_mode("plot")
tm_shape(data)+
tm_fill("Ladder score" ,
style = "quantile",
palette = "RdYlGn")+
tm_layout(legend.outside = TRUE,
legend.position = c("right", "bottom"),
frame = FALSE
) +
tm_borders(alpha = 0.5) +
tm_compass(type="8star", size = 2) +
tm_scale_bar(width = 0.15)
Set tmap_mode is set to “view” to make the world map interactive. When hovering over a country area on the map, the tooltip shows the country name.
map <- tm_shape(data) +
tm_polygons("Ladder score",
style = "quantile",
palette = "RdYlGn") +
tm_layout(legend.height = 0.45,
legend.width = 0.3,
legend.outside = FALSE,
legend.position = c("right", "bottom"),
frame = FALSE
) +
tm_borders(alpha = 0.5)
tmap_mode("view")
map
## Warning: One tm layer group has duplicated layer types, which are omitted. To
## draw multiple layers of the same type, use multiple layer groups (i.e. specify
## tm_shape prior to each of them).
Plot scatter plots using ggplot and ggplotly to show how the various factors affect happiness scores. Ggplotly adds interactivity to the scatter plots. There are 6 factors that affect the happiness score, but I have decided to just select 2 factors that I am interested to analyse - GDP per Capita and Generosity.
Double-clicking on a particular country in the legend will highlight that country’s point on the scatter plot.
p <- data %>%
# prepare text for tooltip
mutate(text = paste("Country: ", COUNTRY, "\nHappiness Score: ", `Ladder score`, "\nLogged GDP per capita: ", `Logged GDP per capita`, sep="")) %>%
ggplot( aes(x=`Logged GDP per capita`, y=`Ladder score`, color = COUNTRY, text=text)) +
geom_point(alpha=0.7) +
scale_size(range = c(1.4, 19), name="Population (M)") +
scale_color_viridis(discrete=TRUE, guide=FALSE) +
labs(title = "How does GDP affect Happiness?",
color = "Country", x = "GDP per Capita",
y = "Happiness Score")+
theme_minimal() +
theme(legend.position="right") + theme_ipsum_rc()
# turn ggplot interactive with plotly
pp <- ggplotly(p, tooltip="text")
pp
p2 <- data %>%
# prepare text for tooltip
mutate(text = paste("Country: ", COUNTRY, "\nHappiness Score: ", `Ladder score`, "\nGenerosity: ", `Generosity`, sep="")) %>%
ggplot( aes(x=`Generosity`, y=`Ladder score`, color = COUNTRY, text=text)) +
geom_point(alpha=0.7) +
scale_size(range = c(1.4, 19), name="Population (M)") +
scale_color_viridis(discrete=TRUE, guide=FALSE) +
labs(title = "How does Generosity affect Happiness?",
color = "Country", x = "Generosity",
y = "Happiness Score")+
theme_minimal() +
theme(legend.position="right") + theme_ipsum_rc()
# turn ggplot interactive with plotly
pp2 <- ggplotly(p2, tooltip="text")
pp2
## Warning: One tm layer group has duplicated layer types, which are omitted. To
## draw multiple layers of the same type, use multiple layer groups (i.e. specify
## tm_shape prior to each of them).
According to the scatterplot, GDP per Capita has a strong positive correlation to the Happiness Score, as compared to Generosity. This shows that wealth has a huge impact on Happiness, probably because money can buy materialistic happiness. Generally, countries with high generosity also have a high happiness score, which shows that making other people happy will make one happy as well. However, there are quite a few outliers, such as Indonesia and Myanmmar, which suggests that there are other more important factors that affect the happiness score. Hence, I can conclude that among these 2 factors, GDP per Capita is more important in increasing a country’s happiness and thus should be prioritised.
According to the world map, countries in the Nordic region, such as Finland, have very high happiness scores, whereas countries in Africa and the Middle East have very low happiness scores. This is largely due to the wealth difference in the different regions. Interestingly, Russia has a neutral score for Happiness.
Singapore has a high happiness score as seen from the map, but surprisingly, Singapore was not one of the top few countries with an extremely high happiness score. For GDP per capita, Singapore ranked very high with a logged GDP per capita of 11.4, whereas Singapore ranked quite low for Generosity at 0.03. This shows that Singaporeans are not particularly generous in terms of donating money, and that money can’t buy happiness!