The World Happiness Report is a landmark survey of the state of global happiness. The report was first published in 2012 and since then continues to gain global recognition. Happiness indicators have been used increasingly by governments, organizations and civil society to supplement their policy-making decisions. Leading experts across fields of economics, psychology, survey analysis, national statistics, health, public policy and more have described how measurements of well-being can be used effectively to assess the progress of nations.
GDP, household income and unemployment used to be metrics of well-being of a country. It was important to keep track of how people spend, how much they make and whether they have a job. However, these questions do not tell us anything about people’s happiness.
More leaders in the world have come to realise that they need to pay more attention to their people’s well-being. Leaders like German Chancellor Angela Merkel have touched on how well-being, including happiness, have become a guide for the nation’s decision-making. In 2018, New Zealand Prime Minister Jacinda Ardern had also announced plans to include the well-being of citizens as a measure of economic success.
Other than economic measures, other factors such a country’s ability to provide freedom to its people, how the country has managed it’s corruption and even down to how helpful and generous are the people in the country, are all equally important.
Image taken from: https://www.gallup.com/analytics/247355/gallup-world-happiness-report.aspx
The purpose of this visualisation report aims to review the state of happiness in the world today and how each world region have fared in terms of happiness over the years. It also aims to analyze to what extent the 6 factors, Economy, Social Support, Health, Freedom, Perceptions of Corruption and Generosity relate to happiness.
Economy: the country’s GDP per Capita
Social Support: how much support from friends or family do the country’s population have in times of need?
Health: Life Expectancy of the country’s population
Freedom: how much freedom is given in the country to make life choices?
Perceptions of Corruption: how widespread is corruption in the country and how much trust do the population have in its government?
Generosity: how generous is the population in the country, for example in performing generous deeds?
The data was taken from a Kaggle dataset entitled “World Happiness Report”. The dataset can be accessed from: https://www.kaggle.com/unsdsn/world-happiness?select=2019.csv
The data initially consists of 5 CSVs, 1 for each year from 2015 to 2019. Simple data processing was done to combine the CSVs into 1 CSV.
The final dataset looks like this:
One of the design challenges is how to effectively show the rankings of countries in terms of the Happiness Score and Year without overcrowding the visualistion. Hence, in order not to squeeze all these variables into 1 graph, I have decided to create bar charts, 1 for each year, from 2015 to 2019. The bar chart will also only include the top 20 happiest countries.
Another design challenge is to show up to 3 dimensions of the data at a glance, but at the same time not losing any important information. Hence, this challenge was solved by using faceting. In total, 3 faceted charts were used to visualise different combinations of variables.
Showing happiness scores of regions through the years using line faceted charts
Showing the 6 factors and their scores in the respective regions using parallel coordinates faceted charts
Showing the correlation of each of the factor with happiness using scatter-plot faceted charts
The first step is to install and load the necessary R packages.
tidyverse contains libraries such as readr, dplyr, tidyr and ggplot for data manipulation as well as exploration.
corrr is a package for exploring correlations in R.
GGally is an extension to ggplot2. The function ggparcoord() in this extension is used for plotting parallel coordinates charts.
The CSV file of the final dataset as described in Section 2.1 is loaded into the variable df using the function read_csv.
Data formatting needed to be done to the column Perceptions of Corruption. The column was changed from col_character() data type to col_double() data type, hence the argument col_types was used.
Some data manipulation needed to be done to generate the graphs for the Top 20 Global Rankings (2015-2019) charts in the final visualisation.
5 separate dataframes are generated for each year from 2015 to 2019 using dplyr library.
filter() is used to filter the data to the respective years.
arrange() is used to sort the data in ascending order according to the Happiness Rank column so that we get the countries with highest ranked Happiness Score first.
head() is then used to only take the top 20 countries.
data.2019 <- df %>% filter(Year == "2019") %>%
arrange(`Happiness Rank`) %>% head(20)
data.2018 <- df %>% filter(Year == "2018") %>%
arrange(`Happiness Rank`) %>% head(20)
data.2017 <- df %>% filter(Year == "2017") %>%
arrange(`Happiness Rank`) %>% head(20)
data.2016 <- df %>% filter(Year == "2016") %>%
arrange(`Happiness Rank`) %>% head(20)
data.2015 <- df %>% filter(Year == "2015") %>%
arrange(`Happiness Rank`) %>% head(20)In order to explore the correlations of the 6 factors and how they relate to happiness, data manipulation is done to extract only the required columns.
Country, Region, Happiness Rank and Year columns are removed.
A series of bar graphs are plotted in order to effectively visualise the top 20 countries in terms of Happiness Score.
x-axis: Happiness score
y-axis: Country
The code below uses data.2019 to visualise the top 20 countries in 2019. However, the other 4 bar charts (2015 - 2018) were visualised using the same set of codes, with only the data used in ggplot() that is different.
ggplot(data.2019, aes(x= reorder(Country, `Happiness Score`),
y = `Happiness Score`,
fill = `Happiness Score` )) +
geom_bar(position = 'dodge',
color = "grey",
stat = "identity") +
labs(title = "Top 20 Happiest Countries in 2019",
x = "Country") +
coord_flip() +
scale_fill_continuous(type = "gradient") +
geom_text(aes(label = round(`Happiness Score`, 2),
y = `Happiness Score` - 0.5),
size=3,
colour= "white") +
theme_minimal()
Next, a bar chart is also used to visualise the correlation coefficient scores between Happiness Score and all other 6 factors that related to happiness.
x-axis: The 6 factors
y-axis: Correlation coefficient
The function correlated() is used to calculate the correlation coefficients first, then geom_bar() was used to plot this data.
cor.df2 <- df.sel %>% correlate() %>% focus(`Happiness Score`)
cor.df2 %>%
mutate(rowname = factor(rowname,
levels = rowname[order(-`Happiness Score`)])) %>%
ggplot(aes(x = rowname,
y = `Happiness Score`)) +
geom_bar(stat = "identity",
color = "grey",
fill = "tomato2") +
ylab("Correlation with Happiness Score") +
xlab("Factor") +
geom_text(aes(label = round(`Happiness Score`, 2),
y = `Happiness Score` -0.05),
size=3,
colour= "white") Faceted charts are used to visualise the data available in greater detail.
This chart is used to visualise the changes in happiness throughout the years, faceted by regions.
x-axis: Year (from 2015 to 2019)
y-axis: Happiness score (Average)
Faceted Variable: Region
ggplot(df, aes(x= Year, y = `Happiness Score`)) +
stat_summary(geom = "line",
fun = "mean",
color = "seagreen4",
size = 0.5) +
stat_summary(geom = "errorbar",
colour = "black",
width =0.2) +
facet_wrap(~Region) +
labs(title = "Happiness By Region",
x = "Year",
y = "Avg. Happiness Score",
size = 2) +
coord_cartesian(ylim = c(4,8)) +
theme(strip.text = element_text(size=8,
face="bold",
color="firebrick"),
strip.background = element_rect(fill="bisque1")) +
stat_summary(aes(label=round(..y.., 2)),
fun.y=mean,
geom="text",
size=2,
vjust = -2)This chart is used to visualise the 6 factors in parallel, faceted by regions. This chart used only the scores in 2019.
x-axis: The 6 factors
y-axis: Factor scores
Faceted Variable: Region
ggparcoord(data.2019,
columns = c(5:10),
groupColumn = "Region",
scale = "uniminmax",
boxplot = TRUE,
title = "World Happines Factors by Regions (2019)") +
facet_wrap(~ Region) +
theme(legend.position = "none",
axis.text.x = element_text( size = 6,
angle = 45,
hjust = 1),
strip.text = element_text(size=7,
face="bold",
color="darkslategrey"),
strip.background =element_rect(fill="lightcyan2")) +
labs( x = "Factor",
y = "Factor Score",
size = 2)This chart is used to visualise each of the 6 factor scores with Happiness Score, faceted by the factors.
x-axis: The 6 factors scores
y-axis: Happiness score
Faceted Variable: Region
df.sel %>%
gather(-`Happiness Score`,
key = "var",
value = "Factor Score") %>%
ggplot(aes(x = `Factor Score`,
y = `Happiness Score`)) +
facet_wrap(~ var, scales = "free") +
geom_point(colour = "tomato1") +
stat_smooth(size=1,
method = lm,
colour = "black") +
theme(strip.text = element_text(size=8,
face="bold",
color="firebrick"))A line chart is created to show the changes of the 6 factor scores across the years.
x-axis: Year
y-axis: The 6 factors scores (Average)
ggplot(df, aes(x=Year)) +
stat_summary(aes(y = Economy,
colour= "Economy"),
geom = "line",
fun = "mean",
size = 1.5) +
stat_summary(aes(y = `Social Support`,
colour= "Social Support"),
geom = "line",
fun = "mean",
size = 1.5) +
stat_summary(aes(y = Health,
colour= "Health"),
geom = "line",
fun = "mean",
size = 1.5) +
stat_summary(aes(y = Freedom,
colour= "Freedom"),
geom = "line",
fun = "mean",
size = 1.5) +
stat_summary(aes(y = `Perceptions of Corruption`,
colour= "Perceptions of Corruption"),
geom = "line",
fun = "mean",
size = 1.5) +
stat_summary(aes(y = Generosity,
colour= "Generosity"),
geom = "line",
fun = "mean",
size = 1.5) +
labs(x = "Year",
y = "Avg. Factor Score",
size = 2,
colour="") +
scale_colour_manual(values=c("steelblue4",
"orange1",
"gold1",
"forestgreen",
"mediumorchid4",
"lightcoral")) How happy were the various countries throughout the years? In this section, we explore the top 20 happiest countries from 2015 to 2019.
How happy was each world region from 2015 to 2019? In this section, we explore the average happiness scores of each region, plotted against year.
In this section, we analyse in greater detail the 6 factors that is said to affect happiness.
How much have the 6 factor scores changed through the years?
How does each factor look like in the various world regions?
To what extent do the 6 factors affect the happiness of countries? This section explores the correlation between each of the 6 factors and happiness.
Through these collection of visualisations, a few interesting discoveries were made about world happiness and the factors that affect it.
Most countries in the Top 20 rankings of Happiness Score are those from the European region, with visibly no countries from Asia. Scandinavian countries like Denmark, Finland and Norway, have consistently been placed in the Top 6 of rankings. With years 2018 and 2019 especially, all 3 countries have stayed in the top 3 places. One Scandinavian country is clearly lacking behind, which is Sweden.
The happiest region is Australia and New Zealand, whose happiness scores fluctuate between 7.27 and 7.32 between 2015 and 2019. The region to follow close behind is North America. Even though the top few happiest countries tend to come from Western Europe, such as the Scandinavian countries, sadly the region as whole did not do as well as the regions mentioned above. Countries from the Sub-Saharan Africa region are especially unhappy, with happiness scores ranging from 4.11 to 4.3 between 2015 and 2019.
An interesting insight from this visualisation is that Social Support as a factor to happiness has increased through the years. Even though it dipped from 2015 to 2016, it quickly surpassed all the other factors in 2017. Perceptions of Corruption and Generosity scores remain low throughout the years, indicating that these 2 factors do not affect happiness as much as the other factors do.
The factor with the highest correlation to Happiness Score is Economy (GDP per Capita), with a correlation coefficient of 0.79. However, Health follows close behind, with a correlation coefficient of 0.74.
Overall, from these visualisations, we were able to find out how happy the world was throughout the years from 2015 to 2019, in terms of the various countries as well as each region collectively.
We also had a glimpse of how much the 6 factor scores have changed through the years, and even got a chance to explore how each factor look like in the different regions in 2019.
Lastly, the most significant finding was that we manage to find out to what extent each of the 6 factors affect the happiness.