The World Happiness Report (WHR) is a publication of the United Nations Sustainable Development Solutions Network. It contains articles and rankings of national happiness, based on respondent ratings of their own lives, which the report also correlates with various (quality of) life factors.
The first World Happiness Report was released on April 1, 2012 as a foundational text for the UN High Level Meeting: Well-being and Happiness: Defining a New Economic Paradigm, drawing international attention. The first report outlined the state of world happiness, causes of happiness and misery, and policy implications highlighted by case studies. In 2013, the second World Happiness Report was issued, and in 2015 the third. Since 2016, it has been issued on an annual basis on the 20th of March, to coincide with the UN’s International Day of Happiness.
Usually, WHR uses six key variables contribute to explaining the full sample of national annual average scores. These variables are GDP per capita, social support, healthy life expectancy, freedom, generosity, and absence of corruption.
Since The World Happiness Report has been focusing on ranking, in this exploration, we will try to make a comparison of each metrics used, then ranking each on of them, and see the contrast between the earliest available data, which is 2008 and the latest that is 2020 data (the 2020 data used for 2021 report).
Load Library
library(tidyverse)
library(dplyr)
library(ggpubr)
library(ggrepel)
library(GGally)
Load The Data
wh21_raw <- read.csv(file = "world-happiness-report-2021.csv")
wh_raw <- read.csv("world-happiness-report.csv")
wh_raw
str(wh_raw)
## 'data.frame': 1949 obs. of 11 variables:
## $ ï..Country.name : chr "Afghanistan" "Afghanistan" "Afghanistan" "Afghanistan" ...
## $ year : int 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 ...
## $ Life.Ladder : num 3.72 4.4 4.76 3.83 3.78 ...
## $ Log.GDP.per.capita : num 7.37 7.54 7.65 7.62 7.71 ...
## $ Social.support : num 0.451 0.552 0.539 0.521 0.521 0.484 0.526 0.529 0.559 0.491 ...
## $ Healthy.life.expectancy.at.birth: num 50.8 51.2 51.6 51.9 52.2 ...
## $ Freedom.to.make.life.choices : num 0.718 0.679 0.6 0.496 0.531 0.578 0.509 0.389 0.523 0.427 ...
## $ Generosity : num 0.168 0.19 0.121 0.162 0.236 0.061 0.104 0.08 0.042 -0.121 ...
## $ Perceptions.of.corruption : num 0.882 0.85 0.707 0.731 0.776 0.823 0.871 0.881 0.793 0.954 ...
## $ Positive.affect : num 0.518 0.584 0.618 0.611 0.71 0.621 0.532 0.554 0.565 0.496 ...
## $ Negative.affect : num 0.258 0.237 0.275 0.267 0.268 0.273 0.375 0.339 0.348 0.371 ...
Changing the column names of wh_raw and naming the dataframe wh
wh <-
wh_raw %>%
select(country = ï..Country.name,
life_ladder = Life.Ladder,
log_gdp_capita = Log.GDP.per.capita,
social_support = Social.support,
health_birth_expectancy = Healthy.life.expectancy.at.birth,
freedom = Freedom.to.make.life.choices,
generosity = Generosity,
corruption = Perceptions.of.corruption,
year,
postive_affect = Positive.affect,
negative_affect = Negative.affect)
wh21_raw
str(wh21_raw)
## 'data.frame': 149 obs. of 20 variables:
## $ ï..Country.name : chr "Finland" "Denmark" "Switzerland" "Iceland" ...
## $ Regional.indicator : chr "Western Europe" "Western Europe" "Western Europe" "Western Europe" ...
## $ Ladder.score : num 7.84 7.62 7.57 7.55 7.46 ...
## $ Standard.error.of.ladder.score : num 0.032 0.035 0.036 0.059 0.027 0.035 0.036 0.037 0.04 0.036 ...
## $ upperwhisker : num 7.9 7.69 7.64 7.67 7.52 ...
## $ lowerwhisker : num 7.78 7.55 7.5 7.44 7.41 ...
## $ Logged.GDP.per.capita : num 10.8 10.9 11.1 10.9 10.9 ...
## $ Social.support : num 0.954 0.954 0.942 0.983 0.942 0.954 0.934 0.908 0.948 0.934 ...
## $ Healthy.life.expectancy : num 72 72.7 74.4 73 72.4 73.3 72.7 72.6 73.4 73.3 ...
## $ Freedom.to.make.life.choices : num 0.949 0.946 0.919 0.955 0.913 0.96 0.945 0.907 0.929 0.908 ...
## $ Generosity : num -0.098 0.03 0.025 0.16 0.175 0.093 0.086 -0.034 0.134 0.042 ...
## $ Perceptions.of.corruption : num 0.186 0.179 0.292 0.673 0.338 0.27 0.237 0.386 0.242 0.481 ...
## $ Ladder.score.in.Dystopia : num 2.43 2.43 2.43 2.43 2.43 2.43 2.43 2.43 2.43 2.43 ...
## $ Explained.by..Log.GDP.per.capita : num 1.45 1.5 1.57 1.48 1.5 ...
## $ Explained.by..Social.support : num 1.11 1.11 1.08 1.17 1.08 ...
## $ Explained.by..Healthy.life.expectancy : num 0.741 0.763 0.816 0.772 0.753 0.782 0.763 0.76 0.785 0.782 ...
## $ Explained.by..Freedom.to.make.life.choices: num 0.691 0.686 0.653 0.698 0.647 0.703 0.685 0.639 0.665 0.64 ...
## $ Explained.by..Generosity : num 0.124 0.208 0.204 0.293 0.302 0.249 0.244 0.166 0.276 0.215 ...
## $ Explained.by..Perceptions.of.corruption : num 0.481 0.485 0.413 0.17 0.384 0.427 0.448 0.353 0.445 0.292 ...
## $ Dystopia...residual : num 3.25 2.87 2.84 2.97 2.8 ...
Again, changing the column names of wh21_raw and naming the dataframe wh21
wh21 <-
wh21_raw %>%
select(country = ï..Country.name,
regional_indicator = Regional.indicator,
life_ladder = Ladder.score,
se_ladder_score = Standard.error.of.ladder.score,
upperwhisker,
lowerwhisker,
log_gdp_capita = Logged.GDP.per.capita,
social_support = Social.support,
health_birth_expectancy = Healthy.life.expectancy,
freedom = Freedom.to.make.life.choices,
generosity = Generosity,
corruption = Perceptions.of.corruption,
dystopia_ladder_score = Ladder.score.in.Dystopia,
exp_log_gdp = Explained.by..Log.GDP.per.capita,
exp_social_support = Explained.by..Social.support,
exp_health_expectancy = Explained.by..Healthy.life.expectancy,
exp_freedom = Explained.by..Freedom.to.make.life.choices,
exp_generosity = Explained.by..Generosity,
exp_corruption = Explained.by..Perceptions.of.corruption,
dystopia_residual = Dystopia...residual
)
Then selecting only the key metrics for each dataframe available.
wh_alt <-
wh %>%
select(country,life_ladder, log_gdp_capita, social_support, health_birth_expectancy, freedom, generosity, corruption, year) %>%
filter(year!=2020)
wh21_alt <-
wh21%>%
select(country,life_ladder, log_gdp_capita, social_support, health_birth_expectancy, freedom, generosity, corruption, regional_indicator ) %>%
mutate(year = 2020)
#Selecting 'country' and 'regional_indicator' columns in 'wh21_alt' and naming it continent
continent <-
wh21_alt %>%
select(country, regional_indicator)
#Full join 'wh_alt' and continent by country that creates a new 'wh_alt' (overwrite)
wh_alt <-
full_join(wh_alt, continent, by = 'country')
colSums(is.na(wh_alt))
## country life_ladder log_gdp_capita
## 0 0 29
## social_support health_birth_expectancy freedom
## 13 52 31
## generosity corruption year
## 82 104 0
## regional_indicator
## 0
colSums(is.na(wh21_alt))
## country life_ladder log_gdp_capita
## 0 0 0
## social_support health_birth_expectancy freedom
## 0 0 0
## generosity corruption regional_indicator
## 0 0 0
## year
## 0
We see that there are several NA value in wh_alt metrics, so we will have to deal with them seperately later.
Attempts to combine the two dataframes.
wh_all <-
bind_rows(wh_alt, wh21_alt) %>%
mutate(year = as.character(year))
tail(wh_all, 5)
What a difference 12 years make.
Many researchers have noted that there is a correlation between economic growth and reductions in happiness inequality—even when income inequality is increasing at the same time. The possibility is that economic growth in rich countries has translated into a more diverse society in terms of cultural expressions (e.g. through the emergence of alternative lifestyles), which has allowed people to converge in happiness even if they diverge in incomes, tastes and consumption.
For example, Zimbabwe has been dealing with disrupted livelihoods to the extreme poor. In addition to that, supply-side challenges facing the health system, contributed to a decline in the coverage and quality of essential health services and also fronting an economic crisis exacerbated by the COVID-19 (coronavirus) pandemic.
As by regions, namely the main problem that worries the South Asian states is unemployment and the low standard of living of a significant part of the population. A strong overpopulation is also big problem. In addition to these ailments, the countries of South Asia suffer from hunger and lack of clean drinking water.
ladder_dynamics <-
wh_all %>%
select(regional_indicator, life_ladder, year) %>%
filter (year == 2008 | year == 2020) %>%
group_by(regional_indicator, year) %>%
summarise_all(mean) %>%
pivot_wider(names_from = "year", values_from = "life_ladder") %>%
mutate(difference = round((`2020`-`2008`)/`2008`*100,1)) %>%
pivot_longer(cols=`2008`:`2020`, names_to = "year", values_to = "ladder")
ladder_dynamics %>%
ggplot(aes(x=factor(year),
y = ladder,
color = regional_indicator))+
geom_line(aes(group = regional_indicator))+
geom_point(alpha =0.8) +
geom_text_repel(data = . %>%
filter(year == 2008),
aes(x = year, y = ladder,
label = regional_indicator),
xlim = c(NA, 1),
ylim = c(0.1, NA),
size = 2.9)+
geom_text(data = . %>%
filter (year == 2020),
mapping = aes(x = year, y = ladder,
label = paste(difference,"%", sep="")),
hjust = -.25, size = 3, fontface ="bold")+
scale_color_manual( values = c( "#097168","#097168", "#097168", "#097168", "#878787",
"#878787", "#878787", "#097168", "#097168", "#878787" ))+
ylab ("Average Ladder Score") +
theme_minimal() +
theme(legend.position = "none",
axis.title.x =element_blank())
Here we can see how each region’s average Ladder Score has changed.
Middle East and North Africa,South Asia,North America and ANZandWestern Europeare the region which they have lower average 2020 Happiness score than they were in 2008.
Since the writer is Indonesian citizen and curious how Indonesia’s happiness compares to the country/ nations around. So the next thing to do is to make a comparison between Indonesia’s own subcontinent, Southeast Asia.
sea <-
wh_all %>%
mutate(year = as.numeric(year)) %>%
filter(regional_indicator == "Southeast Asia",
year >2007)%>%
group_by(year) %>%
mutate(yrrank = row_number(-life_ladder))
ggplot(sea,
aes(x=year, y = life_ladder, label = yrrank)) +
geom_point(aes(color = country), size =10) +
theme_minimal() +
geom_line(aes(color = country)) +
scale_x_continuous(breaks = seq(2008,2020,1)) +
scale_y_continuous(breaks = seq(0,8,0.5)) +
geom_text(color = 'white', size =5) +
scale_color_manual(values = c("#e4d1a4", "#750000", "#41b6c4","#1d91c0", "#bcc98e",
"#7fcdbb","#253494","#081d58","#2a7a78")) +
labs(title = "Indonesia v. Southeast Asia",
subtitle = "Ranking the Happiness Score of Southeast Asian Countries Across The Years",
x = "", y = "Happiness Score") +
theme(plot.title = element_text(size = 30, face = "bold"),
plot.subtitle = element_text(size = 15),
axis.title = element_text(size= 15),
axis.text = element_text(size = 15),
legend.direction = "horizontal",
legend.position = "bottom",
legend.title = element_blank(),
panel.grid.minor = element_blank()
)
We can see that, Indonesia, most of the times are in the middle rank. We can also say that Phillipines has consistently managed to improve their Happiness Score. They went from being the 7th in 2008, to become the top 3 in the last 5 years period (2016 - 2020).
4.2 Social Support