World Happines

About World Happiness

The World Happiness Report is a landmark survey of the state of global happiness. The first report was published in 2012, the second in 2013, the third in 2015, and the fourth in the 2016 Update. The World Happiness 2017, which ranks 155 countries by their happiness levels, was released at the United Nations at an event celebrating International Day of Happiness on March 20th. The report continues to gain global recognition as governments, organizations and civil society increasingly use happiness indicators to inform their policy-making decisions. Leading experts across fields – economics, psychology, survey analysis, national statistics, health, public policy and more – describe how measurements of well-being can be used effectively to assess the progress of nations. The reports review the state of happiness in the world today and show how the new science of happiness explains personal and national variations in happiness.

# Define libraries to be used

library(ggplot2)
library(ggthemes)
library(glue)
library(plotly)

## 
## Attaching package: 'plotly'

## The following object is masked from 'package:ggplot2':
## 
##     last_plot

## The following object is masked from 'package:stats':
## 
##     filter

## The following object is masked from 'package:graphics':
## 
##     layout

library(tidyr)
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following object is masked from 'package:glue':
## 
##     collapse

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

# Read the data from 2015 to 2019

wh_2015 <- read.csv("data/world_happiness/2015.csv")
wh_2016 <- read.csv("data/world_happiness/2016.csv")
wh_2017 <- read.csv("data/world_happiness/2017.csv")
wh_2018 <- read.csv("data/world_happiness/2018.csv")
wh_2019 <- read.csv("data/world_happiness/2019.csv")

# Observe the data

glimpse(wh_2015)

## Rows: 158
## Columns: 12
## $ Country                       <chr> "Switzerland", "Iceland", "Denmark", "No~
## $ Region                        <chr> "Western Europe", "Western Europe", "Wes~
## $ Happiness.Rank                <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1~
## $ Happiness.Score               <dbl> 7.587, 7.561, 7.527, 7.522, 7.427, 7.406~
## $ Standard.Error                <dbl> 0.03411, 0.04884, 0.03328, 0.03880, 0.03~
## $ Economy..GDP.per.Capita.      <dbl> 1.39651, 1.30232, 1.32548, 1.45900, 1.32~
## $ Family                        <dbl> 1.34951, 1.40223, 1.36058, 1.33095, 1.32~
## $ Health..Life.Expectancy.      <dbl> 0.94143, 0.94784, 0.87464, 0.88521, 0.90~
## $ Freedom                       <dbl> 0.66557, 0.62877, 0.64938, 0.66973, 0.63~
## $ Trust..Government.Corruption. <dbl> 0.41978, 0.14145, 0.48357, 0.36503, 0.32~
## $ Generosity                    <dbl> 0.29678, 0.43630, 0.34139, 0.34699, 0.45~
## $ Dystopia.Residual             <dbl> 2.51738, 2.70201, 2.49204, 2.46531, 2.45~

glimpse(wh_2016)

## Rows: 157
## Columns: 13
## $ Country                       <chr> "Denmark", "Switzerland", "Iceland", "No~
## $ Region                        <chr> "Western Europe", "Western Europe", "Wes~
## $ Happiness.Rank                <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1~
## $ Happiness.Score               <dbl> 7.526, 7.509, 7.501, 7.498, 7.413, 7.404~
## $ Lower.Confidence.Interval     <dbl> 7.460, 7.428, 7.333, 7.421, 7.351, 7.335~
## $ Upper.Confidence.Interval     <dbl> 7.592, 7.590, 7.669, 7.575, 7.475, 7.473~
## $ Economy..GDP.per.Capita.      <dbl> 1.44178, 1.52733, 1.42666, 1.57744, 1.40~
## $ Family                        <dbl> 1.16374, 1.14524, 1.18326, 1.12690, 1.13~
## $ Health..Life.Expectancy.      <dbl> 0.79504, 0.86303, 0.86733, 0.79579, 0.81~
## $ Freedom                       <dbl> 0.57941, 0.58557, 0.56624, 0.59609, 0.57~
## $ Trust..Government.Corruption. <dbl> 0.44453, 0.41203, 0.14975, 0.35776, 0.41~
## $ Generosity                    <dbl> 0.36171, 0.28083, 0.47678, 0.37895, 0.25~
## $ Dystopia.Residual             <dbl> 2.73939, 2.69463, 2.83137, 2.66465, 2.82~

glimpse(wh_2017)

## Rows: 155
## Columns: 12
## $ Country                       <chr> "Norway", "Denmark", "Iceland", "Switzer~
## $ Happiness.Rank                <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1~
## $ Happiness.Score               <dbl> 7.537, 7.522, 7.504, 7.494, 7.469, 7.377~
## $ Whisker.high                  <dbl> 7.594445, 7.581728, 7.622030, 7.561772, ~
## $ Whisker.low                   <dbl> 7.479556, 7.462272, 7.385970, 7.426227, ~
## $ Economy..GDP.per.Capita.      <dbl> 1.616463, 1.482383, 1.480633, 1.564980, ~
## $ Family                        <dbl> 1.533524, 1.551122, 1.610574, 1.516912, ~
## $ Health..Life.Expectancy.      <dbl> 0.7966665, 0.7925655, 0.8335521, 0.85813~
## $ Freedom                       <dbl> 0.6354226, 0.6260067, 0.6271626, 0.62007~
## $ Generosity                    <dbl> 0.36201224, 0.35528049, 0.47554022, 0.29~
## $ Trust..Government.Corruption. <dbl> 0.31596383, 0.40077007, 0.15352656, 0.36~
## $ Dystopia.Residual             <dbl> 2.277027, 2.313707, 2.322715, 2.276716, ~

glimpse(wh_2018)

## Rows: 156
## Columns: 9
## $ Overall.rank                 <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13~
## $ Country.or.region            <chr> "Finland", "Norway", "Denmark", "Iceland"~
## $ Score                        <dbl> 7.632, 7.594, 7.555, 7.495, 7.487, 7.441,~
## $ GDP.per.capita               <dbl> 1.305, 1.456, 1.351, 1.343, 1.420, 1.361,~
## $ Social.support               <dbl> 1.592, 1.582, 1.590, 1.644, 1.549, 1.488,~
## $ Healthy.life.expectancy      <dbl> 0.874, 0.861, 0.868, 0.914, 0.927, 0.878,~
## $ Freedom.to.make.life.choices <dbl> 0.681, 0.686, 0.683, 0.677, 0.660, 0.638,~
## $ Generosity                   <dbl> 0.202, 0.286, 0.284, 0.353, 0.256, 0.333,~
## $ Perceptions.of.corruption    <chr> "0.393", "0.340", "0.408", "0.138", "0.35~

glimpse(wh_2019)

## Rows: 156
## Columns: 9
## $ Overall.rank                 <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13~
## $ Country.or.region            <chr> "Finland", "Denmark", "Norway", "Iceland"~
## $ Score                        <dbl> 7.769, 7.600, 7.554, 7.494, 7.488, 7.480,~
## $ GDP.per.capita               <dbl> 1.340, 1.383, 1.488, 1.380, 1.396, 1.452,~
## $ Social.support               <dbl> 1.587, 1.573, 1.582, 1.624, 1.522, 1.526,~
## $ Healthy.life.expectancy      <dbl> 0.986, 0.996, 1.028, 1.026, 0.999, 1.052,~
## $ Freedom.to.make.life.choices <dbl> 0.596, 0.592, 0.603, 0.591, 0.557, 0.572,~
## $ Generosity                   <dbl> 0.153, 0.252, 0.271, 0.354, 0.322, 0.263,~
## $ Perceptions.of.corruption    <dbl> 0.393, 0.410, 0.341, 0.118, 0.298, 0.343,~

What is Dystopia?

Dystopia is an imaginary country that has the world’s least-happy people. The purpose in establishing Dystopia is to have a benchmark against which all countries can be favorably compared (no country performs more poorly than Dystopia) in terms of each of the six key variables, thus allowing each sub-bar to be of positive width. The lowest scores observed for the six key variables, therefore, characterize Dystopia. Since life would be very unpleasant in a country with the world’s lowest incomes, lowest life expectancy, lowest generosity, most corruption, least freedom and least social support, it is referred to as “Dystopia,” in contrast to Utopia.

What are the residuals?

The residuals, or unexplained components, differ for each country, reflecting the extent to which the six variables either over- or under-explain average 2014-2016 life evaluations. These residuals have an average value of approximately zero over the whole set of countries. Figure 2.2 shows the average residual for each country when the equation in Table 2.1 is applied to average 2014- 2016 data for the six variables in that country. We combine these residuals with the estimate for life evaluations in Dystopia so that the combined bar will always have positive values. As can be seen in Figure 2.2, although some life evaluation residuals are quite large, occasionally exceeding one point on the scale from 0 to 10, they are always much smaller than the calculated value in Dystopia, where the average life is rated at 1.85 on the 0 to 10 scale.

Cleaning the data

As we observe the data content of the world happiness from each year, we can see that the content inside the data are changing each year. Although most changes are the adjustment of the category names, data after 2016 does not include the Region of the countries anymore. Data before 2018 includes categories such as Dystopia Residual and Convidence Interval in 2016 which change into Whisker High and Low in 2017. Due to them not included in the year after 2017 and the categories are not the main category which determine the happiness score of a country, we can safely remove them from the data, create Region column for data above 2016, and rename the other categories to match the most updated names from the most recent year in the data. The detail about each data to be renamed and to be removed can be found below:

Columns to be renamed:
- Country.or.region
- Overall.Rank
- Score
- Economy..GDP.per.Capita.
- Trust..Government.Corruption.
- Freedom
- Family

Included:
- Region

Exclude:
- Standart.Error
- Lower.Confidence.Interval
- Upper.Confidence.Interval
- Whisker.high
- Whisker.low
- Dystopia.Residual

# Clean each data and make the column names the same before combining them into one dataframe

clean_wh2015 <- wh_2015 %>% 
  select(-c(Standard.Error, Dystopia.Residual)) %>% 
  mutate(Year = 2015) %>% 
  rename(GDP.per.capita = Economy..GDP.per.Capita., 
         Social.support = Family, 
         Healthy.life.expectancy = Health..Life.Expectancy., 
         Freedom.to.make.life.choices = Freedom,
         Perceptions.of.corruption = Trust..Government.Corruption.)

clean_wh2016 <- wh_2016 %>% 
  select(-c(Lower.Confidence.Interval, Upper.Confidence.Interval, Dystopia.Residual)) %>% 
  mutate(Year = 2016) %>%
  rename(GDP.per.capita = Economy..GDP.per.Capita.,
         Social.support = Family,
         Healthy.life.expectancy = Health..Life.Expectancy.,
         Freedom.to.make.life.choices = Freedom,
         Perceptions.of.corruption = Trust..Government.Corruption.)

clean_wh2017 <- wh_2017 %>% 
  select(-c(Whisker.high, Whisker.low, Dystopia.Residual)) %>% 
  mutate(Year = 2017) %>%
  rename(GDP.per.capita = Economy..GDP.per.Capita.,
         Social.support = Family,
         Healthy.life.expectancy = Health..Life.Expectancy.,
         Freedom.to.make.life.choices = Freedom,
         Perceptions.of.corruption = Trust..Government.Corruption.)

clean_wh2018 <- wh_2018 %>% 
  mutate(Perceptions.of.corruption = as.numeric(Perceptions.of.corruption),
         Year = 2018) %>%
  fill(Perceptions.of.corruption) %>%
  rename(Happiness.Rank = Overall.rank,
         Country = Country.or.region,
         Happiness.Score = Score)

## Warning in mask$eval_all_mutate(quo): NAs introduced by coercion

clean_wh2019 <- wh_2019 %>% 
  mutate(Year = 2019) %>%
  rename(Happiness.Rank = Overall.rank,
         Country = Country.or.region,
         Happiness.Score = Score)

# Make a list of Region names from where the countries came from

WestEU <- unique(c(unique((clean_wh2015 %>% 
                                  filter(Region == "Western Europe"))$Country),
                        unique((clean_wh2016 %>% 
                                  filter(Region == "Western Europe"))$Country)))

AusNZ <- unique(c(unique((clean_wh2015 %>% 
                                  filter(Region == "Australia and New Zealand"))$Country),
                        unique((clean_wh2016 %>% 
                                  filter(Region == "Australia and New Zealand"))$Country)))

LatAmCar <- unique(c(unique((clean_wh2015 %>% 
                                  filter(Region == "Latin America and Caribbean"))$Country),
                        unique((clean_wh2016 %>% 
                                  filter(Region == "Latin America and Caribbean"))$Country)))

CenEstEU <- unique(c(unique((clean_wh2015 %>% 
                                  filter(Region == "Central and Eastern Europe"))$Country),
                        unique((clean_wh2016 %>% 
                                  filter(Region == "Central and Eastern Europe"))$Country)))

SubSA <- unique(c(unique((clean_wh2015 %>% 
                                 filter(Region == "Sub-Saharan Africa"))$Country),
                       unique((clean_wh2016 %>% 
                                 filter(Region == "Sub-Saharan Africa"))$Country)))

NorAm <- unique(c(unique((clean_wh2015 %>% 
                                  filter(Region == "North America"))$Country),
                        unique((clean_wh2016 %>% 
                                  filter(Region == "North America"))$Country)))

MidEsNorA <- unique(c(unique((clean_wh2015 %>% 
                                  filter(Region == "Middle East and Northern Africa"))$Country),
                        unique((clean_wh2016 %>% 
                                  filter(Region == "Middle East and Northern Africa"))$Country)))

SouEstAs <- unique(c(unique((clean_wh2015 %>% 
                                  filter(Region == "Southeastern Asia"))$Country),
                        unique((clean_wh2016 %>% 
                                  filter(Region == "Southeastern Asia"))$Country)))

EstAs <- unique(c(unique((clean_wh2015 %>% 
                                  filter(Region == "Eastern Asia"))$Country),
                        unique((clean_wh2016 %>% 
                                  filter(Region == "Eastern Asia"))$Country)))

SouAs <- unique(c(unique((clean_wh2015 %>% 
                                  filter(Region == "Southern Asia"))$Country),
                        unique((clean_wh2016 %>% 
                                  filter(Region == "Southern Asia"))$Country)))

# Add "Region" column to the year of 2017-2019

clean_wh2017 <- clean_wh2017 %>%  
  mutate(Country = recode(Country, 'Taiwan Province of China' = "Taiwan",
                          'Hong Kong S.A.R., China' = "Hong Kong"), 
         Region = case_when(Country %in% WestEU ~ "Western Europe",
                            Country %in% AusNZ ~ "Australia and New Zealand",
                            Country %in% LatAmCar ~ "Latin America and Caribbean",
                            Country %in% CenEstEU ~ "Central and Eastern Europe",
                            Country %in% SubSA ~ "Sub-Saharan Africa",
                            Country %in% NorAm ~ "North America",
                            Country %in% MidEsNorA ~ "Middle East and Northern Africa",
                            Country %in% SouEstAs ~ "Southeastern Asia",
                            Country %in% EstAs ~ "Eastern Asia",
                            Country %in% SouAs ~ "Southern Asia"))

clean_wh2018 <- clean_wh2018 %>% 
  mutate(Country = recode(Country, 'Trinidad & Tobago' = "Trinidad and Tobago",
                          'Northern Cyprus' = "North Cyprus"),
         Region = case_when(Country %in% WestEU ~ "Western Europe",
                            Country %in% AusNZ ~ "Australia and New Zealand",
                            Country %in% LatAmCar ~ "Latin America and Caribbean",
                            Country %in% CenEstEU ~ "Central and Eastern Europe",
                            Country %in% SubSA ~ "Sub-Saharan Africa",
                            Country %in% NorAm ~ "North America",
                            Country %in% MidEsNorA ~ "Middle East and Northern Africa",
                            Country %in% SouEstAs ~ "Southeastern Asia",
                            Country %in% EstAs ~ "Eastern Asia",
                            Country %in% SouAs ~ "Southern Asia"))

clean_wh2019 <- clean_wh2019 %>% 
  mutate(Country = recode(Country, 'Trinidad & Tobago' = "Trinidad and Tobago",
                          'Northern Cyprus' = "North Cyprus",
                          'North Macedonia' = "Macedonia"),
         Region = case_when(Country %in% WestEU ~ "Western Europe",
                            Country %in% AusNZ ~ "Australia and New Zealand",
                            Country %in% LatAmCar ~ "Latin America and Caribbean",
                            Country %in% CenEstEU ~ "Central and Eastern Europe",
                            Country %in% SubSA ~ "Sub-Saharan Africa",
                            Country %in% NorAm ~ "North America",
                            Country %in% MidEsNorA ~ "Middle East and Northern Africa",
                            Country %in% SouEstAs ~ "Southeastern Asia",
                            Country %in% EstAs ~ "Eastern Asia",
                            Country %in% SouAs ~ "Southern Asia",
                            Country == "Gambia" ~ "Sub-Saharan Africa"))

Now that the data for each year has the same column names, we can combine them into one single dataframe. The method that we choose to combine all the data into one dataframe are called “rbind”. reason being is that “rbind” combines data which have the same column names and combine the list of data inside the column into one without loosing or merging the data from each dataframe.

# Merge the complete data using rbind method

world_happiness <- rbind(clean_wh2015, clean_wh2016, clean_wh2017, clean_wh2018, clean_wh2019)

world_happiness

Although the data has been cleaned before the data were combined, the content of the data from each column has not been processed. The next step is to round the numbers inside the numeric columns for a better undestanding about the numbers, and changing some categorical columns into a factor type data for memory efficiency.

# Clean the newly combined data

world_happiness <- world_happiness %>% 
  group_by(Country, Region) %>% 
  summarise_if(is.numeric, round, digit=3) %>% 
  ungroup %>% 
  mutate(across(c(Country, Region), as.factor),
         Year = as.integer(Year))

Visualizations and Interpretations

Now that all steps for pre-processing is done, we can start to process our data to find what makes a country happy according to our data by the help of visualization for an easier understanding of the data. you can find each visualizations and interpretations of each visualizations below:

# Top 10 happiest countries

ggplotly(world_happiness %>% 
  select(-Year) %>% 
  group_by(Country, Region) %>%
  summarise_if(is.numeric, mean, digit=3) %>% 
  ungroup %>% 
  arrange(desc(Happiness.Score)) %>% 
  head(10) %>% 
  mutate(popup_rank = glue("{Region},
                            Score: {Happiness.Score}")) %>% 
  ggplot(aes(reorder(Country, Happiness.Score), Happiness.Score))+
  geom_col(aes(fill=Happiness.Score, text=popup_rank))+
  coord_flip()+
  scale_fill_gradient(high="cyan2", low="turquoise3")+
  scale_y_continuous(breaks=seq(0,8,0.5))+
  labs(title = "Top 10 Happiest Countries",
       x="Country",
       y="Happiness Score")+
  theme_pander()+
  theme(panel.grid=element_line(linetype="dotdash")),
  tooltip="text") %>% 
  hide_colorbar()

## Warning: Ignoring unknown aesthetics: text

Interpretations:
- Top 10 Happiest countries mostly came from Western Europe region
- Western countries dominate the top 5 list of the top 10 happiest countries
- Happiness score start to drop quite significantly after the top 5 happiest countries
- The happiest country achieved by Denmark with the total score of 7.548

# 10 least happy country

ggplotly(world_happiness %>% 
  select(-Year) %>% 
  group_by(Country, Region) %>%
  summarise_if(is.numeric, mean, digit=3) %>% 
  ungroup %>% 
  arrange(Happiness.Score) %>% 
  head(10) %>% 
  mutate(popup_rank = glue("{Region},
                            Score: {Happiness.Score}")) %>% 
  ggplot(aes(reorder(Country, Happiness.Score), Happiness.Score))+
  geom_col(aes(fill=Happiness.Score, text=popup_rank))+
  coord_flip()+
  scale_fill_gradient(high="turquoise3 ", low="midnightblue")+
  scale_y_continuous(breaks=seq(0,8,0.5))+
  labs(title = "10 Least Happy Country",
       x="Country",
       y="Happiness Score")+
  theme_pander()+
  theme(panel.grid=element_line(linetype="dotdash")),
  tooltip="text") %>% 
  hide_colorbar()

## Warning: Ignoring unknown aesthetics: text

Interpretations:
- Most of the 10 least happy countries were dominated by countries mostly located in the Continent of Africa
- Lowest score achieved by a country named Burundi with the total score of 3.076

# Top 10 happiest country trend

ggplotly(world_happiness %>% 
  filter(Happiness.Rank >= 1,
         Happiness.Rank <= 10) %>% 
  mutate(popup_rank = glue("{Country},
                            GDP per Capita: {GDP.per.capita},
                            Social Support: {Social.support},
                            Life Expectancy: {Healthy.life.expectancy},
                            Freedom of Choice: {Freedom.to.make.life.choices},
                            Gov. Trust: {Perceptions.of.corruption},
                            Generosity: {Generosity}")) %>% 
  ggplot(aes(Year, Happiness.Rank))+
  geom_line(aes(col=Country), size=1.5)+
  geom_point(aes(col=Country, text=popup_rank), size=4)+
  scale_color_brewer(palette = "Paired")+
  scale_y_continuous(trans="reverse", breaks=seq(1,158,1))+
  labs(title = "Top 10 Happiness Countries by Year",
       y="Happiness Rank")+
  theme_pander()+
  theme(panel.grid=element_line(linetype="dotdash")),
  tooltip="text") %>% 
  hide_legend()

## Warning: Ignoring unknown aesthetics: text

Interpretations:
1. Switzerland decreased the most
2. Finland increase the most
3. New Zealand has the most stable score
4. Denmark has a fluctuating score but hold on the top three

# Top 10 happiest country trend

ggplotly(world_happiness %>% 
  filter(Happiness.Rank >= 148,
         Happiness.Rank <= 158) %>% 
  mutate(popup_rank = glue("{Country},
                            GDP per Capita: {GDP.per.capita},
                            Social Support: {Social.support},
                            Life Expectancy: {Healthy.life.expectancy},
                            Freedom of Choice: {Freedom.to.make.life.choices},
                            Gov. Trust: {Perceptions.of.corruption},
                            Generosity: {Generosity}")) %>% 
  ggplot(aes(Year, Happiness.Rank))+
  geom_line(aes(col=Country), size=1.5)+
  geom_point(aes(col=Country, text=popup_rank), size=4)+
  scale_y_continuous(trans="reverse", breaks=seq(1,158,1))+
  labs(title = "Top 10 Happiness Countries by Year",
       y="Happiness Rank")+
  theme_pander()+
  theme(panel.grid=element_line(linetype="dotdash")),
  tooltip="text") %>% 
  hide_legend()

## Warning: Ignoring unknown aesthetics: text

Interpretations:
- Afganistan has the most stable happiness score in the lower 10 of the least happy countries - The 10 least happy countries were changing rapidly - Central African Republic decrase the most

Correlation Between Each of The Determining Factor and The Rank

# Correlation between the Happiness Rank and GDP per Capita

ggplotly(world_happiness %>%
  mutate(popup_gdp = glue("{Country},
                            Rank: {Happiness.Rank},
                            GDP per Capita: {GDP.per.capita}")) %>% 
  ggplot(aes(Happiness.Rank, GDP.per.capita))+
  geom_point(aes(col=as.character(Year), text=popup_gdp), size=1.5)+
  geom_smooth(method=lm, se=F, col="red")+
  scale_color_brewer(palette = "Paired")+
  scale_x_continuous(trans="reverse", breaks=seq(1,158,10))+
  scale_y_continuous(breaks=seq(0,3,0.2))+
  labs(x="Happiness Rank", y="GDP per Capita")+
  theme_pander()+
  theme(panel.grid=element_line(linetype="dotdash")),
  tooltip="text") %>% 
  hide_legend()

## Warning: Ignoring unknown aesthetics: text

## `geom_smooth()` using formula 'y ~ x'

Interpretations:
- Poor positive correlation
- Extreme jump in the rank with around the same GDP per Capita value

# Correlation between Happiness Rank and Social Support (Social situation around an individual)

ggplotly(world_happiness %>% 
  mutate(popup_social = glue("{Country},
                            Rank: {Happiness.Rank},
                            Social Support: {Social.support}")) %>% 
  ggplot(aes(Happiness.Rank, Social.support))+
  geom_point(aes(col=as.character(Year), text=popup_social), size=1.5)+
  geom_smooth(method=lm, se=F, col="red")+
  scale_color_brewer(palette="Paired")+
  scale_x_continuous(trans="reverse", breaks=seq(1,158,15))+
  scale_y_continuous(breaks=seq(0.1,2,0.2))+
  labs(x="Happiness Rank", y="Social Support")+
  theme_pander()+
  theme(panel.grid=element_line(linetype="dotdash")),
  tooltip="text") %>% 
  hide_legend()

## Warning: Ignoring unknown aesthetics: text

## `geom_smooth()` using formula 'y ~ x'

Interpretations:
- Poor positive correlation
- Extreme jump in the rank with around the same Social Support

# Correlation between Happiness Rank and Healthy Life Expectancy

ggplotly(world_happiness %>% 
  mutate(popup_life = glue("{Country},
                            Rank: {Happiness.Rank},
                            Life Expectancy: {Healthy.life.expectancy}")) %>% 
  ggplot(aes(Happiness.Rank, Healthy.life.expectancy))+
  geom_point(aes(col=as.character(Year), text=popup_life), size=1.5)+
  geom_smooth(method=lm, se=F, col="red")+
  scale_color_brewer(palette = "Paired")+
  scale_x_continuous(trans="reverse", breaks=seq(1,158,10))+
  scale_y_continuous(breaks=seq(0.1,2,0.1))+
  labs(x="Happiness Rank", y="Healthy Life Expectancy")+
  theme_pander()+
  theme(panel.grid=element_line(linetype="dotdash")),
  tooltip="text") %>% 
  hide_legend()

## Warning: Ignoring unknown aesthetics: text

## `geom_smooth()` using formula 'y ~ x'

Interpretations:
- Poor positive correlation
- Large jump between Happiness Rank with the same Healthy Life Expectancy value

# Correlation between Happinees Rank and Freedom to Make Life Choices

ggplotly(world_happiness %>%
  mutate(popup_free = glue("{Country},
                            Rank: {Happiness.Rank},
                            Freedom of Choice: {Freedom.to.make.life.choices}")) %>% 
  ggplot(aes(Happiness.Rank, Freedom.to.make.life.choices))+
  geom_point(aes(col=as.character(Year), text=popup_free), size=1.5)+
  geom_smooth(method=lm, se=F, col="red")+
  scale_color_brewer(palette = "Paired")+
  scale_x_continuous(trans="reverse", breaks=seq(1,158,15))+
  scale_y_continuous(breaks=seq(0.1,1,0.1))+
  labs(x="Happiness Rank", y="Freedom to Make Life Choiches")+
  theme_pander()+
  theme(panel.grid=element_line(linetype="dotdash")),
  tooltip="text") %>% 
  hide_legend()

## Warning: Ignoring unknown aesthetics: text

## `geom_smooth()` using formula 'y ~ x'

Interpretations:
- Can be said that there were almost no correlation between the two category
- Lowest to highest ranks can be seen in the same value of freedom to make life choices

# Correlation between Happiness Rank and Preception of Corruption

ggplotly(world_happiness %>% 
  mutate(popup_gov = glue("{Country},
                            Rank: {Happiness.Rank},
                           Gov. Trust: {Perceptions.of.corruption}")) %>% 
  ggplot(aes(Happiness.Rank, Perceptions.of.corruption))+
  geom_point(aes(col=as.character(Year), text=popup_gov), size=1.5)+
  geom_smooth(method=lm, se=F, col="red")+
  scale_color_brewer(palette = "Paired")+
  scale_x_continuous(trans="reverse", breaks=seq(1,158,15))+
  scale_y_continuous(breaks=seq(0.1,1,0.1))+
  labs(x="Happiness Rank", y="Trust in The Government")+
  theme_pander()+
  theme(panel.grid=element_line(linetype="dotdash")),
  tooltip="text") %>% 
  hide_legend()

## Warning: Ignoring unknown aesthetics: text

## `geom_smooth()` using formula 'y ~ x'

Interpretations:
- Can be said that both category almost has no corretaion
- Lowest to highest ranks spotted in the same value of Trust in The Government

# Correlation between Happiness Rank and Generousity

ggplotly(world_happiness %>% 
  mutate(popup_gen = glue("{Country},
                            Rank: {Happiness.Rank},
                            Generosity: {Generosity}")) %>% 
  ggplot(aes(Happiness.Rank, Generosity))+
  geom_point(aes(col=as.character(Year), text=popup_gen), size=1.5)+
  geom_smooth(method=lm, se=F, col="red")+
  scale_color_brewer(palette = "Paired")+
  scale_x_continuous(trans="reverse", breaks=seq(1,158,15))+
  scale_y_continuous(breaks=seq(0.1,1,0.1))+
  labs(x="Happiness Rank", y="Generosity")+
  theme_pander()+
  theme(panel.grid=element_line(linetype="dotdash")),
  tooltip="text") %>% 
  hide_legend()

## Warning: Ignoring unknown aesthetics: text

## `geom_smooth()` using formula 'y ~ x'

Interpretations:
- Can be said that there were no correlation between the two category
- Lowest to highest ranks can be seen in the same Generosity value

The most Prominent Factor to The Ranking System

# Trend of each category of high ranked country per year

ggplotly(world_happiness %>% 
  mutate(popup_country = glue("Happiness Score: {Happiness.Score},
                              Rank: {Happiness.Rank}")) %>% 
  gather(Category, Value, GDP.per.capita, Social.support, Healthy.life.expectancy, 
         Freedom.to.make.life.choices, Perceptions.of.corruption, Generosity) %>% 
  mutate(popup_country = glue("{Category}: {Value},
                              {popup_country}")) %>% 
  filter(Country=="Denmark") %>% 
  ggplot(aes(Year, Value))+
  geom_line(aes(col=Category), size=1.5)+
  geom_point(aes(col=Category, text=popup_country), size=3)+
  scale_color_brewer(palette = "Paired")+
  scale_y_continuous(breaks=seq(0,2,0.2))+
  # labs(y="Happiness Rank")+
  theme_pander()+
  theme(panel.grid=element_line(linetype="dotdash")),
  tooltip="text") %>% 
  hide_legend()

## Warning: Ignoring unknown aesthetics: text

Interpretations:
- In the year of 2016, denmark achieved rank 1 of the happiest country while the GDP per Capita in that country increased relative to the year before
- Decrease and increase of the social support does not have significant impact to the rank
- The most similar pattern to the fluctuating rank pattern of Denmark appear to be the Freedom to make life choices. This indicates a positive correlation to the happiness rank between 2015 to 2019
- The other category were not as high in value as the GDP per Capita and Social Support, but the consistency in the value of each category is relatively good
- Money is the most desired in denmark

# Trend of each category of low ranked country per year

ggplotly(world_happiness %>% 
  mutate(popup_country = glue("Happiness Score: {Happiness.Score},
                              Rank: {Happiness.Rank}")) %>% 
  gather(Category, Value, GDP.per.capita, Social.support, Healthy.life.expectancy, 
         Freedom.to.make.life.choices, Perceptions.of.corruption, Generosity) %>% 
  mutate(popup_country = glue("{Category}: {Value},
                              {popup_country}")) %>% 
  filter(Country=="Burundi") %>% 
  ggplot(aes(Year, Value))+
  geom_line(aes(col=Category), size=1.5)+
  geom_point(aes(col=Category, text=popup_country), size=3)+
  scale_color_brewer(palette = "Paired")+
  scale_y_continuous(breaks=seq(0,2,0.2))+
  # labs(y="Happiness Rank")+
  theme_pander()+
  theme(panel.grid=element_line(linetype="dotdash")),
  tooltip="text") %>% 
  hide_legend()

## Warning: Ignoring unknown aesthetics: text

Interpretations:
- 2 rank increase during high increase of social support value follow by hardly noticable increase in GDP per Capita
- High jump in rank during the high increase of people’s healthy life expectency followed by freedom to make life choices and trust in governmentn which indicates significant impact
- Drop in ranks during the drop of the generosity value
- People’s well being is more important compare to money

Conclusion

Although considered to have a poor positive correlation, two of the most prominent factor that determining the happiness rank of a country are the GDP per Capita and Social Support. However, this does not mean that each of of the countries happiness fluctuate based on those factors. Taking Denmark as our example. Although the highest contributors to the happiness score were the GDP per Capita and the Social Support, we hava to consider the increase and decrease of the other factor as well as the total of all six factors will result in the happiness score. Furtermore, we can see that the freedom to make life choices have a similar factor to the rank trend from 2015 to 2019, which means that this factor has the highest positive correlation to the ranking. Furthermore, taking example from the lowest happiness score achieved by a country by the name of Burundi, we can see that they prioritize more to peoples well being and social live compare to material and money. We cannot determine precisely what makes a country happy based on these scores as the experience of people’s lives in a country are so diverse. However, we can take this scores as a note for us as a consideration to the condition of our people in our countries.

Reference

https://www.kaggle.com/unsdsn/world-happiness