World Happiness Index

This analysis was initially published here I have redone the analysis for my own practice.

\[ Happiness Index = (Life Expectancy * Expirience Wellbeing * Inequality of outcomes)/ Ecological footprints \]

HPI tells us “how well nations are doing at achieving long, happy, sustainable lives”. The index is weighted to give progressively higher scores to nations with lower ecological footprints.

Libraries used

library(dplyr)
library(plotly)
library(stringr)
library(cluster)
library(FactoMineR)
library(factoextra)
library(ggplot2)
library(reshape2)
library(ggthemes)
library(NbClust)
library(readxl)
library(GGally)
library(maps)

Visualizing Coorelations

ggplot(hpi, aes(x=gdp, y=life_expectancy, size = population)) + 
  geom_point(aes(color = region)) + 
  coord_trans(x = 'log10') +
  geom_smooth(method = 'loess') + 
  ggtitle('Life Expectancy and GDP per Capita in USD log10') + theme_classic()

ggplot(hpi, aes(x=life_expectancy, y=hpi_index)) + 
  geom_point(aes(size=population, color=region)) + 
  geom_smooth(method = 'loess') + 
  ggtitle('Life Expectancy and Happy Planet Index Score') + 
  theme_classic()

ggplot(hpi, aes(x=gdp, y=hpi_index)) + 
  geom_point(aes(size=population, color=region)) + 
  geom_smooth(method = 'loess') + 
  ggtitle('GDP per Capita(log10) and Happy Planet Index Score') +
  coord_trans(x = 'log10') +
  theme_classic()

hpi[, 4:13] <- scale(hpi[, 4:13])

qplot(data=melt(cor(hpi[, 4:13],use="p")), x=Var1, y=Var2, fill=value, geom="tile") +
  scale_fill_gradient2(limits=c(-1, 1)) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  labs(title="Heatmap of Correlation Matrix", x=NULL, y=NULL)

PCA is a procedure for identifying a smaller number of uncorrelated variables, called “principal components”, from a large set of data. The goal of principal components analysis is to explain the maximum amount of variance with the minimum number of principal components.

Principal Componenet Analysis

hpi.pca <- PCA(hpi[, 4:13], graph=FALSE)
eigenvalues <- hpi.pca$eig
fviz_screeplot(hpi.pca, addlabels = TRUE, ylim = c(0, 65))

(hpi.pca$var$contrib)

##                           Dim.1        Dim.2        Dim.3      Dim.4
## life_expectancy     12.27500107 2.298157e+00  0.002516184 18.4965447
## wellbeing           12.31846893 7.472989e-02  0.198445432 22.1593907
## happy_years         14.79371047 1.288175e-02  0.027105103  0.7180341
## footprint            9.02127688 2.471162e+01  2.982449522  0.4891428
## inequality_outcomes 13.36365052 3.049462e-01  0.010038818  9.7957329
## adj_life_expectancy 12.67789150 9.580098e-01  0.001525891 19.5589843
## adj_wellbeing       12.28394580 3.206506e-04  0.135269065 23.5949300
## hpi_index            3.57121564 5.096355e+01  5.368971166  2.1864830
## gdp                  9.68826525 1.157381e+01  1.003632002  2.3980025
## population           0.00657393 9.101975e+00 90.270046817  0.6027549
##                            Dim.5
## life_expectancy     3.179724e-01
## wellbeing           6.376146e+00
## happy_years         3.254368e-02
## footprint           7.629671e+00
## inequality_outcomes 2.976993e+00
## adj_life_expectancy 7.545215e-02
## adj_wellbeing       4.808546e+00
## hpi_index           5.284314e+00
## gdp                 7.249799e+01
## population          3.689407e-04

Variables that are correlated with PC1 and PC2 are the most important in explaining the variability in the data set.
The contribution of variables was extracted above: The larger the value of the contribution, the more the variable contributes to the component.

fviz_pca_var(hpi.pca, col.var="contrib",gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"), repel = TRUE )

This highlights the most important variables in explaining the variations retained by the principal components.

Using Pam Clustering Analysis to group countries by wealth, development, carbon emissions, and happiness

number <- NbClust(hpi[, 4:13], distance="euclidean",
               min.nc=2, max.nc=15, method='kmeans', index='all', alphaBeale = 0.1)

## *** : The Hubert index is a graphical method of determining the number of clusters.
##                 In the plot of Hubert index, we seek a significant knee that corresponds to a 
##                 significant increase of the value of the measure i.e the significant peak in Hubert
##                 index second differences plot. 
##

## *** : The D index is a graphical method of determining the number of clusters. 
##                 In the plot of D index, we seek a significant knee (the significant peak in Dindex
##                 second differences plot) that corresponds to a significant increase of the value of
##                 the measure. 
##  
## ******************************************************************* 
## * Among all indices:                                                
## * 7 proposed 2 as the best number of clusters 
## * 7 proposed 3 as the best number of clusters 
## * 1 proposed 6 as the best number of clusters 
## * 1 proposed 7 as the best number of clusters 
## * 1 proposed 10 as the best number of clusters 
## * 2 proposed 12 as the best number of clusters 
## * 4 proposed 15 as the best number of clusters 
## 
##                    ***** Conclusion *****                            
##  
## * According to the majority rule, the best number of clusters is  2 
##  
##  
## *******************************************************************

set.seed(2017)
pam <- pam(hpi[, 4:13], diss=FALSE, 3, keep.data=TRUE)
fviz_silhouette(pam)

##   cluster size ave.sil.width
## 1       1   43          0.46
## 2       2   66          0.32
## 3       3   31          0.37

hpi$country[pam$id.med]

## [1] "Liberia" "Romania" "Ireland"

fviz_cluster(pam, stand = FALSE, geom = "point",ellipse.type = "norm")

A World map of three clusters

hpi['cluster'] <- as.factor(pam$clustering)
map <- map_data("world")
map <- left_join(map, hpi, by = c('region' = 'country'))

ggplot() + geom_polygon(data = map, aes(x = long, y = lat, group = group, fill=cluster, color=cluster)) +
  labs(title = "Clustering Happy Planet Index", subtitle = "Based on data from:http://happyplanetindex.org/", x=NULL, y=NULL) + 
  theme_minimal()