The project will be investigating world happiness. There is a lot of aspects that contributes to happiness of a person like income, social support, health,… The UN has been publishing World Happiness Report every year since 2012. The World Happiness Report underscores a global desire for governments to prioritize happiness and well0being when shaping their policies.
What is the happiest country in the world? Does money make people happy?
The data sources will be provided from the 2023 UN World Happiness Report
https://happiness-report.s3.amazonaws.com/2023/DataForTable2.1WHR2023.xls
Provide a link to the documentation for the data or the documentation itself. Is there a data dictionary? The full report can be found at: https://worldhappiness.report/ed/2023/
The data includes the life ladder of countries from 2005 to 2022 – the happiness rankings are based on individuals’ own assessments of their lives. Others variables that are collected are: Log GDP per capita, Social support, Healthy Life Expectancy at Birth, Freedom to make life choices, Generosity, Positive affect and negative affect.
library(dplyr)
## Warning: package 'dplyr' was built under R version 4.2.3
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.2.3
library(corrplot)
## Warning: package 'corrplot' was built under R version 4.2.3
## corrplot 0.92 loaded
library(readr)
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
WHR2023 <- read_csv("C:/Users/toanp/Desktop/MSCS/Classes/CSC530-DataAnalysis/Data/WHR2023.csv")
## Rows: 2199 Columns: 11
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): Country_name
## dbl (10): year, Life_ladder, Log_GDP_per_capita, Social_support, Healthy_lif...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
summary(WHR2023)
## Country_name year Life_ladder Log_GDP_per_capita
## Length:2199 Min. :2005 Min. :1.281 Min. : 5.527
## Class :character 1st Qu.:2010 1st Qu.:4.647 1st Qu.: 8.500
## Mode :character Median :2014 Median :5.432 Median : 9.499
## Mean :2014 Mean :5.479 Mean : 9.390
## 3rd Qu.:2018 3rd Qu.:6.309 3rd Qu.:10.373
## Max. :2022 Max. :8.019 Max. :11.664
## NA's :20
## Social_support Healthy_life_expectancy_at_birth Freedom_to_make_life_choices
## Min. :0.2280 Min. : 6.72 Min. :0.2580
## 1st Qu.:0.7470 1st Qu.:59.12 1st Qu.:0.6562
## Median :0.8360 Median :65.05 Median :0.7700
## Mean :0.8107 Mean :63.29 Mean :0.7479
## 3rd Qu.:0.9050 3rd Qu.:68.50 3rd Qu.:0.8590
## Max. :0.9870 Max. :74.47 Max. :0.9850
## NA's :13 NA's :54 NA's :33
## Generosity Perceptions_of_corruption Positive_affect Negative_affect
## Min. :-0.33800 Min. :0.0350 Min. :0.1790 Min. :0.0830
## 1st Qu.:-0.11200 1st Qu.:0.6880 1st Qu.:0.5720 1st Qu.:0.2080
## Median :-0.02300 Median :0.8000 Median :0.6630 Median :0.2610
## Mean : 0.00009 Mean :0.7452 Mean :0.6521 Mean :0.2715
## 3rd Qu.: 0.09200 3rd Qu.:0.8690 3rd Qu.:0.7380 3rd Qu.:0.3230
## Max. : 0.70300 Max. :0.9830 Max. :0.8840 Max. :0.7050
## NA's :73 NA's :116 NA's :24 NA's :16
str(WHR2023)
## spc_tbl_ [2,199 × 11] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
## $ Country_name : chr [1:2199] "Afghanistan" "Afghanistan" "Afghanistan" "Afghanistan" ...
## $ year : num [1:2199] 2008 2009 2010 2011 2012 ...
## $ Life_ladder : num [1:2199] 3.72 4.4 4.76 3.83 3.78 ...
## $ Log_GDP_per_capita : num [1:2199] 7.35 7.51 7.61 7.58 7.66 ...
## $ Social_support : num [1:2199] 0.451 0.552 0.539 0.521 0.521 0.484 0.526 0.529 0.559 0.491 ...
## $ Healthy_life_expectancy_at_birth: num [1:2199] 50.5 50.8 51.1 51.4 51.7 ...
## $ Freedom_to_make_life_choices : num [1:2199] 0.718 0.679 0.6 0.496 0.531 0.578 0.509 0.389 0.523 0.427 ...
## $ Generosity : num [1:2199] 0.168 0.191 0.121 0.164 0.238 0.063 0.106 0.082 0.044 -0.119 ...
## $ Perceptions_of_corruption : num [1:2199] 0.882 0.85 0.707 0.731 0.776 0.823 0.871 0.881 0.793 0.954 ...
## $ Positive_affect : num [1:2199] 0.414 0.481 0.517 0.48 0.614 0.547 0.492 0.491 0.501 0.435 ...
## $ Negative_affect : num [1:2199] 0.258 0.237 0.275 0.267 0.268 0.273 0.375 0.339 0.348 0.371 ...
## - attr(*, "spec")=
## .. cols(
## .. Country_name = col_character(),
## .. year = col_double(),
## .. Life_ladder = col_double(),
## .. Log_GDP_per_capita = col_double(),
## .. Social_support = col_double(),
## .. Healthy_life_expectancy_at_birth = col_double(),
## .. Freedom_to_make_life_choices = col_double(),
## .. Generosity = col_double(),
## .. Perceptions_of_corruption = col_double(),
## .. Positive_affect = col_double(),
## .. Negative_affect = col_double()
## .. )
## - attr(*, "problems")=<externalptr>
Describe the steps you took to get from your original dataset to the final dataset you used for your analysis. Include the R code in chunks.
#Remove N/A Values
WHR2023 <- na.omit(WHR2023)
Show how you approached the questions you posed at the beginning. Describe how much you were able to accomplish. There should be both graphical and numerical results produced by R code included in chunks. Explain what you did and what it means.
HAPPIEST COUNTRY OVER YEARS
happiest_countries <- WHR2023 %>%
group_by(year) %>%
slice(which.max(Life_ladder))
# Print the result
happiest_countries
In recent years, according to the WHR2023, Finland is the happiest country in the world. Following by Denmark and Canada.
DOES MONEY MEANS HAPPY?
ggplot(WHR2023, aes(x = Log_GDP_per_capita, y = Life_ladder)) +
geom_point() +
# Add labels and title
labs(x = "Log GDP per Capita",
y = "Life Ladder",
title = "Scatter Plot between GDP and Life Ladder") +
# Customize theme if needed
theme_minimal()
According to the graph there is a very clear correlation between
happiness and money (GDP per Capita). As the GDP increases the life
ladder increases. So, is it safe to say money brings happiness? Let’s
investigate it further with more information.
MORE ABOUT CORRELATIONS
selected_columns <- WHR2023[, !names(WHR2023) %in% c('year', 'Country_name')]
# Calculate correlations with 'Life Ladder'
correlations_with_life_ladder <- cor(selected_columns$Life_ladder, selected_columns)
single_row <- correlations_with_life_ladder[1, ]
ordered_vars <- names(single_row)[order(-as.numeric(single_row))]
# Create a bar plot for each variable with variable names displayed
barplot(as.numeric(single_row[ordered_vars]), names.arg = ordered_vars,
main = "Bar Plot of Variables",
ylab = "Values",
col = "skyblue",
las = 2, # Rotates x-axis labels vertically for better visibility
cex.names = 0.7, # Adjusts the size of variable names
args.legend = list(title = "Variables")) # Adds a legend
Based on the correlation graph, money (GDP per capita) has the highest
correlation with life ladder following by healthy life and social
support; freedom and positive affect also play a big role to contribute
to happiness, surprisingly generosity doesn’t have a big correlation
with life ladder. The reason GPD per capita have a big correlation with
happiness may be that it also contribute to the other variables that
brings happiness like healthy life, social support or freedom of
choices.
Link to presentation: https://youtu.be/zs7unP-3lf8