Background

This dataset was posted by Magnus Skonberg on week 5 discussion board in DATA 607. The entire document about it can be found in this link: https://www.kaggle.com/unsdsn/world-happiness

The proposed analysis is:

Compare the (Happiness) Score and GDP per capita for the Top 20 countries to observe the correlation (if there is any).

Thus I will based my analysis in 2019 data.

Libraries

library(tidyverse)
## -- Attaching packages --------------------------------------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.2     v purrr   0.3.4
## v tibble  3.0.3     v dplyr   1.0.2
## v tidyr   1.1.2     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.5.0
## -- Conflicts ------------------------------------------------------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(kableExtra)
## 
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
## 
##     group_rows
library(Hmisc)
## Loading required package: lattice
## Loading required package: survival
## Loading required package: Formula
## 
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:dplyr':
## 
##     src, summarize
## The following objects are masked from 'package:base':
## 
##     format.pval, units

Data Transformation

Data insights and cleansing

# Get the data

data <- read.csv("https://raw.githubusercontent.com/jnataky/DATA-607/master/A2_Various_dataset_transformation/happy_data2019.csv")
# make a copy of the data frame 

data_copy <- data

dim(data_copy)
## [1] 156   9
# Get the insights

head(data_copy)
##   Overall.rank Country.or.region Score GDP.per.capita Social.support
## 1            1           Finland 7.769          1.340          1.587
## 2            2           Denmark 7.600          1.383          1.573
## 3            3            Norway 7.554          1.488          1.582
## 4            4           Iceland 7.494          1.380          1.624
## 5            5       Netherlands 7.488          1.396          1.522
## 6            6       Switzerland 7.480          1.452          1.526
##   Healthy.life.expectancy Freedom.to.make.life.choices Generosity
## 1                   0.986                        0.596      0.153
## 2                   0.996                        0.592      0.252
## 3                   1.028                        0.603      0.271
## 4                   1.026                        0.591      0.354
## 5                   0.999                        0.557      0.322
## 6                   1.052                        0.572      0.263
##   Perceptions.of.corruption
## 1                     0.393
## 2                     0.410
## 3                     0.341
## 4                     0.118
## 5                     0.298
## 6                     0.343
# Check for missing values

sum(is.na(data_copy))
## [1] 0
# Retrieve columns names

colnames(data_copy)
## [1] "Overall.rank"                 "Country.or.region"           
## [3] "Score"                        "GDP.per.capita"              
## [5] "Social.support"               "Healthy.life.expectancy"     
## [7] "Freedom.to.make.life.choices" "Generosity"                  
## [9] "Perceptions.of.corruption"
# Rename columns

data_copy <- data_copy %>%
  rename(rank = Overall.rank, country = Country.or.region, happiness_score = Score, GDP_per_capita =  GDP.per.capita, freedom = Freedom.to.make.life.choices, generosity = Generosity, corruption = Perceptions.of.corruption)

Tidying data

# make a copy of data set
data_copy1 <- data_copy
# create new data frame with necessary variables for analysis

data1 <- data_copy1 %>%
  select(country, happiness_score, GDP_per_capita)
# select top 20 countries

data2 <- head(data1, 20)
data2 %>%
  kbl(caption = "Countries rank for Happiness & GDP per capita", align = 'c') %>%
  kable_material(c("striped", "hover")) %>%
  row_spec(0, color = "indigo")
Countries rank for Happiness & GDP per capita
country happiness_score GDP_per_capita
Finland 7.769 1.340
Denmark 7.600 1.383
Norway 7.554 1.488
Iceland 7.494 1.380
Netherlands 7.488 1.396
Switzerland 7.480 1.452
Sweden 7.343 1.387
New Zealand 7.307 1.303
Canada 7.278 1.365
Austria 7.246 1.376
Australia 7.228 1.372
Costa Rica 7.167 1.034
Israel 7.139 1.276
Luxembourg 7.090 1.609
United Kingdom 7.054 1.333
Ireland 7.021 1.499
Germany 6.985 1.373
Belgium 6.923 1.356
United States 6.892 1.433
Czech Republic 6.852 1.269

Data Analysis

ggplot(data = data2) +
  geom_point(mapping = aes(x = GDP_per_capita, y = happiness_score, color = country)) +
  geom_smooth(mapping = aes( x = GDP_per_capita, y = happiness_score), se = FALSE, color = "red")
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

There is not a linear relationship between GDP_per_capita and happiness or it is just a very small association between GDP per capita and happiness.

Let calculate to verify this by calculation:

cor(data2$GDP_per_capita, data2$happiness_score)
## [1] 0.08958574

Findings

As Rousseau write in 1750: “Money buys everything, except morality and citizens.” Used ironically as “MONEY CAN’T BUY HAPPINESS”…

We can confirm from this analysis that money can’t buy happiness as there is very small association between GDP per capita and happiness.

LS0tDQp0aXRsZTogIlByb2JsZW0gMjogSGFwcGluZXNzIFZzIEdEUCINCmF1dGhvcjogIkplcmVkIEF0YWt5Ig0KZGF0ZTogImByIFN5cy5EYXRlKClgIg0Kb3V0cHV0OiANCiAgb3BlbmludHJvOjpsYWJfcmVwb3J0OiBkZWZhdWx0DQogIGh0bWxfZG9jdW1lbnQ6DQogICAgbnVtYmVyX3NlY3Rpb25zOiB5ZXMNCi0tLQ0KDQoNCmBgYHtyIHNldHVwLCBpbmNsdWRlPUZBTFNFfQ0Ka25pdHI6Om9wdHNfY2h1bmskc2V0KGVjaG8gPSBUUlVFKQ0KYGBgDQoNCiMjIEJhY2tncm91bmQNCg0KDQo8c3R5bGU+DQpkaXYuYXF1YW1hcmluZSB7IGJhY2tncm91bmQtY29sb3I6IzdmZmZkNDsgYm9yZGVyLXJhZGl1czogMTBweDsgcGFkZGluZzogNXB4O30NCjwvc3R5bGU+DQo8ZGl2IGNsYXNzID0gImFxdWFtYXJpbmUiPg0KDQpUaGlzIGRhdGFzZXQgd2FzIHBvc3RlZCBieSBNYWdudXMgU2tvbmJlcmcgb24gd2VlayA1IGRpc2N1c3Npb24gYm9hcmQgaW4gREFUQSA2MDcuDQpUaGUgZW50aXJlIGRvY3VtZW50IGFib3V0IGl0IGNhbiBiZSBmb3VuZCBpbiB0aGlzIGxpbms6DQpodHRwczovL3d3dy5rYWdnbGUuY29tL3Vuc2Rzbi93b3JsZC1oYXBwaW5lc3MNCg0KVGhlIHByb3Bvc2VkIGFuYWx5c2lzIGlzOiANCg0KQ29tcGFyZSB0aGUgKEhhcHBpbmVzcykgU2NvcmUgYW5kIEdEUCBwZXIgY2FwaXRhIGZvciB0aGUgVG9wIDIwIGNvdW50cmllcyANCnRvIG9ic2VydmUgdGhlIGNvcnJlbGF0aW9uIChpZiB0aGVyZSBpcyBhbnkpLg0KDQpUaHVzIEkgd2lsbCBiYXNlZCBteSBhbmFseXNpcyBpbiAyMDE5IGRhdGEuDQoNCjwvZGl2PiBcaGZpbGxcYnJlYWsNCg0KDQojIyBMaWJyYXJpZXMNCg0KYGBge3J9DQoNCmxpYnJhcnkodGlkeXZlcnNlKQ0KbGlicmFyeShrYWJsZUV4dHJhKQ0KbGlicmFyeShIbWlzYykNCmBgYA0KDQoNCiMjIERhdGEgVHJhbnNmb3JtYXRpb24NCg0KDQojIyMgRGF0YSBpbnNpZ2h0cyBhbmQgY2xlYW5zaW5nIA0KDQpgYGB7cn0NCiMgR2V0IHRoZSBkYXRhDQoNCmRhdGEgPC0gcmVhZC5jc3YoImh0dHBzOi8vcmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbS9qbmF0YWt5L0RBVEEtNjA3L21hc3Rlci9BMl9WYXJpb3VzX2RhdGFzZXRfdHJhbnNmb3JtYXRpb24vaGFwcHlfZGF0YTIwMTkuY3N2IikNCg0KYGBgDQoNCmBgYHtyfQ0KIyBtYWtlIGEgY29weSBvZiB0aGUgZGF0YSBmcmFtZSANCg0KZGF0YV9jb3B5IDwtIGRhdGENCg0KZGltKGRhdGFfY29weSkNCg0KYGBgDQoNCmBgYHtyfQ0KDQojIEdldCB0aGUgaW5zaWdodHMNCg0KaGVhZChkYXRhX2NvcHkpDQoNCmBgYA0KDQpgYGB7cn0NCiMgQ2hlY2sgZm9yIG1pc3NpbmcgdmFsdWVzDQoNCnN1bShpcy5uYShkYXRhX2NvcHkpKQ0KDQpgYGANCg0KDQpgYGB7cn0NCiMgUmV0cmlldmUgY29sdW1ucyBuYW1lcw0KDQpjb2xuYW1lcyhkYXRhX2NvcHkpDQoNCmBgYA0KDQpgYGB7cn0NCiMgUmVuYW1lIGNvbHVtbnMNCg0KZGF0YV9jb3B5IDwtIGRhdGFfY29weSAlPiUNCiAgcmVuYW1lKHJhbmsgPSBPdmVyYWxsLnJhbmssIGNvdW50cnkgPSBDb3VudHJ5Lm9yLnJlZ2lvbiwgaGFwcGluZXNzX3Njb3JlID0gU2NvcmUsIEdEUF9wZXJfY2FwaXRhID0gIEdEUC5wZXIuY2FwaXRhLCBmcmVlZG9tID0gRnJlZWRvbS50by5tYWtlLmxpZmUuY2hvaWNlcywgZ2VuZXJvc2l0eSA9IEdlbmVyb3NpdHksIGNvcnJ1cHRpb24gPSBQZXJjZXB0aW9ucy5vZi5jb3JydXB0aW9uKQ0KDQpgYGANCg0KIyMjIFRpZHlpbmcgZGF0YSANCg0KYGBge3J9DQojIG1ha2UgYSBjb3B5IG9mIGRhdGEgc2V0DQpkYXRhX2NvcHkxIDwtIGRhdGFfY29weQ0KDQpgYGANCg0KDQpgYGB7cn0NCiMgY3JlYXRlIG5ldyBkYXRhIGZyYW1lIHdpdGggbmVjZXNzYXJ5IHZhcmlhYmxlcyBmb3IgYW5hbHlzaXMNCg0KZGF0YTEgPC0gZGF0YV9jb3B5MSAlPiUNCiAgc2VsZWN0KGNvdW50cnksIGhhcHBpbmVzc19zY29yZSwgR0RQX3Blcl9jYXBpdGEpDQoNCmBgYA0KDQpgYGB7cn0NCiMgc2VsZWN0IHRvcCAyMCBjb3VudHJpZXMNCg0KZGF0YTIgPC0gaGVhZChkYXRhMSwgMjApDQpgYGANCg0KYGBge3J9DQpkYXRhMiAlPiUNCiAga2JsKGNhcHRpb24gPSAiQ291bnRyaWVzIHJhbmsgZm9yIEhhcHBpbmVzcyAmIEdEUCBwZXIgY2FwaXRhIiwgYWxpZ24gPSAnYycpICU+JQ0KICBrYWJsZV9tYXRlcmlhbChjKCJzdHJpcGVkIiwgImhvdmVyIikpICU+JQ0KICByb3dfc3BlYygwLCBjb2xvciA9ICJpbmRpZ28iKQ0KDQpgYGANCg0KDQojIyBEYXRhIEFuYWx5c2lzDQoNCmBgYHtyfQ0KDQpnZ3Bsb3QoZGF0YSA9IGRhdGEyKSArDQogIGdlb21fcG9pbnQobWFwcGluZyA9IGFlcyh4ID0gR0RQX3Blcl9jYXBpdGEsIHkgPSBoYXBwaW5lc3Nfc2NvcmUsIGNvbG9yID0gY291bnRyeSkpICsNCiAgZ2VvbV9zbW9vdGgobWFwcGluZyA9IGFlcyggeCA9IEdEUF9wZXJfY2FwaXRhLCB5ID0gaGFwcGluZXNzX3Njb3JlKSwgc2UgPSBGQUxTRSwgY29sb3IgPSAicmVkIikNCg0KDQpgYGANCg0KVGhlcmUgaXMgbm90IGEgbGluZWFyIHJlbGF0aW9uc2hpcCBiZXR3ZWVuIEdEUF9wZXJfY2FwaXRhIGFuZCBoYXBwaW5lc3MNCm9yIGl0IGlzIGp1c3QgYSB2ZXJ5IHNtYWxsIGFzc29jaWF0aW9uIGJldHdlZW4gR0RQIHBlciBjYXBpdGEgYW5kDQpoYXBwaW5lc3MuDQoNCkxldCBjYWxjdWxhdGUgdG8gdmVyaWZ5IHRoaXMgYnkgY2FsY3VsYXRpb246DQoNCmBgYHtyfQ0KDQpjb3IoZGF0YTIkR0RQX3Blcl9jYXBpdGEsIGRhdGEyJGhhcHBpbmVzc19zY29yZSkNCg0KYGBgDQoNCg0KIyMgRmluZGluZ3MNCg0KDQo8c3R5bGU+DQpkaXYuYXF1YW1hcmluZSB7IGJhY2tncm91bmQtY29sb3I6IzdmZmZkNDsgYm9yZGVyLXJhZGl1czogMTBweDsgcGFkZGluZzogNXB4O30NCjwvc3R5bGU+DQo8ZGl2IGNsYXNzID0gImFxdWFtYXJpbmUiPg0KDQoNCkFzIFJvdXNzZWF1IHdyaXRlIGluIDE3NTA6ICJNb25leSBidXlzIGV2ZXJ5dGhpbmcsIGV4Y2VwdCBtb3JhbGl0eSBhbmQgY2l0aXplbnMuIg0KVXNlZCBpcm9uaWNhbGx5IGFzICJNT05FWSBDQU4nVCBCVVkgSEFQUElORVNTIi4uLiAgIA0KDQpXZSBjYW4gY29uZmlybSBmcm9tIHRoaXMgYW5hbHlzaXMgdGhhdCBtb25leSBjYW4ndCBidXkgaGFwcGluZXNzIGFzIHRoZXJlIGlzIHZlcnkgDQpzbWFsbCBhc3NvY2lhdGlvbiBiZXR3ZWVuIEdEUCBwZXIgY2FwaXRhIGFuZCBoYXBwaW5lc3MuDQoNCg0KDQo8L2Rpdj4gXGhmaWxsXGJyZWFrDQoNCg0KDQoNCg==