Introduction:

I am interested in dealing with data that is relevant to my life and how I am feeling. Lately I noticed that a lot of my colleagues are more concern about their own happiness and well being. This lead me to want to focus on contribution to happiness. I found a dataset on kaggle about countries and their happiness being measured.

Coming from a covid lockdown from 2020-2021. I noticed the stress level fo my friends are at an all time high. I understand that happiness does change over time, but with a pandemic level threat like Covid. Was happiness greatly affected? Did government controls or aid provided any relief in contributing happiness to the world from 2020-2021? #Goal and Motivation My motivation for working on this proposal is to determine if happiness level increased from 2020 to 2021 with the ease of the lockdown, or did happiness decrease even further due to the amount of stress developed after the lockdown and covid. I understand government played a role in how our society was able to move. I will explore how the Happiness score varies across countries in different parts of the world. I will also identify the groups of countries(regions) of high happiness across the globe, and the country with highest happiness score in each region. These regions will be highlighted on a world map indicating the happiest and saddest parts of the globe. I want to see I am determined to learn more about happiness, and how to bring it to my own mental health. My personal goal is to showcase my R Programming. APPROACH

In answering the question above, the following approach was followed:

Acquire tech stock data. Filter for highest value (growth or market cap) companies. Verify corresponding company review on Glassdoor (if < 3.5, drop). For each company, scrape the “Pros” section of the top 10 reviews. Tidy and transform our collection of reviews. Visualize most frequent, pertinent verbage via table, barplot, and wordcloud. Analyze and conclude.

Where is the data sources:

I received the dataset for 1) 2020 Happiness level:https://www.kaggle.com/datasets/londeen/world-happiness-report-2020 2) 2021 Happiness level:https://www.kaggle.com/datasets/ajaypalsinghlo/world-happiness-report-2021 Work Citation: 3) https://scrumbook.org/retrospective-pattern-language/happiness-metric.html Note: 1) Dataset is uplaoded into github where I extract the raw dataset from here: https://github.com/Wilchau/Data_607_Final_Project

Setting up the data and library

happy_20 <- read.csv("https://raw.githubusercontent.com/Wilchau/Data_607_Final_Project/main/world_happiness_report_2020.csv", header=TRUE)

happy_21 <- read.csv("https://raw.githubusercontent.com/Wilchau/Data_607_Final_Project/main/world_happiness_report_2021.csv", header=TRUE)
library(readr)
library(readxl)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(tidyr)
library(stringr)
library(e1071)
library(tidyverse)
## ── Attaching packages
## ───────────────────────────────────────
## tidyverse 1.3.2 ──
## ✔ tibble  3.1.8     ✔ forcats 0.5.2
## ✔ purrr   0.3.4     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(corrplot)
## corrplot 0.92 loaded
library(RColorBrewer)
library(Hmisc)
## Loading required package: lattice
## Loading required package: survival
## Loading required package: Formula
## 
## Attaching package: 'Hmisc'
## 
## The following object is masked from 'package:e1071':
## 
##     impute
## 
## The following objects are masked from 'package:dplyr':
## 
##     src, summarize
## 
## The following objects are masked from 'package:base':
## 
##     format.pval, units
library(ggpubr)
head(happy_20)
##   Country.name Regional.indicator Ladder.score Standard.error.of.ladder.score
## 1      Finland     Western Europe       7.8087                     0.03115630
## 2      Denmark     Western Europe       7.6456                     0.03349229
## 3  Switzerland     Western Europe       7.5599                     0.03501417
## 4      Iceland     Western Europe       7.5045                     0.05961586
## 5       Norway     Western Europe       7.4880                     0.03483738
## 6  Netherlands     Western Europe       7.4489                     0.02779175
##   upperwhisker lowerwhisker Logged.GDP.per.capita Social.support
## 1     7.869766     7.747634              10.63927      0.9543297
## 2     7.711245     7.579955              10.77400      0.9559908
## 3     7.628528     7.491272              10.97993      0.9428466
## 4     7.621347     7.387653              10.77256      0.9746696
## 5     7.556281     7.419719              11.08780      0.9524866
## 6     7.503372     7.394428              10.81271      0.9391388
##   Healthy.life.expectancy Freedom.to.make.life.choices  Generosity
## 1                71.90083                    0.9491722 -0.05948202
## 2                72.40250                    0.9514443  0.06620178
## 3                74.10245                    0.9213367  0.10591104
## 4                73.00000                    0.9488919  0.24694422
## 5                73.20078                    0.9557503  0.13453263
## 6                72.30092                    0.9085478  0.20761244
##   Perceptions.of.corruption Ladder.score.in.Dystopia
## 1                 0.1954446                 1.972317
## 2                 0.1684895                 1.972317
## 3                 0.3037284                 1.972317
## 4                 0.7117097                 1.972317
## 5                 0.2632182                 1.972317
## 6                 0.3647171                 1.972317
##   Explained.by..Log.GDP.per.capita Explained.by..Social.support
## 1                         1.285190                     1.499526
## 2                         1.326949                     1.503449
## 3                         1.390774                     1.472403
## 4                         1.326502                     1.547567
## 5                         1.424207                     1.495173
## 6                         1.338946                     1.463646
##   Explained.by..Healthy.life.expectancy
## 1                             0.9612714
## 2                             0.9793326
## 3                             1.0405332
## 4                             1.0008434
## 5                             1.0080719
## 6                             0.9756753
##   Explained.by..Freedom.to.make.life.choices Explained.by..Generosity
## 1                                  0.6623167                0.1596704
## 2                                  0.6650399                0.2427934
## 3                                  0.6289545                0.2690558
## 4                                  0.6619807                0.3623302
## 5                                  0.6702009                0.2879851
## 6                                  0.6136265                0.3363176
##   Explained.by..Perceptions.of.corruption Dystopia...residual
## 1                               0.4778573            2.762835
## 2                               0.4952603            2.432741
## 3                               0.4079459            2.350267
## 4                               0.1445408            2.460688
## 5                               0.4341006            2.168266
## 6                               0.3685698            2.352117
head(happy_21)
##   Country.name Regional.indicator Ladder.score Standard.error.of.ladder.score
## 1      Finland     Western Europe        7.842                          0.032
## 2      Denmark     Western Europe        7.620                          0.035
## 3  Switzerland     Western Europe        7.571                          0.036
## 4      Iceland     Western Europe        7.554                          0.059
## 5  Netherlands     Western Europe        7.464                          0.027
## 6       Norway     Western Europe        7.392                          0.035
##   upperwhisker lowerwhisker Logged.GDP.per.capita Social.support
## 1        7.904        7.780                10.775          0.954
## 2        7.687        7.552                10.933          0.954
## 3        7.643        7.500                11.117          0.942
## 4        7.670        7.438                10.878          0.983
## 5        7.518        7.410                10.932          0.942
## 6        7.462        7.323                11.053          0.954
##   Healthy.life.expectancy Freedom.to.make.life.choices Generosity
## 1                    72.0                        0.949     -0.098
## 2                    72.7                        0.946      0.030
## 3                    74.4                        0.919      0.025
## 4                    73.0                        0.955      0.160
## 5                    72.4                        0.913      0.175
## 6                    73.3                        0.960      0.093
##   Perceptions.of.corruption Ladder.score.in.Dystopia
## 1                     0.186                     2.43
## 2                     0.179                     2.43
## 3                     0.292                     2.43
## 4                     0.673                     2.43
## 5                     0.338                     2.43
## 6                     0.270                     2.43
##   Explained.by..Log.GDP.per.capita Explained.by..Social.support
## 1                            1.446                        1.106
## 2                            1.502                        1.108
## 3                            1.566                        1.079
## 4                            1.482                        1.172
## 5                            1.501                        1.079
## 6                            1.543                        1.108
##   Explained.by..Healthy.life.expectancy
## 1                                 0.741
## 2                                 0.763
## 3                                 0.816
## 4                                 0.772
## 5                                 0.753
## 6                                 0.782
##   Explained.by..Freedom.to.make.life.choices Explained.by..Generosity
## 1                                      0.691                    0.124
## 2                                      0.686                    0.208
## 3                                      0.653                    0.204
## 4                                      0.698                    0.293
## 5                                      0.647                    0.302
## 6                                      0.703                    0.249
##   Explained.by..Perceptions.of.corruption Dystopia...residual
## 1                                   0.481               3.253
## 2                                   0.485               2.868
## 3                                   0.413               2.839
## 4                                   0.170               2.967
## 5                                   0.384               2.798
## 6                                   0.427               2.580

Data Cleaning for 2020 and 2021 In order to focus more on the variables

that contribute to happiness. I will take the Country.name, Regional.indicator., Ladder.score Logged.GDP.per.capita, Social.support, Freedom.to.make.life.choices, Generosity, Perceptions.of.corruption, Dystopia…residual Column:(1,2,7,8,9,10,11,12,20)

New_20 <- happy_20 %>% select(1, 2, 3, 7,8, 9, 10, 11, 12, 20)
New_21 <- happy_21 %>% select(1, 2, 3, 7,8, 9, 10, 11, 12, 20)

I will check to see if there is any missing values in the data set, which has no missing values. Once there aren’t any missing values, I check the new database New_20 and New_21 where I wanted to

sum(is.na(New_20))
## [1] 0
sum(is.na(New_21))
## [1] 0

Statistical analysis

summary(New_20)
##  Country.name       Regional.indicator  Ladder.score   Logged.GDP.per.capita
##  Length:153         Length:153         Min.   :2.567   Min.   : 6.493       
##  Class :character   Class :character   1st Qu.:4.724   1st Qu.: 8.351       
##  Mode  :character   Mode  :character   Median :5.515   Median : 9.456       
##                                        Mean   :5.473   Mean   : 9.296       
##                                        3rd Qu.:6.229   3rd Qu.:10.265       
##                                        Max.   :7.809   Max.   :11.451       
##  Social.support   Healthy.life.expectancy Freedom.to.make.life.choices
##  Min.   :0.3195   Min.   :45.20           Min.   :0.3966              
##  1st Qu.:0.7372   1st Qu.:58.96           1st Qu.:0.7148              
##  Median :0.8292   Median :66.31           Median :0.7998              
##  Mean   :0.8087   Mean   :64.45           Mean   :0.7834              
##  3rd Qu.:0.9067   3rd Qu.:69.29           3rd Qu.:0.8777              
##  Max.   :0.9747   Max.   :76.80           Max.   :0.9750              
##    Generosity       Perceptions.of.corruption Dystopia...residual
##  Min.   :-0.30091   Min.   :0.1098            Min.   :0.2572     
##  1st Qu.:-0.12701   1st Qu.:0.6830            1st Qu.:1.6299     
##  Median :-0.03366   Median :0.7831            Median :2.0463     
##  Mean   :-0.01457   Mean   :0.7331            Mean   :1.9723     
##  3rd Qu.: 0.08543   3rd Qu.:0.8492            3rd Qu.:2.3503     
##  Max.   : 0.56066   Max.   :0.9356            Max.   :3.4408
summary(New_21)
##  Country.name       Regional.indicator  Ladder.score   Logged.GDP.per.capita
##  Length:149         Length:149         Min.   :2.523   Min.   : 6.635       
##  Class :character   Class :character   1st Qu.:4.852   1st Qu.: 8.541       
##  Mode  :character   Mode  :character   Median :5.534   Median : 9.569       
##                                        Mean   :5.533   Mean   : 9.432       
##                                        3rd Qu.:6.255   3rd Qu.:10.421       
##                                        Max.   :7.842   Max.   :11.647       
##  Social.support   Healthy.life.expectancy Freedom.to.make.life.choices
##  Min.   :0.4630   Min.   :48.48           Min.   :0.3820              
##  1st Qu.:0.7500   1st Qu.:59.80           1st Qu.:0.7180              
##  Median :0.8320   Median :66.60           Median :0.8040              
##  Mean   :0.8147   Mean   :64.99           Mean   :0.7916              
##  3rd Qu.:0.9050   3rd Qu.:69.60           3rd Qu.:0.8770              
##  Max.   :0.9830   Max.   :76.95           Max.   :0.9700              
##    Generosity       Perceptions.of.corruption Dystopia...residual
##  Min.   :-0.28800   Min.   :0.0820            Min.   :0.648      
##  1st Qu.:-0.12600   1st Qu.:0.6670            1st Qu.:2.138      
##  Median :-0.03600   Median :0.7810            Median :2.509      
##  Mean   :-0.01513   Mean   :0.7274            Mean   :2.430      
##  3rd Qu.: 0.07900   3rd Qu.:0.8450            3rd Qu.:2.794      
##  Max.   : 0.54200   Max.   :0.9390            Max.   :3.482

Data Summary Originally the data set has: happy_20 -> New_20 Number of

variables: 10 Number of countries: 153 happy_21 -> New_21 Number of variables: 10 Number of Countries: 149

The Ladder score is the main indicator of “Happiness” Level. Based on the summary function, we can see that in 2020 the max ladder score is 7.809 while in 2021 it is 7.842. We need in 2021 the high increase on Social support, GDP. Capital, Health Expectancy gives a better picture of mental health for everyone…. The one stress that I see still are an increase of Perceptions of corruption, and Dystopia Residual from 2020 -> 2021.Decrease in Freedom to make a choice and Generosity from 2020 -> 2021 can reveal many types of unhappiness in our life.

Different type of presentation

ggplot(happy_21, aes(x = Healthy.life.expectancy, y = Freedom.to.make.life.choices)) +
  geom_point() +
  stat_smooth()
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

ggplot(happy_21, aes(x = Healthy.life.expectancy, y = Freedom.to.make.life.choices)) +
  geom_point() +
  stat_smooth()
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

ggplot(happy_20, aes(x = Healthy.life.expectancy, y = Social.support)) +
  geom_point() +
  stat_smooth()
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

ggplot(happy_21, aes(x = Healthy.life.expectancy, y = Social.support)) +
  geom_point() +
  stat_smooth()
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

corrplot(cor(happy_20 %>% 
               select(Ladder.score,Social.support, Freedom.to.make.life.choices,Logged.GDP.per.capita, Generosity)), 
         method="color",  
         sig.level = 0.01, insig = "blank",
         addCoef.col = "black", 
         tl.srt=45, 
         type="upper")

corrplot(cor(happy_21 %>% 
               select(Ladder.score,Social.support, Freedom.to.make.life.choices,Logged.GDP.per.capita, Generosity)), 
         method="color",  
         sig.level = 0.01, insig = "blank",
         addCoef.col = "black", 
         tl.srt=45, 
         type="upper")

``` ##Observation

We can see from results 2020 and 2021 that Freedom to make life choices, and social support is viewed as something that is greatly expressive in making sure it can contribute to our happiness. On the correlation map, you can see Social support, Freedom to make life choices, GDP per capita and genersity comes to play as well.

#Conclusion Happiness is defined by a many variables, but through professional studies we can see that when we look through happiness, social support, freedom to make life choices, capita, and generosity plays a huge role in making sure this can help us gain the happiness we can contribute as a society.

#work citation

  1. https://www.pursuit-of-happiness.org/history-of-happiness/ed-diener/
  2. https://www.kaggle.com/datasets/londeen/world-happiness-report-2020
  3. https://www.kaggle.com/datasets/ajaypalsinghlo/world-happiness-report-2021