Discussion12

happiness<-read.csv("https://raw.githubusercontent.com/hrensimin05/Cuny_DataScience/master/2019.csv")

#there are 156 observations and 9 variables
happy<-data.frame(happiness)
head(happy)

##   Overall.rank Country.or.region Score GDP.per.capita Social.support
## 1            1           Finland 7.769          1.340          1.587
## 2            2           Denmark 7.600          1.383          1.573
## 3            3            Norway 7.554          1.488          1.582
## 4            4           Iceland 7.494          1.380          1.624
## 5            5       Netherlands 7.488          1.396          1.522
## 6            6       Switzerland 7.480          1.452          1.526
##   Healthy.life.expectancy Freedom.to.make.life.choices Generosity
## 1                   0.986                        0.596      0.153
## 2                   0.996                        0.592      0.252
## 3                   1.028                        0.603      0.271
## 4                   1.026                        0.591      0.354
## 5                   0.999                        0.557      0.322
## 6                   1.052                        0.572      0.263
##   Perceptions.of.corruption
## 1                     0.393
## 2                     0.410
## 3                     0.341
## 4                     0.118
## 5                     0.298
## 6                     0.343

summary(happy)

##   Overall.rank    Country.or.region      Score       GDP.per.capita  
##  Min.   :  1.00   Length:156         Min.   :2.853   Min.   :0.0000  
##  1st Qu.: 39.75   Class :character   1st Qu.:4.545   1st Qu.:0.6028  
##  Median : 78.50   Mode  :character   Median :5.380   Median :0.9600  
##  Mean   : 78.50                      Mean   :5.407   Mean   :0.9051  
##  3rd Qu.:117.25                      3rd Qu.:6.184   3rd Qu.:1.2325  
##  Max.   :156.00                      Max.   :7.769   Max.   :1.6840  
##  Social.support  Healthy.life.expectancy Freedom.to.make.life.choices
##  Min.   :0.000   Min.   :0.0000          Min.   :0.0000              
##  1st Qu.:1.056   1st Qu.:0.5477          1st Qu.:0.3080              
##  Median :1.272   Median :0.7890          Median :0.4170              
##  Mean   :1.209   Mean   :0.7252          Mean   :0.3926              
##  3rd Qu.:1.452   3rd Qu.:0.8818          3rd Qu.:0.5072              
##  Max.   :1.624   Max.   :1.1410          Max.   :0.6310              
##    Generosity     Perceptions.of.corruption
##  Min.   :0.0000   Min.   :0.0000           
##  1st Qu.:0.1087   1st Qu.:0.0470           
##  Median :0.1775   Median :0.0855           
##  Mean   :0.1848   Mean   :0.1106           
##  3rd Qu.:0.2482   3rd Qu.:0.1412           
##  Max.   :0.5660   Max.   :0.4530

library(tidyverse)

## -- Attaching packages ---------------------------------------------------------------------------------------------- tidyverse 1.3.0 --

## v ggplot2 3.3.2     v purrr   0.3.4
## v tibble  3.0.3     v dplyr   1.0.2
## v tidyr   1.1.2     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.5.0

## -- Conflicts ------------------------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

library(corrplot)

## Warning: package 'corrplot' was built under R version 4.0.3

## corrplot 0.84 loaded

library(plotly)

## Warning: package 'plotly' was built under R version 4.0.3

## 
## Attaching package: 'plotly'

## The following object is masked from 'package:ggplot2':
## 
##     last_plot

## The following object is masked from 'package:stats':
## 
##     filter

## The following object is masked from 'package:graphics':
## 
##     layout

Correlation

cor.test(happy$Score,happy$Generosity)

## 
##  Pearson's product-moment correlation
## 
## data:  happy$Score and happy$Generosity
## t = 0.94366, df = 154, p-value = 0.3468
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.08229763  0.23022136
## sample estimates:
##        cor 
## 0.07582369

corrplot(cor(happy %>% 
               select(Score,Generosity)), 
         method="color",  
         sig.level = 0.01, insig = "blank",
         addCoef.col = "black", 
         tl.srt=45, 
         type="upper"
         )

Linear Regression Model

model <- lm( Generosity  ~ Score  , data = happy)
summary(model)$adj.r.squared

## [1] -0.0007069411

summary(model)

## 
## Call:
## lm(formula = Generosity ~ Score, data = happy)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.18407 -0.07288 -0.00380  0.06494  0.38795 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.149762   0.037953   3.946  0.00012 ***
## Score       0.006489   0.006876   0.944  0.34682    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.09529 on 154 degrees of freedom
## Multiple R-squared:  0.005749,   Adjusted R-squared:  -0.0007069 
## F-statistic: 0.8905 on 1 and 154 DF,  p-value: 0.3468

Residual Analysis

par.orig <- par(mfrow=c(2,2))
plot(log(happy$Score), resid(model), main="Predictors vs Residuals")
abline(0,0)
plot(fitted(model), resid(model),main="Fitted vs Residuals", xlab="Fitted Values")
abline(0,0)
qqnorm(resid(model), main="QQ-Plot of Residuals")
qqline(resid(model))
hist(resid(model), main="Histogram of Residuals")

par(par.orig)

plot_ly(data = happy, 
        x=~Generosity, y=~Score, color=~Generosity, type = "scatter",
        text = ~paste("Country:", Country.or.region)) %>% 
        layout(title = "Happiness and Generocity ", 
               xaxis = list(title = "Generosity"),
               yaxis = list(title = "Happiness Score"))

## No scatter mode specifed:
##   Setting the mode to markers
##   Read more about this attribute -> https://plot.ly/r/reference/#scatter-mode

## Warning: `arrange_()` is deprecated as of dplyr 0.7.0.
## Please use `arrange()` instead.
## See vignette('programming') for more help
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.

corrplot(cor(happy %>% 
               select(Score:Perceptions.of.corruption)), 
         method="color",  
         sig.level = 0.01, insig = "blank",
         addCoef.col = "black", 
         tl.srt=45, 
         type="upper"
         )

Conclusion

After analyzing data of Global Happiness Levels in the world, created by the United Nations Sustainable Development Solutions Network,I was able to discover no significat impact of the generosity factor in determining “happiness.” I decided to focus on creating graphs of their relationship to confirm more that there is no direct relation between these factors.

By looking at and analyzing the report, I was able to discover if being generous really makes countries and their citizens happier and if does have significant correlation with happiness. As I thought that generosity will make people happy but my analysis proved that it is not a factor. After running test on all variable in below graph we was able what have the biggest impact on the happiness score according to the World Happiness Report. It turned out that gross domestic product (GDP) per capital has the biggest impact, means the richer country happier people.

Discussion12

Dominika Markowska-Desvallons

4/18/2021

Correlation

Linear Regression Model

Residual Analysis

Conclusion