The World Happiness Report is a landmark survey of the state of global happiness. The report continues to gain global recognition as governments, organizations and civil society increasingly use happiness indicators to inform their policy-making decisions. Leading experts across fields – economics, psychology, survey analysis, national statistics, health, public policy and more – describe how measurements of well-being can be used effectively to assess the progress of nations. The reports review the state of happiness in the world today and show how the new science of happiness explains personal and national variations in happiness.
Does generosity correlate with the happiness of a country?
Each case represents a country from around a world. There are 156 observations in the given data set from 2019.
Data is collected by the Sustainable Development Solutions Network(SDSN) in 2019 as part of the The World Happiness Report. The happiness scores and rankings use data from the Gallup World Poll.
Data is collected by SDSN and is available online here: https://www.kaggle.com/unsdsn/world-happiness. For this project, data was dowloaded as a csv file. https://www.kaggle.com/unsdsn/world-happiness
“World Happiness Report Happiness scored according to economic production, social support, etc.” Sustainable Development Solutions Network, https://www.kaggle.com/unsdsn/world-happiness.
The response variable is generosity and is quantitative.
The independent variables are score and country and are quantitative and qualitative.
happiness<-read.csv("https://raw.githubusercontent.com/hrensimin05/Cuny_DataScience/master/2019.csv")
#there are 156 observations and 9 variables
happy<-data.frame(happiness)
#View(happy)
head(happy)
## Overall.rank Country.or.region Score GDP.per.capita Social.support
## 1 1 Finland 7.769 1.340 1.587
## 2 2 Denmark 7.600 1.383 1.573
## 3 3 Norway 7.554 1.488 1.582
## 4 4 Iceland 7.494 1.380 1.624
## 5 5 Netherlands 7.488 1.396 1.522
## 6 6 Switzerland 7.480 1.452 1.526
## Healthy.life.expectancy Freedom.to.make.life.choices Generosity
## 1 0.986 0.596 0.153
## 2 0.996 0.592 0.252
## 3 1.028 0.603 0.271
## 4 1.026 0.591 0.354
## 5 0.999 0.557 0.322
## 6 1.052 0.572 0.263
## Perceptions.of.corruption
## 1 0.393
## 2 0.410
## 3 0.341
## 4 0.118
## 5 0.298
## 6 0.343
names(happy)
## [1] "Overall.rank" "Country.or.region"
## [3] "Score" "GDP.per.capita"
## [5] "Social.support" "Healthy.life.expectancy"
## [7] "Freedom.to.make.life.choices" "Generosity"
## [9] "Perceptions.of.corruption"
summary(happy)
## Overall.rank Country.or.region Score GDP.per.capita
## Min. : 1.00 Length:156 Min. :2.853 Min. :0.0000
## 1st Qu.: 39.75 Class :character 1st Qu.:4.545 1st Qu.:0.6028
## Median : 78.50 Mode :character Median :5.380 Median :0.9600
## Mean : 78.50 Mean :5.407 Mean :0.9051
## 3rd Qu.:117.25 3rd Qu.:6.184 3rd Qu.:1.2325
## Max. :156.00 Max. :7.769 Max. :1.6840
## Social.support Healthy.life.expectancy Freedom.to.make.life.choices
## Min. :0.000 Min. :0.0000 Min. :0.0000
## 1st Qu.:1.056 1st Qu.:0.5477 1st Qu.:0.3080
## Median :1.272 Median :0.7890 Median :0.4170
## Mean :1.209 Mean :0.7252 Mean :0.3926
## 3rd Qu.:1.452 3rd Qu.:0.8818 3rd Qu.:0.5072
## Max. :1.624 Max. :1.1410 Max. :0.6310
## Generosity Perceptions.of.corruption
## Min. :0.0000 Min. :0.0000
## 1st Qu.:0.1087 1st Qu.:0.0470
## Median :0.1775 Median :0.0855
## Mean :0.1848 Mean :0.1106
## 3rd Qu.:0.2482 3rd Qu.:0.1412
## Max. :0.5660 Max. :0.4530
top20<-happy %>% filter(Overall.rank<=20)%>% arrange(desc(Score))
top20$label<-paste(top20$Country.or.region,top20$Overall.rank ,top20$Score ,sep="\n ")
options(repr.plot.width=12, repr.plot.height=8)
treemap(top20,
index=c("label"),
vSize="Score",
vColor="Overall.rank",
type="value",
title="Top 20 Happiness Countries -2019",
palette=terrain.colors(20),
command.line.output = TRUE,
format.legend = list(scientific = FALSE, big.mark = " "))
## Warning in if (class(try(col2rgb(palette), silent = TRUE)) == "try-error")
## stop("color palette is not correct"): the condition has length > 1 and only the
## first element will be used
top10<-happy %>% select(Country.or.region,Overall.rank,Score) %>% head(n=20)
ggplot(top10,aes(x=factor(Country.or.region,levels=Country.or.region),y=Score))+geom_bar(stat="identity",width=0.5,fill="blue")+theme(axis.text.x = element_text(angle=90, vjust=0.6))+labs(title="Top 20 Countries from World Happiness Report-2019",x="Country",y="Score")+coord_flip()
cor.test(happy$Score,happy$Generosity)
##
## Pearson's product-moment correlation
##
## data: happy$Score and happy$Generosity
## t = 0.94366, df = 154, p-value = 0.3468
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.08229763 0.23022136
## sample estimates:
## cor
## 0.07582369
corrplot(cor(happy %>%
select(Score,Generosity)),
method="color",
sig.level = 0.01, insig = "blank",
addCoef.col = "black",
tl.srt=45,
type="upper"
)
model <- lm( Generosity ~ Score , data = happy)
summary(model)$adj.r.squared
## [1] -0.0007069411
summary(model)
##
## Call:
## lm(formula = Generosity ~ Score, data = happy)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.18407 -0.07288 -0.00380 0.06494 0.38795
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.149762 0.037953 3.946 0.00012 ***
## Score 0.006489 0.006876 0.944 0.34682
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.09529 on 154 degrees of freedom
## Multiple R-squared: 0.005749, Adjusted R-squared: -0.0007069
## F-statistic: 0.8905 on 1 and 154 DF, p-value: 0.3468
par.orig <- par(mfrow=c(2,2))
plot(log(happy$Score), resid(model), main="Predictors vs Residuals")
abline(0,0)
plot(fitted(model), resid(model),main="Fitted vs Residuals", xlab="Fitted Values")
abline(0,0)
qqnorm(resid(model), main="QQ-Plot of Residuals")
qqline(resid(model))
hist(resid(model), main="Histogram of Residuals")
par(par.orig)
plot_ly(data = happy,
x=~Generosity, y=~Score, color=~Generosity, type = "scatter",
text = ~paste("Country:", Country.or.region)) %>%
layout(title = "Happiness and Generocity ",
xaxis = list(title = "Generosity"),
yaxis = list(title = "Happiness Score"))
## No scatter mode specifed:
## Setting the mode to markers
## Read more about this attribute -> https://plot.ly/r/reference/#scatter-mode
## Warning: `arrange_()` is deprecated as of dplyr 0.7.0.
## Please use `arrange()` instead.
## See vignette('programming') for more help
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
Country which generosity is the highest vs country with the lowest generosity.
max_g <- max(happy$Generosity)
subset(happy[c(2,8)], Generosity == max_g)
## Country.or.region Generosity
## 131 Myanmar 0.566
min_g <- min(happy$Generosity)
subset(happy[c(2,8)], Generosity == min_g)
## Country.or.region Generosity
## 82 Greece 0
After analyzing data of Global Happiness Levels in the world, created by the United Nations Sustainable Development Solutions Network,I was able to discover no significat impact of the generosity factor in determining “happiness.” I decided to focus on creating graphs of their relationship to confirm more that there is no direct relation between these factors.
By looking at and analyzing the report, I was able to discover if being generous really makes countries and their citizens happier and if does have significant correlation with happiness. As I thought that generosity will make people happy but my analysis proved that it is not a factor.
After running test on all variable in below graph we was able what have the biggest impact on the happiness score according to the World Happiness Report. It turned out that gross domestic product (GDP) per capital has the biggest impact, means the richer country happier people.
corrplot(cor(happy %>%
select(Score:Perceptions.of.corruption)),
method="color",
sig.level = 0.01, insig = "blank",
addCoef.col = "black",
tl.srt=45,
type="upper"
)