DHACHAINEE MURUGAYAH (s3794334) RACHEAL RONALD COELHO (s3804448)
Last updated: 27 October, 2019
RPubs link: http://rpubs.com/Dhachu18/d543369
10 Happiest Countries in the World
happy <- read_csv("happy.csv")
names(happy)[names(happy) == "Happiness score"] <- "HappinessScore"
names(happy)[names(happy) == "Explained by: Healthy life expectancy"] <- "HealthyLifeExpectancy"
head(happy)x<-happy %>% summarise(MissingValuesInHealthyLifeStyle=sum(is.na(happy$HealthyLifeExpectancy)))
y<-happy %>% summarise(MissingValuesInHappinessScore=sum(is.na(happy$HappinessScore)))
data.frame(y,x)happy<-happy %>%mutate(HappinessScore = ifelse (is.na(HappinessScore), mean(HappinessScore, na.rm = TRUE), HappinessScore))
happy<-happy %>%mutate(HealthyLifeExpectancy = ifelse (is.na(HealthyLifeExpectancy), mean(HealthyLifeExpectancy, na.rm = TRUE), HealthyLifeExpectancy))
x<-happy %>% summarise(MissingValuesInHealthyLifeStyle=sum(is.na(happy$HealthyLifeExpectancy)))
y<-happy %>% summarise(MissingValuesInHappinessScore=sum(is.na(happy$HappinessScore)))
data.frame(y,x)#boxplot
happy %>% boxplot(happy$HappinessScore, happy$HealthyLifeExpectancy, names=c("HappinessScore"
, "HealthyLifeExpectancy"), data = .,
main="Boxplot of Happines sScore and Life Expectancy",
xlab="happy", ylab="Range", col=c("yellow", "green"))HappinessScore <- happy$HappinessScore[!is.na(happy$HappinessScore)]
z_score<-HappinessScore %>% scores(type = "z")
HappinessScore[ which( abs(z_score) >3 )]## numeric(0)
HappinessScore[ which( abs(z_score) >3 )]<-mean(HappinessScore,na.rm=TRUE)
boxplot(HappinessScore, main = "Box Plot of Happiness Score",ylab="Happiness Score",verticle
=TRUE, col = "green")HealthyLifeExpectancy <- happy%>% summarise(Min = min(HealthyLifeExpectancy,na.rm = TRUE),
Q1 = quantile(HealthyLifeExpectancy,probs = .25,na.rm=TRUE),
Median = median(HealthyLifeExpectancy, na.rm = TRUE),
Q3 = quantile(HealthyLifeExpectancy,probs = .75,na.rm=TRUE),
Max = max(HealthyLifeExpectancy,na.rm = TRUE),
Mean = mean(HealthyLifeExpectancy, na.rm = TRUE),
SD = sd(HealthyLifeExpectancy, na.rm = TRUE)
)
HappinessScore <- happy%>% summarise(Min = min(HappinessScore,na.rm = TRUE),
Q1 = quantile(HappinessScore,probs = .25,na.rm=TRUE),
Median = median(HappinessScore, na.rm = TRUE),
Q3 = quantile(HappinessScore,probs = .75,na.rm=TRUE),
Max = max(HappinessScore,na.rm = TRUE),
Mean = mean(HappinessScore, na.rm = TRUE),
SD = sd(HappinessScore, na.rm = TRUE)
)
combination <- rbind(HappinessScore, HealthyLifeExpectancy)
rownames(combination) <- c("HappinessScore", "HealthyLifeExpectancy")
kable(round(combination,2), caption = "Summary table of Happiness Score and Healthy Life Expectancy", row.names = TRUE)| Min | Q1 | Median | Q3 | Max | Mean | SD | |
|---|---|---|---|---|---|---|---|
| HappinessScore | 2.85 | 4.54 | 5.38 | 6.18 | 7.77 | 5.41 | 1.11 |
| HealthyLifeExpectancy | 0.00 | 0.55 | 0.78 | 0.88 | 1.14 | 0.72 | 0.24 |
matplot(t(data.frame(happy$HealthyLifeExpectancy,happy$HappinessScore)),
type="b",
pch = 19,
col = 1,
lty = 1,
xlab = "Comparison",
ylab = "Happiness Score",
xaxt = "n")
axis(1, at=1:2,labels = c("HappinessScore","HealthyLifeExpectancy") )#scatterplot
x <- happy$HappinessScore
y <- happy$HealthyLifeExpectancy
plot(x, y, main = "Happiness Score Price VS Healthy Life Expectancy ",
xlab = "Happiness", ylab = " HealthyLifeExpectancy",
pch = 19, frame = FALSE)
abline(lm(y ~ x, data = happy), col="red", lty=6)Assumptions:
Independence: Happiness Score and Healthy Life Expectancy are independent Linearity: as shown in scatter plot, there is a possitive relationship between Happiness Score and Healthy Life Expectancy
##
## Call:
## lm(formula = HappinessScore ~ HealthyLifeExpectancy, data = happy)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.6831 -0.4604 0.0743 0.5230 1.5904
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.7832 0.1771 15.72 <2e-16 ***
## HealthyLifeExpectancy 3.6382 0.2331 15.61 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6949 on 154 degrees of freedom
## Multiple R-squared: 0.6127, Adjusted R-squared: 0.6102
## F-statistic: 243.7 on 1 and 154 DF, p-value: < 2.2e-16
The p-value for the F-test is very small, F(1,154) = 243.7, p<.001. Since the p value is less than the 0.05 level of significance, H0 is rejected. Hence, there was statistically significant evidence that there is an association between the Happiness Score and Healthy Life Expectancy
## [1] 3.767232e-33
The p value is less than 0.001. Since, the p is less than the 0.05 level of significance, we reject H0. There was statistically significant evidence that the data fit a linear regression model.
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.783194 0.1770647 15.71852 7.979033e-34
## HealthyLifeExpectancy 3.638231 0.2330765 15.60960 1.544574e-33
## 2.5 % 97.5 %
## (Intercept) 2.433405 3.132984
## HealthyLifeExpectancy 3.177791 4.098671
The intercept/constant is reported as a=2.783194. The 95% CI for a to be [2.433405, 3.132984]. H0:α=0 is clearly not captured by this interval. Thus, H0 is rejected.
Findings:
Strengths:
Limitations:
The directions for future research:
Conclusion:
Happy humans
kaggle, 2019. World Happiness Report 2019. [Online] Available at: https://www.kaggle.com/PromptCloudHQ/world-happiness-report-2019 [Accessed 8 October 2019].
MacMillan, A., 2018. Happiness linked to longer life. [Online] Available at: https://edition.cnn.com/2011/10/31/health/happiness-linked-longer-life/index.html [Accessed 20 October 2019].
McKenzie, D. J., 2014. Happiness - The Highest Form of Health. [Online] Available at: https://www.naturopathiccurrents.com/articles/happiness-highest-form-health [Accessed 10 October 2019].
UNRIC, 2019. The UN and happiness. [Online] Available at: https://www.unric.org/en/happiness/27709-the-un-and-happiness [Accessed 10 October 2019].
World Happiness Report , 2019. World Happiness Report 2019. [Online] Available at: https://worldhappiness.report/ed/2019/ [Accessed 10 October 2019].