Abstract

On April 25 polling firm Gallup released a report titled “Americans’ Stress, Worry and Anger Intensified in 2018”. Citizens from over 100 countries were surveyed and on the question about stress 55% of Americans said they experience stress much of the previous day. The global average is 35%. This got me thinking. I’m familiar with the World Happiness Report (WHR) and the US ranked #19 out of 136 countries in happiness. I’m an American. If my stress score is so high, how am I also among the happiest people in the world?
Gallup didn’t supply the full data but they did show a list of the top 12 most-stressed countries and their corresponding scores. Scanning the list I can understand countries like Greece, Venezuala, and the sub-Saharan African countries being highly stressed … but Costa Rica?! Costa Rica has always ranked high on the WHR and this year they are #13. I wanted to check my assumption that stress and happiness are strongly negatively correlated so I copied that list into the same Excel workbook that the World Happiness Report data is, then joined the two datasets, and checked their correlation. My expectation was that the correlation would be -0.75 or below.
The result was that the correlation was near zero when compared to 2018 WHR happiness scores, and also near zero when I used the previous three years for each country.
Turns out stress doesn’t impact happiness.

Sources

The 2019 World Happiness Report
Gallup - Americans’ Stress, Worry and Anger Intensified in 2018

Import & join data

Load libraries

library(tidyverse)
library(openxlsx)

Import most-stressed countries’ data

I copied and pasted the list of most stressed countries into a separate sheet in the World Happiness Report workbook.

stressed <- read.xlsx("happiness and stress.xlsx", sheet = "Stressed") %>%
  mutate(country = str_replace(country, "United States of America", "United States"))
head(stressed, 12)
##          country stress_pct
## 1         Greece         59
## 2    Philippines         58
## 3       Tanzania         57
## 4        Albania         55
## 5           Iran         55
## 6      Sri Lanka         55
## 7  United States         55
## 8         Uganda         53
## 9     Costa Rica         52
## 10        Rwanda         52
## 11        Turkey         52
## 12     Venezuela         52

Americans are more stressed than Venezualans.

Import World Happiness Report data

I use this dataset quite a bit. The raw data has many missing values that cause issues so I created a clean version with those missing values imputed. Even though there no missing values in the data I need for this analysis I still use the cleaned data.

whr_2019 <- read.xlsx("happiness and stress.xlsx", sheet = "WHR_Clean") %>%
  select( year, country, happiness) %>%
  filter(year %in% c(2018, 2017, 2016)) %>%
  filter(country %in% unique(stressed$country))
sample_n(whr_2019, 10)
##    year     country happiness
## 20 2018   Sri Lanka  4.400223
## 16 2016      Rwanda  3.332990
## 33 2016   Venezuela  4.041115
## 7  2016      Greece  5.302619
## 25 2017      Turkey  5.607262
## 19 2017   Sri Lanka  4.330945
## 34 2017   Venezuela  5.070751
## 8  2017      Greece  5.148242
## 14 2017 Philippines  5.594270
## 9  2018      Greece  5.409289

A quick diversion … Negative Correlations

The general idea is that as one factor goes up the other factor goes down. Stress goes up and happiness goes down.
Let’s look at a more concrete example about cars. We could expect that as the weight of a car goes up it’s miles per gallon (mpg) will go down.

Using the built-in mtcars dataset I will test this …

ggplot(mtcars, aes(wt, mpg)) + 
  geom_point() +
  stat_smooth(method = "lm") +
  ggtitle("Weight to Miles-per-gallon")

Check the correlation

cor(mtcars$mpg, mtcars$wt)
## [1] -0.8676594

This is a stronger correlation than the -0.75 threshold I have set for the stress/happiness data.

Using three years of data

Luckily this is pretty straight-forward.
- Join the two datasets
- Show scatterplot of Stress to Happiness
- Check the correlation

Join datasets

Join the two datasets together

joined <- whr_2019 %>%
  left_join(stressed, by = "country")

Plot it out

ggplot(joined, aes(stress_pct, happiness)) +
  geom_jitter(aes(col = country)) +
  stat_smooth(method = "lm") +
  ggtitle("Stress to Happiness")

Hmmm… that’s not at all what I expected.

Check the correlation

cor(joined$happiness, joined$stress_pct)
## [1] -0.02269669

That is practically zero.

Using current year data only

Filter existing data for only 2018

whr_2018 <- joined %>%
  filter(year == 2018)

Plot it out

ggplot(whr_2018, aes(stress_pct, happiness)) +
  geom_jitter(aes(col = country)) + 
  stat_smooth(method = "lm") +
  ggtitle("2018 Happiness to Stress Pct")

Flat line.

Check the correlation

cor(whr_2018$happiness, whr_2018$stress_pct)
## [1] 0.0006624051

Again, it is near-zero which indicates zero correlation.

Conclusion

The purpose of this paper was to test the null hypothesis that stress and happiness are highly negatively-correlated. Happiness scores from the World Happiness Report were tested for each of the stressed countries in two parts. First, three years of data was used to give more data points while also being relatively recent, and part 2 compares just the WHR scores from 2018 with the Stress scores which are also from 2018.
The result was that the correlation of stress to happiness is near zero. One does not impact the other. The null hypothesis that happiness is strongly correlated with stress can be rejected.

END