Please read the following report as if you’re part of a newly created task-force which includes leaders of large corporations and non-profit organizations and high-level government officials in the US. Essentially, people of influence who have the power to affect the economy, social support, public health, etc.

The task-force is created after the newly elected president’s transition team identified this as a project that should be undertaken over the course of the next four years. This report should be the “jumping off point” for this task force.

I’ll use “we” or “our” in this report as if I’m working with a team of analysts, but compiling this has been a solo effort.


Executive Summary

The World Happiness Report is a landmark survey of the state of global happiness. The first report was published in 2012 and has been produced each year since.

It continues to gain global recognition as governments, organizations, and civil society increasingly use happiness indicators to inform their policy-making decisions. Over the last five years, the Happiness Score in the US has decreased and then remained stagnant. Yet, other countries that are comparable in terms of gross domestic product (GDP) and other factors have a higher Happiness Score.

In this report, we will explore how different variables are tied to happiness and attempt to create a model to improve this score.

Purpose

Ultimately, the question we’d like to answer in this report is:

What can we do to make the US a happier country?

As the leaders of some of the largest employers and agencies who can positively affect outcomes in the US, this report should serve as a roadmap–both showing where we’ve been and where we, as a country, can go. We’ll analyze publicly available data that is compiled on an annual basis, allowing this task-force to track progress.

The following charts and models are an analysis of the 2019 Annual World Happiness Report.

Data

Since the first report was published in 2012, the state of the world, and more specifically, the US, has changed.

Before performing an analysis or creating a model, we wanted to explore the World Happiness Report survey data.

The dataset for 2019 contains 156 observations covering 9 variables:

Range, Mean & Standard Deviation of Variables

What we found noteworthy while reviewing this basic information is that Happiness Score has a wider range than other variables, both Generosity and Perception of Corruption have a tighter distribution with a smaller standard deviation, and Social Support has a negative skewness, its mean being closer to its max value in the range.

Happiness Score GDP Per Capita
Range: 2.853, 7.769 Range: 0, 1.684
Mean: 5.4070962 Mean: 0.9051474
Standard Deviation: 1.1131199 Standard Deviation: 0.3983895
Generosity Freedom of Choice
Range: 0, 0.566 Range: 0, 0.631
Mean: 0.1848462 Mean: 0.3925705
Standard Deviation: 0.0952544 Standard Deviation: 0.1432895
Perception of Corruption Healthy Life Expectancy
Range: 0, 0.453 Range: 0, 1.141
Mean: 0.1106026 Mean: 0.7252436
Standard Deviation: 0.0945378 Standard Deviation: 0.242124
Social Support
Range: 0, 1.624
Mean: 1.2088141
Standard Deviation: 0.2991914

Histograms

By producing histograms of our data, a few notable observations appear:

  • Happiness Score is grouped around 4-6.5 range.
    • We’d like to focus on moving the US from the “7s” to the “8s”
  • Generosity and Perception of Corruption have fairly low scores.
    • This is a good time to point out the subjective nature of these variables.

Scatterplots & Correlation Matrix

Producing scatterplots with numeric variables plotted on the X axis, we can roughly see which are correlated to Happiness Score.

Our correlation matrix further proves the evidence of correlation and helped us identify viable inputs for our model.

Modelling For Happiness

After reviewing the exploratory charts and data, there are relationships between variables contained in the survey and a country’s Happiness Score.

In this portion of the report, we will explore potential models for Happiness Score and try to answer our question–How can we make the US a happier country?

Potential Models

Model.1 - Multiple Linear Regression With All Variables

## 
## Call:
## lm(formula = score ~ gdpPerCap + generosity + freedomChoices + 
##     socialSupport + lifeExpectancy + perceptionCorruption, data = X2019)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.75304 -0.35306  0.05703  0.36695  1.19059 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            1.7952     0.2111   8.505 1.77e-14 ***
## gdpPerCap              0.7754     0.2182   3.553 0.000510 ***
## generosity             0.4898     0.4977   0.984 0.326709    
## freedomChoices         1.4548     0.3753   3.876 0.000159 ***
## socialSupport          1.1242     0.2369   4.745 4.83e-06 ***
## lifeExpectancy         1.0781     0.3345   3.223 0.001560 ** 
## perceptionCorruption   0.9723     0.5424   1.793 0.075053 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5335 on 149 degrees of freedom
## Multiple R-squared:  0.7792, Adjusted R-squared:  0.7703 
## F-statistic: 87.62 on 6 and 149 DF,  p-value: < 2.2e-16

Notes:

  • Generosity and Perception of Corruption are the only variables that are not statistically significant predictors of Happiness Score.
  • As noted before, Perception of Corruption is one of the more subjective measures in this dataset.
  • Social Support is a stronger predictor than our team assumed it would be.

Model.2 - Multiple Linear Regression With Strong Predictors

## 
## Call:
## lm(formula = score ~ gdpPerCap + freedomChoices + socialSupport + 
##     lifeExpectancy, data = X2019)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.86584 -0.34594  0.03403  0.43676  1.13076 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      1.8921     0.1994   9.491  < 2e-16 ***
## gdpPerCap        0.8105     0.2165   3.745 0.000256 ***
## freedomChoices   1.8458     0.3404   5.423 2.28e-07 ***
## socialSupport    1.0166     0.2347   4.331 2.70e-05 ***
## lifeExpectancy   1.1414     0.3373   3.384 0.000910 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5398 on 151 degrees of freedom
## Multiple R-squared:  0.7709, Adjusted R-squared:  0.7649 
## F-statistic:   127 on 4 and 151 DF,  p-value: < 2.2e-16

Notes:

  • This model is stronger than Model.1, this was indicated in our exploratory correlation matrix–both Generosity and Perception of Corruption were not as strongly correlated to score.
  • GDP Per Capita, Freedom of Choice, Social Support, and Healthy Life Expectancy are strong predictors of Happiness.

Model.3 - Simple Linear Regression - GDP Per Capita

## 
## Call:
## lm(formula = score ~ gdpPerCap, data = X2019)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.22044 -0.48361  0.00828  0.48433  1.47409 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   3.3993     0.1353   25.12   <2e-16 ***
## gdpPerCap     2.2181     0.1369   16.20   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.679 on 154 degrees of freedom
## Multiple R-squared:  0.6303, Adjusted R-squared:  0.6278 
## F-statistic: 262.5 on 1 and 154 DF,  p-value: < 2.2e-16

Notes:

  • While this model is statistically significant in rejecting the null hypothesis, it is too simplistic.
  • Even if GDP Per Capita is high and all other variables are low, it would result in a lower Happiness Score.

Results

Winning Model

The model that appears to be strongest in enabling us to predict Happiness Score is Model.2–Multiple Linear Regression With Strong Predictors. It factors in all but two of the numeric variables (the categorical variable “Country” was not deemed useful in this type of exercise, therefore, it’s been excluded), and all predictors have p-values far lower than .05.

Testing Our Winning Model

The equation for this model can be written as \(score = 1.89 + gdpPerCap * .81 + freedomChoices * 1.85 + socialSupport*1.02+lifeExpectancy*1.14\)

In order to validate our model, we used the sample() function to choose three countries at random, which returned rows 16, 152, and 24. Respectively, these correspond to Ireland, Rwanda, and France.

This is a random sample, however, it’s notable that we have a country with a higher Happiness Score (Ireland), one with a comparable score (France), and one with a much lower score (Rwanda).


Ireland

Equation For Estimated Score: \(score=1.89 + 1.499 * .81 + 0.516 * 1.85 + 1.553*1.02+0.999*1.14\)

Estimated Score Using Model Observed Score
6.78171 7.021

Rwanda

Equation For Estimated Score: \(score=1.89 + 0.359 * .81 + 0.555 * 1.85 + 0.711*1.02+0.614*1.14\)

Estimated Score Using Model Observed Score
4.63272 3.334

France

Equation For Estimated Score: \(score=1.89 + 1.324 * .81 + 0.436 * 1.85 + 1.472*1.02+1.045*1.14\)

Estimated Score Using Model Observed Score
6.28484 6.592

We can see that by using observed values from the survey, our estimated score is close to observed score for both Ireland and France, but the model overestimated the score for Rwanda.

Applying Our Model

Circling back to our initial question–What can we do to make the US a happier country? We’ll try to answer that using our winning model and testing the effect of different actions.

Estimated Score Using Model Observed Score
6.37313 6.892

Our model produces a lower score than the observed score from the survey. Since we are testing the affect of altering variables, we will use the estimated score of 6.37313 as our benchmark to compare the following estimated scores to.

Test Action Estimated Score
Improving GDP Per Capita by 0.25 6.57563
Improving Social Support by 0.25 6.62813
Improving Both GDP & Social Support by 0.25 6.83063
Improving Healthy Life Expectancy by 0.25 6.65813
Improving Healthy Life Expectancy & Social Support by 0.25 6.91313
Improving All Variables by 0.10 6.85513
Improving All Variables by 0.25 7.57813
GDP constant/All other variables increase by 0.10 6.77413

Notes:

  • Improving Social Support is more impactful to Happiness Score than GDP Per Capita.
    • A full point increase in Social Support would result in an increase of 1.016 in Happiness Score.
  • Improving all variables slightly increases Happiness Score significantly.
  • Holding GDP constant and improving all other variables by 0.10 also improves the score significantly.

Issues With The Model

As with all models, it is imperfect. One of the biggest issues is that it’s a simplification of a complex measure. The survey this model is based on is an incredible undertaking–Being able to compile and synthesize a vast amount of information from over 150 countries in order to calculate an estimated score is no small feat. With that being said, this model doesn’t factor in disastrous weather events, acts of terror, or, as we are currently seeing, the effects of a pandemic on a country’s overall happiness.

Conclusion & Next Steps

What we hope our model demonstrates is that even small improvements in the country’s health, social support, freedom of choices, and GDP per capita can have a large impact on the Happiness Score.

As a task-force, we are in a position to influence the well-being of the US.

Next Steps

We would like to continue this discussion. Our team is proposing is that:

  1. We identify proxies for measuring these variables on our own throughout the course of the following year.
  2. Develop improvement plans for health and social support since this group has has the ability to affect these two variables the most.
  3. Convene on a quarterly and annual basis to report on improvements.

Please contact Taylor Tuomie with questions:

World Happiness Report Datasets