Please read the following report as if you’re part of a newly created task-force which includes leaders of large corporations and non-profit organizations and high-level government officials in the US. Essentially, people of influence who have the power to affect the economy, social support, public health, etc.
The task-force is created after the newly elected president’s transition team identified this as a project that should be undertaken over the course of the next four years. This report should be the “jumping off point” for this task force.
I’ll use “we” or “our” in this report as if I’m working with a team of analysts, but compiling this has been a solo effort.
Executive Summary
The World Happiness Report is a landmark survey of the state of global Happiness. The first report was published in 2012 and has been produced each year since.
It continues to gain global recognition as governments, organizations, and civil society increasingly use Happiness indicators to inform their policy-making decisions. Over the last five years, the Happiness score in the US has decreased and then has essentially remained stagnant. Yet, other countries that are comparable in terms of gross domestic product (GDP) and other factors have a higher Happiness score.
In this report, we will explore how different variables are tied to Happiness and attempt to create a model for Happiness.
Purpose
Ultimately, the question we’d like to answer in this report is:
What can we do to make the US a happier country?
As the leaders of some of the largest employers and agencies who can positively affect outcomes in the US, this report should serve as a roadmap–both showing where we’ve been and where we, as a country, can potentially go. We’ll analyze publicly available data that is compiled on an annual basis, allowing this task-force to track progress.
The following charts and models are an analysis of the 2019 Annual World Happiness Report.
Data
Since the first report was published in 2012, the state of the world, and more specifically, the US, has changed.
Before performing an analysis or creating a model, we wanted to explore the World Happiness Report survey data.
The dataset for 2019 contains 156 observations covering 9 variables:
- Overall Rank - overallRank
- Country or Region - countryRegion
- Happiness Score - score
- GDP Per Capita - gdpPerCapita
- Social Support - socialSupport
- Life Expectancy - lifeExpectancy
- Freedom of Choice - freedomChoice
- Generosity - generosity
- Perception of Corruption - perceptionCorruption
Range, Mean & Standard Deviation of Variables
what we found noteworthy while reviewing this basic information is that Happiness Score has a wider range than other variables, both Generosity and Perception of Corruption have a tighter distribution with a smaller standard deviation, and Social Support has a negative skewness, its mean being closer to its max value in the range.
| Happiness Score | GDP Per Capita |
|---|---|
| Range: 2.853, 7.769 | Range: 0, 1.684 |
| Mean: 5.4070962 | Mean: 0.9051474 |
| Standard Deviation: 1.1131199 | Standard Deviation: 0.3983895 |
| Generosity | Freedom of Choice |
|---|---|
| Range: 0, 0.566 | Range: 0, 0.631 |
| Mean: 0.1848462 | Mean: 0.3925705 |
| Standard Deviation: 0.0952544 | Standard Deviation: 0.1432895 |
| Perception of Corruption | Healthy Life Expectancy |
|---|---|
| Range: 0, 0.453 | Range: 0, 1.141 |
| Mean: 0.1106026 | Mean: 0.7252436 |
| Standard Deviation: 0.0945378 | Standard Deviation: 0.242124 |
| Social Support | |
|---|---|
| Range: 0, 1.624 | |
| Mean: 1.2088141 | |
| Standard Deviation: 0.2991914 |
Histograms
By producing histograms of our data, a few notable observations appear:
- Happiness Score is grouped around 4-6.5 range.
- We’d like to focus on moving the US from the “7s” to the “8s”
- Generosity and Perception of Corruption have fairly low scores.
- This is a good time to point out the subjective nature of these variables.
Scatterplots & Correlation Matrix
Producing scatterplots with numeric variables plotted on the X axis, we can see which ones are correlated to Happiness Score.
Our correlation matrix further proves the evidence of correlation and will help us identify inputs for our model.
Modelling For Happiness
After reviewing the exploratory charts and data, there are relationships between variables contained in the survey and a country’s Happiness Score.
In this portion of the report, we will explore potential models for Happiness Score and try to answer our question–How can we make the US a happier country?
Potential Models
Model.1 - Multiple Linear Regression With All Variables
##
## Call:
## lm(formula = score ~ gdpPerCap + generosity + freedomChoices +
## socialSupport + lifeExpectancy + perceptionCorruption, data = X2019)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.75304 -0.35306 0.05703 0.36695 1.19059
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.7952 0.2111 8.505 1.77e-14 ***
## gdpPerCap 0.7754 0.2182 3.553 0.000510 ***
## generosity 0.4898 0.4977 0.984 0.326709
## freedomChoices 1.4548 0.3753 3.876 0.000159 ***
## socialSupport 1.1242 0.2369 4.745 4.83e-06 ***
## lifeExpectancy 1.0781 0.3345 3.223 0.001560 **
## perceptionCorruption 0.9723 0.5424 1.793 0.075053 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5335 on 149 degrees of freedom
## Multiple R-squared: 0.7792, Adjusted R-squared: 0.7703
## F-statistic: 87.62 on 6 and 149 DF, p-value: < 2.2e-16
Notes:
- Generosity and Perception of Corruption are the only variables that are not statistically significant predictors of Happiness Score.
- As noted before, Perception of Corruption is one of the more subjective measures in this dataset.
- Social Support is a stronger predictor than our team assumed it would be.
Model.2 - Multiple Linear Regression With Strong Predictors
##
## Call:
## lm(formula = score ~ gdpPerCap + freedomChoices + socialSupport +
## lifeExpectancy, data = X2019)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.86584 -0.34594 0.03403 0.43676 1.13076
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.8921 0.1994 9.491 < 2e-16 ***
## gdpPerCap 0.8105 0.2165 3.745 0.000256 ***
## freedomChoices 1.8458 0.3404 5.423 2.28e-07 ***
## socialSupport 1.0166 0.2347 4.331 2.70e-05 ***
## lifeExpectancy 1.1414 0.3373 3.384 0.000910 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5398 on 151 degrees of freedom
## Multiple R-squared: 0.7709, Adjusted R-squared: 0.7649
## F-statistic: 127 on 4 and 151 DF, p-value: < 2.2e-16
Notes:
- This model is stronger than Model.1, this was indicated in our exploratory correlation matrix–both Generosity and Perception of Corruption were not as strongly correlated to score.
- GDP Per Capita, Freedom of Choice, Social Support, and Healthy Life Expectancy are strong predictors of Happiness.
Model.3 - Simple Linear Regression - GDP Per Capita
##
## Call:
## lm(formula = score ~ gdpPerCap, data = X2019)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.22044 -0.48361 0.00828 0.48433 1.47409
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.3993 0.1353 25.12 <2e-16 ***
## gdpPerCap 2.2181 0.1369 16.20 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.679 on 154 degrees of freedom
## Multiple R-squared: 0.6303, Adjusted R-squared: 0.6278
## F-statistic: 262.5 on 1 and 154 DF, p-value: < 2.2e-16
Notes:
- While this model is statistically significant in rejecting the null hypothesis, it is too simplistic.
- Even if GDP Per Capita is high and all other variables are low, it would result in a lower Happiness Score.
Results
Winning Model
The model that appears to be strongest in enabling us to predict Happiness Score is Model.2–Multiple Linear Regression With Strong Predictors. It factors in all but two of the numeric variables (the categorical variable “Country” was not deemed useful in this type of exercise, therefore, it’s been excluded), and all predictors have p-values far lower than .05.
Testing Our winning model
The equation for this model can be written as \(score = 1.89 + gdpPerCap * .81 + freedomChoices * 1.85 + socialSupport*1.02+lifeExpectancy*1.14\)
In order to validate our model, we used the sample() function to choose three countries at random, which returned rows 16, 152, and 24. Respectively, these correspond to Ireland, Rwanda, and France.
This is a random sample, however, it’s notable that we have a country with a higher Happiness Score (Ireland), one with a comparable score (France), and one with a much lower score (Rwanda).
Ireland
Equation for Estimated Score: \(score=1.89 + 1.499 * .81 + 0.516 * 1.85 + 1.553*1.02+0.999*1.14\)
| Estimated Score Using Model | Observed Score |
|---|---|
| 6.78171 | 7.021 |
Rwanda
Equation for Estimated Score: \(score=1.89 + 0.359 * .81 + 0.555 * 1.85 + 0.711*1.02+0.614*1.14\)
| Estimated Score Using Model | Observed Score |
|---|---|
| 4.63272 | 3.334 |
France
Equation for Estimated Score: \(score=1.89 + 1.324 * .81 + 0.436 * 1.85 + 1.472*1.02+1.045*1.14\)
| Estimated Score Using Model | Observed Score |
|---|---|
| 6.28484 | 6.592 |
We can see that by using observed values from the survey, our estimated score is close to observed score for both Ireland and France, but the model overestimated the score for Rwanda.
Applying our Model
Circling back to our initial question–What can we do to make the US a happier country? We’ll try to answer that using our winning model and testing the effect of different actions.
| Estimated Score Using Model | Observed Score |
|---|---|
| 6.37313 | 6.892 |
Our model produces a lower score than the observed score from the survey. Since we are testing the affect of altering variables, we will use the estimated score of 6.37313 as the benchmark.
| Test Action | Estimated Score |
|---|---|
| Improving GDP Per Capita by 0.25 | 6.57563 |
| Improving Social Support by 0.25 | 6.62813 |
| Improving Both GDP & Social Support by 0.25 | 6.83063 |
| Improving Healthy Life Expectancy by 0.25 | 6.65813 |
| Improving Healthy Life Expectancy & Social Support by 0.25 | 6.91313 |
| Improving All Variables by 0.10 | 6.85513 |
| Improving All Variables by 0.25 | 7.57813 |
| GDP constant/All other variables increase by 0.10 | 6.77413 |
Notes:
- Improving Social Support is more impactful to Happiness Score than GDP Per Capita.
- A full point increase in Social Support would result in an increase of 1.016 in Happiness Score.
- Improving all variables slightly increases Happiness Score significantly.
- Holding GDP constant and improving all other variables by 0.10 also improves the score significantly.
Issues With The Model
As with all models, it is imperfect. One of the biggest issues is that it’s a simplification of an incredibly complex measure. The survey this model is based on is an incredible undertaking–Being able to compile and synthesize a vast amount of information from over 150 countries in order to calculate an estimated score is no small feat. With that being said, this model doesn’t factor in disastrous weather events, acts of terror, or, as we are currently seeing, the effects of a pandemic.
Conclusion & Next Steps
What we hope our model demonstrates is that even small improvements in the country’s health, social support, freedom of choices, and GPP per capita can have a large impact on the Happiness Score.
As a task-force, we are in a position to influence the well-being of the US.
Next Steps
We would like to continue this discussion. Our team is proposing is that:
- We identify proxies for measuring these variables on our own throughout the course of the following year.
- Develop improvement plans for health and social support, since this group has has the ability to affect these two variables the most.
- Convene on a quarterly and annual basis to report on improvements.
Please contact Taylor Tuomie with questions: ttuomie03@hamline.edu