2026-03-26

Dataset Overview and Source

World Happiness Report

This analysis examines data from 153 countries to identify relationships between how individual metrics affect a country’s overall happiness score.

Data Source: World Happiness Report Database

Number of Observations: 153 countries across multiple regions

Key Variables:

  • Happiness Score - Overall life satisfaction rating (0-10 scale)
  • Job Satisfaction - Work satisfaction levels (0-100 scale)
  • Corruption - Perception of corruption in government/business (0-1 scale)
  • Economy - GDP per capita contribution to happiness (0-2 scale)
  • Family - Social support and family connections (0-2 scale)
  • Health - Life expectancy contribution (0-1 scale)
  • Freedom - Perceived freedom to make life choices (0-1 scale)
  • Region - Geographic region classification (categorical)

R Code for Data Preparation

Here’s how I load and prepare the data:

# Load required libraries
library(ggplot2)
library(plotly)
library(readxl)
library(dplyr)
# Load the data
happiness <- read_excel("world_happiness.xlsx")
# Convert Job Satisfaction to numeric
happiness$`Job Satisfaction` <- as.numeric(happiness$`Job Satisfaction`)
# Create happiness level groups for analysis
happiness$Happiness_Level <- cut(
  happiness$`Happiness Score`,
  breaks = c(0, 4.5, 6, 10),
  labels = c("Low", "Medium", "High")
)

The code above begins by loading the libraries needed for data visualization and file reading. The dataset is imported directly from an Excel file using read_excel. Job Satisfaction is converted to numeric since it was read in as a character type, which would otherwise cause errors in plotting. Finally, a new grouping variable called Happiness Level is created by splitting the continuous Happiness Score into three meaningful categories: Low, Medium, and High. This is used later in the boxplot analysis.

3D Happiness Factors Plot

3D Plot Analysis

  • Health Impact: There is a clear trend where countries with higher health (life expectancy) scores tend to have higher happiness scores, indicating a strong positive relationship. We can see this because as the higher health scores get overall, the higher the overall happiness score is.
  • Economic Impact: Higher GDP ( or greater economy) values are also associated with increased happiness, though the spread suggests more variability in economic conditions compared to health in directly affecting the overall happiness score
  • Stronger Driver: Health seems to have a more consistent relationship with happiness than economy, as the data points align more tightly along the health axis.However, both economy and health overall yield a better happiness score.
  • Combined Effect: Countries with both high health and high economic values cluster at the top of the happiness scale, showing that these factors together significantly elevate overall happiness. (dark red dots)
  • Diminishing Returns: At higher levels of GDP, increases in economy appear to contribute less dramatically to happiness unless accompanied by strong health outcomes.However, this is expected, as there is a point in the economic cycle of countries where they cannot continue the growth they initially had in an industrial faze. Nevertheless, it is still easy to see that working towards a higher economy as a nation or region yields higher overall happiness scores

Job Satisfaction vs Happiness Score

Health Outcomes by Happiness Level

Corruption vs Happiness Score

Statistical Analysis: Summary Statistics

# Five-number summary of Happiness Score
happiness %>% summarise(Min = min(`Happiness Score`), Q1 = quantile(`Happiness Score`, 0.25),
  Median = median(`Happiness Score`), Q3 = quantile(`Happiness Score`, 0.75),
  Max = max(`Happiness Score`), Mean = mean(`Happiness Score`), SD = sd(`Happiness Score`))
Five-Number Summary: Happiness Score
Min Q1 Median Q3 Max Mean SD
2.69 4.5 5.28 6.1 7.54 5.35 1.13

Happiness scores range from a low of 2.69 to 7.54, with a mean (average) of 5.35. The median had a similar value of 5.28. This possibly indicates a symmetric distribution. The standard deviation of 1.13 shows moderate spread across the 153 countries studied. This means that it was rare to have a country with an extremely high or extremely low happiness score. A mean of 5.35 was suprising, one would think that the happiness score average would be a bit higher overall.

Linear Regression: Economy vs Happiness Score

Linear Regression Coefficients
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.2053 0.1368 23.429 0
Economy 2.1823 0.1280 17.046 0

Linear Regression: Freedom vs Happiness Score

Linear Regression Coefficients
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.5776 0.2180 16.4114 0
Freedom 4.3371 0.5009 8.6593 0

Linear Regression: Key Findings

  • Economy is a strong predictor of happiness (R² = 0.658): GDP, or the measure of a nations economy explains 65.8% of the variation in happiness scores across countries. For every one-unit increase in economy, happiness rises by 2.18 points. This is one of the strongest relationship in our analysis. A good economy is correlated heavily to having a more happy nation.

  • Freedom also significantly predicts happiness (R² = 0.332): Perceived freedom explains 33.2% of the variation in happiness. Of course, “feel of freedom” is relative to the individual, but it is clear that countries where citizens feel free to make their own life choices score meaningfully higher, with a 4.34-point increase in happiness per unit of freedom.

  • Both findings make logical, intuitive sense: Wealthier nations can invest in infrastructure and healthcare, that directly improve quality of life. Freedom allows people to take more risk. Both of these metrics, though subjective at some level, are consensus among most people to be “good things”, things that they would like to have. Thus, it makes sense that higher values of these metrics yield higher happiness scores.

  • Takeaway: A country’s happiness is not random. It is indeed meaningfully driven by economic prosperity and personal freedom (among other factors from the dataset). The graphs and metrics support this rationale that investment as a nation in the things that improve these metrics will overall improve national happiness.

Notable Insights and Conclusions

  • Economy is the strongest driver of happiness (R² = 0.658): GDP per capita (economic measures) explains nearly 66% of happiness variation. Countries scoring above 6.0 in Economy average a happiness score of 6.8, while those below 0.5 average just 3.9. This is nearly a 3-point gap driven by wealth alone.

  • Health and happiness are heavily correlated: The boxplot shows countries with high happiness (6.0-10) have median health scores above 0.75, while low countries with low happiness (0-4.5) sit below 0.30. This is less than half of the “high” median happiness score, looking at just health alone. Longer, healthier lives clearly translate to greater happiness score (and perceived wellbeing)

  • Higher job satisfaction correlates with higher happiness: Western Europe, the happiest region, consistently scores 88-95 on job satisfaction. African nations, among the lowest in happiness, cluster below 65. This suggests that job satisfaction is indeed correlated to overall happiness scores.

  • Corruption erodes happiness: Countries with happiness scores of 7 show average corruption scores near 0.45, while those scoring 3 average just 0.07 — meaning happier nations perceive their institutions as significantly more trustworthy. [It should be noted that the higher the decimal for “Corruption” means less corruption is taking place. Ex. Norway at 0.316 is less corrupt than Costa Panama at 0.01]

  • Going Forward: Countries that invest in keeping their people healthy and have stronger economies consistently score higher on overall happiness. The graphs clearly support this, the gap between high and low happiness nations lines up almost perfectly with gaps in health and GDP. It’s not a coincidence, healthier and wealthier countries are unarguably happier ones.

Thank You