Happiness vs The world

Author

Sarah Abdela

Image Credit: “time to be happy, happiness concept” by Song_about_summer

https://www.worldhappiness.report/data-sharing/

Happiness Regarding Life Expectancy

This project focuses on analyzing happiness levels and life expectancy across different countries using data from the World Happiness Report. The dataset contains multiple quantitative and categorical variables related to quality of life, health, and economic conditions around the world. Some of the main quantitative variables used in this project include happiness score, healthy life expectancy, GDP per capita, and year, while the main categorical variable is country name. These variables help measure different aspects of well-being and allow comparisons between countries and years. The purpose of this project is to explore whether factors such as health and economic conditions have a relationship with happiness levels globally. I chose this topic because I have always been interested in understanding what factors contribute to happiness and quality of life in different countries. Mental health, stress, and overall well-being are important issues that affect many people, especially students and young adults, and I wanted to better understand how health and economic stability may influence happiness. This project also helped me learn how data analysis can be used to study real-world social and health-related issues. The dataset was collected and published through the World Happiness Report using international surveys and official statistical data gathered from multiple organizations.

# Load libraries
library(tidyverse)
Warning: package 'tidyverse' was built under R version 4.5.3
Warning: package 'readr' was built under R version 4.5.3
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.0     ✔ readr     2.2.0
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(readxl)
library(plotly)

Attaching package: 'plotly'

The following object is masked from 'package:ggplot2':

    last_plot

The following object is masked from 'package:stats':

    filter

The following object is masked from 'package:graphics':

    layout
library(ggfortify)
# Load the happiness dataset
happy <- read_excel("WHR26_Data_Figure_2.1.xlsx")
# Clean and select variables
happy_clean <- happy %>%
  select(
    Year,
    `Country name`,
    `Life evaluation (3-year average)`,
    `Explained by: Healthy life expectancy`,
    `Explained by: Log GDP per capita`
  ) %>%
  mutate(
    happiness_score = as.numeric(`Life evaluation (3-year average)`),
    life_expectancy = as.numeric(`Explained by: Healthy life expectancy`),
    gdp = as.numeric(`Explained by: Log GDP per capita`)
  )
head(happy_clean)
# A tibble: 6 × 8
   Year `Country name` `Life evaluation (3-year average)` Explained by: Health…¹
  <dbl> <chr>                                       <dbl>                  <dbl>
1  2025 Finland                                      7.76                  0.939
2  2025 Iceland                                      7.54                  0.996
3  2025 Denmark                                      7.54                  0.93 
4  2025 Costa Rica                                   7.44                  0.739
5  2025 Sweden                                       7.26                  1.03 
6  2025 Norway                                       7.24                  0.983
# ℹ abbreviated name: ¹​`Explained by: Healthy life expectancy`
# ℹ 4 more variables: `Explained by: Log GDP per capita` <dbl>,
#   happiness_score <dbl>, life_expectancy <dbl>, gdp <dbl>
# Create summary statistics
happy_summary <- happy_clean %>%
  summarize(
    avg_happiness = mean(happiness_score, na.rm = TRUE),
    avg_life = mean(life_expectancy, na.rm = TRUE),
    avg_gdp = mean(gdp, na.rm = TRUE)
  )

happy_summary
# A tibble: 1 × 3
  avg_happiness avg_life avg_gdp
          <dbl>    <dbl>   <dbl>
1          5.47    0.553    1.27

The summary statistics section provides an overview of the main quantitative variables included in the analysis. The table displays the average happiness score, average healthy life expectancy, and average GDP per capita across the countries included in the dataset. These statistics help summarize the data before performing regression analysis and visualizations. By examining the averages, it becomes easier to understand the general trends and overall patterns within the dataset. The summary statistics also help identify whether countries tend to have relatively high or low values for happiness, life expectancy, and economic conditions on average. This section provides important background information that supports the later regression and visualization analysis.

# Run multiple linear regression
model <- lm(
  happiness_score ~ life_expectancy + gdp,
  data = happy_clean
)

summary(model)

Call:
lm(formula = happiness_score ~ life_expectancy + gdp, data = happy_clean)

Residuals:
     Min       1Q   Median       3Q      Max 
-3.01607 -0.44377  0.06577  0.50395  1.71073 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)      3.02959    0.07092   42.72   <2e-16 ***
life_expectancy  2.01437    0.11787   17.09   <2e-16 ***
gdp              1.11275    0.05837   19.07   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.7269 on 1013 degrees of freedom
  (1100 observations deleted due to missingness)
Multiple R-squared:  0.584, Adjusted R-squared:  0.5832 
F-statistic: 711.1 on 2 and 1013 DF,  p-value: < 2.2e-16

Happiness Score=Bo+B1(Life Expectancy)+B2(GDP)

The multiple linear regression analysis explored the relationship between happiness score, healthy life expectancy, and GDP per capita. In this model, happiness score was used as the dependent variable, while healthy life expectancy and GDP per capita were used as independent variables. The regression results showed a statistically significant relationship between the variables, with p-values smaller than 2.2e-16 for both life expectancy and GDP, indicating that both variables are highly significant predictors of happiness score. The adjusted R-squared value was approximately 0.5832, meaning that about 58% of the variation in happiness scores can be explained by healthy life expectancy and GDP per capita included in the model. The regression coefficients for both life expectancy and GDP were positive, suggesting that countries with higher life expectancy and stronger economic conditions tend to report higher happiness levels. The F-statistic of 711.1 also indicates that the overall regression model is statistically significant. Overall, the regression analysis suggests that health and economic stability play important roles in influencing happiness and quality of life across countries.

# Create diagnostic plots
autoplot(model)
Warning: `fortify(<lm>)` was deprecated in ggplot2 4.0.0.
ℹ Please use `broom::augment(<lm>)` instead.
ℹ The deprecated feature was likely used in the ggfortify package.
  Please report the issue at <https://github.com/sinhrks/ggfortify/issues>.
Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
ℹ Please use tidy evaluation idioms with `aes()`.
ℹ See also `vignette("ggplot2-in-packages")` for more information.
ℹ The deprecated feature was likely used in the ggfortify package.
  Please report the issue at <https://github.com/sinhrks/ggfortify/issues>.
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
ℹ The deprecated feature was likely used in the ggfortify package.
  Please report the issue at <https://github.com/sinhrks/ggfortify/issues>.

The diagnostic plots were used to evaluate whether the regression model fit the data properly. The Residuals vs Fitted plot shows that most points are spread around the center line, although there are some outliers. The Normal Q-Q plot suggests that the residuals are close to normally distributed, with slight deviations at the ends. The Scale-Location plot shows a fairly consistent spread of residuals across the fitted values. The Residuals vs Leverage plot indicates that a few observations may have higher influence on the model. Overall, the diagnostic plots suggest that the regression model reasonably explains the relationship between happiness score, life expectancy, and GDP.

# Create interactive visualization
p1 <- ggplot(
  happy_clean,
  aes(
    x = life_expectancy,
    y = happiness_score,
    color = factor(Year)
  )
) +
  geom_point(size = 3, alpha = 0.7) +
  labs(
    title = "Happiness Score and Life Expectancy",
    x = "Healthy Life Expectancy",
    y = "Happiness Score",
    color = "Year",
    caption = "Source: World Happiness Report"
  ) +
  scale_color_viridis_d() +
  theme_minimal()

ggplotly(p1)

The interactive scatterplot visualization displays the relationship between happiness score and healthy life expectancy across countries. Each point on the graph represents a country, while the colors represent different years included in the dataset. The visualization helps show whether countries with higher life expectancy also tend to report higher happiness scores. The use of interactivity makes it easier to explore the data and compare patterns between countries and years more clearly. The graph also highlights how happiness levels vary globally and demonstrates that health and quality of life may have important relationships with overall well-being. Some countries appear to have consistently higher happiness scores, while others show lower values despite differences in life expectancy and economic conditions. The use of different colors improves readability and makes it easier to distinguish patterns across years. Overall, this visualization provides a clearer understanding of global happiness patterns and supports the findings from the regression analysis.

https://public.tableau.com/shared/BBXZW7KXN?:display_count=n&:origin=viz_share_link

The Tableau map visualization displays happiness scores across different countries around the world using an interactive geographic map. Countries with different happiness levels are represented using different shades of color, making it easier to compare global happiness patterns visually and geographically. Darker shades generally represent countries with higher happiness scores, while lighter shades represent countries with lower happiness levels. The visualization helps show that happiness levels vary significantly across regions and countries around the world. Some countries with stronger economic conditions, higher life expectancy, and better quality of life appear to report higher happiness scores compared to countries facing greater economic or social challenges. The interactive features in Tableau also make the visualization more engaging because users can explore different countries and compare happiness scores more easily. The year filter allows viewers to examine how happiness levels may change over time and observe patterns across different periods. Using a map visualization provides a clearer understanding of how happiness is distributed globally and highlights regional differences in well-being. Overall, the Tableau visualization complements the regression analysis and R visualizations by presenting the data in a simpler and more interactive geographic format that is easier to interpret and compare.

Background

According to the World Happiness Report, happiness and overall well-being are influenced by several important factors including health, social support, economic stability, freedom, and quality of life. Countries with stronger healthcare systems, better economic opportunities, and higher life expectancy often report higher happiness scores. Researchers also suggest that social trust, mental health, personal freedom, and community support play important roles in shaping overall well-being. Mental health and happiness are increasingly recognized as important global public health concerns because they affect individuals, families, and communities in many different ways. The World Happiness Report uses international survey data and statistical indicators to better understand what factors contribute to happiness across countries. Studying these relationships can help governments and organizations improve policies related to healthcare, quality of life, and social support systems. Understanding global happiness patterns may also help raise awareness about the importance of mental health and overall well-being.

Conclusion

Overall, this project explored the relationship between happiness, healthy life expectancy, and economic conditions across different countries using data from the World Happiness Report. Through the use of multiple linear regression, diagnostic plots, and interactive visualizations, the project examined how health and economic factors may influence happiness levels around the world. The regression analysis suggested that countries with stronger economic conditions and higher life expectancy often tend to report higher happiness scores. The visualizations made it easier to compare countries and understand global patterns related to well-being and quality of life. One interesting finding was that happiness levels appear to vary significantly across countries and regions, suggesting that multiple social, economic, and health-related factors may contribute to overall happiness. This project was personally meaningful to me because mental health, stress, and overall well-being are important topics that affect many people around me. Working on this analysis helped me better understand the importance of health, economic stability, and quality of life in shaping happiness globally. If I had more time, I would have liked to include additional variables related to mental health services, social support systems, or education to further explore their relationship with happiness and well-being.