2025-03-21

The Dataset

-This is a project that uses the WHI_Inflation dataset from kaggle.

-The data set was created by combining the World Happiness Index (WHI) data and the inflation matrix, coming from 148 countries between 2015 to 2023.

-The data set has 17 columns, the major ones including the happiness score and rank, economic indicators, social factors, and geographical information.

Brief Overview

We are going to explore the data through different visualizations to better understand and analyze the data. The following is the layout:

-World map: Geographical mapping with color intensity of countries based on their average happiness rank.

-Scatter plot: Shows the economical and political factors that impact the happiness score. We plotted three variables, perceptions of corruption, GDP deflator growth rate, inflation rate.

-Pie Chart: Average generosity of the continents based on the level of freedom they have to make choices.

-Box Plot: Shows distribution of happiness scores between the different continents.

-Statistical Analysis - 5 number summary to show distribution of happiness scores between continents.

Ggplot World Map

By ranking, this graph shows the lighter shades of green which indicate a higher happiness rank while the darker side of the gradient shows a lower happiness rank. Due to the nature of the data set, missing data is indicated by the gray colors of the countries. This gives us a rough understanding of the happiness ranks across the world.

Plotly 3D plot

The following is a 3D Scatter Plot of Perceptions of Corruption, GDP Deflator Growth Rate, and Inflation Rate.

3D plot Analysis

To see what impacts the happiness score, the 3D plot shows 4 variables. The color is based on the happiness score. Perception of corruption displays the level of corruption perceived by the people of the country’s government and business. GPD Deflator Index Growth Rate is the measure of economy price level changes. Finally, Producer Price Inflation is the inflation rate for producers.

-The graph is complicated, as there seems to be an unusual trend where there is a higher happiness scores between 0.2%-0.6% of corruption while anything below or above that corruption variable indicates a lower happiness score.

-Most GDP Deflator Index Growth rate scores and Producer Price Inflation fall between a rate of 0-20.

Pie Charts

To show the different traits between countries, a pie chart is used. The pie charts are divided into different level of freedom. The low level is less than 0.3 level of freedom, medium is 0.3-0.6. Anything above 0.6 freedom is high. In each level of freedom, the pie chart shows the amount of generosity by the people of that continent. From the charts, it seems that Europe is one of the continents that has a variance in the levels of freedom perceived by the people. Additional, it seems that not all the continents have countries with all three levels of freedom.

Ggplot Boxplot Code

To find a correlation between continent traits and happiness scores, a boxplot was used to compare each continent. The following is the code for the box plot of happiness score by continent:

ggplot(Q1, aes(x = Continent, y = Score)) + 
  geom_boxplot(aes(color= Continent), fill = "white") +  
  labs(title = "Boxplot of Happiness Score by Continent",
       x = "Continent", 
       y =  "Happiness Score")

Ggplot Happiness Score by Country

From the plot, Oceania seems to be skewed to the higher end of the happiness score spectrum when compared the other continents. On the other hand, Africa is lagging on the happiness score spectrum.

Statistical analysis

From the 5 number summary 5 continents, Europe has the biggest distribution of happiness score. It seems that the 5 number summary also correlates to the trends seen in the pie charts, as Europe is shown to have countries with the high variance in each level of freedom in the pie chart as well. Oceania has the highest median, while Africa has the lowest median. Americas is skewed to the top (or the right) and Asia and Oceania is skewed to the bottom (or to the left). Africa and Europe have a normal distribution.

## # A tibble: 5 × 6
##   Continent   Min    Q1 Median    Q3   Max
##   <chr>     <dbl> <dbl>  <dbl> <dbl> <dbl>
## 1 Africa     4.12  4.62   4.86  5.20  6.10
## 2 Americas   5.49  6.11   6.34  6.93  7.43
## 3 Asia       3.57  5.11   5.88  6.33  6.85
## 4 Europe     4.10  5.72   6.36  7.13  7.84
## 5 Oceania    7.12  7.22   7.28  7.31  7.33

Conclusion and Thanks for your time!

Based on the pie charts and box plots, there appears to be a trend in with the perceived levels of freedom and the happiness scores in the countries of the continents. For continents with medium to high levels of freedoms, the happiness scores are also skewed towards the top (higher scores). For example, Oceania has medium and high levels of freedom, and it appears to also have skewed distribution of the happiness scores to the top. This indicates that there seems to be a trend between the perceived level of freedom and the distribution of happiness scores. Although with the econmic and political fators seen with the 3D scatter plot a direct correlation was not observed, indicating a more complicated relationship between such aspects of the countries and happiness.