The world happiness report is a survey first published in 2012, up to 2019. I chose to cover the most recent 2019 happiness survey to consider. The rankings and data are from the Gallop World Poll. Scores are based on the Cantril Ladder, where the best possible life is a 10 and the worst possible life being 0. Interesting note: “Since life would be very unpleasant in a country with the world’s lowest incomes, lowest life expectancy, lowest generosity, most corruption, least freedom and least social support, it is referred to as “Dystopia,” in contrast to Utopia.”
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.7 ✔ dplyr 1.0.9
## ✔ tidyr 1.2.0 ✔ stringr 1.4.0
## ✔ readr 2.1.2 ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(treemap)
library(RColorBrewer)
read the data CSV file with the World Happiness Data Reports and assigned variable World happiness2019
setwd("C:/Users/baise/OneDrive/Desktop/Baidata110summer")
happiness2019 <- read_csv("happiness2019.csv")
## Rows: 156 Columns: 9
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): Country or region
## dbl (8): Overall rank, Score, GDP per capita, Social support, Healthy life e...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
summary(happiness2019)
## Overall rank Country or region Score GDP per capita
## Min. : 1.00 Length:156 Min. :2.853 Min. :0.0000
## 1st Qu.: 39.75 Class :character 1st Qu.:4.545 1st Qu.:0.6028
## Median : 78.50 Mode :character Median :5.380 Median :0.9600
## Mean : 78.50 Mean :5.407 Mean :0.9051
## 3rd Qu.:117.25 3rd Qu.:6.184 3rd Qu.:1.2325
## Max. :156.00 Max. :7.769 Max. :1.6840
## Social support Healthy life expectancy Freedom to make life choices
## Min. :0.000 Min. :0.0000 Min. :0.0000
## 1st Qu.:1.056 1st Qu.:0.5477 1st Qu.:0.3080
## Median :1.272 Median :0.7890 Median :0.4170
## Mean :1.209 Mean :0.7252 Mean :0.3926
## 3rd Qu.:1.452 3rd Qu.:0.8818 3rd Qu.:0.5072
## Max. :1.624 Max. :1.1410 Max. :0.6310
## Generosity Perceptions of corruption
## Min. :0.0000 Min. :0.0000
## 1st Qu.:0.1087 1st Qu.:0.0470
## Median :0.1775 Median :0.0855
## Mean :0.1848 Mean :0.1106
## 3rd Qu.:0.2482 3rd Qu.:0.1412
## Max. :0.5660 Max. :0.4530
#Declare names in dataset happiness2019
#For the category names, use underscore to create the top row title names
names(happiness2019) <- tolower(names(happiness2019))
names(happiness2019) <- gsub(" ","_",names(happiness2019))
head(happiness2019)
## # A tibble: 6 × 9
## overall_rank country_or_region score gdp_per_capita social_support
## <dbl> <chr> <dbl> <dbl> <dbl>
## 1 1 Finland 7.77 1.34 1.59
## 2 2 Denmark 7.6 1.38 1.57
## 3 3 Norway 7.55 1.49 1.58
## 4 4 Iceland 7.49 1.38 1.62
## 5 5 Netherlands 7.49 1.40 1.52
## 6 6 Switzerland 7.48 1.45 1.53
## # … with 4 more variables: healthy_life_expectancy <dbl>,
## # freedom_to_make_life_choices <dbl>, generosity <dbl>,
## # perceptions_of_corruption <dbl>
#Displaying name of countries, region etc for each row in the dataset
#Display the names of each row in dataset
names(happiness2019)
## [1] "overall_rank" "country_or_region"
## [3] "score" "gdp_per_capita"
## [5] "social_support" "healthy_life_expectancy"
## [7] "freedom_to_make_life_choices" "generosity"
## [9] "perceptions_of_corruption"
plot1 <- happiness2019 %>%
ggplot(aes(score, healthy_life_expectancy))+
geom_point()+
labs (x = "Score", y = "Healthy Life Expectancy", title = "Score vs. Healthy Life Expectancy")
plot1
This just a little summary score on how happier countries have a Healthy Life Expectancy. I created a general plot to compare the score to the level of healthy life expectancy before showing the tree map of the dataset in the world. Here, there is a general positive linear relationship which can further support that countries or regions with a healthy life expectancy, tend to have a higher happiness score.In the chart, score between 4.5 and above tend to have a good healthy Life expectancy
plot2 <- happy2019 %>%
##Creating a map Create a treemap for happiness 2019, using color index that displays the countries and religion inorder of numerical ranking
# Create treemap for happiness 2019, use color index that displays the countries and regions in order of numerical ranking
treemap(happiness2019, index = "country_or_region", vSize = "score", vColor = "overall_rank", type = "manual", palette = "RdYlBu")
## Summary of Data
I am really interested in using a variety of visual displays for this dataset. This displays show the names of the Countries or Regions in the data set with a progression from the left being the “happiest” places to the “least-happiest” places on the right. I like how you can visually see the different countries and regions on this visual but some are a bit more difficult than others to read. Looking at this charts, countries that are most happiest are top 20 wealthiest countries in the world. To me, as you see the first thick column red color (Finland, Denmark, Norway, Iceland, the Netherlands, Switzerland and Sweden) are all countries ranked in the top 20 world wealthiest countries in the world. I would wonder why the United States is not on the thick red color column. Also, this top 10 happiest countries most of them came from Europe and why? I am curious to know what determines the happiness of a country, good health facility and good infrastructure? does the lower population also determine for a country to be happier?. After looking at this tree map, I am really wondering why most happiest countries are mostly in Europe. When looking at the least happiest countries from the charts, African and Asia are the least happiest countries in the world. I am assuming that because they are the poorest continent in the world specially Africa. I can give an example to myself, I came from African where there are limited/no job opportunities, poor health care facility, poor infrastructures, poor education poor in everything etc… there is no hope in where I came from. people die from acute illnesses because they can not able to afford health care expenses, other can not go to school for not affording tuition. Most countries in Africa people that have master degree, the highest salaries they can make a month is around $100. So all these factors contributes to a country not to be happy.Some countries in North and South America will be difficult to see them in the bottom of the list because they are way better than majority of the African and some countries in Asia like India which is one of the most populated countries in the world.
What I would like to work on for a future representation is possible grouping the countries based on scale level, between 0-1, 1-2 and so on. Once these groups were created, I could use a tree map to show the size comparison of the highest scaled group to the lowest scaled group. Looking closely at the data, I thought it was interesting how the topped ranked places for having the highest happiness ratings were Finland, Denmark, Norway, Iceland, Netherlands, Switzerland, and Sweden. Then on the opposite end, Malawi, Yemen, Rwanda and Tanzania are at the lowest end. With this information, it would be interesting to know specific statistics on these places to determine why they are viewed as being happy or not. This could allow a researcher to look at the categories on a numerical scale to look at the population income,way of living, cost of living the regulations set by government, and other factors that could explain why these places are where they are in this data set.
This project really widen my research and thought about this topic.