Hypothesis: Modern day happiness often seems to be based on the theory that money is the primary derivative for all purpose, comfort, pleasure, etc. The belief is that money is essentially the root to happiness. By examining the below data on a high level, we can determine how true that is, especially with respect to the USA. Happiness/satisfaction will then be measured on a career basis for the USA alone using similar methods. The primary goal in this analysis is thus to disprove the notion that wealth is the primary contributor to universal happiness.

Cleaning and prepping the data

Load Libraries

suppressMessages(suppressWarnings(library(tidyr)))
suppressMessages(suppressWarnings(library(dplyr)))
suppressMessages(suppressWarnings(library(ggplot2)))
suppressMessages(suppressWarnings(library(RCurl)))
suppressMessages(suppressWarnings(library(ggrepel)))
suppressMessages(suppressWarnings(library(XML)))
suppressMessages(suppressWarnings(library(rvest)))
twenty_fifteen_happiness = read.csv(text = getURL('https://raw.githubusercontent.com/manonfire86/FinalProject/master/2015_Happiness.csv'))
twenty_sixteen_happiness = read.csv(text = getURL('https://raw.githubusercontent.com/manonfire86/FinalProject/master/2016_Happiness.csv'))
twenty_seventeen_happiness = read.csv(text = getURL('https://raw.githubusercontent.com/manonfire86/FinalProject/master/2017_Happiness.csv'))

globalOECDGDP = read.csv(text = getURL('https://raw.githubusercontent.com/manonfire86/FinalProject/master/API_NY.GDP.PCAP.CD_DS2_en_excel_v2.csv'))

avgworldincome = 'https://www.worlddata.info/average-income.php'
avgworldincometable  = avgworldincome %>%
  read_html() %>%
  html_nodes(xpath='//*[@id="tabsort"]') %>%
  html_table(fill=TRUE)

careersatisfaction = read.csv(text = getURL('https://raw.githubusercontent.com/manonfire86/FinalProject/master/career_satisfaction_14_11_2017.csv'))

Global Happiness Reports: 2015-2017 (cleaning and prepping the data)

twenty_fifteen_happiness_sub = twenty_fifteen_happiness[,c('Country','Happiness.Rank')]
twenty_fifteen_happiness_sub['Year'] = 2015

twenty_sixteen_happiness_sub = twenty_sixteen_happiness[,c('Country','Happiness.Rank')]
twenty_sixteen_happiness_sub['Year'] = 2016

twenty_seventeen_happiness_sub = twenty_seventeen_happiness[,c('Country','Happiness.Rank')]
twenty_seventeen_happiness_sub['Year'] = 2017

longformat_twenty_fifteen_happiness = twenty_fifteen_happiness_sub %>%
  spread(Year,Happiness.Rank)

longformat_twenty_sixteen_happiness = twenty_sixteen_happiness_sub %>%
  spread(Year,Happiness.Rank)

longformat_twenty_seventeen_happiness = twenty_seventeen_happiness_sub %>%
  spread(Year,Happiness.Rank)


happiness_ranking_df = merge(longformat_twenty_fifteen_happiness,longformat_twenty_sixteen_happiness,by= 'Country')
happiness_ranking_df = merge(happiness_ranking_df,longformat_twenty_seventeen_happiness,'Country')

happiness_ranking_df = happiness_ranking_df %>%
  gather(Year, Rank,'2015':'2017')

top30countries = subset(happiness_ranking_df,Rank<=30)

Mapping the top 30 countries in terms of Happiness Ranking and determining the top absolute changes year over year

countriessorted = happiness_ranking_df[with(happiness_ranking_df,order(Country,Year,Rank)),]
countriesyoy = countriessorted %>% mutate( chg = ifelse(Country == lag(Country),Rank - lag(Rank),0))
countriesyoy['Abs_Change'] = abs(countriesyoy['chg'])
avgyoychange = countriesyoy[,c('Country','Year','Abs_Change')] %>%
  spread(Year,Abs_Change)


avgyoychange[is.na(avgyoychange)] = 0

avgyoychange["Average Change Across Yrs"] = rowMeans(avgyoychange[,2:4])

toptenchanges = avgyoychange[order(avgyoychange$`Average Change Across Yrs`,decreasing = T),]
toptenchanges = head(toptenchanges,10)

purechanges = countriesyoy[which(countriesyoy$Country %in% toptenchanges$Country),]
purechanges = purechanges[,c('Country','Year','chg')] %>%
  spread(Year,chg)
purechanges["Average Change Across Yrs"] = rowMeans(purechanges[,2:4])
purechanges = purechanges[order(purechanges$`Average Change Across Yrs`),]

What was the contributing factor based on the Happiness Index Data for the countries with the top 10 largest changes

twenty_fifteen_happiness_attr = twenty_fifteen_happiness[,-2:-5]
twenty_sixteen_happiness_attr = twenty_sixteen_happiness[,-2:-6]
twenty_seventeen_happiness_attr = twenty_seventeen_happiness[,-2:-5]

twenty_fifteen_attr_changes = twenty_fifteen_happiness_attr[which(twenty_fifteen_happiness_attr$Country %in% toptenchanges$Country),]
twenty_sixteen_attr_changes = twenty_sixteen_happiness_attr[which(twenty_sixteen_happiness_attr$Country %in% toptenchanges$Country),]
twenty_seventeen_attr_changes = twenty_seventeen_happiness_attr[which(twenty_seventeen_happiness_attr$Country %in% toptenchanges$Country),]

twenty_fifteen_attr_changes['Year'] = 2015
twenty_sixteen_attr_changes['Year'] = 2016
twenty_seventeen_attr_changes['Year'] = 2017

twenty_fifteen_attr_changes=twenty_fifteen_attr_changes[,-8]
twenty_sixteen_attr_changes = twenty_sixteen_attr_changes[,-8]
twenty_seventeen_attr_changes = twenty_seventeen_attr_changes[,-8]


newnamevector = c('Country','Economy','Family','Health','Freedom','Trust in Government','Generosity','Year')

colnames(twenty_fifteen_attr_changes) = newnamevector
colnames(twenty_sixteen_attr_changes) = newnamevector
colnames(twenty_seventeen_attr_changes) = newnamevector

longdffifteen = twenty_fifteen_attr_changes %>%
  gather(Attribute,Factor,Economy:Generosity) %>%
  spread(Year,Factor)

longdfsixteen = twenty_sixteen_attr_changes %>%
  gather(Attribute,Factor,Economy:Generosity) %>%
  spread(Year,Factor)

longdfseventeen = twenty_seventeen_attr_changes %>%
  gather(Attribute,Factor,Economy:Generosity) %>%
  spread(Year,Factor)

merged_attr_df = merge(longdffifteen,longdfsixteen,by=c('Country','Attribute'))
merged_attr_df = merge(merged_attr_df,longdfseventeen,by=c('Country','Attribute'))
merged_attr_df = merged_attr_df %>%
  gather(Year,Factor,'2015':'2017')

attributessorted = merged_attr_df[with(merged_attr_df,order(Country,Attribute,Year)),]
attributesyoy = attributessorted %>% mutate( chg = ifelse(Country == lag(Country) & Attribute == lag(Attribute),Factor - lag(Factor),0))
avgyoyattrchange = attributesyoy[,c('Country','Attribute','Year','chg')] %>%
  spread(Year,chg)

avgyoyattrchange[is.na(avgyoyattrchange)] = 0
avgyoyattrchange["Average Change Across Yrs"] = rowMeans(avgyoyattrchange[,3:5])

LargestContributingAttr = avgyoyattrchange[,c('Country','Attribute','Average Change Across Yrs')] %>%
  spread(Attribute,'Average Change Across Yrs')

maxattributeeachcountry = avgyoyattrchange[,c('Country','Attribute','Average Change Across Yrs')] %>% group_by(Country) %>% top_n(1,abs(`Average Change Across Yrs`))

Examining the USA: Career Satisfaction and Global Income Standing

twenty_fifteen_happiness_USA = twenty_fifteen_happiness[,-2:-5]
twenty_sixteen_happiness_USA = twenty_sixteen_happiness[,-2:-6]
twenty_seventeen_happiness_USA = twenty_seventeen_happiness[,-2:-5]

twenty_fifteen_happiness_USA['Year'] = 2015
twenty_sixteen_happiness_USA['Year'] = 2016
twenty_seventeen_happiness_USA['Year'] = 2017

twenty_fifteen_happiness_USA=twenty_fifteen_happiness_USA[,-8]
twenty_sixteen_happiness_USA = twenty_sixteen_happiness_USA[,-8]
twenty_seventeen_happiness_USA = twenty_seventeen_happiness_USA[,-8]

colnames(twenty_fifteen_happiness_USA) = newnamevector
colnames(twenty_sixteen_happiness_USA) = newnamevector
colnames(twenty_seventeen_happiness_USA) = newnamevector


USAdffifteen = twenty_fifteen_happiness_USA %>%
  gather(Attribute,Factor,Economy:Generosity) %>%
  spread(Year,Factor)

USAdfsixteen = twenty_sixteen_happiness_USA %>%
  gather(Attribute,Factor,Economy:Generosity) %>%
  spread(Year,Factor)

USAdfseventeen = twenty_seventeen_happiness_USA %>%
  gather(Attribute,Factor,Economy:Generosity) %>%
  spread(Year,Factor)

merged_attr_USA = merge(USAdffifteen,USAdfsixteen,by=c('Country','Attribute'))
merged_attr_USA = merge(merged_attr_USA,USAdfseventeen,by=c('Country','Attribute'))
merged_attr_USA = merged_attr_USA %>%
  gather(Year,Factor,'2015':'2017')

attributesUSA = merged_attr_USA[which(merged_attr_USA$Country == "United States"),]
attributesUSA= attributesUSA[with(attributesUSA,order(Country,Attribute,Year)),]
attributesUSAyoy = attributesUSA %>% mutate( chg = ifelse(Country == lag(Country) & Attribute == lag(Attribute),Factor - lag(Factor),0))
avgyoyUSAchange = attributesUSAyoy[,c('Country','Attribute','Year','chg')] %>%
  spread(Year,chg)

avgyoyUSAchange[is.na(avgyoyUSAchange)] = 0
avgyoyUSAchange["Average Change Across Yrs"] = rowMeans(avgyoyUSAchange[,3:5])

Career Satisfaction

careersatisfactionUSA = careersatisfaction

factoraverages = colMeans(careersatisfactionUSA[6:12])
salary_average = mean(careersatisfaction[is.na(careersatisfactionUSA$Salary....USD.)==FALSE,"Salary....USD."])

salarymapping = careersatisfaction[is.na(careersatisfactionUSA$Salary....USD.)==FALSE,]
salarymapping = salarymapping[order(-salarymapping[,4]),]

careersatisfactionUSA[is.na(careersatisfactionUSA)]=0


heatmaptable = salarymapping[,-2:-6]

heatmaptable = heatmaptable[1:50,] %>%
  gather(Attribute,Factor, Fit:Salary)

analysistableheat = salarymapping[,c(-2,-3,-5)]
analysistableheat = analysistableheat[1:50,]

Average Global Income

avgworldincomedf = avgworldincometable[[1]]
colnames(avgworldincomedf) = avgworldincomedf[1,]
avgworldincomedf = avgworldincomedf[-1,-5]
row.names(avgworldincomedf) = avgworldincomedf[,1]
avgworldincomedf[] = lapply(avgworldincomedf,gsub,pattern = ',',replacement = '')
avgworldincomedf[] = lapply(avgworldincomedf,gsub,pattern = '\\$',replacement = '')
avgworldincomedf$`Average incomeannually` = as.numeric(avgworldincomedf$`Average incomeannually`)
avgworldincomedf$monthly = as.numeric(avgworldincomedf$monthly)

USAglobalincomerank = avgworldincomedf[which(avgworldincomedf$Country == 'United States'),]

Global OECD GDP Data

globalOECDGDP[is.na(globalOECDGDP)] = 0

globalOECDGDPparsedfourteen = globalOECDGDP[,c("Country.Name","X2014")]
globalOECDGDPparsedfifteen = globalOECDGDP[,c("Country.Name","X2015")]
globalOECDGDPparsedsixteen = globalOECDGDP[,c("Country.Name","X2016")]


globalOECDGDPparsedfourteen = globalOECDGDPparsedfourteen[with(globalOECDGDPparsedfourteen,order(-globalOECDGDPparsedfourteen$X2014)),]
globalOECDGDPparsedfifteen = globalOECDGDPparsedfifteen[with(globalOECDGDPparsedfifteen,order(-globalOECDGDPparsedfifteen$X2015)),]
globalOECDGDPparsedsixteen = globalOECDGDPparsedsixteen[with(globalOECDGDPparsedsixteen,order(-globalOECDGDPparsedsixteen$X2016)),]

toptentableoecd = merge(globalOECDGDPparsedfourteen,globalOECDGDPparsedfifteen,by = 'Country.Name')
toptentableoecd = merge(toptentableoecd,globalOECDGDPparsedsixteen,by = 'Country.Name')
toptentableoecd = toptentableoecd[with(toptentableoecd,order(-X2016,-X2015,-X2014)),]
toptentableoecdfinal = toptentableoecd[1:10,]

tennamescols = c('Countries','2014','2015','2016')

colnames(toptentableoecdfinal) = tennamescols

toptentableoecdfinal = toptentableoecdfinal %>%
  gather(Year,Average_GDP,`2014`:`2016`)

Analysis

Below is the mapping of the top 30 Countries Based on Happiness Rankings from 2015 to 2017. As you can see there are some major shifts where countries lost there entire standing within the top 30 or gained position within the top 30, for instance the Czech Republic was not present in 2015, but it made the top 30 in 2016 and 2017. The USA shifted between rank 15, 13, and 14 between 2015 and 2017.

Analysis: Top 30 Ranked Countries from 2015-2017

Given the macro level, it is safe to question what were the largest contributing factors for countries that experience the largest shifts in rankings. Taking the absolute average changes in the data set and then mapping these rankings to their natural year over year changes, I parsed the top 10 countries that experienced the greatest shifts.

I then included the data set’s happiness factors and mapped them accordingly. This is the first confirmation that money is not the root of happiness; economic sentiment did not have the greatest impact for 70% of the countries mapped. In fact, it was often the compounding effect of all attributes affected in that year that caused the shift in rankings.

Ten countries with largest Rank shifts: Contributing Factors

Attributes: Greatest Average Changes Across 2015-2017 for the Ten Selected Countries

print(maxattributeeachcountry)
## # A tibble: 10 x 3
## # Groups:   Country [10]
##      Country  Attribute `Average Change Across Yrs`
##       <fctr>      <chr>                       <dbl>
##  1   Algeria    Economy                  0.05085816
##  2  Bulgaria     Family                  0.10941315
##  3     Egypt     Family                  0.08349046
##  4   Hungary    Economy                  0.05502398
##  5    Latvia     Family                  0.10303165
##  6   Liberia Generosity                 -0.06822392
##  7   Nigeria     Family                  0.10381683
##  8   Romania     Family                  0.08807043
##  9 Venezuela    Freedom                 -0.09169429
## 10    Zambia    Economy                  0.05534226

Examining the USA specifically, changes in economic sentiment did not have the greatest impact in the rank shifts experienced between 2015 and 2017. The largest affected factor was a sharp decline in geneorsity over the years while trust in government experienced the highest positive average change.

USA: Contributing Factors

USA: Average Changes 2015-2017

print(avgyoyUSAchange)
##         Country           Attribute 2015     2016         2017
## 1 United States             Economy    0  0.11345  0.038299284
## 2 United States              Family    0 -0.19929  0.372100564
## 3 United States             Freedom    0 -0.06441  0.024110523
## 4 United States          Generosity    0  0.00972 -0.275131212
## 5 United States              Health    0 -0.08279 -0.004713372
## 6 United States Trust in Government    0 -0.01022  0.243898781
##   Average Change Across Yrs
## 1                0.05058309
## 2                0.05760352
## 3               -0.01343316
## 4               -0.08847040
## 5               -0.02916779
## 6                0.07789293

Keeping the hypothesis in mind, I then examine the USA on a micro level. Here I take into consideration career data; this, to me, is the easiest way to determine if money is the root of happiness. It essentially tests the logic that a good job is the one that pays you alot of money.

The below heat map however disproves that with the salary factor having the lowest impact on overall job satisfaction; it is the case that fit plays the greatest role in job satisfaction, followed by interest and environment.

Heat Map of Job Happiness USA

Overall Scores: Career Satisfaction for the Top 50 paying jobs

print(analysistableheat)
##                                          Name Salary....USD. Overall Fit
## 218                              Psychiatrist         181880     3.4 3.9
## 116                           Chief Executive         173320     3.9 4.4
## 370                    Sustainability Officer         173320     3.6 4.0
## 209                              Pediatrician         163350     4.0 4.3
## 7                                     Dentist         149540     3.0 3.5
## 471                        Petroleum Engineer         130050     3.0 3.6
## 387                         Marketing Manager         127130     3.2 3.8
## 96                     Air Traffic Controller         122340     3.3 3.8
## 49                                 Pharmacist         120950     2.8 3.5
## 338             Clinical Research Coordinator         120050     2.9 3.7
## 92                                      Pilot         118140     3.9 4.1
## 321                         Financial Manager         115320     3.0 3.8
## 334                       Bank Branch Manager         115320     2.8 3.8
## 364                                 Treasurer         115320     3.1 3.7
## 33                                     Lawyer         114970     2.7 3.4
## 274                             Sales Manager         110660     2.8 3.7
## 428                                 Physicist         109600     3.6 3.9
## 143                Computer Hardware Engineer         108430     3.6 3.9
## 192 Computer & Information Research Scientist         108360     3.5 3.9
## 266                        Purchasing Manager         106090     3.0 3.6
## 77                                 Astronomer         105410     4.0 3.9
## 123                        Aerospace Engineer         105380     3.6 4.1
## 339                        Compliance Manager         105060     2.7 3.5
## 372                      Supply Chain Manager         105060     3.0 3.7
## 376                   Investment Fund Manager         105060     3.2 3.8
## 68                        Political Scientist         104920     3.4 3.9
## 225                             Mathematician         103720     3.7 3.9
## 366                   Human Resources Manager         102780     3.2 3.9
## 354                       Fundraising Manager         101510     2.9 3.6
## 51                                Optometrist         101410     2.8 3.7
## 245                          Nuclear Engineer         100470     3.0 3.9
## 490                        Operations Manager          97270     3.2 3.9
## 78                          Chemical Engineer          96940     3.0 3.6
## 121                       Advertising Manager          96720     3.1 3.8
## 358                    Green Product Marketer          96720     3.5 3.9
## 329                                   Actuary          96700     2.6 3.4
## 272                            Sales Engineer          96340     2.9 3.6
## 452                       Physician Assistant          95820     3.1 3.8
## 316                                 Economist          95710     3.1 3.6
## 416                         Software Engineer          95510     3.2 3.8
## 431                        Nurse Practitioner          95350     3.0 3.8
## 231                         Robotics Engineer          94240     4.3 4.4
## 397                   Nanotechnology Engineer          94240     4.1 4.2
## 406                      Biochemical Engineer          94240     3.5 3.9
## 222                           Naval Architect          92930     3.3 3.9
## 489                           Marine Engineer          92930     3.6 3.8
## 165                   Health Services Manager          92810     3.0 3.8
## 190             Industrial Production Manager          92470     2.9 3.8
## 388                       Materials Scientist          91980     3.2 3.7
## 37                        Electrical Engineer          91410     3.2 3.7
##     Work.Env Interest Skill.Util Meaning Salary
## 218      3.7      3.8        3.5     3.2    3.9
## 116      4.0      4.3        4.1     3.7    3.5
## 370      3.7      4.1        3.3     3.6    3.4
## 209      3.8      4.1        4.0     4.1    4.1
## 7        3.3      3.3        3.2     2.9    3.2
## 471      3.4      3.7        3.0     3.0    3.7
## 387      3.5      3.6        3.2     2.8    3.2
## 96       3.8      3.9        3.3     3.2    3.8
## 49       3.1      3.2        2.8     2.8    3.2
## 338      3.5      3.7        2.9     2.9    2.8
## 92       3.9      4.4        3.9     3.7    3.4
## 321      3.4      3.4        3.1     2.6    3.3
## 334      3.4      3.2        3.0     2.6    3.1
## 364      3.6      3.4        3.0     2.9    3.0
## 33       3.0      3.4        3.1     2.6    3.3
## 274      3.3      3.2        2.8     2.4    3.1
## 428      3.7      4.1        3.8     3.5    3.0
## 143      3.7      4.0        3.6     3.3    3.4
## 192      3.8      3.8        3.5     3.4    3.1
## 266      3.5      3.3        3.0     2.6    3.2
## 77       4.1      4.3        3.9     3.9    3.3
## 123      3.8      4.1        3.5     3.3    3.6
## 339      3.2      3.2        2.9     2.6    3.2
## 372      3.3      3.4        3.0     2.6    3.3
## 376      3.5      3.6        3.3     2.9    3.9
## 68       3.5      4.1        3.4     3.5    2.9
## 225      3.8      4.0        3.7     3.6    3.2
## 366      3.6      3.5        3.2     2.9    3.4
## 354      3.4      3.4        3.0     3.3    2.9
## 51       3.5      3.3        3.2     2.7    3.4
## 245      3.4      4.0        3.6     3.0    3.6
## 490      3.6      3.6        3.3     2.8    3.2
## 78       3.3      3.6        3.2     2.9    3.5
## 121      3.5      3.6        3.1     2.6    3.0
## 358      3.7      3.8        3.0     3.2    2.6
## 329      3.3      3.0        2.9     2.1    3.8
## 272      3.4      3.4        2.9     2.5    3.2
## 452      3.6      4.0        3.4     3.5    3.7
## 316      3.4      3.5        3.0     2.9    3.3
## 416      3.6      3.6        3.3     2.8    3.6
## 431      3.5      4.1        3.6     3.8    3.9
## 231      4.3      4.6        4.3     4.2    3.7
## 397      4.1      4.3        4.1     4.0    3.5
## 406      3.8      4.0        3.8     3.6    3.5
## 222      3.8      3.9        3.3     3.1    3.3
## 489      3.7      4.2        3.5     3.4    3.8
## 165      3.5      3.6        3.1     3.0    3.1
## 190      3.3      3.5        3.0     2.7    3.3
## 388      3.6      3.8        3.4     3.2    3.4
## 37       3.5      3.6        3.1     3.0    3.4

Using two additional data sets, OECD World GDP Data and World Average Annual Income Data, I further solidify the point that money is not the root of happiness.

The United States ranks 9th in the World Average Income Data, higher than its global happiness ranking, and the USA is ranked 8th in the OECD 2016 Data (8th in 2015 and 13th in 2014)

Despite the higher economic ranking, happiness rankings as of 2016 are at 14th place.

Average Annual Global Income: Top 50 countries

OECD GDP Mapping

Directionally Correct Salary Comparison

Differenceinincomesources = USAglobalincomerank$`Average incomeannually` - salary_average 
print(paste("There is only a",abs(round(Differenceinincomesources,0)),"dollar difference in the average salaries calculated in the Average Annual World Income Data and Career Satisfication data for the USA"))
## [1] "There is only a 584 dollar difference in the average salaries calculated in the Average Annual World Income Data and Career Satisfication data for the USA"

Conclusion: The data strongly suggests that money is not the root of happiness; economic sentiment does not have the greatest impact on overall happiness on a macro and micro scale. However, it is important to delineate the fact that this is simply scratching the surface of the earth and that a deeper analysis into the complexities of human happiness is warranted based on these preliminary findings.