Hypothesis: Modern day happiness often seems to be based on the theory that money is the primary derivative for all purpose, comfort, pleasure, etc. The belief is that money is essentially the root to happiness. By examining the below data on a high level, we can determine how true that is, especially with respect to the USA. Happiness/satisfaction will then be measured on a career basis for the USA alone using similar methods. The primary goal in this analysis is thus to disprove the notion that wealth is the primary contributor to universal happiness.
Cleaning and prepping the data
Load Libraries
suppressMessages(suppressWarnings(library(tidyr)))
suppressMessages(suppressWarnings(library(dplyr)))
suppressMessages(suppressWarnings(library(ggplot2)))
suppressMessages(suppressWarnings(library(RCurl)))
suppressMessages(suppressWarnings(library(ggrepel)))
suppressMessages(suppressWarnings(library(XML)))
suppressMessages(suppressWarnings(library(rvest)))
twenty_fifteen_happiness = read.csv(text = getURL('https://raw.githubusercontent.com/manonfire86/FinalProject/master/2015_Happiness.csv'))
twenty_sixteen_happiness = read.csv(text = getURL('https://raw.githubusercontent.com/manonfire86/FinalProject/master/2016_Happiness.csv'))
twenty_seventeen_happiness = read.csv(text = getURL('https://raw.githubusercontent.com/manonfire86/FinalProject/master/2017_Happiness.csv'))
globalOECDGDP = read.csv(text = getURL('https://raw.githubusercontent.com/manonfire86/FinalProject/master/API_NY.GDP.PCAP.CD_DS2_en_excel_v2.csv'))
avgworldincome = 'https://www.worlddata.info/average-income.php'
avgworldincometable = avgworldincome %>%
read_html() %>%
html_nodes(xpath='//*[@id="tabsort"]') %>%
html_table(fill=TRUE)
careersatisfaction = read.csv(text = getURL('https://raw.githubusercontent.com/manonfire86/FinalProject/master/career_satisfaction_14_11_2017.csv'))
Global Happiness Reports: 2015-2017 (cleaning and prepping the data)
twenty_fifteen_happiness_sub = twenty_fifteen_happiness[,c('Country','Happiness.Rank')]
twenty_fifteen_happiness_sub['Year'] = 2015
twenty_sixteen_happiness_sub = twenty_sixteen_happiness[,c('Country','Happiness.Rank')]
twenty_sixteen_happiness_sub['Year'] = 2016
twenty_seventeen_happiness_sub = twenty_seventeen_happiness[,c('Country','Happiness.Rank')]
twenty_seventeen_happiness_sub['Year'] = 2017
longformat_twenty_fifteen_happiness = twenty_fifteen_happiness_sub %>%
spread(Year,Happiness.Rank)
longformat_twenty_sixteen_happiness = twenty_sixteen_happiness_sub %>%
spread(Year,Happiness.Rank)
longformat_twenty_seventeen_happiness = twenty_seventeen_happiness_sub %>%
spread(Year,Happiness.Rank)
happiness_ranking_df = merge(longformat_twenty_fifteen_happiness,longformat_twenty_sixteen_happiness,by= 'Country')
happiness_ranking_df = merge(happiness_ranking_df,longformat_twenty_seventeen_happiness,'Country')
happiness_ranking_df = happiness_ranking_df %>%
gather(Year, Rank,'2015':'2017')
top30countries = subset(happiness_ranking_df,Rank<=30)
Mapping the top 30 countries in terms of Happiness Ranking and determining the top absolute changes year over year
countriessorted = happiness_ranking_df[with(happiness_ranking_df,order(Country,Year,Rank)),]
countriesyoy = countriessorted %>% mutate( chg = ifelse(Country == lag(Country),Rank - lag(Rank),0))
countriesyoy['Abs_Change'] = abs(countriesyoy['chg'])
avgyoychange = countriesyoy[,c('Country','Year','Abs_Change')] %>%
spread(Year,Abs_Change)
avgyoychange[is.na(avgyoychange)] = 0
avgyoychange["Average Change Across Yrs"] = rowMeans(avgyoychange[,2:4])
toptenchanges = avgyoychange[order(avgyoychange$`Average Change Across Yrs`,decreasing = T),]
toptenchanges = head(toptenchanges,10)
purechanges = countriesyoy[which(countriesyoy$Country %in% toptenchanges$Country),]
purechanges = purechanges[,c('Country','Year','chg')] %>%
spread(Year,chg)
purechanges["Average Change Across Yrs"] = rowMeans(purechanges[,2:4])
purechanges = purechanges[order(purechanges$`Average Change Across Yrs`),]
What was the contributing factor based on the Happiness Index Data for the countries with the top 10 largest changes
twenty_fifteen_happiness_attr = twenty_fifteen_happiness[,-2:-5]
twenty_sixteen_happiness_attr = twenty_sixteen_happiness[,-2:-6]
twenty_seventeen_happiness_attr = twenty_seventeen_happiness[,-2:-5]
twenty_fifteen_attr_changes = twenty_fifteen_happiness_attr[which(twenty_fifteen_happiness_attr$Country %in% toptenchanges$Country),]
twenty_sixteen_attr_changes = twenty_sixteen_happiness_attr[which(twenty_sixteen_happiness_attr$Country %in% toptenchanges$Country),]
twenty_seventeen_attr_changes = twenty_seventeen_happiness_attr[which(twenty_seventeen_happiness_attr$Country %in% toptenchanges$Country),]
twenty_fifteen_attr_changes['Year'] = 2015
twenty_sixteen_attr_changes['Year'] = 2016
twenty_seventeen_attr_changes['Year'] = 2017
twenty_fifteen_attr_changes=twenty_fifteen_attr_changes[,-8]
twenty_sixteen_attr_changes = twenty_sixteen_attr_changes[,-8]
twenty_seventeen_attr_changes = twenty_seventeen_attr_changes[,-8]
newnamevector = c('Country','Economy','Family','Health','Freedom','Trust in Government','Generosity','Year')
colnames(twenty_fifteen_attr_changes) = newnamevector
colnames(twenty_sixteen_attr_changes) = newnamevector
colnames(twenty_seventeen_attr_changes) = newnamevector
longdffifteen = twenty_fifteen_attr_changes %>%
gather(Attribute,Factor,Economy:Generosity) %>%
spread(Year,Factor)
longdfsixteen = twenty_sixteen_attr_changes %>%
gather(Attribute,Factor,Economy:Generosity) %>%
spread(Year,Factor)
longdfseventeen = twenty_seventeen_attr_changes %>%
gather(Attribute,Factor,Economy:Generosity) %>%
spread(Year,Factor)
merged_attr_df = merge(longdffifteen,longdfsixteen,by=c('Country','Attribute'))
merged_attr_df = merge(merged_attr_df,longdfseventeen,by=c('Country','Attribute'))
merged_attr_df = merged_attr_df %>%
gather(Year,Factor,'2015':'2017')
attributessorted = merged_attr_df[with(merged_attr_df,order(Country,Attribute,Year)),]
attributesyoy = attributessorted %>% mutate( chg = ifelse(Country == lag(Country) & Attribute == lag(Attribute),Factor - lag(Factor),0))
avgyoyattrchange = attributesyoy[,c('Country','Attribute','Year','chg')] %>%
spread(Year,chg)
avgyoyattrchange[is.na(avgyoyattrchange)] = 0
avgyoyattrchange["Average Change Across Yrs"] = rowMeans(avgyoyattrchange[,3:5])
LargestContributingAttr = avgyoyattrchange[,c('Country','Attribute','Average Change Across Yrs')] %>%
spread(Attribute,'Average Change Across Yrs')
maxattributeeachcountry = avgyoyattrchange[,c('Country','Attribute','Average Change Across Yrs')] %>% group_by(Country) %>% top_n(1,abs(`Average Change Across Yrs`))
Examining the USA: Career Satisfaction and Global Income Standing
twenty_fifteen_happiness_USA = twenty_fifteen_happiness[,-2:-5]
twenty_sixteen_happiness_USA = twenty_sixteen_happiness[,-2:-6]
twenty_seventeen_happiness_USA = twenty_seventeen_happiness[,-2:-5]
twenty_fifteen_happiness_USA['Year'] = 2015
twenty_sixteen_happiness_USA['Year'] = 2016
twenty_seventeen_happiness_USA['Year'] = 2017
twenty_fifteen_happiness_USA=twenty_fifteen_happiness_USA[,-8]
twenty_sixteen_happiness_USA = twenty_sixteen_happiness_USA[,-8]
twenty_seventeen_happiness_USA = twenty_seventeen_happiness_USA[,-8]
colnames(twenty_fifteen_happiness_USA) = newnamevector
colnames(twenty_sixteen_happiness_USA) = newnamevector
colnames(twenty_seventeen_happiness_USA) = newnamevector
USAdffifteen = twenty_fifteen_happiness_USA %>%
gather(Attribute,Factor,Economy:Generosity) %>%
spread(Year,Factor)
USAdfsixteen = twenty_sixteen_happiness_USA %>%
gather(Attribute,Factor,Economy:Generosity) %>%
spread(Year,Factor)
USAdfseventeen = twenty_seventeen_happiness_USA %>%
gather(Attribute,Factor,Economy:Generosity) %>%
spread(Year,Factor)
merged_attr_USA = merge(USAdffifteen,USAdfsixteen,by=c('Country','Attribute'))
merged_attr_USA = merge(merged_attr_USA,USAdfseventeen,by=c('Country','Attribute'))
merged_attr_USA = merged_attr_USA %>%
gather(Year,Factor,'2015':'2017')
attributesUSA = merged_attr_USA[which(merged_attr_USA$Country == "United States"),]
attributesUSA= attributesUSA[with(attributesUSA,order(Country,Attribute,Year)),]
attributesUSAyoy = attributesUSA %>% mutate( chg = ifelse(Country == lag(Country) & Attribute == lag(Attribute),Factor - lag(Factor),0))
avgyoyUSAchange = attributesUSAyoy[,c('Country','Attribute','Year','chg')] %>%
spread(Year,chg)
avgyoyUSAchange[is.na(avgyoyUSAchange)] = 0
avgyoyUSAchange["Average Change Across Yrs"] = rowMeans(avgyoyUSAchange[,3:5])
Career Satisfaction
careersatisfactionUSA = careersatisfaction
factoraverages = colMeans(careersatisfactionUSA[6:12])
salary_average = mean(careersatisfaction[is.na(careersatisfactionUSA$Salary....USD.)==FALSE,"Salary....USD."])
salarymapping = careersatisfaction[is.na(careersatisfactionUSA$Salary....USD.)==FALSE,]
salarymapping = salarymapping[order(-salarymapping[,4]),]
careersatisfactionUSA[is.na(careersatisfactionUSA)]=0
heatmaptable = salarymapping[,-2:-6]
heatmaptable = heatmaptable[1:50,] %>%
gather(Attribute,Factor, Fit:Salary)
analysistableheat = salarymapping[,c(-2,-3,-5)]
analysistableheat = analysistableheat[1:50,]
Average Global Income
avgworldincomedf = avgworldincometable[[1]]
colnames(avgworldincomedf) = avgworldincomedf[1,]
avgworldincomedf = avgworldincomedf[-1,-5]
row.names(avgworldincomedf) = avgworldincomedf[,1]
avgworldincomedf[] = lapply(avgworldincomedf,gsub,pattern = ',',replacement = '')
avgworldincomedf[] = lapply(avgworldincomedf,gsub,pattern = '\\$',replacement = '')
avgworldincomedf$`Average incomeannually` = as.numeric(avgworldincomedf$`Average incomeannually`)
avgworldincomedf$monthly = as.numeric(avgworldincomedf$monthly)
USAglobalincomerank = avgworldincomedf[which(avgworldincomedf$Country == 'United States'),]
Global OECD GDP Data
globalOECDGDP[is.na(globalOECDGDP)] = 0
globalOECDGDPparsedfourteen = globalOECDGDP[,c("Country.Name","X2014")]
globalOECDGDPparsedfifteen = globalOECDGDP[,c("Country.Name","X2015")]
globalOECDGDPparsedsixteen = globalOECDGDP[,c("Country.Name","X2016")]
globalOECDGDPparsedfourteen = globalOECDGDPparsedfourteen[with(globalOECDGDPparsedfourteen,order(-globalOECDGDPparsedfourteen$X2014)),]
globalOECDGDPparsedfifteen = globalOECDGDPparsedfifteen[with(globalOECDGDPparsedfifteen,order(-globalOECDGDPparsedfifteen$X2015)),]
globalOECDGDPparsedsixteen = globalOECDGDPparsedsixteen[with(globalOECDGDPparsedsixteen,order(-globalOECDGDPparsedsixteen$X2016)),]
toptentableoecd = merge(globalOECDGDPparsedfourteen,globalOECDGDPparsedfifteen,by = 'Country.Name')
toptentableoecd = merge(toptentableoecd,globalOECDGDPparsedsixteen,by = 'Country.Name')
toptentableoecd = toptentableoecd[with(toptentableoecd,order(-X2016,-X2015,-X2014)),]
toptentableoecdfinal = toptentableoecd[1:10,]
tennamescols = c('Countries','2014','2015','2016')
colnames(toptentableoecdfinal) = tennamescols
toptentableoecdfinal = toptentableoecdfinal %>%
gather(Year,Average_GDP,`2014`:`2016`)
Analysis
Below is the mapping of the top 30 Countries Based on Happiness Rankings from 2015 to 2017. As you can see there are some major shifts where countries lost there entire standing within the top 30 or gained position within the top 30, for instance the Czech Republic was not present in 2015, but it made the top 30 in 2016 and 2017. The USA shifted between rank 15, 13, and 14 between 2015 and 2017.
Analysis: Top 30 Ranked Countries from 2015-2017

Given the macro level, it is safe to question what were the largest contributing factors for countries that experience the largest shifts in rankings. Taking the absolute average changes in the data set and then mapping these rankings to their natural year over year changes, I parsed the top 10 countries that experienced the greatest shifts.
I then included the data set’s happiness factors and mapped them accordingly. This is the first confirmation that money is not the root of happiness; economic sentiment did not have the greatest impact for 70% of the countries mapped. In fact, it was often the compounding effect of all attributes affected in that year that caused the shift in rankings.
Ten countries with largest Rank shifts: Contributing Factors

Attributes: Greatest Average Changes Across 2015-2017 for the Ten Selected Countries
print(maxattributeeachcountry)
## # A tibble: 10 x 3
## # Groups: Country [10]
## Country Attribute `Average Change Across Yrs`
## <fctr> <chr> <dbl>
## 1 Algeria Economy 0.05085816
## 2 Bulgaria Family 0.10941315
## 3 Egypt Family 0.08349046
## 4 Hungary Economy 0.05502398
## 5 Latvia Family 0.10303165
## 6 Liberia Generosity -0.06822392
## 7 Nigeria Family 0.10381683
## 8 Romania Family 0.08807043
## 9 Venezuela Freedom -0.09169429
## 10 Zambia Economy 0.05534226
Examining the USA specifically, changes in economic sentiment did not have the greatest impact in the rank shifts experienced between 2015 and 2017. The largest affected factor was a sharp decline in geneorsity over the years while trust in government experienced the highest positive average change.
USA: Contributing Factors

USA: Average Changes 2015-2017
print(avgyoyUSAchange)
## Country Attribute 2015 2016 2017
## 1 United States Economy 0 0.11345 0.038299284
## 2 United States Family 0 -0.19929 0.372100564
## 3 United States Freedom 0 -0.06441 0.024110523
## 4 United States Generosity 0 0.00972 -0.275131212
## 5 United States Health 0 -0.08279 -0.004713372
## 6 United States Trust in Government 0 -0.01022 0.243898781
## Average Change Across Yrs
## 1 0.05058309
## 2 0.05760352
## 3 -0.01343316
## 4 -0.08847040
## 5 -0.02916779
## 6 0.07789293
Keeping the hypothesis in mind, I then examine the USA on a micro level. Here I take into consideration career data; this, to me, is the easiest way to determine if money is the root of happiness. It essentially tests the logic that a good job is the one that pays you alot of money.
The below heat map however disproves that with the salary factor having the lowest impact on overall job satisfaction; it is the case that fit plays the greatest role in job satisfaction, followed by interest and environment.
Heat Map of Job Happiness USA

Overall Scores: Career Satisfaction for the Top 50 paying jobs
print(analysistableheat)
## Name Salary....USD. Overall Fit
## 218 Psychiatrist 181880 3.4 3.9
## 116 Chief Executive 173320 3.9 4.4
## 370 Sustainability Officer 173320 3.6 4.0
## 209 Pediatrician 163350 4.0 4.3
## 7 Dentist 149540 3.0 3.5
## 471 Petroleum Engineer 130050 3.0 3.6
## 387 Marketing Manager 127130 3.2 3.8
## 96 Air Traffic Controller 122340 3.3 3.8
## 49 Pharmacist 120950 2.8 3.5
## 338 Clinical Research Coordinator 120050 2.9 3.7
## 92 Pilot 118140 3.9 4.1
## 321 Financial Manager 115320 3.0 3.8
## 334 Bank Branch Manager 115320 2.8 3.8
## 364 Treasurer 115320 3.1 3.7
## 33 Lawyer 114970 2.7 3.4
## 274 Sales Manager 110660 2.8 3.7
## 428 Physicist 109600 3.6 3.9
## 143 Computer Hardware Engineer 108430 3.6 3.9
## 192 Computer & Information Research Scientist 108360 3.5 3.9
## 266 Purchasing Manager 106090 3.0 3.6
## 77 Astronomer 105410 4.0 3.9
## 123 Aerospace Engineer 105380 3.6 4.1
## 339 Compliance Manager 105060 2.7 3.5
## 372 Supply Chain Manager 105060 3.0 3.7
## 376 Investment Fund Manager 105060 3.2 3.8
## 68 Political Scientist 104920 3.4 3.9
## 225 Mathematician 103720 3.7 3.9
## 366 Human Resources Manager 102780 3.2 3.9
## 354 Fundraising Manager 101510 2.9 3.6
## 51 Optometrist 101410 2.8 3.7
## 245 Nuclear Engineer 100470 3.0 3.9
## 490 Operations Manager 97270 3.2 3.9
## 78 Chemical Engineer 96940 3.0 3.6
## 121 Advertising Manager 96720 3.1 3.8
## 358 Green Product Marketer 96720 3.5 3.9
## 329 Actuary 96700 2.6 3.4
## 272 Sales Engineer 96340 2.9 3.6
## 452 Physician Assistant 95820 3.1 3.8
## 316 Economist 95710 3.1 3.6
## 416 Software Engineer 95510 3.2 3.8
## 431 Nurse Practitioner 95350 3.0 3.8
## 231 Robotics Engineer 94240 4.3 4.4
## 397 Nanotechnology Engineer 94240 4.1 4.2
## 406 Biochemical Engineer 94240 3.5 3.9
## 222 Naval Architect 92930 3.3 3.9
## 489 Marine Engineer 92930 3.6 3.8
## 165 Health Services Manager 92810 3.0 3.8
## 190 Industrial Production Manager 92470 2.9 3.8
## 388 Materials Scientist 91980 3.2 3.7
## 37 Electrical Engineer 91410 3.2 3.7
## Work.Env Interest Skill.Util Meaning Salary
## 218 3.7 3.8 3.5 3.2 3.9
## 116 4.0 4.3 4.1 3.7 3.5
## 370 3.7 4.1 3.3 3.6 3.4
## 209 3.8 4.1 4.0 4.1 4.1
## 7 3.3 3.3 3.2 2.9 3.2
## 471 3.4 3.7 3.0 3.0 3.7
## 387 3.5 3.6 3.2 2.8 3.2
## 96 3.8 3.9 3.3 3.2 3.8
## 49 3.1 3.2 2.8 2.8 3.2
## 338 3.5 3.7 2.9 2.9 2.8
## 92 3.9 4.4 3.9 3.7 3.4
## 321 3.4 3.4 3.1 2.6 3.3
## 334 3.4 3.2 3.0 2.6 3.1
## 364 3.6 3.4 3.0 2.9 3.0
## 33 3.0 3.4 3.1 2.6 3.3
## 274 3.3 3.2 2.8 2.4 3.1
## 428 3.7 4.1 3.8 3.5 3.0
## 143 3.7 4.0 3.6 3.3 3.4
## 192 3.8 3.8 3.5 3.4 3.1
## 266 3.5 3.3 3.0 2.6 3.2
## 77 4.1 4.3 3.9 3.9 3.3
## 123 3.8 4.1 3.5 3.3 3.6
## 339 3.2 3.2 2.9 2.6 3.2
## 372 3.3 3.4 3.0 2.6 3.3
## 376 3.5 3.6 3.3 2.9 3.9
## 68 3.5 4.1 3.4 3.5 2.9
## 225 3.8 4.0 3.7 3.6 3.2
## 366 3.6 3.5 3.2 2.9 3.4
## 354 3.4 3.4 3.0 3.3 2.9
## 51 3.5 3.3 3.2 2.7 3.4
## 245 3.4 4.0 3.6 3.0 3.6
## 490 3.6 3.6 3.3 2.8 3.2
## 78 3.3 3.6 3.2 2.9 3.5
## 121 3.5 3.6 3.1 2.6 3.0
## 358 3.7 3.8 3.0 3.2 2.6
## 329 3.3 3.0 2.9 2.1 3.8
## 272 3.4 3.4 2.9 2.5 3.2
## 452 3.6 4.0 3.4 3.5 3.7
## 316 3.4 3.5 3.0 2.9 3.3
## 416 3.6 3.6 3.3 2.8 3.6
## 431 3.5 4.1 3.6 3.8 3.9
## 231 4.3 4.6 4.3 4.2 3.7
## 397 4.1 4.3 4.1 4.0 3.5
## 406 3.8 4.0 3.8 3.6 3.5
## 222 3.8 3.9 3.3 3.1 3.3
## 489 3.7 4.2 3.5 3.4 3.8
## 165 3.5 3.6 3.1 3.0 3.1
## 190 3.3 3.5 3.0 2.7 3.3
## 388 3.6 3.8 3.4 3.2 3.4
## 37 3.5 3.6 3.1 3.0 3.4
Using two additional data sets, OECD World GDP Data and World Average Annual Income Data, I further solidify the point that money is not the root of happiness.
The United States ranks 9th in the World Average Income Data, higher than its global happiness ranking, and the USA is ranked 8th in the OECD 2016 Data (8th in 2015 and 13th in 2014)
Despite the higher economic ranking, happiness rankings as of 2016 are at 14th place.
Average Annual Global Income: Top 50 countries

OECD GDP Mapping

Directionally Correct Salary Comparison
Differenceinincomesources = USAglobalincomerank$`Average incomeannually` - salary_average
print(paste("There is only a",abs(round(Differenceinincomesources,0)),"dollar difference in the average salaries calculated in the Average Annual World Income Data and Career Satisfication data for the USA"))
## [1] "There is only a 584 dollar difference in the average salaries calculated in the Average Annual World Income Data and Career Satisfication data for the USA"
Conclusion: The data strongly suggests that money is not the root of happiness; economic sentiment does not have the greatest impact on overall happiness on a macro and micro scale. However, it is important to delineate the fact that this is simply scratching the surface of the earth and that a deeper analysis into the complexities of human happiness is warranted based on these preliminary findings.