Introduction

In this project I will use the dataset “happiness” from package Wooldridge. The data set consists of 17137 observations and 33 variables and it is a survey collection about how happy people experience regards to the degree of happiness and the other factors that may affect the degree of happiness. The data set comes from “Introductory Econometrics: A Modern Approach, 5th Edition” by Jeffrey M. Wooldridge, 2003.

The variable of particular interest for me is “happy”, which indicate different degrees of happiness for the respondents. The aim for this project is to find out how the degree of happiness will be affected by the other factors. There are many variables in this data set and I will select “happy”, “year”, “female”, “black”, “income”,“educ”,“prestige”,“tvhours”,“workstat” and “attend” to focus on.

The data set needs to be processed so it suits the purpose and I will start off by filtering out those rows that containing NA. There are some invalid responses should not be included in the data, for example “Inapplicable” (IAP), “Don’t know” (DK) values. Some of the categorical variables are formatted as integer and I will change them into factors and relabel the levels. At last, I will give the processed data set “Happiness” a new name just to avoid overlapping with the original one. The information of the new dataset Happiness can be found in tables 1.

   Table 1: The description of variables in dataset happiness.
Name Data type Description
happy factor general happiness
year integer general social survey year for this respondent
gender factor gender(male and female)
color factor skin color(black or non-black)
income integer total family income
attend factor how often respondent attends religious services
tvhours integer hours per day watching tv
prestige integer occupational prestige score
educ integer highest year of school completed
workstat factor work force status

Part 1

As time goes by, things change however is people’s life getting better as time goes by? In this part I will use mosaic plot to see how respondents experience in different years. From Figure 1 we can see the outcome. The width of the bars differ from each other and it may due to the different collection volume each year. It is very clear to learn that people’s life doesn’t seem to improve as time goes by, the percentages of different degrees of happiness remain mostly at the same as the level “pretty happy” charging for the most every year and then comes with the level “very happy”, and “not too happy” accounts for only a small part.

Figure 1: A mosaic plot indicates the percentage of different happiness level in ascending years. The degree of happiness are in different colors.

Figure 1: A mosaic plot indicates the percentage of different happiness level in ascending years. The degree of happiness are in different colors.

Part 2

In this part we will see how the gender and skin color may affect the degree of happiness. In Figure 2, the gender is mapped to the x-axis and the plot is divided into two facets according to the skin colors. Generally speaking, the level of “pretty happy” dominates the proportion no matter the what the color or the gender the groups are. However we can see the distinctive difference between Black and Non-black. In the non-black group, male and female charge the same proportion of different happiness levels. The level of “not too happy” accounts for relatively lager proportion for black people than non-black people and black women seem to suffer more than black man as the “not too happy” makes up more proportion in the black-female bar.

Figure 2: A faceted mosaic plot shows how the skin colors would affect people's sense of happiness base on their gender.

Figure 2: A faceted mosaic plot shows how the skin colors would affect people’s sense of happiness base on their gender.

Part 3

In this part we will explore if the family income has a positive impact on how people feel about their life. I will use a grouped proportional bar chart. I first count the unique values of “income” and “happy”, then compute the proportion of different levels of happiness in different income groups.

From Figure 3 we can see the trend that as the income increases, the percentage of “not too happy” shrinks and the “very happy” on the contrary, increases. So it seems like the family income have a positive impact on how people experience. But there is also an extreme case showing a different story, in the group where people have less than 1000 dollars income, the ratio of “not too happy” and “very happy” are almost the same which is around 25%, and a percentage of 25% “very happy” is relatively high while comparing to the level in the other income groups. So for this group of people, even though they have the least family income, it doesn’t make they feel less happy.

Figure 3: A grouped bar chart presents the ratio of different happiness levels among different groups of family incomes and the bar position is 'dodge'.

Figure 3: A grouped bar chart presents the ratio of different happiness levels among different groups of family incomes and the bar position is ‘dodge’.

Part 4

In this part we will explore how the years of education and the occupational prestige score may relate to the happiness levels. From Figure 4, we can learn that most of t he points are gathering at the right upper part of the plot, meaning most of the people have around 10 to 20 years education and the education seems to gain people higher occupational score. The lines in the plot indicates different happiness levels and the differece between them is not so significant besides when the education year is 5, people who are “very happy” have higher prestige score. However, generally speaking, people with the same years of education but higher prestige scores are more happy.
Figure 4: A scatter plot shows how the years of education people received and the prestige they gain are related to the degree of happiness. The opacity of points in the plot is set to 0.1, and the y-axis is sclaed to avoid overlapping, however the points are not jittering, if points are jittering it will make it difficult to read. The lines shows the mean value of prestige score. The binwidth for bin-summarise is 3.

Figure 4: A scatter plot shows how the years of education people received and the prestige they gain are related to the degree of happiness. The opacity of points in the plot is set to 0.1, and the y-axis is sclaed to avoid overlapping, however the points are not jittering, if points are jittering it will make it difficult to read. The lines shows the mean value of prestige score. The binwidth for bin-summarise is 3.

Here we will further explore the how the prestige affect people’s feeling in different years. In Figure 5 we can see that even though the variance of prestige score of different happiness levels varies year from year, the conclusion remain unchanged, the higher prestige bring people more satisfaction because those people who are very happy have the higher mean prestige score and it follows with “pretty happy” and then “not very happy”.
Figure 5: A box plot shows the how the prestige scores of different happiness levels vary accorss different years.

Figure 5: A box plot shows the how the prestige scores of different happiness levels vary accorss different years.

Part 5

There are the other factors that have potential influence on the happiness levels and the variable “attend” (how often respondent attends religious services) is worth to take into account. In this part a mosaic plot showed in Figure 6 will be used to illustrate influence.

The trend can be seen from the plot that the more often people visit religious services the more likely to be happier because the proportion of being “very happy” is getting bigger as the frequency increases. So it is certain that attending religious activities do bring some positive effect on how people feel.

Figure 6: A mosaic shows the the relationship between the frequency of attending religious services and the degree of happiness

Figure 6: A mosaic shows the the relationship between the frequency of attending religious services and the degree of happiness

Part 6

In this part we will get to know the relationship between work status, TV hours and happiness levels. All of the distributions in Figure 7 are right-skewed but they skew to different extents, for instance, for the same work status, the distribution of “not very happy” skew more to the right, meaning people who are less happy tend to spend more time on TV. If we compare the distribution of different work status we can learn that those who work full time spend less time on TV and those who are unemployed and so on spend more time on TV.
Figure 7: A density plot repsents the distribution of tv-watching hours based on their work status and their happiness levels.

Figure 7: A density plot repsents the distribution of tv-watching hours based on their work status and their happiness levels.

Summary

In summary, there are many factors that affect the degree of happiness in different ways. 1. The proportion of different levels of happiness doesn’t seem to change as time goes by. 2. The black people are prone to experience a “not too happy” life than non-black and black women are especially more likely to suffer than black men. 3. The family incomes generally have a positive impact on the happiness levels, the more the family income is the less likely people being unhappy. 4. Years of education gain people higher occupational prestige which leads to a more happy life. 5. The frequency of attending religious services improves the sense of happiness as well, as people who are attending religious services regularly have more chance to be “very happy”. 6. Those who are less happy spend more time on TV.