Git: https://github.com/baharhm/Homework-7

library(productplots)
data(happy, package="productplots")
head(happy)
##   id         happy year age    sex       marital         degree       finrela
## 1  1 not too happy 1972  23 female never married       bachelor       average
## 2  2 not too happy 1972  70   male       married lt high school above average
## 3  3  pretty happy 1972  48 female       married    high school       average
## 4  4 not too happy 1972  27 female       married       bachelor       average
## 5  5  pretty happy 1972  61 female       married    high school above average
## 6  6  pretty happy 1972  26   male never married    high school above average
##      health wtssall
## 1      good  0.4446
## 2      fair  0.8893
## 3 excellent  0.8893
## 4      good  0.8893
## 5      good  0.8893
## 6      good  0.4446
HAPPY <- readRDS("data/HAPPY.rds")
  1. Data cleaning: the values “IAP”, “DK” and “NA” all encode missing values. Replace all of these instances by the value NA.
happy = replace(happy, happy == "IAP", NA)
happy = replace(happy, happy == "DK", NA)
happy = replace(happy, happy == "NA", NA)
  1. Check the type of the variable and cast into the right type (factor variable for categorical variables). For age, change “89 OR OLDER” to 89 and assume the variable should be numeric.

  2. Bring all levels of factors into a sensible order. For marital you could e.g. order the levels according to average age.

 happy <- happy %>% mutate(
    degree = factor(tolower(degree)),
    degree = factor(degree, levels=c("graduate school",
                                      "bachelor",
                                      "junior college",
                                      "lt high school",
                                      "high school"))
  )%>% select(-degree)


happy = happy %>% mutate(
  happiness = factor(tolower(happy))
  
  )%>% select(-happy)


happy = happy %>% mutate(
  
  finrela= factor(tolower(finrela)),
  finrela = factor(finrela, levels=c("far above average",
                                      "above average",
                                     "average",
                                      "below average",
                                     "far below average"))
  ) %>% select(-finrela)


happy = happy %>% mutate(
  health = factor(tolower(health)),
   health = factor(health, levels=c("excellent",
                                   "good",
                                   "fair",
                                   "poor"))
)  %>% select(-health)

happy = happy %>% mutate(
  marital = factor(tolower(marital)),
  sex = factor(tolower(sex)),
  year = year,
  age = age,
  wtssall = wtssall,
) %>% select(-sex, -marital, -year, -age, -wtssall)
saveRDS(happy,"happy.rds")

Investigate the relationship between happiness and two other variables in the data. Find a visualization that captures the relationship and write a paragraph to describe it.

Question: Fine the relation between happiness and age and marital status.