Objectives

To submit this homework you will create the document in Rstudio, using the knitr package (button included in Rstudio) and then submit the document to your Rpubs account. Once uploaded you will submit the link to that document on Canvas. Please make sure that this link is hyperlinked and that I can see the visualization and the code required to create it.

Look at the data

str(housing)
## 'data.frame':    72 obs. of  5 variables:
##  $ Sat : Ord.factor w/ 3 levels "Low"<"Medium"<..: 1 2 3 1 2 3 1 2 3 1 ...
##  $ Infl: Factor w/ 3 levels "Low","Medium",..: 1 1 1 2 2 2 3 3 3 1 ...
##  $ Type: Factor w/ 4 levels "Tower","Apartment",..: 1 1 1 1 1 1 1 1 1 2 ...
##  $ Cont: Factor w/ 2 levels "Low","High": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Freq: int  21 21 28 34 22 36 10 11 36 61 ...

1. Data Balance Check

# check sample balance of each category
v1 <- housing %>%
  group_by(Infl, Type, Cont) %>%
  summarise(Freq = sum(Freq))
levels(v1$Infl) <- list("Infl-Low"="Low", "Infl-Medium"="Medium", "Infl-High"="High")
levels(v1$Cont) <- list("Cont-Low"="Low", "Cont-High"="High")

ggballoonplot(v1, x = "Infl", y = "Cont", size = "Freq", facet.by = "Type",
              fill = "Freq", ggtheme = theme_bw()) +
  scale_fill_viridis_c(option = "C")

By balance check of each category, we can see that most residents involved in the study lived in apartments while least residents from atrium. More residents had low perceived degree on the management of the property and high contact afforded with other residents.

2. Number of Resident by Satisfaction and Rental Accomodation

v2 <- housing %>%
  group_by(Sat, Type) %>%
  summarise(Freq = sum(Freq))

ggplot(v2, aes(Freq, Sat)) +
  geom_point(aes(color = Type)) +
  facet_grid(Type ~ ., scales = "free", space = "free") +
  theme_light() +
  theme(strip.text.y = element_text(angle = 0),
        legend.position = "none") +
  labs(y = "Satisfaction",
       x = "Numbers of Residents",
       title = "Number of Resident by Satisfaction and Rental Accomodation")

For residents lived in Tower and Atrium, more people tended to have higher satisfaction, while Terrace residents tended to have lower satisfaction. Residents from Apartments had heavily tailed satisfaction towards both sides (high and low).

3. Number of Resident by Satisfaction, Rental Accomodation and Management Influence

v3 <- housing %>%
  group_by(Sat, Type, Infl) %>%
  summarise(Freq = sum(Freq))
levels(v3$Infl) <- list("Infl-Low"="Low", "Infl-Medium"="Medium", "Infl-High"="High")

ggplot(v3, aes(x = Infl, y = Freq))+
  geom_bar(
    aes(fill = Sat), stat = "identity", color = "white",
    position = position_dodge(0.9)
    ) +
  facet_wrap(~Type) +
  guides(fill = guide_legend(title = "Satisfaction")) +
  labs(x = "Management Influence",
       y = "Numbers of Residents",
       title = "Number of Resident by Satisfaction, Rental Accomodation and Management Influence")

For all types of rental accomodation, residents tended to have higher satisfaction when they had higher perceived degree of influence on the management of the property. Especially for residents who lived in Aparment and Terrace, more than half of these residents had low satisfaction when influence was low, while more than half of them had high satisfaction when influence was high.

4. Resident Satisfaction by Rental Accomodation and Afforded Contact

v4 <- housing %>%
  mutate(
    sum_sat_score = ifelse(Sat == 'Low', 1, ifelse(Sat == 'High', 5, 3)) * Freq
  ) %>%
  group_by(Type, Cont) %>%
  summarise(
    sum_sat_score = sum(sum_sat_score),
    freq = sum(Freq)
  ) %>%
  mutate(avg_sat_score = round(sum_sat_score / freq, 2))

ggplot(v4, aes(avg_sat_score, Cont)) +
  geom_point(aes(size = freq, color = freq)) +
  facet_grid(Type ~ .) +
  scale_colour_gradient(low = "green", high = "orange") +
  guides(
    size = guide_legend(title = "Numbers of Residents"),
    color = guide_legend(title = "Numbers of Residents")
  ) +
  labs(x = "Avg Satisfaction Score",
       y = "Afforded Contact",
       title = "Resident Satisfaction by Rental Accomodation and Afforded Contact")

Generally speaking, the higher the contact the householders were afforded with other residents (Contact), the higher average satisfcation score. The opposite result was observed among Terrace residents, this result might caused by the smaller sample size of Terrace residents who had low afforded contact. (Satisfaction score was calculated for each householder: 1 for low satisfaction, 3 for medium and 5 for high satisfaction).

5. Resident Satisfaction by Rental Accomodation, Management Influence and Afforded Contact

v5 <- housing %>%
  mutate(
    sum_sat_score = ifelse(Sat == 'Low', 1, ifelse(Sat == 'High', 5, 3)) * Freq
  ) %>%
  group_by(Infl, Cont, Type) %>%
  summarise(
    sum_sat_score = sum(sum_sat_score),
    Freq = sum(Freq)
  ) %>%
  mutate(avg_sat_score = round(sum_sat_score / Freq, 2))

ggplot(v5, aes(Infl, avg_sat_score)) +
  facet_wrap(Type ~ .) +
  geom_bar(
    aes(fill = Cont), stat = "identity", color = "white",
    position = position_dodge(0.9)
  )  +
  guides(fill = guide_legend(title = "Afforded Contact")) +
  labs(x = "Mangement Influence",
       y = "Avg Satisfaction Score",
       title = "Resident Satisfaction by Rental Accomodation, Management Influence and Afforded Contact")

The average satsfaction score of tower resident varied the least (3 ~ 4.5), while apartment residents’ satisfaction varied the most (2 ~ 4) depending on Management Influence and Afforded Contact.