Objectives

To submit this homework you will create the document in Rstudio, using the knitr package (button included in Rstudio) and then submit the document to your Rpubs account. Once uploaded you will submit the link to that document on Canvas. Please make sure that this link is hyperlinked and that I can see the visualization and the code required to create it.

Look at the data

str(housing)
## 'data.frame':    72 obs. of  5 variables:
##  $ Sat : Ord.factor w/ 3 levels "Low"<"Medium"<..: 1 2 3 1 2 3 1 2 3 1 ...
##  $ Infl: Factor w/ 3 levels "Low","Medium",..: 1 1 1 2 2 2 3 3 3 1 ...
##  $ Type: Factor w/ 4 levels "Tower","Apartment",..: 1 1 1 1 1 1 1 1 1 2 ...
##  $ Cont: Factor w/ 2 levels "Low","High": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Freq: int  21 21 28 34 22 36 10 11 36 61 ...
head(housing)
##      Sat   Infl  Type Cont Freq
## 1    Low    Low Tower  Low   21
## 2 Medium    Low Tower  Low   21
## 3   High    Low Tower  Low   28
## 4    Low Medium Tower  Low   34
## 5 Medium Medium Tower  Low   22
## 6   High Medium Tower  Low   36

1. First plot

# The frequency of each house type with different satisfaction levels
library(dplyr)

housing %>% 
  dplyr::select(c(Sat,Type, Freq)) %>% 
  group_by(Type, Sat) %>% 
  mutate(total_freq = sum(Freq)) %>%
  dplyr::select(c(Sat,Type,total_freq)) %>%
  unique()-> df1


ggplot(df1, aes(x=Type, y=total_freq)) + geom_bar(stat='identity', aes(fill = Type)) +facet_wrap(.~Sat) + scale_fill_manual(values = c("skyblue", "royalblue", "blue", "navy")) + xlab('Housing Type') +ylab('Total Number of Residents') + ggtitle('Residents Satisfaction of Different Housing Types') + theme_light()

Apartment counts the most housing type among all four. Also, apartment housing type ranked the highest among all low, medium, and high satisfaction categories, which means while lots of residents are highly satisfied with Apartment, certain amount of residents are highly unsatisfied with Apartment. The same logic happens at Tower housing type. The reason behinds can be total number of residents living in Apartment and Tower way more than Atrium and Terrace. Most of residents living in Atrium are high satisfied, less residents are unsatisfied.

2. Second plot

df1
## # A tibble: 12 x 3
## # Groups:   Type, Sat [12]
##    Sat    Type      total_freq
##    <ord>  <fct>          <int>
##  1 Low    Tower             99
##  2 Medium Tower            101
##  3 High   Tower            200
##  4 Low    Apartment        271
##  5 Medium Apartment        192
##  6 High   Apartment        302
##  7 Low    Atrium            64
##  8 Medium Atrium            79
##  9 High   Atrium            96
## 10 Low    Terrace          133
## 11 Medium Terrace           74
## 12 High   Terrace           70
ggplot(df1, aes(Sat, total_freq)) + geom_col(aes(fill = Sat)) +facet_grid(. ~ Type) + theme_cleveland() +xlab('Resident Satisfaction') + ylab('Total Number of Residents') + ggtitle('Residents Satisfaction of Different Housing Types')

Based on the plot, we can tell that Tower and Atrium types high residents satisfaction, while Apartment has both high and low residents satisfaction. Most of residents are not satisfied with Terrace housing type.

3. Third plot

# How the perceived management of property affect the residents satisfaction
housing %>% 
  dplyr::select(c(Sat,Infl,Freq)) %>% 
  group_by(Infl, Sat) %>% 
  mutate(total_freq = sum(Freq)) %>%
  dplyr::select(c(Sat,Infl,total_freq)) %>%
  unique()-> df2

ggplot(df2, aes(x=Infl, y=total_freq)) + geom_bar(stat='identity', aes(fill = Infl)) + facet_wrap(.~Sat) +  scale_fill_brewer(palette = "Blues") + theme_pubr() + xlab('Perceived Management of Property ') +ylab('Total Number of Residents') + ggtitle('Residents Satisfaction of Perceived Management of Property ')

Based on the plot, we can identify the trend that the lower the perceived management of property by residents, the lower the residents’ satisfaction. However, not necessary right on the contrary. Medium level of perceived management of property tends to lead to high satisfaction. Too much involvement of property management tends to go to the opposite.

4. Fourth plot

housing %>% 
  dplyr::select(c(Type,Infl, Freq)) %>% 
  group_by(Type, Infl) %>% 
  mutate(total_freq = sum(Freq)) %>%
  dplyr::select(c(Type,Infl,total_freq)) %>%
  unique()-> df3

ggplot(df3, aes(Infl, total_freq)) + geom_col(aes(fill = Infl)) +facet_grid(. ~ Type) + theme_minimal() +xlab('Perceived Management of Property') + ylab('Total Number of Residents') + ggtitle('Perceived Management of Property of Different Housing Types')

In general, high perceived management of property is rare among four different housing types, while medium level of perception is more common in terms of the number of residents. Atrium and Terrace tend to have lower property management.

5. Fifth plot

housing %>% 
  dplyr::select(c(Type,Cont, Freq)) %>% 
  group_by(Type, Cont) %>% 
  mutate(total_freq = sum(Freq)) %>%
  dplyr::select(c(Type,Cont,total_freq)) %>%
  unique()-> df4

ggplot(df4,aes(x=Cont, y = total_freq)) + geom_bar(aes(fill = Cont), stat = 'identity', color = 'black', position = position_dodge(0.9))  +
   facet_wrap(Type ~ .) + theme_light() + xlab('Afforded Contact') + ylab('Number of Residents')

Based on the plot, we can tell that the number of low Afforded Contact is nearly half of the high Afforded Contact.