Objectives

To submit this homework you will create the document in Rstudio, using the knitr package (button included in Rstudio) and then submit the document to your Rpubs account. Once uploaded you will submit the link to that document on Canvas. Please make sure that this link is hyperlinked and that I can see the visualization and the code required to create it.

Look at the data

str(housing)
## 'data.frame':    72 obs. of  5 variables:
##  $ Sat : Ord.factor w/ 3 levels "Low"<"Medium"<..: 1 2 3 1 2 3 1 2 3 1 ...
##  $ Infl: Factor w/ 3 levels "Low","Medium",..: 1 1 1 2 2 2 3 3 3 1 ...
##  $ Type: Factor w/ 4 levels "Tower","Apartment",..: 1 1 1 1 1 1 1 1 1 2 ...
##  $ Cont: Factor w/ 2 levels "Low","High": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Freq: int  21 21 28 34 22 36 10 11 36 61 ...

1. First plot

# place code for vis here

df <- as.data.frame(xtabs(Freq~Sat+Type, data=housing))
df %>% ggplot(aes(x = Sat, y = Freq)) +
  geom_bar(stat = "identity", fill ="green") +
  facet_wrap(~ Type, ncol = 2) +
  labs(title = "Satisfaction Level by Housing Type",
       y = "Frequency",
       x = "Satisfaction")

Among all housing types, apartment is the most popular one. In terms of satisfaction of householders with their present housing circumstances, tower and atrium have the most highly satisfied residents porportionally; terrace has the most low-satisfaction residents proportionally. Overall Tower is the best option for customer satisfaction.

2. Second plot

# place code for vis here

df <- as.data.frame(xtabs(Freq~Sat+Infl+Cont, data=housing))
df$Infl <- factor(df$Infl, levels = c("Low","Medium", "High"),
                   labels = c("Infl: Low","Infl: Medium", "Infl: High"))
df$Cont <- factor(df$Cont, levels = c("Low","High"),
                   labels = c("Cont: Low","Cont: High"))
df %>% ggplot(aes(x = Sat, y = Freq)) +
  geom_bar(stat = "identity", fill ="green") +
  facet_grid(Cont~ Infl) +
  labs(title = "Satisfaction By Contact Residents and Perceived Degree of Influence",
       y = "Contact residents arre afforded with other residents",
       x = "Perceived Degree of Influence")

Among different levels of perceived degree of influence and contact residents, those with high perceived influence and more contact residents have the highest proportion of highly satisfied residents; those with low perceived influence and fewer contact residents have the lowest proportion of highly satisfied residents. Overall both perceived degree of influence and contact residents are positively contributing to the resident satisfaction.

3. Third plot

# place code for vis here

df <- as.data.frame(xtabs(Freq~Sat+Infl+Cont, data=housing))
df$Infl <- factor(df$Infl, levels = c("Low","Medium", "High"),
                   labels = c("Infl: Low","Infl: Medium", "Infl: High"))
df$Cont <- factor(df$Cont, levels = c("Low","High"),
                   labels = c("Cont: Low","Cont: High"))
ggplot(df, aes(x = Infl, y = Freq))+
  geom_bar(
    aes(fill = Sat), stat = "identity", color = "white",
    position = position_dodge(0.9)
    )+
  facet_wrap(~Cont) +
  labs(title = "No.of Contact Residents and Perceived Degree of Influence by Satisfaction",
       y = "Frequency",
       x = "Perceived Degree of Influence")

In the third plot we could see that, overall we have more high contact residents observations in the sample. When there are more contact residents, samples are more likely to have higher proportion of highly satisfied residents. In addition, when the preceived influence level is medium, having high contact residents level will have higher satisfaction probability than lower contact residents valued condition.

4. Fourth plot

df <- as.data.frame(xtabs(Freq~Sat+Type+Cont, data=housing))
df$Cont <- factor(df$Cont, levels = c("Low","High"),
                   labels = c("Cont: Low","Cont: High"))
ggplot(df, aes(x = Type, y = Freq))+
  geom_bar(
    aes(fill = Sat), stat = "identity", color = "white",
    position = position_dodge(0.9)
    )+
  facet_wrap(~Cont) +
  labs(title = "No. of Contact Residents by Housing Type and Satisfaction",
       y = "Frequency",
       x = "Contact residents are afforded with other residents")

When the contact residents level is high, most resident types would have better satisfaction level. However, for terrace type, the effect is reserved. A high contact resident level will bring a higher low-satisfaction probability. ## 5. Fifth plot

# place code for vis here

library(ggpubr)
## Loading required package: magrittr
## 
## Attaching package: 'magrittr'
## The following object is masked from 'package:purrr':
## 
##     set_names
## The following object is masked from 'package:tidyr':
## 
##     extract
df <- as.data.frame(xtabs(Freq~Sat+Type+Infl, data=housing))
df$Infl <- factor(df$Infl, levels = c("Low","Medium", "High"),
                   labels = c("Infl: Low","Infl: Medium", "Infl: High"))
df$Sat <- factor(df$Sat, levels = c("Low","Medium", "High"),
                   labels = c("Sat: Low","Sat: Medium", "Sat: High"))
ggballoonplot(df, x = "Infl", y = "Sat", size = "Freq",
              fill = "Freq", facet.by = "Type",
              ggtheme = theme_bw()) +
  scale_fill_viridis_c(option = "C") +
  labs(title = "Satisfaction By Perceived Degree of Influence and Housing Type",
       y = "Satisfaction",
       x = "Perceived degree of influence")

From the fifth graph we could see that, influence level is positively correlated with the satisfaction level. This effect is most significant when the accomendation type is Terrace, and is least siginificant when the accomendation type is Tower and Atrium.