Objectives

To submit this homework you will create the document in Rstudio, using the knitr package (button included in Rstudio) and then submit the document to your Rpubs account. Once uploaded you will submit the link to that document on Canvas. Please make sure that this link is hyperlinked and that I can see the visualization and the code required to create it.

Look at the data

str(housing)
## 'data.frame':    72 obs. of  5 variables:
##  $ Sat : Ord.factor w/ 3 levels "Low"<"Medium"<..: 1 2 3 1 2 3 1 2 3 1 ...
##  $ Infl: Factor w/ 3 levels "Low","Medium",..: 1 1 1 2 2 2 3 3 3 1 ...
##  $ Type: Factor w/ 4 levels "Tower","Apartment",..: 1 1 1 1 1 1 1 1 1 2 ...
##  $ Cont: Factor w/ 2 levels "Low","High": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Freq: int  21 21 28 34 22 36 10 11 36 61 ...

1. First plot

# place code for vis here
ggplot(housing, aes(x=Cont,y=Freq)) + 
  geom_point(shape=9) + 
  geom_point(aes(color = Type)) + 
  facet_grid(~Type) + 
  labs(y = 'Contacts with other residents', x = 'Type of Rental', title = 'Contacts with other residents based on the type of rental') 

#Summary: The plot shows the relationship between the number of contacts with other residents for each rental type. It can be observed that in apartments, more people have contacts with eachother whereas in Atriums comparitvely less people have contacts.

2. Second plot

# place code for vis here
sa_2 = housing %>%
  group_by(Infl,Type) %>%
  summarise(Freq = sum(Freq))


ggplot(sa_2, aes(x = Infl, y = Freq))+
  geom_bar(
    aes(fill = Type), stat = "identity", 
    position = position_dodge(0.8)
    )+ 
    facet_wrap(~Type) +
  labs(y = "Number of Residents",
       x = "Influence",
       title = "Influence vs Type of Rental")

#Summary: The plot shows the influence statistics based on the type of rental. From the plot, we can say that there are more number of people in the apartments who have Medium influence compared to the people who have low and High influence. In the terrace rental type, more residents have low influence followed by residents with medium and high degrees of influence.

3. Third plot

# place code for vis here
sa_3 = housing %>%
  mutate( satis=ifelse(Sat=="Low",1,ifelse(Sat=="High",3,2))* Freq
  )%>%
  group_by(Type,Infl)%>%
  summarise( 
    satis=sum(satis),
    Freq=sum(Freq)
  )%>%
  mutate(avg=round(satis/Freq,2))

ggplot(sa_3,aes(x=avg, y=Infl))+ 
         geom_point(aes(size=Freq,color=Freq)) +
           facet_grid(Type ~ .) +
         labs(x="Average Satisfaction Scores",
              y="Influence",
              title="Relationsip between Influence, Rental Type and Average Satisfaction")

#Summary: The plot show the relationship between Influence, Rental Type and Average Satisfaction Scores. It can be seen that highest number of people have average scores around 2.12 with medium influence and they live in Apartments whereas the least number of people compared to other classes in this comp who live in Terrace and have high influence and average satisfaction score of just above 2.25.

4. Fourth plot

# place code for vis here
sa_4 =  housing %>%
  mutate(
    satis = ifelse(Sat == 'Low', 1, ifelse(Sat == 'High', 5, 3)) * Freq
  ) %>%
  group_by(Type, Cont) %>%
  summarise(
    satis = sum(satis),
    Freq = sum(Freq)
  ) %>%
  mutate(avg = round(satis/ Freq, 2))

ggplot(sa_4, aes(avg, Cont)) +
  geom_point(aes(size = Freq, color = Freq)) +
  facet_grid(Type ~ .) +
 scale_colour_gradient(low = "yellow", high = "red") +
 labs(x="Average Satisfaction Scores",
              y="Contacts",
              title="Relationsip between Contacts, Rental Type and Average Satisfaction")

#Summary: This plot shows the relationship between Contacts, Rental Type and Average Satisfaction Scores. It can be seen that more number of people fall under the category who live in Apartments with High Contacts and an average satisfaction score of around 3.2 whereas the least number of people fall under the category who live in Terrace and low contacts and an average satisfaction score around 2.8.

5. Fifth plot

# place code for vis here
ggballoonplot(housing, x = 'Sat', y = 'Infl', size = 'Freq', facet.by = 'Type',
              fill = 'Freq', ggtheme = theme_classic2()) +
  scale_fill_viridis_c(option = 'D')  +
 labs(x="Satisfaction",
              y="Influence",
              title="Relationsip between Contacts, Rental Type and Satisfaction")

#Summary: This plot shows the relationship between Contacts, Rental Type and Satisfaction. It can be noted that highest number of people fall under the category who live in Apartment and Medium Influence and High Satisfaction level whereas the least number of people fall under the category who live in Terrace and High Influence and Low Satisfaction level.