In this exercise you will learn to plot data using the ggplot2 package. To this end, you will make your own note of 4.1 Categorical vs. Categorical from Data Visualization with R.

# Load package
library(tidyverse)
## ── Attaching packages ───────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.2.1     ✔ purrr   0.3.2
## ✔ tibble  2.1.3     ✔ dplyr   0.8.3
## ✔ tidyr   0.8.3     ✔ stringr 1.4.0
## ✔ readr   1.3.1     ✔ forcats 0.4.0
## ── Conflicts ──────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
# Load data
data(SaratogaHouses, package="mosaicData")
glimpse(SaratogaHouses)
## Observations: 1,728
## Variables: 16
## $ price           <int> 132500, 181115, 109000, 155000, 86060, 120000, 1…
## $ lotSize         <dbl> 0.09, 0.92, 0.19, 0.41, 0.11, 0.68, 0.40, 1.21, …
## $ age             <int> 42, 0, 133, 13, 0, 31, 33, 23, 36, 4, 123, 1, 13…
## $ landValue       <int> 50000, 22300, 7300, 18700, 15000, 14000, 23300, …
## $ livingArea      <int> 906, 1953, 1944, 1944, 840, 1152, 2752, 1662, 16…
## $ pctCollege      <int> 35, 51, 51, 51, 51, 22, 51, 35, 51, 44, 51, 51, …
## $ bedrooms        <int> 2, 3, 4, 3, 2, 4, 4, 4, 3, 3, 7, 3, 2, 3, 3, 3, …
## $ fireplaces      <int> 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ bathrooms       <dbl> 1.0, 2.5, 1.0, 1.5, 1.0, 1.0, 1.5, 1.5, 1.5, 1.5…
## $ rooms           <int> 5, 6, 8, 5, 3, 8, 8, 9, 8, 6, 12, 6, 4, 5, 8, 4,…
## $ heating         <fct> electric, hot water/steam, hot water/steam, hot …
## $ fuel            <fct> electric, gas, gas, gas, gas, gas, oil, oil, ele…
## $ sewer           <fct> septic, septic, public/commercial, septic, publi…
## $ waterfront      <fct> No, No, No, No, No, No, No, No, No, No, No, No, …
## $ newConstruction <fct> No, No, No, No, Yes, No, No, No, No, No, No, No,…
## $ centralAir      <fct> No, No, No, No, Yes, No, No, No, No, No, No, No,…

Q1 Stacked bar chart Plot the relationship between newContruction and heating type.

ggplot(SaratogaHouses,
       aes(x = newConstruction,
           fill = heating)) +
    geom_bar(position = "stack")

Q2 What is the most common heating system overall? Discuss your reason.

Hot air is the most common heating system overall.

Q3 Grouped bar chart Plot the relationship between newContruction and heating type.

ggplot(SaratogaHouses,
       aes(x = newConstruction,
           fill = heating)) +
    geom_bar(position = "dodge")

Q4 Segmented bar chart Plot the relationship between newContruction and heating type.

 ggplot(SaratogaHouses,
       aes(x = newConstruction,
           fill = heating)) +
    geom_bar(position = "fill")

Q5 In which type of houses (new or old) is the proportion of hot air heating system higher? Discuss your reason.

In the newer houses the hot air system is most popular. Older homes also have hot air system just not as many as newer.

Q6 Rename the construction type as new and old.

ggplot(SaratogaHouses, 
       aes(x = factor(newConstruction,
                         labels = c("new", 
                                    "old")), 
           fill = heating)) + 
  geom_bar(position = "fill") +
  labs(y = "Proportion")

Q7 Add labels to the axes.

Hint: See the code in 4.1.4 Improving the color and labeling.

ggplot(SaratogaHouses, 
       aes(x = factor(newConstruction,
                         labels = c("new", 
                                    "old")), 
           fill = heating)) + 
  geom_bar(position = "fill") +
  labs(y = "Percent", 
       fill = "Heating",
       x = "Home Types",
       title = "Heating with New & Old Homes")

Q8 Hide the messages and the code, but display results of the code from the webpage.

Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.

Q9 Display the title and your name correctly at the top of the webpage.

Q10 Use the correct slug.