This notebook demonstrates the use of dplyr and ggplot2 in creating plots of aggregates.

First get the libraries, filter the diamonds dataframe and create the ppc variable. Save this as d.

library(tidyverse)
## Loading tidyverse: ggplot2
## Loading tidyverse: tibble
## Loading tidyverse: tidyr
## Loading tidyverse: readr
## Loading tidyverse: purrr
## Loading tidyverse: dplyr
## Conflicts with tidy packages ----------------------------------------------
## filter(): dplyr, stats
## lag():    dplyr, stats
diamonds %>% 
  filter(x>0, y>0, z>0) %>%
  mutate(ppc = price/carat) -> d

Now do a plot of two categorical variables. There are six possibilities.

d %>% 
  group_by(cut,color) %>% 
  summarize(meanPPC = mean(ppc),
            ndmnds = n()) %>% 
  ggplot(aes(x=cut,y=color)) +
     geom_point(aes(size = ndmnds,color = meanPPC))

Try another combination of variables.