Introduction to ggplot

This file will walk you through the basic ggplots, as well as show ways of making them more attractive. It corresponds to the notes:

ggplot 1 - introduction

Bar

What game has sold the fewest copies? Use geom_bar.

What game has the lowest median price? Use geom_col. Hint: create a new tibble called t2, group by game, and then use summarise to create a new column showing median price.

Smoothed line

What game has had the most price changes over time? Use geom_smooth

What game has had the most price changes each year? Use geom_line. Hint: create a new tibble called t_line that groups by year and game, and then shows the median price.

Point

What game has the most outlier values that could throw off our analysis? Use geom_point, and add some alpha transparency to make it easier to see over-plotting. Also try adding a little bit of jitter to see if that helps your results.

Point + Line

Combining plots is also very helpful. First copy+paste your geom_point code, and then add geom_smooth on top of it as a second call. Set geom_smooth to not print the error bars by setting se to FALSE.

Density / Histogram

Show some plots that show the range of prices for our data. Try a histogram, density plot, and boxplot for each game. Note that the histogram will probably need facets to show game.

Point + Line + Facet

Combining plots is also very helpful. Tweak the output from point + line to only include Heat, Gloomhaven, and Pandemic. Then, add facets by game.

ggplot 2 - Making a plot pretty

line

Add a horizontal line at the median price value, and a yellow rectange that goes from:

  • x: from min, to max date
  • y: from median price - standard deviation, to median price + standard deviation

Labels

Modify the following plot to have labels for the title, subtitle, caption, x, and y.

Limits

Set limits for the plot, going from 0 to 200 for y, and 2021-2022 for the x.

Hint: use mdy to create the start/end dates.

Scales

Add scales. Use scale_x_date and scale_y_continuous.