knitr::opts_chunk$set(echo = F,
                      fig.align = "center",
                      fig.height = 6,
                      fig.width = 8,
                      message = F,
                      warning = F)

# Loading the needed packages: ggfittext will help with adding long text to a graph
pacman::p_load(tidyverse, ggfittext)

# Reading in the tv episode data set
tv_shows <- read.csv("TV Show Episodes.csv")

For this R script, we’ll be creating a tile plot (sometimes called a heat map) for episodes of different TV shows.

My girlfriend fiance and I have been watching Mad Men recently. So we’ll start by using it as our example.

Let’s create a data set named episodes that just has episodes from Mad Men

Next, create a tile plot that has:

using geom_tile(). Save the result as gg_MM

You can add the title of the episode to each tile by adding geom_text(mapping = aes(label = episode)) to gg_MM

Oh, doesn’t look like the text wants to stay in their respective tiles, especially if the episode title is long.

That’s where the ggfittext package comes to the rescue!

Copy and paste the code from the previous code chunk, but replace geom_text() with geom_fit_text()

Fits better, but some words are really small, and if the title is too long, it’s not added at all! So what’s happening?

By default, geom_fit_text() won’t “wrap” the text around to a new line, which would be really helpful for long titles. You can include the reflow = T argument to make the text start a new line if it is too long. Additionally, you should always include contrast = T to make sure the color of the text is readable regardless of the color of the background!

Unfortunately it’s not a perfect solution. If the tile is small and the text is long, it still won’t fit :(

Tile plots for summarized data

Let’s create a tile plot for the average episode rating per season for 5 different shows:

  1. Mad Men
  2. Breaking Bad
  3. Dexter
  4. The Sopranos
  5. Game of Thrones

Start by calculating the average IMDB rating for each season. For example, Breaking Bad season 1 had an average of

## # A tibble: 34 × 3
##    show         season season_rating
##    <chr>         <int>         <dbl>
##  1 Breaking Bad      1          8.7 
##  2 Breaking Bad      2          8.79
##  3 Breaking Bad      3          8.74
##  4 Breaking Bad      4          8.96
##  5 Breaking Bad      5          9.38
##  6 Dexter            1          8.74
##  7 Dexter            2          8.78
##  8 Dexter            3          8.44
##  9 Dexter            4          8.89
## 10 Dexter            5          8.57
## # ℹ 24 more rows

Use the data set created above to create a tile plot with:

and include the average season rating in each tile, rounded to 1 decimal place