knitr::opts_chunk$set(echo = F,
fig.align = "center",
fig.height = 6,
fig.width = 8,
message = F,
warning = F)
# Loading the needed packages: ggfittext will help with adding long text to a graph
pacman::p_load(tidyverse, ggfittext)
# Reading in the tv episode data set
tv_shows <-
read.csv("TV Show Episodes.csv") |>
janitor::clean_names()
For this R script, we’ll be creating a tile plot (sometimes called a heat map) for episodes of different TV shows.
Let’s create a data set named episodes that just has episodes from Rick and Morty
Next, create a tile plot that has:
x =
season number (season_number
)y =
episode number (episode_number
)fill =
Episode rating (ep_rating
)using geom_tile()
. Save the result as
gg_RM
We can use different scale_fill_
functions to change how
the color representing rating is displayed.
scale_fill_viridis_c()
will use an arrange of colors on
a continuous (gradient) scale. Include limits = c(1, 10)
since IMDB ratings can be anywhere from a 1 to a 10
Sometimes we don’t necessarily want a continuous variable to be
displayed with a gradient. You can have ggplot
divide the
rating
column into groups using
scale_fill_stepsn()
to pick how many color “bins” to divide
the colors into, seen below:
The downside to using binned groups instead of a gradient for color or fill is you lose a little specificity (Aka, a 7.1 and and 7.9 will both be yellow, even though they are fairly far apart), but it does separate groups more clearly (a 7 is markedly different than an 8)
You can add the title of the episode to each tile by adding
geom_text(mapping = aes(label = episode))
to
gg_RM
Oh, doesn’t look like the text wants to stay in their respective tiles, especially if the episode title is long.
That’s where the ggfittext
package comes to the
rescue!
Copy and paste the code from the previous code chunk, but replace
geom_text()
with geom_fit_text()
Fits better, but some words are really small, and if the title is too long, it’s not added at all! So what’s happening?
By default, geom_fit_text()
won’t “wrap” the text around
to a new line, which would be really helpful for long titles. You can
include the reflow = T
argument to make the text start a
new line if it is too long. Additionally, you should always include
contrast = T
to make sure the color of the text is readable
regardless of the color of the background!
Unfortunately it’s not a perfect solution. If the tile is small and the text is long, it still won’t fit :(
Let’s create a tile plot for the average episode rating per season for 5 different dramas:
Start by calculating the average IMDB rating for each season. For example, Breaking Bad season 1 had an average of 8.7
## show season_number season_rating
## 1 The Sopranos 1 8.584615
## 2 The Sopranos 2 8.592308
## 3 The Sopranos 3 8.707692
## 4 The Sopranos 4 8.392308
## 5 The Sopranos 5 8.700000
## 6 The Sopranos 6 8.700000
## 7 Dexter 1 8.741667
## 8 Dexter 2 8.766667
## 9 Dexter 3 8.425000
## 10 Dexter 4 8.883333
## 11 Dexter 5 8.558333
## 12 Dexter 6 8.166667
## 13 Dexter 7 8.691667
## 14 Dexter 8 7.450000
## 15 Mad Men 1 8.153846
## 16 Mad Men 2 8.192308
## 17 Mad Men 3 8.453846
## 18 Mad Men 4 8.661538
## 19 Mad Men 5 8.615385
## 20 Mad Men 6 8.261538
## 21 Mad Men 7 8.621429
## 22 Breaking Bad 1 8.700000
## 23 Breaking Bad 2 8.784615
## 24 Breaking Bad 3 8.746154
## 25 Breaking Bad 4 8.969231
## 26 Breaking Bad 5 9.387500
## 27 Game of Thrones 1 8.970000
## 28 Game of Thrones 2 8.820000
## 29 Game of Thrones 3 8.940000
## 30 Game of Thrones 4 9.230000
## 31 Game of Thrones 5 8.700000
## 32 Game of Thrones 6 8.990000
## 33 Game of Thrones 7 9.028571
## 34 Game of Thrones 8 6.416667
Use the data set created above to create a tile plot with:
x =
Seasony =
TV Showfill =
average season ratingand include the average season rating in each tile, rounded to 1 decimal place