For this week, my goals were:
library(tidytuesdayR)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.3 ✓ purrr 0.3.4
## ✓ tibble 3.1.2 ✓ dplyr 1.0.6
## ✓ tidyr 1.1.3 ✓ stringr 1.4.0
## ✓ readr 1.4.0 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
records <-read.csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-05-25/records.csv')
summary(records)
## track type shortcut player
## Length:2334 Length:2334 Length:2334 Length:2334
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## system_played date time_period time
## Length:2334 Length:2334 Length:2334 Min. : 14.59
## Class :character Class :character Class :character 1st Qu.: 39.03
## Mode :character Mode :character Mode :character Median : 86.19
## Mean : 90.62
## 3rd Qu.:120.16
## Max. :375.83
## record_duration
## Min. : 0.0
## 1st Qu.: 6.0
## Median : 51.0
## Mean : 220.8
## 3rd Qu.: 198.8
## Max. :3659.0
ggplot(records) + geom_point(aes(x = track, y = time))
After creating a scatterplot, I thought that the data would be best suited using a boxplot:
ggplot(records) + geom_boxplot(aes(x = track, y = time))
I wasn’t a huge fan of how congested the x-axis was, so I decided to switch the axis to make it more clear!
ggplot(records) + geom_boxplot(aes(x = time, y = track))
I also added the variable ‘shortcut’, to show whether a track had a shortcut, and whether race time was affected by such shortcuts:
ggplot(records) + geom_boxplot(aes(x = time, y = track, fill = shortcut))