Data was collected from Tidy Tuesday R 07/27/21 This data is originally from Kaggle and is a historical dataset on the Modern Olympic Games. It should be noted that the winter and summer games were held the same year until 1992. This data contains information on athlete’s statistics, events, and medals.
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v tibble  3.1.2     v dplyr   1.0.7
## v tidyr   1.1.3     v stringr 1.4.0
## v readr   1.4.0     v forcats 0.5.1
## v purrr   0.3.4
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
## 
## -- Column specification --------------------------------------------------------
## cols(
##   id = col_double(),
##   name = col_character(),
##   sex = col_character(),
##   age = col_double(),
##   height = col_double(),
##   weight = col_double(),
##   team = col_character(),
##   noc = col_character(),
##   games = col_character(),
##   year = col_double(),
##   season = col_character(),
##   city = col_character(),
##   sport = col_character(),
##   event = col_character(),
##   medal = col_character()
## )
Data Cleaning
I wanted to take a closer look at this data to specifically learn about the Women’s Triathlon Event. After looking just at medalists for the Women’s Triathlon Event, I was surprised to learn that this event has only been a part of the Summer Olympic Games since 2000.
olympic_medalists <- olympics[!is.na(olympics$medal),]
olympic_medalists_women_triathlon <- olympic_medalists[olympic_medalists$event == "Triathlon Women's Olympic Distance",]
Plotting the Data
data = olympic_medalists_women_triathlon
ggplot(olympic_medalists_women_triathlon, aes(x = year, y = age, colour = medal)) +
theme(legend.title = element_blank())+
scale_colour_manual(labels = c("Bronze", "Gold", "Silver"),values = c("yellow", "blue", "red")) +
  geom_point()+geom_line()+
  xlab("Year") + ylab("Age")+
  labs(title = "Age of Women's Triathlon Medalists from 2000-2016")

From this data, I can conclude that all medalists were younger in 2008 than other years of competition. I can also see that the oldest woman to win gold in the triathlon event was 34 and won in Australia 2004. I cna see the youngest woman to win gold was 27 and won in Beijing 2008. This data shows that women from 22-35 have all been able to place as medalists from 2000-2016. While all medalists were youngest in 2008 in Beijing, their ages have increased to the oldest group of medalists in 2016 Rio.