Data Cleaning
I wanted to take a closer look at this data to specifically learn about the Women’s Triathlon Event. After looking just at medalists for the Women’s Triathlon Event, I was surprised to learn that this event has only been a part of the Summer Olympic Games since 2000.
olympic_medalists <- olympics[!is.na(olympics$medal),]
olympic_medalists_women_triathlon <- olympic_medalists[olympic_medalists$event == "Triathlon Women's Olympic Distance",]
Plotting the Data
data = olympic_medalists_women_triathlon
ggplot(olympic_medalists_women_triathlon, aes(x = year, y = age, colour = medal)) +
theme(legend.title = element_blank())+
scale_colour_manual(labels = c("Bronze", "Gold", "Silver"),values = c("yellow", "blue", "red")) +
geom_point()+geom_line()+
xlab("Year") + ylab("Age")+
labs(title = "Age of Women's Triathlon Medalists from 2000-2016")

From this data, I can conclude that all medalists were younger in 2008 than other years of competition. I can also see that the oldest woman to win gold in the triathlon event was 34 and won in Australia 2004. I cna see the youngest woman to win gold was 27 and won in Beijing 2008. This data shows that women from 22-35 have all been able to place as medalists from 2000-2016. While all medalists were youngest in 2008 in Beijing, their ages have increased to the oldest group of medalists in 2016 Rio.