install.packages("gganimate")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.5'
## (as 'lib' is unspecified)
install.packages("gifski")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.5'
## (as 'lib' is unspecified)
install.packages("magick")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.5'
## (as 'lib' is unspecified)
install.packages(c("gganimate", "gifski", "magick", "tidyverse"))
## Installing packages into '/cloud/lib/x86_64-pc-linux-gnu-library/4.5'
## (as 'lib' is unspecified)
library(gganimate)

library(gifski)

library(magick)
## Linking to ImageMagick 6.9.10.23
## Enabled features: fontconfig, freetype, fftw, lcms, pango, webp, x11
## Disabled features: cairo, ghostscript, heic, raw, rsvg
## Using 16 threads

This is the dataset you will be working with:

olympics <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-07-27/olympics.csv')



olympics_alpine <- olympics %>% 
  filter(!is.na(weight)) %>%             # only keep athletes with known weight
  filter(sport == "Alpine Skiing") %>%   # keep only alpine skiers
  mutate(
    medalist = case_when(                # add column to 
      is.na(medal) ~ FALSE,              # NA values go to FALSE
      !is.na(medal) ~ TRUE               # non-NA values (Gold, Silver, Bronze) go to TRUE
    )
  )

olympics_alpine is a subset of olympics and contains only the data for alpine skiers. More information about the original olympics dataset can be found at https://github.com/rfordatascience/tidytuesday/tree/master/data/2021/2021-07-27/readme.md and https://www.sports-reference.com/olympics.html.

For this project, use olympics_alpine to answer the questions 1, 2 and 3, about the weights of alpine skiers. For Question 4, you should use the FULL dataset, olympics

  1. Are there weight differences for male and female Olympic skiers who were successful or not in earning a medal?

The violin plot shows that male alpine skiers tend to weigh more than female skiers, with both groups displaying a roughly symmetric distribution centered near their medians. Medalists on the right facet appear to cluster around slightly higher mean weights compared to non-medalists, particularly among men. This may suggest that higher body mass, potentially associated with greater strength and momentum, provides a competitive advantage in alpine events.

  1. Are there weight differences for skiers who competed in different alpine skiing events?

From the box plots, there are clear differences appear across alpine events. Men’s events consistently show higher median weights (around 75–85 kg), while women’s events center closer to 60 kg. Events emphasizing speed and power, such as Downhill and Super G—show higher weights, whereas technical events like Slalom show lower averages. This pattern matches the physical demands of each event, where greater mass can improve stability and acceleration on long, high-speed courses.

  1. How has the weight distribution of alpine skiers changed over the years?

The box plot over time illustrates a steady increase in athlete weight from the early 1900s to the modern era. This trend likely reflects improved training, nutrition, and equipment technology, as well as professionalization of the sport. The distribution also becomes wider in recent decades, suggesting greater specialization and diversity among athlete builds across events.

  1. Create ANY interesting animation from the full olympics dataset.

The animation expands the analysis beyond alpine skiing, revealing that average athlete weights vary substantially across sports. Strength-oriented sports like wrestling and weightlifting, show the heaviest competitors, while endurance sports like running and gymnastics remain much lighter. Over time, nearly all sports demonstrate slight upward trends in average weight,showing the long-term pattern seen in alpine skiing.

You should make one plot per question.

Hints:

 + facet_wrap(
    # your other arguments to facet_wrap() go here
    ...,
    # this replaces "TRUE" with "medaled" and "FALSE" with "did not medal"
    labeller = as_labeller(c(`TRUE` = "medaled", `FALSE` = "did not medal"))
  )

Introduction: Your introduction here. The project explores the relationship between athlete weight and performance outcomes among Olympic Alpine Skiing competitors using data from the rfordatascience, TidyTuesday, and the Olympics dataset, which includes athletes from 1896 to 2016. The dataset contains detailed information such as each athlete’s sex, age, height, weight, nationality, sport, event, year, and medal results. I created a subset of the data containing only Alpine Skiing athletes with recorded weight values. By focusing on this group, I could better understand how factors such as sex, event type, and time period relate to the physical characteristics of successful skiers. The goal is to identify trends in skier weight distributions and explore how these may have evolved throughout the history of the Winter Olympics.

Approach: Your approach here. My analysis was conducted entirely using tidyverse. The subset of the dataset used olympics_alpine which includes only alpine skiers with known weights, allowing for consistent comparisons. Several variables were of primary interest, including weight, sex, event, year, and medalist, which together help describe how physical characteristics vary across demographics, competition types, and success levels.

I used ggplot2 to create clear and informative data visualizations that reveal relationships within the data. I used violin plots that displayed differences in weight distribution between male and female athletes, separated by medal status. I used boxplots to examine how weights vary across different alpine skiing events, as well as to see how the overall distribution of weights has changed over time.I included axis labels, titles, and consistent color schemes to improve interpretability in each plot.

Finally, the full Olympics dataset was used to create an animated visualization with gganimate, illustrating how average athlete weights across all sports have evolved over time. This broader view highlights general patterns of athlete physique over decades and situates alpine skiing trends within the wider context of Olympic history. By combining static and animated visualizations, this approach provides both detailed insight into alpine skiing specifically and a universal understanding of changing athlete characteristics across the Olympics.

Analysis:

ggplot(olympics_alpine, aes(x = sex, y = weight, fill = medalist)) +
geom_violin(trim = FALSE) +
labs(title = "Weight Differences of Alpine Skiers by Sex and Medal Status",
x = "Sex", y = "Weight (kg)") +
facet_wrap(~medalist, labeller = as_labeller(c(`TRUE` = "Medaled", `FALSE` = "Did Not Medal"))) +
theme_minimal()

ggplot(olympics_alpine, aes(x = event, y = weight, fill = event)) +
geom_boxplot() +
labs(title = "Weight Distribution by Alpine Skiing Event",
x = "Event", y = "Weight (kg)") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))

ggplot(olympics_alpine, aes(x = year, y = weight, group = year)) +
geom_boxplot(fill = "skyblue") +
labs(title = "Evolution of Alpine Skiers' Weight Over Time",
x = "Year", y = "Weight (kg)") +
theme_minimal()

library(gganimate)
library(gifski)
library(magick)

anim <- olympics |>
filter(!is.na(weight)) |>
group_by(year, sport) |>
summarize(mean_weight = mean(weight), .groups = "drop") |>
ggplot(aes(x = reorder(sport, mean_weight), y = mean_weight, fill = sport)) +
geom_col(show.legend = FALSE) +
coord_flip() +
labs(title = "Average Athlete Weight by Sport — Year: {frame_time}", x = "Sport", y = "Mean Weight (kg)") +
theme_minimal() +
transition_time(year) +
ease_aes("linear")

animate(anim, renderer = gifski_renderer())

anim_save("Ryan_Jahn_Project1.gif")

Discussion: Your discussion of results here. The results show clear differences in weight between male and female alpine skiers, with men being consistently heavier on average. Medal-winning athletes tend to weigh slightly more than non-medalists, which may suggest that greater body mass and strength provide an advantage in generating momentum and stability during high-speed skiing events. When comparing across events, factors such as Downhill and Super-G feature higher average weights, while women’s events show lower averages, reflecting the distinct physical demands of each competition type.

Over time, there has been a noticeable increase in the average weight of alpine skiers, likely influenced by advances in athletic training, nutrition, and equipment technology. The animation of the broader Olympic dataset supports this trend, revealing that athletes across most sports have grown heavier and more powerful through the decades. Overall, these findings demonstrate that athlete weight is influenced by both biological and sport-specific factors and that evolving performance standards have shaped the physical profiles of Olympic competitors over time.