Assignment 7: Cincinnati Reds Offensive Production

## Question

This project asks: **How has Cincinnati Reds offensive production changed across recent seasons, and which players contributed most to those changes?**

I chose this topic because baseball performance is easy to evaluate using statistics such as home runs, RBI, OPS, and plate appearances. The Reds are also a useful team to study because their offense has changed across recent seasons as different young players have entered the roster.

## Data Collection Method

The data was collected from Baseball Reference team batting pages for the Cincinnati Reds. I scraped the batting table from multiple seasons using a separate `.R` script. The scraping script used a function to collect one season at a time and a loop to repeat the same process across multiple seasons.

The final scraped dataset was saved as a CSV file and then imported into this Quarto document. This keeps the scraping code separate from the final report.

library(tidyverse)
library(scales)

reds_data <- read_csv("reds_data.csv")
## Data Cleaning

reds_clean <- reds_data %>%
  rename(Name = Player) %>%
  filter(!is.na(Name)) %>%
  filter(Name != "Team Totals") %>%
  mutate(
    year = as.numeric(year),
    Name = as.character(Name),
    PA = as.numeric(PA),
    HR = as.numeric(HR),
    RBI = as.numeric(RBI),
    R = as.numeric(R),
    H = as.numeric(H),
    OPS = as.numeric(OPS),
    BA = as.numeric(BA)
  )
## Analysis 1: Total Home Runs by Season
reds_clean %>%
  group_by(year) %>%
  summarise(total_HR = sum(HR, na.rm = TRUE)) %>%
  ggplot(aes(x = factor(year), y = total_HR)) +
  geom_col(fill = "#C6011F") +
  labs(
    title = "Total Reds Home Runs by Season",
    x = "Season",
    y = "Total Home Runs"
  ) 

This chart shows whether the Reds’ power production increased or decreased across seasons. A higher bar means the team produced more home runs in that season.

## Analysis 2: Total RBI by Season
reds_clean %>%
  group_by(year) %>%
  summarise(total_RBI = sum(RBI, na.rm = TRUE)) %>%
  ggplot(aes(x = factor(year), y = total_RBI)) +
  geom_col(fill = "#C6011F") +
  labs(
    title = "Total Reds RBI by Season",
    x = "Season",
    y = "Total RBI"
  ) +
  theme_minimal()

RBI helps show how often Reds hitters drove in runs. If RBI rises over time, that provides evidence that the offense became more productive.

## Analysis 3: Average OPS by Season

reds_clean %>%
  group_by(year) %>%
  summarise(avg_OPS = mean(OPS, na.rm = TRUE)) %>%
  ggplot(aes(x = factor(year), y = avg_OPS)) +
  geom_col(fill = "#C6011F") +
  labs(
    title = "Average Reds OPS by Season",
    x = "Season",
    y = "Average OPS"
  ) +
  theme_minimal()

OPS is useful because it combines on-base ability and power. A higher average OPS suggests stronger overall offensive performance.

## Analysis 4: Top 10 Reds Players by Home Runs

reds_clean %>%
  group_by(Name) %>%
  summarise(total_HR = sum(HR, na.rm = TRUE)) %>%
  arrange(desc(total_HR)) %>%
  slice_head(n = 10) %>%
  ggplot(aes(x = reorder(Name, total_HR), y = total_HR)) +
  geom_col(fill = "#C6011F") +
  coord_flip() +
  labs(
    title = "Top 10 Reds Players by Total Home Runs",
    x = "Player",
    y = "Total Home Runs"
  ) +
  theme_minimal()

This chart identifies the individual player-seasons that contributed the most power.

## Analysis 5: Relationship Between Plate Appearances and Home Runs
reds_clean %>%
  filter(PA < 600) %>%  
  ggplot(aes(x = PA, y = HR)) +
  geom_point(alpha = 0.6, color = "#C6011F") +
  geom_smooth(method = "lm", se = FALSE, color = "black") +
  labs(
    title = "Relationship Between Plate Appearances and Home Runs",
    x = "Plate Appearances",
    y = "Home Runs"
  ) +
  theme_minimal()

This scatterplot shows whether players with more opportunities at the plate tended to hit more home runs. The trend line helps show the general relationship.

## Summary Table

reds_clean %>%
  group_by(year) %>%
  summarise(
    total_HR = sum(HR, na.rm = TRUE),
    total_RBI = sum(RBI, na.rm = TRUE),
    avg_OPS = mean(OPS, na.rm = TRUE),
    avg_BA = mean(BA, na.rm = TRUE)
  )
# A tibble: 5 × 5
   year total_HR total_RBI avg_OPS avg_BA
  <dbl>    <dbl>     <dbl>   <dbl>  <dbl>
1  2021      444      1512   0.459  0.159
2  2022      156       618   0.572  0.205
3  2023      198       747   0.708  0.240
4  2024      174       663   0.558  0.192
5  2025      167       677   0.637  0.221

Conclusion

Overall, this analysis evaluates Reds offensive production using several key batting statistics. Home runs and RBI show the team’s run-producing ability, while OPS gives a broader view of offensive quality. The player-level chart identifies which hitters had the strongest individual seasons, and the scatterplot shows how playing time relates to power production.

This evidence can be used to evaluate whether the Reds offense improved or declined across the seasons studied.