Assignment 7
Topic
Has strikeout rate increased over time across MLB, and how does it relate to ERA and FIP? That was the question I asked myself when doing this research. I was interested to see if the hype around strikeout pitchers was validated by other performance stats independent of strikeouts. I figured ERA and FIP were good metrics for this. ERA is the average amount of earned runs a pitcher gives up per 9 innings, and FIP is a pitching metric that is independent of fielding, but scaled similarly to ERA. I chose to get my data via Baseball Reference using webscraping and I plan to make visuals that compare ERA and FIP with SO9 (strikeouts per 9 innings) to see how much value strikeouts really hold.
Importing Data and Libraries
Data Wrangling
The data needed a bit of cleanup before it was ready to analyze. Baseball Reference includes some extra rows in its tables, like repeated headers and a league average summary row, so those were removed first. Since rvest reads everything as text, all the stat columns had to be converted to numbers so R could actually work with them. K% and BB% were also calculated manually by dividing strikeout and walk totals by batters faced, since Baseball Reference doesn’t include those directly. Lastly, any rows with missing values in key columns were dropped to keep the analysis clean.
MLB Average Strikeouts Per 9 Innings (2015-2024)
This graph is looking at the average strikeouts per 9 innings metric for each season. We can see that from 2015-2020, this metric has steadily gone up, then lowered a touch up until 2024. Looking at that 2020 number, it worth noting that the 2020 season had only 60 games due to COVID, instead of the normal 162. This may be why that number is so inflated.
Distribution of Team SO9 by Season (2015-2024)
This graph is similar to the last one, but it shows more information about the distribution of the metric. Bigger boxes (like 2017 and 2022) show that there was a lot of deviation from the mean, meaning that there may be some outliers. Smaller boxes (like 2023 and 2024) show that teams in those years had relatively similar strikeout numbers, and not many teams deviated too far from the mean.
Team SO9 vs ERA (2015-2024)
`geom_smooth()` using formula = 'y ~ x'
This is an interesting graph that shows the relationship between ERA and SO9 (strikeouts per 9 innings). Although the trend line has a negative slope, the scatterplot shows no strong relationship between the two metrics. This means that a pitcher known for getting a lot of strikeouts should not necessarily be valued higher than non-strikeout pitchers, as there is no correlation with the amount of runs they give up.
MLB Average ERA vs FIP Over Time (2015-2024)
This graph shows ERA and FIP for each season. We can see that ERA and FIP are nearly the same throughout this graph, but that isn’t the most interesting part. If we recall the graph with SO9 for each season, it looks nothing like this at all. This further shows how strikeouts do not directly lead to a successful pitching outing (allowing less runs).
Top 10 Team-Seasons by SO9 (2015-2024)
Lastly, I wanted to take a look at the teams that had the 10 best SO9 seasons throughout this stretch of time. It is interesting to see that all of these teams came from 2021 or before. This could mean that teams aren’t trying to load up on strikeout pitchers as much as they used to, as it looks like that value may be decreasing.