library(tidyverse)
library(jsonlite)
library(magrittr)
library(httr)
library(rvest)
library(chromote)
pitching_stats<-
read.csv("pitching.csv")
pitching_stats %>%
group_by(Year) %>%
summarise(Total.Stolen.Bases = sum(SB)) %>%
ggplot(aes(x = Year, y = Total.Stolen.Bases)) +
geom_col() +
labs(title = "Stolen Bases Given Up By Qualified Pitchers 2018-2025",
y = "Stolen Bases",
x = "Year")Assignment 7
Introduction
I will be looking at how some of the pitching stats have been effected by the pitch clock rules. The rules were integrated before the 2023 season and they say that once th pitcher gets the ball back from the catcher they have 20 seconds to pitch the ball or it is an automatic ball. Another aspect of the rules the MLB put into place was the pitcher is allowed to step off the mound only twice for each plate appearance. If they use more than 2 the runners advance a base. I would think this would cause the pitchers to pick off less and have less control over the run game and this will be one of the things I will be trying to find out.
The data that I will be using is scraped from Baseball Savant. It includes data on pitchers from 2018-2019 and 2021-2025 that hit the qualifying innings threshold so they will pretty much all be starters. I did not include 2020 because that was the Covid-19 season and they shortened that season. The dataset includes player name, year, batters faced, home runs, strikeouts, walks, earned runs, run, wild pitches, stolen bases, pickoff attempts at first, and pickoff attempts at first resukting in an out. I will be exploring how some of these variables have changed due to the rule changes in 2023 so I will be able to see a before and after for the rule change.
Analysis
I am going to start by comparing stolen base totals by year:
Its hard to tell here how much of an effect the rule change had on stolen bases seeing that totals were already increasing over previous seasons however the highest stolen base total in the graph was in 2024 which was the 2nd season the rule was in effect. As a result I do think the rule change did have some effect on the stolen base increase but I do not think it was the only reason for it.
Next I am going to look at how pickoff attempts have changed.
pitching_stats %>%
group_by(Year) %>%
summarise(Total.Pickoff.Attempt.1B = sum(Pickoff.Attempt.1B)) %>%
ggplot(aes(x = Year, y = Total.Pickoff.Attempt.1B)) +
geom_col() +
labs(title = "Total Pickoff Attempts to 1B By Qualified Pitchers 2018-2025",
y = "Total Pickoff Attempts to 1B",
x = "Year")There was a dip in pickoff attempts after the rules were put into place but surprisingly there was an even bigger drop in attempts after Covid-19. I wonder what caused this because there are no rule changes that caused this but maybe it was just a change in strategy that caused it to drop like it did.
For my final visualization I am going to move more towards the pitch clock rule to see if the pitchers are being hurried and therefore more likely to make a mistake. I will be looking at the change in total home runs by year.
pitching_stats %>%
group_by(Year) %>%
summarise(Total.Homeruns = sum(HR)) %>%
ggplot(aes(x = Year, y = Total.Homeruns)) +
geom_col() +
labs(title = "Total Homeruns Given up By Qualified Pitchers 2018-2025",
y = "Total Homeruns",
x = "Year")This is pretty similar to the stolen bases graph in that the numbers dropped after Covid-19 and has been increasing since then. I do not think that the pitch clock rules really effect home runs that much. Pitchers probably got accustomed to it quick so it would not effect their pitching.
Conclusion
The pitch clock and pickoff rules have had small effects on the numbers pitchers are putting up. I think that the rules did not effect much outside of base running. There are other explanations that could be why the graphs look the way they do. One of which could be because starting pitchers are shortening their outings and there are more injuries so there are fewer pitchers hitting the qualifying mark for innings.