Let’s take a dive into the baseball swing. This summer, the website Baseball Savant released a plethora of new data. This included a group of data called bat-tracking. with this data we can gain a better understanding of how the swing functions and what ideal pieces are of an elite hitter’s swing.
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Loading required package: airports
Loading required package: cherryblossom
Loading required package: usdata
Attaching package: 'openintro'
The following object is masked from 'package:modeldata':
ames
The following object is masked from 'package:dslabs':
murders
stats <-read_csv("Data/stats.csv")
Rows: 207 Columns: 37
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): last_name, first_name
dbl (36): player_id, year, player_age, ab, pa, hit, single, double, triple, ...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
read_file("pictures/aaron_judge.jpg")
[1] "\xff\xd8\xff\xe0"
read_file("pictures/Shohei_Ohtani.jpg")
[1] "\xff\xd8\xff\xe0"
How do we Judge player performance?
One of the wildly agreed upon metrics to determine a hitter’s ability to produce runs is OPS. This stands for on base + slugging and takes in to account both the players ability to get on base and hit for power. While the argument could be made that swing speed helps give the hitter more time to decide whether or not to swing, we will exclude on base and focus on the batter’s ability to slug.
Slugging
Slugging average amount of bases a player has in any given at bat. a single is 1.000, a home run is 4.000, and striking out is 0. Let’s take a glance at the top 5 in slugging during the 2024 regular season.
# A tibble: 5 × 2
Player `Slug Percentage`
<chr> <dbl>
1 Judge, Aaron 0.701
2 Ohtani, Shohei 0.646
3 Witt Jr., Bobby 0.588
4 Soto, Juan 0.569
5 Alvarez, Yordan 0.567
You may recognize some familiar faces. While we are going to dig deep into what may predict a players success it’s also helpful to see that the highest paid and most respected players are at the top our leader board.
Swing Speed
While slugging is a good metric, it only shows us the result of a players hitting. Baseball savants recently released metric “Swing Speed” tracks the speed of each players bat during any given swing. This helps us look at the players capacity for success.
We can remove luck by using Baseball Savants xslug, which may take away from the players ability to hit balls in gaps but uses an algorithm to predict the amount of base’s the batter will reach based on the balls exit velocity, launch angle, and players sprint speed.
As you can see we get a slightly higher correlation value. This is because metrics help mitigate the affect of luck a player may have. This being obsticales like facing above average fielders or player on a larger or smaller field causing homeruns to be fly outs and vice versa.
Swing speed’s relationship with other metrics
Let’s take a quick glance at other metrics and see if it matches what we have looked at so far.
Now that we have collected a few more statistics we can better our approach at finding the “ideal” swing. While its not perfect we can see that exit velocity has a high correlation to swing speed which checks out.
We can also see a moderate connection between the length of the swing and its speed which also makes sense from a physics stand point. The faster the bat travels, the less time it takes to complete the task of swinging. However, this correlation is not 100% which means some players may sacrifice swing speed in order to load more and swing harder. For example, if you were to punch one of the arcade punching bags starting a few inches from the bag as fast as possible, not caring about the score, you would move directly forward and lose some speed and power. Now, try again starting a few inches from the bag but swing as hard as possible, it makes sense that you would retract the arm first to add some speed to your fist before hitting the bag. The same goes for a player deciding how hard to swing. This is supported by the data we have collected that shows that as swing length increases swing length goes up in fact, nearly 30% of swing speed is determined by swing length.
#is this too much writing and should it be explained in the conclusion?
#Final stuff to figure out: take the top 50% somehow of both swing speed and swing length and compare to groups with only one of the two characteristics
We can take a few things from the previous two tables. Firstly, we can see that being in the top 50% in swing length is much less advantageous than being top 50% in swing speed. Furthermore, we looked into deeper into swing speed and using a linear regression model we can predict that as swing speed goes up 1 mph xslg goes up 24.5 points. This can help us determine where a player is at in terms of the rest of the field. For example, lets take Jorge Polanco. His average xSlugging is .426. We can use the model to predict his expected swing speed.
61.33+ .426*24.53
[1] 71.77978
his actually average swing speed in 2024 was 69.7 which is well below the projected 71.7. This means he likely excels in other areas of hitting and may benefit from an off-season of rotational focus. Throwing medicine balls or under load swinging may help squeeze out some extra velocity in his swing aiding his already solid Slugging.
Conclusion
Its important to not that while we discovered a lot about the swing in this project, there are so many other aspects to a hitters swing. We can quantify almost all of them from plate discipline to contact quality. What we have determined is that physics works and that by swinging harder we have the potential to hit the ball harder and farther. We also discovered that their is a limit to that potential as a hitter can not “sell out” for swing speed by lengthening their swing.
News, RNZ. “Shohei Ohtani Makes Major League Baseball History.” RNZ, RNZ, 20 Sept. 2024, www.rnz.co.nz/news/sport/528539/shohei-ohtani-makes-major-league-baseball-history.
Witz, Billy. “How Aaron Judge Built Baseball’s Mightiest Swing.” The New York Times, The New York Times, 17 July 2017, www.nytimes.com/2017/07/17/sports/baseball/how-aaron-judge-built-baseballs-mightiest-swing.html.