Exploring Baseball Statistics - Traditional vs. “Next Gen”
Introduction
Some of the best-known baseball statistics, such as batting average or RBIs, are useless. Or, at least, so claim many baseball statisticians. These stats supposedly do not accurately represent the value a hitter provides to his team. So how to the traditional, “useless” stats relate to those developed more recently? This is a constant argument among baseball fans, and thus is deserving of greater exploration.
This project uses 2019 hitting statistics from 207 major leaguers, all of those with more than 400 plate appearances in that year. It seeks to draw connections between traditional hitting stats and “next gen” stats to look for relationships between the two and to find ways in which certain stats best describe which players are the best.
All data was pulled from the stats leaderboards on fangraphs.com. As mentioned, the stats are from 2019, since 2020 was an abbreviated 60-game season due to the COVID pandemic. The table below allows for perusal of the dataset.
Data Dictionary
The table below provides explanation for the statistics used in this analysis, in order to allow more casual baseball fans to understand the various metrics used throughout. Note that this dictionary does not encompass every statistic in the dataset, but covers all of those used in the proceeding analysis.
Statistic | Explanation |
---|---|
AVG | Batting Average; number of hits divided by number of at-bats |
wOBA | weighted On Base Average; a rate statistic which credits hitters for the value of an outcome rather than counting all hits as equivalent |
wRC+ | weighted Runs Created; standardized statistic that takes runs created and relates to league-wide performance; 100=league average, 120 = 20% above average, etc. |
WAR | Wins Above Replacement; a measure of player’s value based on number of wins they add over the course of a season compared to a replacement-level player |
EV | Exit Velocity; the average speed with which the ball leaves the bat when hit by the player |
Barrel. | Barrel Rate; approximates % of hits in which are struck squarely by the batter; has to meet certain criteria for exit velocity and launch angle |
HardHit. | Hard Hit Rate; the percentage of a player’s hits which are classified as “hard hit”, meaning having an exit velocity of 95+ MPH |
xBA | expected Batting Average; the batting average predicted by a variety of factors, including quality and location of hits by the batter |
Analysis
To begin our analysis, one of the hottest statistics up for discussion today is exit velocity, a measure of how hard a player hits the ball. Meanwhile, the traditional benchmark for hitters was the batting average. What type of relationship exists between these variables?
So clearly, no strong relationship exists. A player’s average velocity off the bat can not directly predict that player’s batting average.
What about barrel rate? What are its predictive strengths for a player’s batting average?
Once again, not a whole lot. We can therefore be certain that the quality of contact a player is making doesn’t really pay off with hitting better - at least in terms of batting average.
But this is odd, because simple logic would tell us that hitting the ball harder would tend to lead to better results.
So what about the newer statistics used to judge hitters, such as WRC+ or wOBA? How do those relate to the same statistics? Let’s first look at exit velocity as a predictor for wRC+ and wOBA.
Next, barrel rate as a predictor for wRC+ and wOBA.
Suddenly, much more obvious patterns emerge. While they aren’t perfectly linear, it is obvious that batters who hit the ball harder and make contact on the barrel more often post higher wRC+ and wOBA numbers. Considering that these two variables are not part of the calculation for either wRC+ or wOBA, this is significant.
This begins to show the disconnect between traditional vs. modern baseball stats. In batting average, a hit is a hit, while with wRC+ and other new stats, more valuable hits are given, well, more value. Thus, hitting the ball harder leads to more valuable hits, creating a higher wRC+ as shown here.
A similar disconnect can be shown with hard hit rate.
Let’s also take a look at the players considered to be the best by each statistic, to see if there is any consistency across the new and old metrics.
Name | AVG | wRC. | WAR.rank |
---|---|---|---|
Tim Anderson | 0.335 | 129 | 55 |
Christian Yelich | 0.329 | 175 | 3 |
Ketel Marte | 0.329 | 150 | 6 |
DJ LeMahieu | 0.327 | 135 | 18 |
Anthony Rendon | 0.319 | 154 | 7 |
Jeff McNeil | 0.318 | 143 | 28 |
Nolan Arenado | 0.315 | 129 | 12 |
Yoan Moncada | 0.315 | 140 | 16 |
Name | AVG | wRC. | WAR.rank |
---|---|---|---|
Mike Trout | 0.291 | 178 | 2 |
Christian Yelich | 0.329 | 175 | 3 |
Alex Bregman | 0.296 | 169 | 1 |
Nelson Cruz | 0.311 | 163 | 34 |
Cody Bellinger | 0.305 | 162 | 4 |
George Springer | 0.292 | 157 | 10 |
Anthony Rendon | 0.319 | 154 | 7 |
Ketel Marte | 0.329 | 150 | 6 |
This analysis shows how wRC+ better captures the value a player provides to his team while at the plate. Of the top 8 players in wRC+, 7 of them fall into the top 10 players in WAR, along with Nelson Cruz, whose poor defense weighed significantly on his WAR.
Meanwhile, for batting average, the results are scattered, with only three of the top 8 batting averages appearing within the top 7 in WAR, and the player with the top batting average clocking in at just 55th in WAR.
Finally, recent developments in baseball statistics have realized that values such as batting average can be largely driven by luck. As a response, stats like xBA have been developed to quantify how well a batter could be expected to hit based on the quality and strength of their batted balls. Let’s use this metric to find the luckiest and unluckiest hitters in the Major Leagues.
Name | luck | WAR |
---|---|---|
Tim Anderson | 0.048 | 3.4 |
Kolten Wong | 0.048 | 3.7 |
Nolan Arenado | 0.045 | 6.0 |
Xander Bogaerts | 0.042 | 6.8 |
Delino DeShields | 0.039 | 0.8 |
Name | luck | WAR |
---|---|---|
Justin Smoak | -0.040 | 0.2 |
Marcell Ozuna | -0.039 | 2.5 |
Brandon Drury | -0.028 | -0.6 |
Robinson Cano | -0.023 | 0.7 |
Jurickson Profar | -0.023 | 1.4 |
Interestingly, some of the best players are also some of the luckiest. Nolan Arenado and Xander Bogaerts both fall within the top 12 players in the league in WAR, and appear at 3 and 4 on the “luckiest” list, indicating that perhaps some of their great performance can be attributed to luck as well as skill.
Further Analysis
This analysis demonstrates the obvious trend in baseball statistics: away from simple measures such as how often a player gets a hit, and towards more complicated formulas that individually weigh the value that each player provides to his team. The stats of today provide greater meaning to how good a player really is in comparison to his peers, and how much value he provides to his team over a different player.
In order to continue this exploration, applying these metrics to team data would also be useful. It is likely that an entire team’s average hard hit rate or exit velocity could be used to predict its offensive output, and that this output could be used to roughly predict how good the team would be (though the massive variable of pitching would have to play a factor as well). This type of analysis would require significantly more data, but would likely back up the theory that advanced statistics like hard hit rate do a better job of predicting a team’s outcomes than more antiquated stats like batting average.