After reading the paper “Pay and Performance in Major League Baseball”, written by Gerald W. Scully, I thought it may be interesting to extend the usage of Marginal Revenue Product (MRP) into football, to investigate the relationship between performance and player salary. As different positions’ performance can be measured differently, I opted to focus only on wide receivers in this project. While blocking is an important aspect of WR play, their ability to become open and get yards is their primary money earner. I will therefore estimate the impact receiving yards have on win probability, and in turn win probability’s impact on team revenue. While this paper does not provide an exact estimate of WR MRP, it recreates the framework upon which said estimation can be built.
I used two unique datasets in this project. The nflreadr package contains exceptionally granular data on all aspects of the game, including the win probability and passing yards for each individual game. Due to computing limitations, I only analyzed games from the 2019 season. Additionally, some of the variables in the dataset are generated on a drive-by-drive level. For those, I simply generated a sum for the entire game, allowing me to lower the number of observations and look at the games as a whole.
The second dataset I used was team revenue data from the 2019 season.
I have found a set of tables on “RunRepeat” website, which I then
scraped using Stata, and exported to be usable by R.
For future analysis, I have also located a Kaggle dataset which contains all player salaries for multiple seasons. While this dataset is not yet included, when estimating the potential over/underpaying of athletes, this dataset will be crucial.
Marginal Revenue Product in economic terms represents the ratio between the change in an entity’s revenue and the change in labor input. In mathmaticall terms, this can be expressed as:
For sports specifically, this formula can be redesigned as the ratio between the change in entity’s revenue and the change in the team’s talent. As talent can be hard to quantify, a proxy variable must be selected. In the case of this write-up I will be using passing yards.
I use simple OLS models to generate my slope estimates necessary for MRP calculation.
In the first step of calculating WR MRP, I estimate the impact passing yards have on the win probability of a team.The dependent variable of the model is wp, and represents the expected win probability of a team in a given game. The independent variable is passing_yards, and represents the passing yards of a team in a given game.
Reg1 <- lm(data = GameDATA, avg_wp~passing_yards)
# sp = scored
summary(Reg1)
##
## Call:
## lm(formula = avg_wp ~ passing_yards, data = GameDATA)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.096032 -0.020200 -0.000389 0.016932 0.137502
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.538e-01 8.689e-03 52.230 < 2e-16 ***
## passing_yards 7.920e-05 1.686e-05 4.698 4.22e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.03126 on 265 degrees of freedom
## Multiple R-squared: 0.07688, Adjusted R-squared: 0.0734
## F-statistic: 22.07 on 1 and 265 DF, p-value: 4.224e-06
The results show that, on average for each additional passing
yard, the win probability of a team increases by 0.0079 percentage
points, cetris paribus.
The following model estimates the impact that win probability has on revenue. Teams that are more likely to win are also more likely to attract bigger crowds, and therefore generate bigger revenue in a given year.
reg2 <- lm(data = MergedData, MergedData$"2019_revenue_million" ~ avg_wp2)
summary(reg2)
##
## Call:
## lm(formula = MergedData$"2019_revenue_million" ~ avg_wp2, data = MergedData)
##
## Residuals:
## Min 1Q Median 3Q Max
## -122.42 -56.26 -18.38 19.41 512.75
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 400.63 99.75 4.016 7.7e-05 ***
## avg_wp2 147.02 196.58 0.748 0.455
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 104.1 on 265 degrees of freedom
## Multiple R-squared: 0.002106, Adjusted R-squared: -0.001659
## F-statistic: 0.5593 on 1 and 265 DF, p-value: 0.4552
The results are given in units of millions of dollars. Know that the wp variable is given on a 0 to 1 scale, meaning that a one percentage point increase would be written as an 0.01 increase. The interpretation of the beta coefficient therefore is that on average for each additional one percentage point increase in wp, the team’s revenue increases by 1.4702 million dollars, CP.
According to the calculations, the MRP for additional receiving yard is $ 116.44. This implies that a reciever who gets a 1000 yards in a season contributes $ 116,440 to the team’s revenue. Considering Justin Jefferson had almost exactly a thousand yards last season, and $116,400 is likely significantly less than just the revenue he brought in with Vikings jersey sales, it seems safe to say that the above result is an underestimation.
In future attempts to improve the accuracy of these results, I
would likely employ a TWFE model in my estimations, using team and year
fixed effects to control for some of the endogeneity within our
regressions. I would also acquire more team revenue data. Finally, I
would use the pre-existing player salary dataset to estimate position
discrimination amongst different football position groups.
Scully, Gerald W. “Pay and Performance in Major League Baseball.” The American Economic Review, vol. 64, no. 6, 1974, pp. 915–30. JSTOR, http://www.jstor.org/stable/1815242. Accessed 22 Sept. 2024.
Rizzo, Nicholas. “160+ NFL Franchise Value Statistics 2021 [Research Review].” RunRepeat, 1 Nov. 1970, runrepeat.com/nfl-franchise-value.
“NFLREADR • Download NFLVERSE Data.” Dev Status, nflreadr.nflverse.com/index.html. Accessed 22 Sept. 2024.
Antonov, Aleksandr. “Football Players Salaries.” Kaggle, 6 June 2019, www.kaggle.com/datasets/trolukovich/football-players-salaries.