Introduction

Sir Jason Tibbetts has offered me a lifetime of free CAC jerseys to give him my best analysis of what the salaries should be for captains in Franchise B league. (By the way, how was the name Franchise D-League not already considered?)

# Load libraries
library(ggplot2); library(caret)

# Load Data
df <- read.csv("~/Dropbox/Franchise Cap Hits/Franchise B League Draft Cap Hits.csv", nrows = 100)
names(df) <- c("PR", "Dollars")

Linear Model

First, here’s a look at a linear model of the data.

# Creating Linear Model
LinMod1 <- lm(Dollars ~ PR, data = df)
b <- signif(coef(LinMod1)[1], digits = 2)
a <- signif(coef(LinMod1)[2], digits = 2)
r <- cor(df$PR, df$Dollars)
textlab <- paste("y = ",a,"x + ",b, sep="")

# Creating graph
ggplot(data = df, aes(PR, Dollars)) + geom_point() + geom_smooth(method = lm, se = FALSE) + annotate("text", x = 25, y = 38, label = textlab, size = 4, parse = FALSE)

It’s not a bad fit, with an r^2 = 0.60. If I break the PRs down into traditional bins of width 5, you’d get these the predicted salaries in the table below. I’ve included a column with my recommended salaries in this scenario.

# Table of Linear Model Salaries
ranges <- c("< 14.9", "15-19.9", "20-24.9", "25-29.9", "30-34.9", "35-39.9", "40-44.9", "45-49.9", "50-54.9", "55-59.9", "60 +")
midpoints = c(12.5, 17.5, 22.5, 27.5, 32.5, 37.5, 42.5, 47.5, 52.5, 57.5, 62.5)
y = a*midpoints + b
table1 <- data.frame(ranges, midpoints, y)
names(table1) <- c("APR Ranges", "Midpoints", "Dollars")
table1$Recommendation <- seq(1, 26, by = 2.5)
print(table1)
##    APR Ranges Midpoints Dollars Recommendation
## 1      < 14.9      12.5    1.45            1.0
## 2     15-19.9      17.5    3.75            3.5
## 3     20-24.9      22.5    6.05            6.0
## 4     25-29.9      27.5    8.35            8.5
## 5     30-34.9      32.5   10.65           11.0
## 6     35-39.9      37.5   12.95           13.5
## 7     40-44.9      42.5   15.25           16.0
## 8     45-49.9      47.5   17.55           18.5
## 9     50-54.9      52.5   19.85           21.0
## 10    55-59.9      57.5   22.15           23.5
## 11       60 +      62.5   24.45           26.0

One issue is that it underestimates the values of the highest paid players. The recommended values in this table increment by $2.5 for every jump in PR bracket and it can be seen that my recommendations somewhat makes up for that undervaluation. Or, in a further attempt to reconcile that, we can look at a degree 2 polynomial model:

Polynomial Model (degree 2)

# Creating Poly Model
PR2 <- df$PR^2
PolyMod2 <- lm(Dollars ~ PR + PR2, data = df)
c2 <- signif(coef(PolyMod2)[1], digits = 2)
b2 <- signif(coef(PolyMod2)[2], digits = 2)
a2 <- signif(coef(PolyMod2)[3], digits = 2)
#summary(PolyMod2) not shown

# Creating Graph
timevalues <- seq(0, 57, 0.1)
predictedcountsPoly <- predict(PolyMod2,list(PR=timevalues, PR2=timevalues^2))
plot(df$PR, df$Dollars, xlab = "PR", ylab = "Dollars", pch = 20)
lines(timevalues, predictedcountsPoly, col = "darkgreen", lwd = 3)

This model has an r^2 = 0.6477 which is a bit better of a fit than the linear model. If I had to make recommended cap hit brackets for this, it would look like:

#Table of PolyMod salary brackets
ranges <- c("Below 14.9", "15-19.9", "20-24.9", "25-29.9", "30-34.9", "35-39.9", "40-44.9", "45-49.9", "50-54.9", "55-59.9", "60 +")
x = c(12.5, 17.5, 22.5, 27.5, 32.5, 37.5, 42.5, 47.5, 52.5, 57.5, 62.5)
y2 = a2*x^2 + b2*x + c2
table2 <- data.frame(ranges, x, y2)
names(table2) <- c("APR Ranges", "APR Range Midpoints", "Dollars")
table2$Recommendation <- round(table2$Dollars)
print(table2)
##    APR Ranges APR Range Midpoints  Dollars Recommendation
## 1  Below 14.9                12.5  3.11875              3
## 2     15-19.9                17.5  4.00375              4
## 3     20-24.9                22.5  5.29875              5
## 4     25-29.9                27.5  7.00375              7
## 5     30-34.9                32.5  9.11875              9
## 6     35-39.9                37.5 11.64375             12
## 7     40-44.9                42.5 14.57875             15
## 8     45-49.9                47.5 17.92375             18
## 9     50-54.9                52.5 21.67875             22
## 10    55-59.9                57.5 25.84375             26
## 11       60 +                62.5 30.41875             30

Conclusions

Sir Tibbetts, do as you please with my impartial analysis. I think either of the recommendations would be fine and I’d lean slightly towards the polynomial model. At the end of the day, Franchise B players, I hope my thorough analysis enhances how much fun all of you have running up and down the court chasing a basketball around like a bunch of 10 year olds in a soccer game.

BJAX OUT