Using data from FBRef.com, I can graph each teams Goals/Game by year since 2010. To change the team, change the variable x in the plot statement from goalsPerGame[x, 1:12] to a number from 1 to 29.The key if downloading the csv file from the google sheets is
1: NEW ENGLAND REVOLUTION 2: PHILADELPHIA UNION 3: NASHVILLE 4: INTER MIAMI 5: CHARLOTTE 6: NYCFC 7: ATLANTA 8: ORLANDO 9: RED BULLS 10: DC UNITED 11: COLUMBUS CREW 12: MONTREAL 13: CHICAGO FIRE 14: TORONTO 15: CINCINNATI 16: COLORADO RAPIDS 17: SEATTLE SOUNDERS 18: SPORTING KC 19: PORTLAND TIMBERS 20: MINNESOTA 21: VANCOUVER WHITECAPS 22: RSL 23: LA GALAXY 24: LAFC 25: SAN JOSE EARTHQUAKES 26: FC DALLAS 27: AUSTIN 28: HOUSTON DYNAMO 29: LEAGUE AVERAGE
goalsPerGame = read.csv("https://docs.google.com/spreadsheets/d/1-2VkGfZoijfjMHblG5gf31FxUJN3KpmbuBqlHQAO1PU/gviz/tq?tqx=out:csv&sheet=MLS", header = TRUE, row.names = 1) #csvData
plot(c(2010:2021), goalsPerGame[2,1:12], type = "l", col= "#cfc29c", main = "Philadelphia Union Goals/Game by Season", lwd = 3, xlab = "Year", ylab = "Goals/Game")
lines(c(2010:2021), goalsPerGame[29,1:12], col = "black", lwd = 3)
legend("topleft", legend=c("Union", "Leauge Average"),
col=c("#cfc29c", "black"),lwd=c(3, 3))
To plot two teams on the same graph, use the lines() function.
plot(c(2010:2021), goalsPerGame[11,1:12], type = "l", col= "#fcdb04", main = "Columbus Crew vs Chicago Fire Goals/Game by Season", lwd = 3, xlab = "Year", ylab = "Goals/Game")
lines(c(2010:2021), goalsPerGame[13,1:12], col = "#fe0000", lwd = 3)
lines(c(2010:2021), goalsPerGame[29,1:12], col = "black", lwd = 3)
legend("topleft", legend=c("Crew","Fire", "League Average"),
col=c("#fcdb04","#fe0000", "black"),lwd=c(3, 3, 3))
After noticing frequently occuring peaks where teams would immdediately fall back after a good year, I decided to test out how often teams would decrease their goals/game the following year after having increased the production the previous year. Conversely, I also caclulated whether teams were likely to rebound after having a decrease in production from the previous year.
downContinueCount = 0
downReboundCount = 0
upContinueCount = 0
upFallCount = 0
upGrowth = c()
downGrowth = c()
row_Of = function(rowNum){
totalnums = c();
for (i in 1:12){
totalnums[i] = goalsPerGame[rowNum, i]
}
return(totalnums)
}
for (r in 1:28) {
team = row_Of(r)
for (y in 1:10) {
if (!is.na(team[y]) && !is.na(team[y+1]) && !is.na(team[y+2])){
if (team[y + 1] > team[y]){
upGrowth[length(upGrowth) + 1] = team[y + 2] - team[y + 1]
if (team[y + 2] > team[y + 1]){
upContinueCount = upContinueCount + 1
} else if (team[y + 2] < team[y+1]) {
upFallCount = upFallCount + 1
}
} else if (team[y+1] < team[y]){
downGrowth[length(downGrowth) + 1] = team[y + 2] - team[y + 1]
if (team[y + 2] > team[y + 1]) {
downReboundCount= downReboundCount +1
} else if (team[y + 2] < team[y + 1]) {
downContinueCount = downContinueCount + 1
}
}
}
}
}
upContinueCount #Increase in current year, another increase the following year
## [1] 33
upFallCount #Increase in current year, decrease the following year
## [1] 60
For teams which had an increase in production, just 35.5%, \((\frac{33}{33+60})\) had an increase in production the following year.For the NFL data, it was 35.3% \(\frac{120}{120+220}\). It’s somewhat disheartening to know that an increase in production is has a 64.5% chance of not continuing. More on this later.
downContinueCount
## [1] 28
downReboundCount
## [1] 51
For teams who had a decrease in production the previous year, the “peakiness” is about the same as 64.6%, \((\frac{51}{51+28})\) of teams rebounded after a down year in production. For the NFL data, the proportion was very similar at \(\frac{212}{212+111}\) or 65.6%. A possible reason why this is that, especially in the MLS, the long-time average increases for both leagues, making increases in production more likely.
mean(upGrowth)
## [1] -0.107732
sqrt(var(upGrowth))
## [1] 0.3025052
mean(downGrowth)
## [1] 0.1174118
sqrt(var(downGrowth))
## [1] 0.3249951
Comparing the averages, the “pointiness” of teams who had an increase of production was greater than the teams who had a decrease in production and the standard deviation was greater as well.
hist(upGrowth, xlab = "Growth in G/Game the following year after an increase the previous year", col = "lightGreen")
hist(downGrowth, xlab = "Growth in G/Game the following year after a decrease the previous year", col = "red")
So a potential question to ask is, “Why don’t teams maintain increases or decreases in offensive production?”. Is it just “regressing to the mean” or maybe cap/contractual issues? Maybe it’s a bit of both or some other factors which I have not thought about? But my main question while looking at this data is “If teams are likely to fall back after an increase in production, why would anyone tank and try to rebuild?” The main point of a rebuild is to have a steady increase in production with the hope of it turning into a superteam and winning a championship. But if success is unsustainable, is there any point in intentionally losing games?