The article I am working with predicts the league results of the 2023 season for club soccer. The artical is linked here: https://projects.fivethirtyeight.com/soccer-predictions/.
I have simplified the table to just have the name, offensive rating, and defensive rating. I then added a new column to determine if the team has an above average or below average offensive rating.
library(RCurl)
x <- getURL("https://raw.githubusercontent.com/bwolin99/TestRepo/main/spi_global_rankings.csv")
y <- read.csv(text = x)
df <- data.frame(Name = y$name, Off = y$off,
Def = y$def)
df$Off_Rating[df$Off > mean(df$Off)] <- "Above Average"
df$Off_Rating[df$Off < mean(df$Off)] <- "Below Average"
head(df)
## Name Off Def Off_Rating
## 1 Manchester City 2.79 0.28 Above Average
## 2 Bayern Munich 3.04 0.68 Above Average
## 3 Barcelona 2.45 0.43 Above Average
## 4 Real Madrid 2.56 0.60 Above Average
## 5 Liverpool 2.63 0.67 Above Average
## 6 Arsenal 2.53 0.61 Above Average
Some work I could do with this dataset in the futures is using the spi numbers for each team to create comparissons between different leagues. This could provide insight into which league can be considered the “strongest”, meaning it is made up of teams with the highest ratings.