With this method, data can be clustered into different groups; with this example being EPL data with different teams. The idea is that a random effect is taken into account such as different teams taking into account the effect of different coaches’ tactics and playing styles.
To start we can assess our response variable being
overall Fifa rating score. We can gain insight
into the distribution of overall ratings
Looking at the data of skill dribbling which we aim to use as our predictor variable
After filtering out some data we have this graph showing the
correlation between skill ball control and overall rating
For the sake of this example we are only going to compare variables within the same league which will be the EPL
Now we can have a look at a facet wrapped linear model of each of the teams in the English Premier League. The relationship between skill_dribbling and overall fifa rating
##
## Call:
## lm(formula = overall ~ skill_dribbling, data = epl)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13.8975 -3.9597 -0.2916 3.6592 20.6764
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 36.46882 1.74192 20.94 <2e-16 ***
## skill_dribbling 0.52463 0.02482 21.14 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.443 on 564 degrees of freedom
## Multiple R-squared: 0.4421, Adjusted R-squared: 0.4411
## F-statistic: 446.9 on 1 and 564 DF, p-value: < 2.2e-16
overall rating = 36.47 + 0.52(dribbling skill)
For every 1 unit increase in dribbling skill, overall rating increases by 0.52 points
The purpose of the unconditional means model is to assess the amount of variation at each level—to compare variability between subjects. Our random intercept as club
Here we have a plot showing the intercepts of each team for overall
rating
-
Fixed effect (Intercept) = 72.86 (initial baseline rating value)
Random effect (club) = 7.09 (clubs will vary around the intercept 72.86, on average by 7.09 units)
Chelsea has the largest intercept, 72.86 + 4.16 = 77.02
Brentford has the smallest intercept, 72.86 - 2.95 = 69.91
- When dribbling skill is added to the model:
Man U has the largest intercepts (+1.26)
Wolverhampton has the smallest intercept (-1.38)
For another model we can add dribbling skill as a random slope. By adding this random slope we can assess the effect of how a predictor variable may also vary across groups. By adding a random slope, you allow for these differences in the relationship between the predictor and the outcome across groups, which leads to a better model fit and more accurate estimation of effects.
| overall | overall | overall | overall | overall | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Predictors | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p |
| (Intercept) | 36.47 | 33.05 – 39.89 | <0.001 | 72.86 | 71.51 – 74.21 | <0.001 | 37.48 | 34.00 – 40.95 | <0.001 | 37.50 | 33.11 – 41.88 | <0.001 | 60.49 | 60.12 – 60.86 | <0.001 |
| skill dribbling | 0.52 | 0.48 – 0.57 | <0.001 | 0.51 | 0.46 – 0.56 | <0.001 | 0.51 | 0.45 – 0.57 | <0.001 | 0.09 | 0.09 – 0.10 | <0.001 | |||
| Random Effects | |||||||||||||||
| σ2 | 67.55 | 40.31 | 39.45 | 25.33 | |||||||||||
| τ00 | 7.09 club_name | 1.30 club_name | 38.39 club_name | 12.07 league_name:club_name | |||||||||||
| 3.66 club_name | |||||||||||||||
| τ11 | 0.01 club_name.skill_dribbling | ||||||||||||||
| ρ01 | -0.98 club_name | ||||||||||||||
| ICC | 0.10 | 0.03 | 0.05 | 0.38 | |||||||||||
| N | 20 club_name | 20 club_name | 20 club_name | 56 league_name | |||||||||||
| 702 club_name | |||||||||||||||
| Observations | 566 | 566 | 566 | 566 | 19239 | ||||||||||
| R2 / R2 adjusted | 0.442 / 0.441 | 0.000 / 0.095 | 0.427 / 0.445 | 0.425 / 0.456 | 0.069 / 0.426 | ||||||||||
| Name | Model | AIC | AIC_wt | AICc | AICc_wt | BIC | BIC_wt | RMSE | Sigma | R2_conditional | R2_marginal | ICC | R2 | R2_adjusted |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| LM | lm | 3719.230 | 0.2170648 | 3719.272 | 0.2217871 | 3732.245 | 0.7851841 | 6.432091 | 6.443485 | NA | NA | NA | 0.4420716 | 0.4410824 |
| RI | lmerMod | 4023.303 | 0.0000000 | 4023.345 | 0.0000000 | 4036.318 | 0.0000000 | 8.107824 | 8.218994 | 0.0950210 | 0.0000000 | 0.0950210 | NA | NA |
| RI_SD | lmerMod | 3717.497 | 0.5162730 | 3717.568 | 0.5200159 | 3734.851 | 0.2133771 | 6.286890 | 6.349113 | 0.4451975 | 0.4272943 | 0.0312607 | NA | NA |
| RS_IM | lmerMod | 3718.818 | 0.2666621 | 3718.968 | 0.2581969 | 3744.850 | 0.0014388 | 6.181165 | 6.280851 | 0.4555591 | 0.4254237 | 0.0524479 | NA | NA |
| Without_RS | lmerMod | 118810.646 | 0.0000000 | 118810.649 | 0.0000000 | 118849.969 | 0.0000000 | 4.945197 | 5.032757 | 0.4255178 | 0.0689104 | 0.3830001 | NA | NA |
R2 BEST MODEL: The LM had the best R2 with 0.442 (shows highest level of correlation) Linear models tend to have a better R² because they try to approximate the linear component of the relationship, but this does not mean they capture the true complexity of the data. Complex models might have lower R² simply because they do not fully rely on a linear approximation. Additionally, R² can be misleading when comparing non-nested models that differ in their form.
AIC BEST MODEL: RI_SD (Random intercepts with skill dribbling) showed to be the lowest AIC model. This refers to the estimated prediction error. So the lowest score is the best and therefore the best model for use
Teams like Manchester United and Manchester City are significantly better compared to the average performance after controlling for dribbling skill.
This variation reflects different playing styles, strategies, or even team dynamics that affect their performance independently of individual dribbling skills.
The random effects show how much each team’s performance deviates from the league average while accounting for individual player skills (skill_dribbling), emphasizing the importance of both team-level differences and individual skill in predicting overall performance.