2026-04-08

Introduction

  • Goal: Predict match outcomes using player performance metrics
  • Approach: Logistic regression + visualization

Data & Variables

  • Simulated dataset (100 players)

  • Variables:

    • Kills Deaths Assists (KDA)
    • Gold per Minute (GPM)
    • Damage per Minute (DPM)
    • Vision Score
    • Win/Loss

Exploratory Plot (KDA vs Win)

ggplot(data, aes(x = KDA, y = Win)) +
  geom_point(alpha = 0.6) +
  geom_smooth(method = "lm", color = "blue") +
  labs(title = "KDA vs Win Probability")

Exploratory Plot (GPM vs DPM)

ggplot(data, aes(x = GPM, y = DPM, color = factor(Win))) +
  geom_point() +
  labs(title = "GPM vs DPM by Match Outcome",
       color = "Win")

Logistic Regression Model

\[ P(Y=1) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 KDA + \beta_2 GPM + \beta_3 DPM + \beta_4 Vision)}} \]

  • Models probability of winning
  • Uses multiple performance metrics
model <- glm(Win ~ KDA + GPM + DPM + Vision, 
             data = data, family = "binomial")
summary(model)

Log-Odds (Logit) Form

\[ \log\left(\frac{P(Y=1)}{1 - P(Y=1)}\right) = \beta_0 + \beta_1 KDA + \beta_2 GPM + \beta_3 DPM + \beta_4 Vision \]

  • Transforms probability into log-odds
  • Linear relationship with predictors

Model Interpretation

  • Positive coefficients = Increase in win probability
  • GPM and DPM typically strongest predictors
  • KDA contributes but less dominant. Likely a byproduct.

3D Visualization (Plotly)

Key Insights

  • Higher gold and damage strongly linked to winning
  • Vision has moderate impact
  • Model estimates are unlikely to be predictive, since more variables are dynamic throughout the game

Conclusion

  • Statistical modeling helps evaluate player performance
  • Logistic regression accurate with extreme hypothetical stats
  • Can extend to real esports datasets (e.g., pro matches)