The Greatest European Clubs: A Data-Driven Analysis

🔎 Exploring club performance in UEFA history using data visualization and analytics
📊 Let’s dive into the stats behind football’s biggest legends!

Introduction

🔹 Why This Analysis?
Football isn’t just about trophies—it’s about performance, consistency, and dominance.

💡 Key Questions We’ll Explore:
✔ Do the most successful teams always have the best win percentage?
✔ How do goals scored & conceded impact success?
✔ Which clubs are statistical powerhouses?

📊 Using R for Data Visualization and Analysis!

Understanding the Dataset

Source: Sports Statistics Dataset

🧐 Key Data Columns:
- WinPercentage = (Wins / Matches Played) × 100
- Goals.For & Goals.Against = Goals Scored vs. Goals Conceded
- Titles = Number of European Cups Won

Most Dominant Clubs (ggplot2)

# Select top 15 clubs by Win Percentage
top_clubs <- df %>% 
  arrange(desc(WinPercentage)) %>% 
  head(15)

# Plot the top 15 clubs
ggplot(top_clubs, aes(x = reorder(Club, -WinPercentage), y = WinPercentage, fill = Country)) +
  geom_col() +
  coord_flip() +
  labs(title = "Top 15 Clubs by Win Percentage", x = "Club", y = "Win Percentage (%)") +
  theme_minimal(base_size = 12)

🎯 Key Takeaways:
- Real Madrid, Bayern Munich, and Barcelona lead in win percentage.
- Some clubs with high win rates (e.g., Manchester City) haven’t won many titles yet!

Win Percentage vs. Titles (ggplot2)

ggplot(df, aes(x = WinPercentage, y = Titles, color = Country)) +
  geom_point(size = 4, alpha = 0.7) +
  geom_smooth(method = "lm", color = "black", se = FALSE) +
  labs(title = "Win Percentage vs. Titles", x = "Win Percentage (%)", y = "Titles Won") +
  theme_minimal(base_size = 14)
## `geom_smooth()` using formula = 'y ~ x'

📌 What This Means:
- Higher win percentage correlates with more trophies.
- A few clubs (e.g., Manchester City, PSG) perform well but have few titles.

3D Analysis – Goals For, Goals Against, and Titles (plotly)

plot_ly(df, x = ~`Goals.For`, y = ~`Goals.Against`, z = ~Titles, type = "scatter3d", mode = "markers",
        marker = list(size = 6, color = ~Titles, colorscale = "Blues"),
        hoverinfo = "text",
        text = ~paste("Club:", Club, "<br>Titles:", Titles, "<br>Goals Scored:", `Goals.For`, "<br>Goals Conceded:", `Goals.Against`)) %>%
  layout(title = "Goals For vs Goals Against vs Titles",
         scene = list(xaxis = list(title = "Goals Scored"),
                      yaxis = list(title = "Goals Conceded"),
                      zaxis = list(title = "Titles Won")))

🚀 Insights:
- High-scoring teams (Real Madrid, Barcelona) also concede a fair amount of goals!
- Defense matters—clubs with fewer goals conceded tend to win more titles.

Win Percentage Formula (LaTeX)

🧮 How do we calculate win percentage?

\[ Win\ Percentage = \frac{Wins}{Games\ Played} \times 100 \]

For example, Real Madrid’s Win Percentage is:

🏆 59.7%

Goal Impact Formula (LaTeX)

We analyze the goal difference impact using:

\[ Goal\ Difference = Goals\ For - Goals\ Against \]

🔍 Clubs with higher goal difference tend to win more titles!

cor_goal_diff_titles <- cor(df$`Goal.Diff`, df$Titles)
paste("Correlation:", round(cor_goal_diff_titles, 2))
## [1] "Correlation: 0.87"

📈 Correlation = 0.87

Key Findings

🏅 Who’s the Best?
- Real Madrid is the most dominant (14 titles, highest goals scored).
- Manchester City & PSG perform well but need historical success.

Scoring Goals Matters!
- The more you score, the more likely you are to win trophies.
- But conceding fewer goals is just as important!

Final Thoughts & Future Work

💡 What’s Next?
🔹 How do finances impact performance? (Transfer spending vs. success)
🔹 Tactical analysis – defensive vs. attacking clubs
🔹 Machine Learning to predict Champions League winners

🔥 The data doesn’t lie—football success is a mix of skill, history, and statistics!

👏 Thank you! 🚀