Assignment 7 Visuals

Author

James Reddy

# load packages
library(tidyverse)
Warning: package 'lubridate' was built under R version 4.4.3
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(httr)
library(rvest)

Attaching package: 'rvest'

The following object is masked from 'package:readr':

    guess_encoding
library(polite)
Warning: package 'polite' was built under R version 4.4.3
library(lubridate)
library(magrittr)

Attaching package: 'magrittr'

The following object is masked from 'package:purrr':

    set_names

The following object is masked from 'package:tidyr':

    extract
library(readr)
# load data
all_players <- 
  read.csv("https://myxavier-my.sharepoint.com/:x:/g/personal/reddyj_xavier_edu/EZDCUm2VSi5BiFQmjhwtFT4BXpZvwmuOZXhp7ejcWPqqeQ?download=1")

Which Premier League club has the most red cards this Season?

all_players %>%
  filter(league == "EPL") %>%
  group_by(team_title) %>%
  summarise(total_red_cards = sum(red_cards, na.rm = TRUE)) %>%
  arrange(desc(total_red_cards)) %>%
  ggplot(aes(x = reorder(team_title, total_red_cards), y = total_red_cards)) +
  geom_col(fill = "red") +
  coord_flip() +
  labs(title = "Total Red Cards by Premier League Club",
       x = "Club",
       y = "Number of Red Cards")

This graph shows that Arsenal and Ipswich are tied with the most red cards, with 5 each. Crystal Palace is right behind with 4, most clubs have one or have not received a red card yet this season.

Is there any correlation between Expected Assists and Key Passes?

all_players %>%
  select(player_name, key_passes, xA) %>%
  ggplot(aes(x = key_passes, y = xA)) +
  geom_point(color = "black") +
  geom_smooth(method = "lm", color = "red", se = FALSE) +
  labs(title = "Relationship Between Key Passes and xA",
       x = "Key Passes",
       y = "Expected Assists (xA)")
`geom_smooth()` using formula = 'y ~ x'

This graph does show a strong correlation between key passes and xA, when a player makes more key passes, their xA goes up.

Which teams have the most Expected Goals?

all_players %>%
  group_by(team_title, league) %>%
  summarize(total_xG = sum(xG, na.rm = TRUE), .groups = "drop") %>%
  filter(total_xG > 65) %>%
  arrange(desc(total_xG)) %>%
  ggplot(aes(x = reorder(team_title, total_xG), y = total_xG, fill = league)) +
  geom_col() +
  coord_flip() +
  labs(title = "Total xG by Team",
       x = "Team",
       y = "Total Expected Goals (xG)",
       fill = "League")

The graph shows PSG as the team that leads the teams in xG, with an even distribution of leagues at first, but as we move on to the bottom half of the list, there is a big streak of Premier League clubs.

Which player has the worst ratio of red cards to minutes played?

all_players %>%
  mutate(minutes_per_red_card = time / red_cards) %>%
  filter(red_cards > 0) %>% 
  arrange(desc(minutes_per_red_card)) %>%  # Flip the order to descending
  filter(minutes_per_red_card < 200) %>%
  ggplot(aes(x = reorder(player_name, -minutes_per_red_card), y = minutes_per_red_card)) +  # Use negative for descending
  geom_col(fill = "red") +
  coord_flip() +
  labs(title = "Players with Most Time per Red Card",
       x = "Player",
       y = "Minutes per Red Card")

Dele Alli has the worst ratio of red cards to minutes played, with two other players getting a red card without even playing 25 minutes.

Which players have outperformed their Expected Goals the most?

all_players %>%
  select(player_name, league, goals, xG) %>%
  mutate(goal_diff = goals - xG) %>%
  filter(goal_diff > 4) %>%
  arrange(desc(goal_diff)) %>%
  ggplot(aes(x = reorder(player_name, goal_diff), y = goal_diff, fill = league)) +
  geom_col() +
  coord_flip() +
  labs(title = "Top 10 Players Who Outscored Their Expected Goals (xG)",
       x = "Player",
       y = "Goals Above xG",
       fill = "League")

Matheus Cunha has outperformed his xG the most, while three of the top four all play in the Premier League.