Final Project
Data Visualization (STAT 302)
Overview💡
With huge financial benefits and rewards that comes with competing in the English Premier League, major teams are investing an unprecedented amount of capital into big data and analytics. I will be analyzing the entirety of the 2021-22 season to give insight into team performance, strategy trends, and other general correlations to see how data explains the results of last season.
The data source I will be using is 2021-2022 Premier League Statistics offered by FBREF. I imported the data into Google sheets and downloaded them into a csv for the purpose of this project.
Chicago-Style Citation: “Premier League Stats.” fbref.com. Accessed October 17, 2022. https://fbref.com/en/comps/9/Premier-League-Stats.
Graphic 1: Measuring Performance 📈
For my first graphic, I will be analyzing the performance of each team in the league by comparing their gf (total goals scored for) and x_g (total expected goals). If a team has more goals scored than expected goals, they overachieved and if a team has less goals scored than expected goals, they underachieved. This method of comparison is incredibly useful to analyze the performance of a team as a collective to see if they were simply fortunate/unfortunate to have achieved their results.
Graphic 2: Measuring Financial Impact 📈
This second graphic will illustrate the correlation between a team’s market value (market_value) and total points (pts) accrued in the 2021-22 season by plotting a scatter plot with a linear regression.
After running a cor.test argument, we see that the Pearson correlation coefficient is 0.9008644. From this number and the confidence interval shown on the regression shade, we see that a team’s market value is heavily correlated with their success and how much points they earned in the 2021-22 season.
Graphic 3: Playstyle Analysis 📈
This graphic will analyze a team’s playstyle and its correlation to points by looking at their total touches in every part of the field, divided into five equal parts — def_pen (defensive penalty area), def_3rd (defensive third), mid_3rd (middle third), att_3rd (attacking third), and att_pen(attacking penalty area). By looking at every teams’ total touches in each part of the field, we will understand what kind of playstyle is most effective.
From the graphic above we can see that teams with high touches gained more points in general. While there are some teams like Newcastle Utd and West Ham that found success with a more counterattack-oriented strategy, teams with a more possession-based strategy that kept the ball in the middle and attacking third definitely seemed to do better in the Premier League.
Graphic 4: Home-Game Analysis 📈
After recovering from the COVID-19 season of 2020-2021, the league began to push and emphasize the importance of fans coming back into the stadium by referencing home game advantages. With this graphic we will see how influential fans were in a team’s success by comparing home win percentages (home_percent) by away win percentages (away_percent) with a size aesthetic on average attendance (attendace).
The green y = x line shows where every team should align if home game advantage did not exist. If a team finds itself above that line, they performed better at home and vice versa. We see that most teams, regardless of attendance, perform better at home (relative to their away record) — with small teams like Brentford and Burnley performing better at home than giants like Chelsea.