1 Introduction

Heya, Go4ino here again with another SGC data analytics report. The original inspiration for this report was RebelFox on twitter asking if I could investigate if differences in performances existed between different champions with different types of player bases.

For example: Is there a difference between Ornn and Sona? Ornn has been played extensively by practically every top laner in SGC, while Sona is more or less exclusively played by Dean. Does this mean Sona has higher average performance? Or is it overall better to stick primarily to the meta?

Thankfully @pookarGG’s SGC data set exists and has concrete data to analyze. I did heavy data modification / rearranging / tinkering / etc to get the data sets I used in R-Studio.


2 The Data

As mentioned previously, the original source for this data was from Pookar’s SGC stats doc. In specific this report pulled data from the Champion By Player, and DATA tab. I put all this data together into an Excel spreadsheet to read into R-Studio.

Pookar’s data takes match results data from almost every AM league up to 7/27, with only BIG League missing. Even with some matches being absent, there are 493 games present to analyze.

The sheet is heavily automated, meaning every tab of interest is updated when match histories are entered. I chose the Champion By Player tab as it allowed me to separate champions by position more easily to distinguish flex picks, and I felt like lumping flex champs together would skew results (eg: Top, Jungle, and Support Sett have vastly different statistics). The Data tab was used for calculations for the entire sample population.


3 Analyzing the Data

3.1 The Basics; Categorizing and Defining the 3 Types of Playerbase

fig 1: Interactive dot-plot of player champ play rate vs win rate

note: only champions with at least 10 games played, and players with 10 games on a champ are shown.

To interact with the graph hover over the graph to display options, hover over data points to display critical information, and click on the positions in the legend to show/hide data points for a specific role.

The vertical lines at 20% and 60% of total play rate is where I drew lines for separating the 3 factor groups. I decided on these 3 groupings based off what I perceived to be 3 distinct groups of data, and picked 20% and 60% since they would divide the 3 groups fairly well. Of course, these separations are by no means written in stone and just my personal interpretation which I based purely off of Proportion of Picks, and my eyeball’s guessing power. Important to note is that due to the minimum game requirements this graph is naturally biased towards players who have played the most games. For example 100T FallenBandit moved to starting top during UPL playoffs and has only 8 games in the recorded data, meaning it is impossible for him to even be included anywhere on fig 1. Likewise, teams who play more games per set on average are more favored to have their players appear on the graph. Lets say Team X is extremely dominant, and 2-0s every opponent, but Team Y is less consistent and all their series go to game 3. This means that Team Y plays 50% more games than Team X despite playing the same number of Bo3 series, and inherently is biased in favor of Team Y.

Points to the left are more widely played meta champs, whereas points to the right tend to have players who represent a significant portion of a given champ’s playerbase. As such I have categorized the 3 playerbase groups as: Broad, Medium, and Narrow. You can think of Broad being meta champions, Medium being somewhat meta champions that certain players/teams may more heavily favor than others, and lastly Narrow is champions that are almost exclusively played by single players.

As expected, the further along the x-axis you go the fewer data points there are. But are there any differences to be noted?

3.2 Does Playing Narrow Category Champions Have Any Effect On Performance Off of That Champion?

This was a question that popped into my head for the 4 players in the Narrow category. While boasting high performance on those champions they’re known for is great, can they still perform when not on said champions? Because if they struggle on different champions, well then said power picks could be a potential weakness for their teams rather than a strength.

To analyze this question I decided to perform hypothesis testing to check the differences.

Due to time limitations I’ve decided to only analyze the win rate as it is a fairly popular metric to judge one’s performance. However, Win rate won’t give us as complete of a picture in champion performance than if I were to include other metrics. For example: comparing gold/minute, or lane difference @ 10 minutes could help show how well the players can generate leads.

My original plan was to compare the mean Win rate for each of the 4 players on and off their champions, using 2 sample hypothesis testing. However, in order to do that each player would need at least 10 wins, and 10 losses on their champions and at least 10 wins, and 10 losses off their signature champs. Sadly, only Dean meets this criteria. Thus for the 4 players I will just be running a single sample hypothesis test to check if they still have a significantly better Win rate compared to the population 50% average, and just do a direct comparison of Win rate for each player. These single hypothesis tests will be done basically the same as the hypothesis test in 3.1.1, with just some numbers changed. For all 4 of these tests we will be testing the following null and alternative hypotheses at a 95% confidence interval:

\[ H_0: \mu=x\\ H_1: \mu < x \]

Furthermore we already know that the population proportion is \(p=0.5\), aka a 50% wr.

Thankfully when all 4 players are combined we meet the requirements for a 2 sample hypothesis test. As such I will run a 2 sample hypothesis test there.

3.2.1 Always Plan Ahea

Always Plan Ahea Averages
Metric Total A-Sol Non A-Sol Champs
Games 61 15 46
Wins 36 12 24
Win rate 0.5902 0.8000 0.5217

table 2

\[ n = 46\\ \mu = np = 23\\ \sigma =\sqrt{23*0.5(1-0.5)} \approx 3.3912 \\ Z_0 = \frac{24-23}{\sigma} \approx 0.4170\\ Pvalue = 1 - P(Z_0) \approx 0.3383 > 0.05 = \alpha \]

We fail to reject \(H_0\) because our P-value is greater than \(\alpha\). Thus there is no significant statistical difference in Win rate between Always Plan Ahea when he’s not on A-Sol and the overall average.


3.2.2 Dean

Dean Averages
Metric Total Sona Non Sona Champs
Games 59 26 33
Wins 37 20 17
Win rate 0.6271 0.7692 0.5152

table 3

\[ n = 33\\ \mu = np = 16.5\\ \sigma =\sqrt{16.5*0.5(1-0.5)} \approx 2.0310 \\ Z_0 = \frac{17-16.5}{\sigma} \approx 0.2462\\ Pvalue = 1 - P(Z_0) \approx 0.4028 > 0.05 = \alpha \]

We fail to reject \(H_0\) because our P-value is greater than \(\alpha\). Thus there is no significant statistical difference in Win rate between Dean when he’s not on Sona and the overall average.


3.2.3 Hunter

Hunter Averages
Metric Total Nunu Non Nunu Champs
Games 70 10 60
Wins 39 7 32
Win rate 0.5571 0.7000 0.5333

table 4

\[ n = 60\\ \mu = np = 30\\ \sigma =\sqrt{30*0.5(1-0.5)} \approx 2.739 \\ Z_0 = \frac{32-30}{\sigma} \approx 0.7303\\ Pvalue = 1 - P(Z_0) \approx 0.2326 > 0.05 = \alpha \]

We fail to reject \(H_0\) because our P-value is greater than \(\alpha\). Thus there is no significant statistical difference in Win rate between Hunter when he’s not on Nunu and the overall average.


3.2.4 Lobozz

Lobozz Averages
Metric Total Jinx Non Jinx Champs
Games 61 15 46
Wins 37 11 26
Win rate 0.6066 0.7333 0.5652

table 5

\[ n = 46\\ \mu = np = 23\\ \sigma =\sqrt{23*0.5(1-0.5)} \approx 3.3912 \\ Z_0 = \frac{26-23}{\sigma} \approx 1.2511\\ Pvalue = 1 - P(Z_0) \approx 0.1055 > 0.05 = \alpha \]

We fail to reject \(H_0\) because our P-value is greater than \(\alpha\). Thus there is no significant statistical difference in Win rate between Lobozz when he’s not on Jinx and the overall average.


3.2.5 All Four

All Four Averages
Metric Total Signature Champ Non Signature Champs
Games 250 66 184
Wins 149 50 99
Win rate 0.5960 0.7576 0.5380

table 6

Now for the based 2 sample hypothesis test. This will tell us if there’s any significant drop in Win rate when these players are not playing their signature champs. Ideally we want the players to be just as good off their signature champions, as they are on their signature champs. Thus in this instance we want to fail to reject \(H_0\), as that would mean there is no significant difference in the means.

Using the data from table 6, and a confidence level of 95% we get the following values and equations:

\[ \alpha = 0.05\\ H_0:p_1=p_2\\ H_1: p_1 > p_2\\ p_1 = \frac{50}{66} \approx0.7576 \\ p_2 = \frac{99}{184}\approx 0.5380 \\ \hat{p}=\frac{149}{250} \approx 0.5960 \\ Z_0 = \frac{p_1-p_2}{\sqrt{\hat{p} (1- \hat{p})(\frac{1}{66} + \frac{1}{184})}} \approx 3.1181 \]

Thus our resulting P-value is:

\[ 1-P(Z_0) \approx 0.00148 < 0.05 = \alpha \]

Because the P-value is lower than our \(\alpha\) we reject \(H_0\). Sadly this means that in general, these players perform significantly better when they are on their champs. I’m not significantly surprised though, since maintaining a close to 75% wr overall is giga hard to do.

Given the results of 3.2.1.1-4, it appears as if ‘just ban their champ lmao’ is a good strat with low time investment since all 4 of these players aren’t significantly better than the average player win rate wise off their champs.



4 Conclusions

So as we have seen these Narrow category champions are very high performers in terms of Win rate with that astonishing 75% wr. Of course it was shown that when these players are off their signature champions, they drop to basically the average Win rate which shows merit in the argument to ban said champs.

But I guess that brings us to the big question of this paper: “Is there differences in performance between champions of different playerbase types?”

To which I can confidently say yes. They have insane average win rates on those champions. It is a very massive difference, and the good kind at that.

4.1 Further Exploration

Below is a list of things I wish I included in this report, and things one could do to further explore the data in the future. Basically all of them come down to perfect universe where I had more time, but SGC ended like 2 months ago and school started again so sadly I wanted to get this out before I died of work load.

  • For the classification of playerbase types maybe I could have used a classification system like LDA/QDA.
  • Examine more metrics than Win rate.
  • Get an interview with Lobozz, Dean, and/or Hunter for how their teams drafts around their champs.

4.2 Shoutouts

  • Rebel Fox for bringing this idea to me in the first place on Twitter.
  • Pookar for managing to even get and compile all this data in the first place.
  • Always Plan Ahea for letting me interview him to get some crucial insight.

5 Bibliography

5.1 Sources notes:

  • Pookar’s SGC stats Google sheet:
    • Has records of SGC matches between the dates of 5/12/20-7/27/20, which was patches 10.9-10.15.
    • It has the records for every player and team over 493 games. Which is 4930 raw data entries.
    • Data does not include bans, only picks. This does mean some champs who were near perma ban status at certain points (eg: Varus, Yuumi) may have much lower presence despite being S+ tier champs for large parts of the season.
    • It has data from Upsurge, Risen, LWL, Focus, and CUP.
    • BIG League, and FACEIT playoffs matches were not included due to lack of match histories to enter.
    • Lane Score stat is just XP and Gold earned at 10.
    • Lane10 is the differential vs your lane opponent.
    • Due to missing data from BIG and FACEIT, certain players and teams my be underrepresented. (EG: 100 Next only competed in 4/6 AM leagues, and with BIG gone potentially a quarter of their games are just not included.)
  • My personal datasets:
    • All notes for Pookar’s SGC stats Google sheet apply here too.
    • Are modified versions of Pookar’s Champion By Player, and DATA tabs in the CSV format so I can load them into R-Studio. + They also have new variables added that weren’t in the original dataset. Furthermore, most existing variables were renamed to be “code friendly”.
    • all role champs 7-27.csv is just a fusion of all 5 role’s individual spreadsheets.
    • all role champs 10g min 7-27.csv is a filtered version of all role champs 7-27.csv with only observations that meet the minimum criteria for a minimum of 10 games in total for a champion, and players who have played at least 10 games of said champion.
    • top/jung/mid/adc/sup champs 7-27.csv are the spreadsheets for each individual position. All 5 combine into all role champs 7-27.csv.
    • APA/Dean/Hunter/Lobozz avgs table.csv are just compiled averages for the metrics shown in tables 2-5.
    • You are free to download and use my datasets for whatever, but be sure to credit both me (@go4ino) and Pookar (@PookarGG).
    • raw data entry 7-27.csv is modified data from Pookar’s DATA tab.
    • raw data entry 7-27 with double data.csv is raw data entry 7-27 with double data.csv but with double the data for the purposes of having the all roles category and being able to put the 5 positions + the overall data results in the same graph.

6 Appendix

Some random supplementary stuff that isn’t exactly relevant to the report.

6.1 fig 1 With No Restrictions

fig 2: Interactive dot-plot of player champ play rate vs win rate

6.2 KDA Bar Graphs

I had originally planned to also compare KDA alongside Win rate for the players with hypothesis testing. Sadly I have an acute case of being all dummy and no thicc and after doing a ton of data transforming and stuff realized I couldn’t do a proper hypothesis test for KDA since there are always zero death cases and I can’t plug infinity into the formulas. I didn’t want all that work to go to waste though so I’m plopping the graphs and tables here in the appendix. I could have grouped the data by player to get non infinite values but I felt like that wouldn’t be accurate enough for my tastes.

6.2.1 Always Plan Ahea

Always Plan Ahea Averages
Kills Deaths Assists KDA
APA Total 4.3279 2.7049 7.2623 4.2848
A-Sol 4.8667 2.0667 7.0667 5.7742
Non A-Sol Champs 4.1522 2.9130 7.3261 3.9403
Overall Mid Avg 4.5142 3.3966 6.6410 3.2843

table 7

fig 3


6.2.2 Dean

Dean Averages
Kills Deaths Assists KDA
Dean Total 1.2542 3.0339 12.8136 4.6369
Sona 1.4231 2.8077 14.2308 5.5753
Non Sona Champs 1.1212 3.2121 11.6970 3.9906
Overall Sup Avg 1.4351 3.9260 10.7424 3.1018

table 8

fig 4


6.2.3 Hunter

Hunter Averages
Kills Deaths Assists KDA
Hunter Total 3.4571 3.4714 10.8714 4.1276
Nunu 2.8000 2.9000 14.7000 6.0345
Non Nunu Champs 3.5667 3.5667 10.2333 3.8692
Overall Jung Avg 3.8377 3.8651 8.1258 3.0953

table 9

fig 5


6.2.4 Lobozz

Lobozz Averages
Kills Deaths Assists KDA
Lobozz Total 5.2295 3.4590 6.6066 3.4218
Jinx 7.0000 3.8000 7.4667 3.8070
Non Jinx Champs 4.6522 3.3478 6.3261 3.2792
Overall ADC Avg 5.1572 3.3732 6.2535 3.3827

table 10

fig 6


6.2.5 All Four

Overall Averages
Position Kills Deaths Assists KDA
Sup 1.4351 3.9260 10.7424 3.1018
ADC 5.1572 3.3732 6.2535 3.3827
Mid 4.5142 3.3966 6.6410 3.2843
Jung 3.8377 3.8651 8.1258 3.0953
Top 3.5730 4.0071 6.1978 2.4384
All 3.7034 3.7136 7.5921 3.0417

table 11

fig 7


6.3 KDA Table For All Roles

Basically the overall population average KDAs by role.

Overall Averages
Position Kills Deaths Assists KDA
Sup 1.4351 3.9260 10.7424 3.1018
ADC 5.1572 3.3732 6.2535 3.3827
Mid 4.5142 3.3966 6.6410 3.2843
Jung 3.8377 3.8651 8.1258 3.0953
Top 3.5730 4.0071 6.1978 2.4384
All 3.7034 3.7136 7.5921 3.0417

table 12