About the database

This database includes information about soccer matches played in Europe between 2008 and 2015, it is composed of 7 tables in SQL format. the number of columns and rows it contains is as follows.

Let’s start looking at the data and create some analysis.

1. I want to know the top 20 of players

2. I want to know about the leagues of the dataset

The first analysis I’m going to take the tables of player and player attributes. I’m going to join them and get to know the information I have.

Loading data

Data is stored in SQLite file so first I have to connect to the data base. Then I look at the available tables and choose the ones I’m interested in.

player and player_stats will be joined because they contain key information.

library(dplyr)
library(RSQLite)
library(grid)  
library(gridExtra) 
library(ggExtra)
library(cowplot)
library(tidyverse)

con <- dbConnect(SQLite(), dbname="database.sqlite")

# list all tables
dbListTables(con)
## [1] "Country"           "League"            "Match"            
## [4] "Player"            "Player_Attributes" "Team"             
## [7] "Team_Attributes"   "sqlite_sequence"
player       <- tbl_df(dbGetQuery(con,"SELECT * FROM player"))
player_stats <- tbl_df(dbGetQuery(con,"SELECT * FROM Player_Attributes"))

player_stats <-  player_stats %>%
  rename(player_stats_id = id) %>%
  left_join(player, by = "player_api_id")

Browsing data

I want to see what’s available in my joined table:

str(player_stats)
## tibble [183,978 × 48] (S3: tbl_df/tbl/data.frame)
##  $ player_stats_id     : int [1:183978] 1 2 3 4 5 6 7 8 9 10 ...
##  $ player_fifa_api_id.x: int [1:183978] 218353 218353 218353 218353 218353 189615 189615 189615 189615 189615 ...
##  $ player_api_id       : int [1:183978] 505942 505942 505942 505942 505942 155782 155782 155782 155782 155782 ...
##  $ date                : chr [1:183978] "2016-02-18 00:00:00" "2015-11-19 00:00:00" "2015-09-21 00:00:00" "2015-03-20 00:00:00" ...
##  $ overall_rating      : int [1:183978] 67 67 62 61 61 74 74 73 73 73 ...
##  $ potential           : int [1:183978] 71 71 66 65 65 76 76 75 75 75 ...
##  $ preferred_foot      : chr [1:183978] "right" "right" "right" "right" ...
##  $ attacking_work_rate : chr [1:183978] "medium" "medium" "medium" "medium" ...
##  $ defensive_work_rate : chr [1:183978] "medium" "medium" "medium" "medium" ...
##  $ crossing            : int [1:183978] 49 49 49 48 48 80 80 79 79 79 ...
##  $ finishing           : int [1:183978] 44 44 44 43 43 53 53 52 51 51 ...
##  $ heading_accuracy    : int [1:183978] 71 71 71 70 70 58 58 57 57 57 ...
##  $ short_passing       : int [1:183978] 61 61 61 60 60 71 71 70 70 70 ...
##  $ volleys             : int [1:183978] 44 44 44 43 43 40 32 29 29 29 ...
##  $ dribbling           : int [1:183978] 51 51 51 50 50 73 73 71 71 71 ...
##  $ curve               : int [1:183978] 45 45 45 44 44 70 70 68 68 68 ...
##  $ free_kick_accuracy  : int [1:183978] 39 39 39 38 38 69 69 69 69 69 ...
##  $ long_passing        : int [1:183978] 64 64 64 63 63 68 68 68 68 68 ...
##  $ ball_control        : int [1:183978] 49 49 49 48 48 71 71 70 70 70 ...
##  $ acceleration        : int [1:183978] 60 60 60 60 60 79 79 79 79 79 ...
##  $ sprint_speed        : int [1:183978] 64 64 64 64 64 78 78 78 78 78 ...
##  $ agility             : int [1:183978] 59 59 59 59 59 78 78 78 78 78 ...
##  $ reactions           : int [1:183978] 47 47 47 46 46 67 67 67 67 67 ...
##  $ balance             : int [1:183978] 65 65 65 65 65 90 90 90 90 90 ...
##  $ shot_power          : int [1:183978] 55 55 55 54 54 71 71 71 71 71 ...
##  $ jumping             : int [1:183978] 58 58 58 58 58 85 85 84 84 84 ...
##  $ stamina             : int [1:183978] 54 54 54 54 54 79 79 79 79 79 ...
##  $ strength            : int [1:183978] 76 76 76 76 76 56 56 56 56 56 ...
##  $ long_shots          : int [1:183978] 35 35 35 34 34 62 60 59 58 58 ...
##  $ aggression          : int [1:183978] 71 71 63 62 62 68 68 67 67 67 ...
##  $ interceptions       : int [1:183978] 70 70 41 40 40 67 67 66 66 66 ...
##  $ positioning         : int [1:183978] 45 45 45 44 44 60 60 58 58 58 ...
##  $ vision              : int [1:183978] 54 54 54 53 53 66 66 65 65 65 ...
##  $ penalties           : int [1:183978] 48 48 48 47 47 59 59 59 59 59 ...
##  $ marking             : int [1:183978] 65 65 65 62 62 76 76 76 76 76 ...
##  $ standing_tackle     : int [1:183978] 69 69 66 63 63 75 75 75 75 75 ...
##  $ sliding_tackle      : int [1:183978] 69 69 69 66 66 78 78 78 78 78 ...
##  $ gk_diving           : int [1:183978] 6 6 6 5 5 14 14 14 14 14 ...
##  $ gk_handling         : int [1:183978] 11 11 11 10 10 7 7 7 7 7 ...
##  $ gk_kicking          : int [1:183978] 10 10 10 9 9 9 9 9 9 9 ...
##  $ gk_positioning      : int [1:183978] 8 8 8 7 7 9 9 9 9 9 ...
##  $ gk_reflexes         : int [1:183978] 8 8 8 7 7 12 12 12 12 12 ...
##  $ id                  : int [1:183978] 1 1 1 1 1 2 2 2 2 2 ...
##  $ player_name         : chr [1:183978] "Aaron Appindangoye" "Aaron Appindangoye" "Aaron Appindangoye" "Aaron Appindangoye" ...
##  $ player_fifa_api_id.y: int [1:183978] 218353 218353 218353 218353 218353 189615 189615 189615 189615 189615 ...
##  $ birthday            : chr [1:183978] "1992-02-29 00:00:00" "1992-02-29 00:00:00" "1992-02-29 00:00:00" "1992-02-29 00:00:00" ...
##  $ height              : num [1:183978] 183 183 183 183 183 ...
##  $ weight              : int [1:183978] 187 187 187 187 187 146 146 146 146 146 ...

There are several observations in date_stat so I choose the latest:

latest_ps <- player_stats %>% 
  group_by(player_api_id) %>% 
  top_n(n = 1, wt = date) %>%
  as.data.frame()

I’m only interested in top 20 players so I choose them from the latest observation based on overall_rating:

top20 <- latest_ps %>% 
  arrange(desc(overall_rating)) %>% 
  head(n = 20) %>%
  as.data.frame()

Here are the key fields after filtering:

library(DT)

top20 %>% 
  select(player_name, birthday, height, weight, preferred_foot, overall_rating) %>% 
  datatable(., options = list(pageLength = 10))

Charts

Overall scores

I will start with describing the distribution of overall scores

library(DescTools)

Desc(top20$overall_rating, plotit = TRUE)
## ------------------------------------------------------------------------------ 
## top20$overall_rating (integer)
## 
##   length       n    NAs  unique     0s   mean  meanCI'
##       20      20      0       6      0  88.60   87.67
##           100.0%   0.0%           0.0%          89.53
##                                                      
##      .05     .10    .25  median    .75    .90     .95
##    87.00   87.00  87.00   88.00  89.25  90.30   93.05
##                                                      
##    range      sd  vcoef     mad    IQR   skew    kurt
##     7.00    1.98   0.02    1.48   2.25   1.41    1.14
##                                                      
## 
##    value  freq   perc  cumfreq  cumperc
## 1     87     7  35.0%        7    35.0%
## 2     88     6  30.0%       13    65.0%
## 3     89     2  10.0%       15    75.0%
## 4     90     3  15.0%       18    90.0%
## 5     93     1   5.0%       19    95.0%
## 6     94     1   5.0%       20   100.0%
## 
## ' 95%-CI (classic)

The scores are high as expected from the top football players

Correlation matrix

In order to see correlations between the variables I decided to create an interactive correlation matrix. click on data point to see a scatter plot

library(qtlcharts)

iplotCorr(top20[,10:42], reorder=TRUE)

Scatter plots

I wanted to see what’s the relationship between overall_score and other numeric variables so I made an interactive scatter plot:

library(ggvis)

measures <- names(top20[,10:42])

top20 %>% 
  ggvis(x = input_select(measures, label = "Choose the x-axis:", map = as.name)) %>% 
  layer_points(y = ~overall_rating, fill = ~player_name)

Lets talk about the leagues now

The names of every European Leagues

dbGetQuery( con,"SELECT c.name, l.name
                 FROM League as l  
                 INNER JOIN Country as c
                 ON c.id = l.country_id" )
##           name                     name
## 1      Belgium   Belgium Jupiler League
## 2      England   England Premier League
## 3       France           France Ligue 1
## 4      Germany    Germany 1. Bundesliga
## 5        Italy            Italy Serie A
## 6  Netherlands   Netherlands Eredivisie
## 7       Poland       Poland Ekstraklasa
## 8     Portugal Portugal Liga ZON Sagres
## 9     Scotland  Scotland Premier League
## 10       Spain          Spain LIGA BBVA
## 11 Switzerland Switzerland Super League

What are the names of the teams that make up these eleven Leagues?

teams <- dbGetQuery( con,"SELECT team_api_id AS 'team id',team_long_name AS 'team name',
                 name AS 'country'
                 FROM Match 
                 INNER JOIN Team
                 ON Team.team_api_id = Match.home_team_api_id
                 INNER JOIN Country 
                 ON Country.id = Match.country_id
                 group by home_team_api_id
                ORDER BY country ASC" )

teams
##     team id                    team name     country
## 1      1773          Oud-Heverlee Leuven     Belgium
## 2      4049                       Tubize     Belgium
## 3      6351                    KAS Eupen     Belgium
## 4      7947                FCV Dender EH     Belgium
## 5      8203                  KV Mechelen     Belgium
## 6      8342               Club Brugge KV     Belgium
## 7      8475             Waasland-Beveren     Belgium
## 8      8571                  KV Kortrijk     Belgium
## 9      8573                  KV Oostende     Belgium
## 10     8635               RSC Anderlecht     Belgium
## 11     9984            KSV Cercle Brugge     Belgium
## 12     9985            Standard de Liège     Belgium
## 13     9986           Sporting Charleroi     Belgium
## 14     9987                     KRC Genk     Belgium
## 15     9989                    Lierse SK     Belgium
## 16     9991                     KAA Gent     Belgium
## 17     9993                 Beerschot AC     Belgium
## 18     9994             Sporting Lokeren     Belgium
## 19     9996         Royal Excel Mouscron     Belgium
## 20     9997            Sint-Truidense VV     Belgium
## 21     9998                    RAEC Mons     Belgium
## 22     9999                KSV Roeselare     Belgium
## 23    10000             SV Zulte-Waregem     Belgium
## 24    10001                 KVC Westerlo     Belgium
## 25   274581         Royal Excel Mouscron     Belgium
## 26     8191                      Burnley     England
## 27     8197               Leicester City     England
## 28     8344                 Cardiff City     England
## 29     8455                      Chelsea     England
## 30     8456              Manchester City     England
## 31     8462                   Portsmouth     England
## 32     8466                  Southampton     England
## 33     8472                   Sunderland     England
## 34     8483                    Blackpool     England
## 35     8528               Wigan Athletic     England
## 36     8549                Middlesbrough     England
## 37     8559             Bolton Wanderers     England
## 38     8586            Tottenham Hotspur     England
## 39     8602      Wolverhampton Wanderers     England
## 40     8650                    Liverpool     England
## 41     8654              West Ham United     England
## 42     8655             Blackburn Rovers     England
## 43     8658              Birmingham City     England
## 44     8659         West Bromwich Albion     England
## 45     8667                    Hull City     England
## 46     8668                      Everton     England
## 47     8678                  Bournemouth     England
## 48     9798                      Reading     England
## 49     9817                      Watford     England
## 50     9825                      Arsenal     England
## 51     9826               Crystal Palace     England
## 52     9850                 Norwich City     England
## 53     9879                       Fulham     England
## 54    10003                 Swansea City     England
## 55    10172          Queens Park Rangers     England
## 56    10194                   Stoke City     England
## 57    10252                  Aston Villa     England
## 58    10260            Manchester United     England
## 59    10261             Newcastle United     England
## 60     4087     Évian Thonon Gaillard FC      France
## 61     4170     US Boulogne Cote D'Opale      France
## 62     6391                  GFC Ajaccio      France
## 63     7794                    SC Bastia      France
## 64     7819                      SM Caen      France
## 65     8121                   Angers SCO      France
## 66     8481            AS Nancy-Lorraine      France
## 67     8521            Stade Brestois 29      France
## 68     8550                      FC Metz      France
## 69     8576                   AC Ajaccio      France
## 70     8583                   AJ Auxerre      France
## 71     8588                      RC Lens      France
## 72     8592       Olympique de Marseille      France
## 73     8639                   LOSC Lille      France
## 74     8682                   Le Mans FC      France
## 75     8689                   FC Lorient      France
## 76     9746                  Le Havre AC      France
## 77     9747         En Avant de Guingamp      France
## 78     9748           Olympique Lyonnais      France
## 79     9827        Girondins de Bordeaux      France
## 80     9829                    AS Monaco      France
## 81     9830                    FC Nantes      France
## 82     9831                     OGC Nice      France
## 83     9836                    Dijon FCO      France
## 84     9837               Stade de Reims      France
## 85     9847          Paris Saint-Germain      France
## 86     9851             Stade Rennais FC      France
## 87     9853             AS Saint-Étienne      France
## 88     9855             Grenoble Foot 38      France
## 89     9873              Valenciennes FC      France
## 90     9874       FC Sochaux-Montbéliard      France
## 91     9941                  Toulouse FC      France
## 92    10242                 ES Troyes AC      France
## 93    10249       Montpellier Hérault SC      France
## 94   108893             AC Arles-Avignon      France
## 95     8152                 FC St. Pauli     Germany
## 96     8165               1. FC Nürnberg     Germany
## 97     8177            Hertha BSC Berlin     Germany
## 98     8178          Bayer 04 Leverkusen     Germany
## 99     8194           Fortuna Düsseldorf     Germany
## 100    8226          TSG 1899 Hoffenheim     Germany
## 101    8234             FC Ingolstadt 04     Germany
## 102    8262              SV Darmstadt 98     Germany
## 103    8295                Karlsruher SC     Germany
## 104    8350         1. FC Kaiserslautern     Germany
## 105    8357         SpVgg Greuther Fürth     Germany
## 106    8358                  SC Freiburg     Germany
## 107    8398           FC Energie Cottbus     Germany
## 108    8406                  FC Augsburg     Germany
## 109    8460              SC Paderborn 07     Germany
## 110    8697             SV Werder Bremen     Germany
## 111    8721                VfL Wolfsburg     Germany
## 112    8722                   1. FC Köln     Germany
## 113    9776       Eintracht Braunschweig     Germany
## 114    9788     Borussia Mönchengladbach     Germany
## 115    9789            Borussia Dortmund     Germany
## 116    9790                 Hamburger SV     Germany
## 117    9810          Eintracht Frankfurt     Germany
## 118    9823             FC Bayern Munich     Germany
## 119    9904                  Hannover 96     Germany
## 120    9905              1. FSV Mainz 05     Germany
## 121    9911                   VfL Bochum     Germany
## 122    9912        DSC Arminia Bielefeld     Germany
## 123   10189                FC Schalke 04     Germany
## 124   10269                VfB Stuttgart     Germany
## 125    6269                       Novara       Italy
## 126    7943                     Sassuolo       Italy
## 127    8524                     Atalanta       Italy
## 128    8529                     Cagliari       Italy
## 129    8530                      Catania       Italy
## 130    8533                Chievo Verona       Italy
## 131    8534                       Empoli       Italy
## 132    8535                   Fiorentina       Italy
## 133    8537                      Livorno       Italy
## 134    8540                      Palermo       Italy
## 135    8543                        Lazio       Italy
## 136    8551                        Siena       Italy
## 137    8564                        Milan       Italy
## 138    8600                      Udinese       Italy
## 139    8636                        Inter       Italy
## 140    8686                         Roma       Italy
## 141    8690              Reggio Calabria       Italy
## 142    9804                       Torino       Italy
## 143    9857                      Bologna       Italy
## 144    9858                      Brescia       Italy
## 145    9875                       Napoli       Italy
## 146    9876                Hellas Verona       Italy
## 147    9878                      Pescara       Italy
## 148    9880                       Cesena       Italy
## 149    9882                    Sampdoria       Italy
## 150    9885                     Juventus       Italy
## 151    9888                        Lecce       Italy
## 152    9891                    Frosinone       Italy
## 153    9976                         Bari       Italy
## 154   10167                        Parma       Italy
## 155   10233                        Genoa       Italy
## 156  208931                        Carpi       Italy
## 157    6413                   PEC Zwolle Netherlands
## 158    6433              Go Ahead Eagles Netherlands
## 159    6601                  FC Volendam Netherlands
## 160    6631                 FC Dordrecht Netherlands
## 161    7788                   SC Cambuur Netherlands
## 162    8277                      Vitesse Netherlands
## 163    8464                       N.E.C. Netherlands
## 164    8525                    Willem II Netherlands
## 165    8526                De Graafschap Netherlands
## 166    8593                         Ajax Netherlands
## 167    8611                    FC Twente Netherlands
## 168    8614             Sparta Rotterdam Netherlands
## 169    8640                          PSV Netherlands
## 170    8674                 FC Groningen Netherlands
## 171    9761                    NAC Breda Netherlands
## 172    9791              Heracles Almelo Netherlands
## 173    9803             Roda JC Kerkrade Netherlands
## 174    9839                    VVV-Venlo Netherlands
## 175    9908                   FC Utrecht Netherlands
## 176   10217                 ADO Den Haag Netherlands
## 177   10218                    Excelsior Netherlands
## 178   10219                 RKC Waalwijk Netherlands
## 179   10228                SC Heerenveen Netherlands
## 180   10229                           AZ Netherlands
## 181   10235                    Feyenoord Netherlands
## 182    1601                 Ruch Chorzów      Poland
## 183    1957        Jagiellonia Białystok      Poland
## 184    2182                  Lech Poznań      Poland
## 185    2183                  P. Warszawa      Poland
## 186    2186                     Cracovia      Poland
## 187    8019                Górnik Łęczna      Poland
## 188    8020                Polonia Bytom      Poland
## 189    8021               Zagłębie Lubin      Poland
## 190    8023               Pogoń Szczecin      Poland
## 191    8024                  Widzew Łódź      Poland
## 192    8025                Śląsk Wrocław      Poland
## 193    8027            Zawisza Bydgoszcz      Poland
## 194    8028                Piast Gliwice      Poland
## 195    8030                Lechia Gdańsk      Poland
## 196    8031                Polonia Bytom      Poland
## 197    8033   Podbeskidzie Bielsko-Biała      Poland
## 198    8242               Odra Wodzisław      Poland
## 199    8244                  Widzew Łódź      Poland
## 200    8245                Korona Kielce      Poland
## 201    8322                  Arka Gdynia      Poland
## 202    8569                GKS Bełchatów      Poland
## 203    8673               Legia Warszawa      Poland
## 204   10265                 Wisła Kraków      Poland
## 205  177361 Termalica Bruk-Bet Nieciecza      Poland
## 206    2033               S.C. Olhanense    Portugal
## 207    4064                     Feirense    Portugal
## 208    6367             Uniao da Madeira    Portugal
## 209    6403         FC Paços de Ferreira    Portugal
## 210    6421                   Leixões SC    Portugal
## 211    6547                  FC Penafiel    Portugal
## 212    7841                   Rio Ave FC    Portugal
## 213    7842                Estoril Praia    Portugal
## 214    7844            Vitória Guimarães    Portugal
## 215    7992                     Trofense    Portugal
## 216    8348                Moreirense FC    Portugal
## 217    8613                  Boavista FC    Portugal
## 218    9764               Gil Vicente FC    Portugal
## 219    9765                 Portimonense    Portugal
## 220    9768                  Sporting CP    Portugal
## 221    9771         União de Leiria, SAD    Portugal
## 222    9772                   SL Benfica    Portugal
## 223    9773                     FC Porto    Portugal
## 224    9807             CF Os Belenenses    Portugal
## 225    9809             Naval 1° de Maio    Portugal
## 226   10211                 SC Beira Mar    Portugal
## 227   10212                  CS Marítimo    Portugal
## 228   10213                      Amadora    Portugal
## 229   10214                  CD Nacional    Portugal
## 230   10215         Académica de Coimbra    Portugal
## 231   10238              Vitória Setúbal    Portugal
## 232   10264                     SC Braga    Portugal
## 233  158085                    FC Arouca    Portugal
## 234  188163                      Tondela    Portugal
## 235    8066 Inverness Caledonian Thistle    Scotland
## 236    8284                    Dundee FC    Scotland
## 237    8426         Partick Thistle F.C.    Scotland
## 238    8429       Hamilton Academical FC    Scotland
## 239    8457         Dunfermline Athletic    Scotland
## 240    8467             St. Johnstone FC    Scotland
## 241    8485                     Aberdeen    Scotland
## 242    8548                      Rangers    Scotland
## 243    8596                      Falkirk    Scotland
## 244    8597                   Kilmarnock    Scotland
## 245    8649               Ross County FC    Scotland
## 246    9800                   St. Mirren    Scotland
## 247    9860          Heart of Midlothian    Scotland
## 248    9925                       Celtic    Scotland
## 249    9927                   Motherwell    Scotland
## 250    9938                Dundee United    Scotland
## 251   10251                    Hibernian    Scotland
## 252    7869                   Córdoba CF       Spain
## 253    7878                   Granada CF       Spain
## 254    8302                   Sevilla FC       Spain
## 255    8305                    Getafe CF       Spain
## 256    8306                UD Las Palmas       Spain
## 257    8315      Athletic Club de Bilbao       Spain
## 258    8370               Rayo Vallecano       Spain
## 259    8371                   CA Osasuna       Spain
## 260    8372                     SD Eibar       Spain
## 261    8388                  CD Numancia       Spain
## 262    8394                Real Zaragoza       Spain
## 263    8479                RC Recreativo       Spain
## 264    8558                 RCD Espanyol       Spain
## 265    8560                Real Sociedad       Spain
## 266    8581                   Levante UD       Spain
## 267    8603          Real Betis Balompié       Spain
## 268    8633               Real Madrid CF       Spain
## 269    8634                 FC Barcelona       Spain
## 270    8661                 RCD Mallorca       Spain
## 271    8696             Racing Santander       Spain
## 272    9783    RC Deportivo de La Coruña       Spain
## 273    9864                    Málaga CF       Spain
## 274    9865                   UD Almería       Spain
## 275    9867                  CD Tenerife       Spain
## 276    9868         Xerez Club Deportivo       Spain
## 277    9869       Real Sporting de Gijón       Spain
## 278    9906              Atlético Madrid       Spain
## 279    9910             RC Celta de Vigo       Spain
## 280   10205                Villarreal CF       Spain
## 281   10267                  Valencia CF       Spain
## 282   10268                     Elche CF       Spain
## 283   10278      Hércules Club de Fútbol       Spain
## 284   10281              Real Valladolid       Spain
## 285    6493                AC Bellinzona Switzerland
## 286    7730           FC Lausanne-Sports Switzerland
## 287    7896                       Lugano Switzerland
## 288    7955              Neuchâtel Xamax Switzerland
## 289    9777                  Servette FC Switzerland
## 290    9824                     FC Vaduz Switzerland
## 291    9930                     FC Aarau Switzerland
## 292    9931                     FC Basel Switzerland
## 293    9956      Grasshopper Club Zürich Switzerland
## 294   10179                      FC Sion Switzerland
## 295   10190                FC St. Gallen Switzerland
## 296   10191                      FC Thun Switzerland
## 297   10192               BSC Young Boys Switzerland
## 298   10199                    FC Luzern Switzerland
## 299   10243                    FC Zürich Switzerland

There are 299 teams in al the 11 leagues, How many correspond to each country?

options(repr.plot.width = 6, repr.plot.height = 6)

teams %>% 
group_by(country) %>% 
summarise(teams_qty = n())  %>% 
arrange(desc(teams_qty)) %>% 

#Let's make a graphic 
      ggplot(aes(y = fct_reorder(country,teams_qty), x = teams_qty, fill = teams_qty )) +
       geom_bar(stat="identity",position=position_dodge(), alpha = 0.7) + theme_minimal() + theme(legend.position="none") + 
         geom_text(aes(label= teams_qty), hjust= -0.2) + labs(title="Number of teams by country", x = "Teams", y = "Country") +
           scale_fill_gradient(low="#8cd2ed",high="#003c57")

It would be logical to think that the countries with the most teams have the most games, let’s see if this is true.

Number of matches played in each country between 2008 and 2015

options(repr.plot.width = 8, repr.plot.height = 6)

dbGetQuery( con,"SELECT name AS 'country', 
      count(Match.home_team_api_id) AS games
                 FROM Match 
                 INNER JOIN Team
                 ON Team.team_api_id = Match.home_team_api_id
                 INNER JOIN Country 
                 ON Country.id = Match.country_id
                 group by country
                 ORDER BY games DESC")  %>% 

# Another graphic

  ggplot(aes(y = fct_reorder(country,games) , x = games , fill = games )) +
  geom_bar(stat="identity",position=position_dodge(), alpha = 0.6) + theme_minimal() + 
  theme(legend.position="none") + 
  geom_text(aes(label= games), hjust= -0.2) +
  labs(title="Number of games",x = "Games",y = "Country") +
  scale_fill_gradient(low="#d37578",high="#5b0000")

That’s right, the countries with the largest number of teams are the ones that play the most games.

Lets see a country more deailed (Spain)

options(repr.plot.width = 8, repr.plot.height = 10)

dbGetQuery( con,"SELECT team_long_name AS 'team_name',
      count(Match.home_team_api_id) AS games
                 FROM Match 
                 INNER JOIN Team
                 ON Team.team_api_id = Match.home_team_api_id
                 INNER JOIN Country 
                 ON Country.id = Match.country_id
                 WHERE name = 'Spain' 
                 group by team_name
                 ORDER BY games DESC")  %>% 
  # ahora hacemos una grafica con ggplot2
  ggplot(aes(y = fct_reorder(team_name,games) , x = games , fill = games )) +
  geom_bar(stat="identity",position=position_dodge(), alpha = 0.8) + theme_minimal() + 
  scale_fill_gradient(low="#4f908c",high="#6e0ff9") +  theme(legend.position="none")

The teams with more games played are the most popular ones, Will these teams be the best also? one way to know this is by the number of goals. it is a good indicator of a team’s performance.

Teams average goals.

dbGetQuery( con,"SELECT team_api_id AS 'team id',team_long_name AS 'team_name',
                 name AS 'country',season,
                AVG(home_team_goal + away_team_goal) AS avg_goals_game,  
               count(Match.home_team_api_id)  AS games
                 FROM Match 
                 INNER JOIN Team
                 ON Team.team_api_id = Match.home_team_api_id
                 INNER JOIN Country 
                 ON Country.id = Match.country_id
                 group by home_team_api_id
                ORDER BY avg_goals_game DESC 
                LIMIT 10")
##    team id            team_name     country    season avg_goals_game games
## 1     8633       Real Madrid CF       Spain 2008/2009       4.289474   152
## 2     8634         FC Barcelona       Spain 2008/2009       3.921053   152
## 3     8640                  PSV Netherlands 2008/2009       3.647059   136
## 4     8483            Blackpool     England 2010/2011       3.526316    19
## 5     9823     FC Bayern Munich     Germany 2008/2009       3.514706   136
## 6     8457 Dunfermline Athletic    Scotland 2011/2012       3.473684    19
## 7    10228        SC Heerenveen Netherlands 2008/2009       3.448529   136
## 8     8573          KV Oostende     Belgium 2013/2014       3.393939    33
## 9    10192       BSC Young Boys Switzerland 2008/2009       3.391608   143
## 10    8721        VfL Wolfsburg     Germany 2008/2009       3.352941   136