Explorar datos de jugadores de FIFA de un archivo csv por medio de librería readr
Identificar una ruta WEB en donde se encuentra un archivo csv que contenga varias variables con lo cual se podrá importar con la función de read (lectura) que permitirá explorar sus datos.
La exploración de datos es un primer paso del análisis de datos que se utiliza para explorar y visualizar datos para descubrir conocimientos desde el mismo inicio o identificar áreas o patrones para profundizarlos más. texto del vínculo.
Va a contener varios elementos:
Se requiere instalar con anticipación estas librerías install.packages()
library(readr)
library(fdth)
library(dplyr)
library(plotly)
Los datos se encuentran en la URL:https://raw.githubusercontent.com/rpizarrog/Analisis-Inteligente-de-datos/main/datos/datos.FIFA.limpios.csv.
Los atributos que en su estado ogirinal son de tipo character los importa como factores o categóricos
datos <- read.csv("https://raw.githubusercontent.com/rpizarrog/Analisis-Inteligente-de-datos/main/datos/datos.FIFA.limpios.csv", stringsAsFactors = TRUE)
str(datos)
## 'data.frame': 17955 obs. of 50 variables:
## $ X : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Name : Factor w/ 16956 levels "A. Ábalos","A. Abang",..: 9532 3134 12377 4098 8534 4386 9540 9746 15251 7705 ...
## $ Age : int 31 33 26 27 27 27 32 31 32 25 ...
## $ Nationality : Factor w/ 163 levels "Afghanistan",..: 7 123 21 140 14 14 36 158 140 137 ...
## $ Overall : int 94 94 92 91 91 91 91 91 91 90 ...
## $ Potential : int 94 94 93 93 92 91 91 91 91 93 ...
## $ Club : Factor w/ 651 levels " SSV Jahn Regensburg",..: 214 329 436 376 375 137 473 214 473 61 ...
## $ Preferred.Foot : Factor w/ 3 levels "","Left","Right": 2 3 3 3 3 3 3 3 3 3 ...
## $ International.Reputation: int 5 5 5 4 4 4 4 5 4 3 ...
## $ Weak.Foot : int 4 4 5 3 5 4 4 4 3 3 ...
## $ Skill.Moves : int 4 5 5 1 4 4 4 3 3 1 ...
## $ Height : Factor w/ 22 levels "","5'1","5'10",..: 10 15 12 17 4 11 11 13 13 15 ...
## $ Weight : Factor w/ 58 levels "","110lbs","115lbs",..: 23 34 19 27 21 25 17 37 33 38 ...
## $ Crossing : int 84 84 79 17 93 81 86 77 66 13 ...
## $ Finishing : int 95 94 87 13 82 84 72 93 60 11 ...
## $ HeadingAccuracy : int 70 89 62 21 55 61 55 77 91 15 ...
## $ ShortPassing : int 90 81 84 50 92 89 93 82 78 29 ...
## $ Volleys : int 86 87 84 13 82 80 76 88 66 13 ...
## $ Dribbling : int 97 88 96 18 86 95 90 87 63 12 ...
## $ Curve : int 93 81 88 21 85 83 85 86 74 13 ...
## $ FKAccuracy : int 94 76 87 19 83 79 78 84 72 14 ...
## $ LongPassing : int 87 77 78 51 91 83 88 64 77 26 ...
## $ BallControl : int 96 94 95 42 91 94 93 90 84 16 ...
## $ Acceleration : int 91 89 94 57 78 94 80 86 76 43 ...
## $ SprintSpeed : int 86 91 90 58 76 88 72 75 75 60 ...
## $ Agility : int 91 87 96 60 79 95 93 82 78 67 ...
## $ Reactions : int 95 96 94 90 91 90 90 92 85 86 ...
## $ Balance : int 95 70 84 43 77 94 94 83 66 49 ...
## $ ShotPower : int 85 95 80 31 91 82 79 86 79 22 ...
## $ Jumping : int 68 95 61 67 63 56 68 69 93 76 ...
## $ Stamina : int 72 88 81 43 90 83 89 90 84 41 ...
## $ Strength : int 59 79 49 64 75 66 58 83 83 78 ...
## $ LongShots : int 94 93 82 12 91 80 82 85 59 12 ...
## $ Aggression : int 48 63 56 38 76 54 62 87 88 34 ...
## $ Interceptions : int 22 29 36 30 61 41 83 41 90 19 ...
## $ Positioning : int 94 95 89 12 87 87 79 92 60 11 ...
## $ Vision : int 94 82 87 68 94 89 92 84 63 70 ...
## $ Penalties : int 75 85 81 40 79 86 82 85 75 11 ...
## $ Composure : int 96 95 94 68 88 91 84 85 82 70 ...
## $ Marking : int 33 28 27 15 68 34 60 62 87 27 ...
## $ StandingTackle : int 28 31 24 21 58 27 76 45 92 12 ...
## $ SlidingTackle : int 26 23 33 13 51 22 73 38 91 18 ...
## $ GKDiving : int 6 7 9 90 15 11 13 27 11 86 ...
## $ GKHandling : int 11 11 9 85 13 12 9 25 8 92 ...
## $ GKKicking : int 15 15 15 87 5 6 7 31 9 78 ...
## $ GKPositioning : int 14 14 15 88 10 8 14 33 7 88 ...
## $ GKReflexes : int 8 11 11 94 13 8 9 37 11 89 ...
## $ Valor : int 110500000 77000000 118500000 72000000 102000000 93000000 67000000 80000000 51000000 68000000 ...
## $ Estatura : num 1.7 1.88 1.75 1.93 1.8 1.73 1.73 1.83 1.83 1.88 ...
## $ PesoKgs : num 72.1 83 68 76.2 69.8 ...
summary(datos)
## X Name Age Nationality
## Min. : 1 J. Rodríguez: 11 Min. :16.0 England : 1660
## 1st Qu.: 4490 Paulinho : 8 1st Qu.:21.0 Germany : 1198
## Median : 8978 J. Williams : 7 Median :25.0 Spain : 1072
## Mean : 8978 Felipe : 6 Mean :25.1 Argentina: 936
## 3rd Qu.:13466 J. Gómez : 6 3rd Qu.:28.0 France : 913
## Max. :17955 J. Hernández: 6 Max. :45.0 Brazil : 826
## (Other) :17911 (Other) :11350
## Overall Potential Club Preferred.Foot
## Min. :46.00 Min. :48.00 Arsenal : 33 : 48
## 1st Qu.:62.00 1st Qu.:67.00 AS Monaco : 33 Left : 4159
## Median :66.00 Median :71.00 Atlético Madrid : 33 Right:13748
## Mean :66.23 Mean :71.32 Borussia Dortmund: 33
## 3rd Qu.:71.00 3rd Qu.:75.00 Burnley : 33
## Max. :94.00 Max. :95.00 Cardiff City : 33
## (Other) :17757
## International.Reputation Weak.Foot Skill.Moves Height
## Min. :1.000 Min. :1.000 Min. :1.000 6'0 :2836
## 1st Qu.:1.000 1st Qu.:3.000 1st Qu.:2.000 5'10 :2449
## Median :1.000 Median :3.000 Median :2.000 5'9 :2203
## Mean :1.114 Mean :2.947 Mean :2.363 5'11 :2131
## 3rd Qu.:1.000 3rd Qu.:3.000 3rd Qu.:3.000 6'2 :1988
## Max. :5.000 Max. :5.000 Max. :5.000 6'1 :1885
## NA's :48 NA's :48 NA's :48 (Other):4463
## Weight Crossing Finishing HeadingAccuracy
## 165lbs : 1458 Min. : 5.00 Min. : 2.00 Min. : 4.0
## 154lbs : 1417 1st Qu.:38.00 1st Qu.:30.00 1st Qu.:44.0
## 176lbs : 1031 Median :54.00 Median :49.00 Median :56.0
## 172lbs : 972 Mean :49.75 Mean :45.59 Mean :52.3
## 159lbs : 936 3rd Qu.:64.00 3rd Qu.:62.00 3rd Qu.:64.0
## 161lbs : 929 Max. :93.00 Max. :95.00 Max. :94.0
## (Other):11212 NA's :48 NA's :48 NA's :48
## ShortPassing Volleys Dribbling Curve
## Min. : 7.00 Min. : 4.00 Min. : 4.00 Min. : 6.00
## 1st Qu.:54.00 1st Qu.:30.00 1st Qu.:49.00 1st Qu.:34.00
## Median :62.00 Median :44.00 Median :61.00 Median :49.00
## Mean :58.72 Mean :42.94 Mean :55.42 Mean :47.22
## 3rd Qu.:68.00 3rd Qu.:57.00 3rd Qu.:68.00 3rd Qu.:62.00
## Max. :93.00 Max. :90.00 Max. :97.00 Max. :94.00
## NA's :48 NA's :48 NA's :48 NA's :48
## FKAccuracy LongPassing BallControl Acceleration
## Min. : 3.00 Min. : 9.00 Min. : 5.00 Min. :12.00
## 1st Qu.:31.00 1st Qu.:43.00 1st Qu.:54.00 1st Qu.:57.00
## Median :41.00 Median :56.00 Median :63.00 Median :67.00
## Mean :42.88 Mean :52.73 Mean :58.42 Mean :64.62
## 3rd Qu.:57.00 3rd Qu.:64.00 3rd Qu.:69.00 3rd Qu.:75.00
## Max. :94.00 Max. :93.00 Max. :96.00 Max. :97.00
## NA's :48 NA's :48 NA's :48 NA's :48
## SprintSpeed Agility Reactions Balance
## Min. :12.00 Min. :14.00 Min. :21.00 Min. :16.00
## 1st Qu.:57.00 1st Qu.:55.00 1st Qu.:56.00 1st Qu.:56.00
## Median :67.00 Median :66.00 Median :62.00 Median :66.00
## Mean :64.74 Mean :63.54 Mean :61.82 Mean :63.97
## 3rd Qu.:75.00 3rd Qu.:74.00 3rd Qu.:68.00 3rd Qu.:74.00
## Max. :96.00 Max. :96.00 Max. :96.00 Max. :96.00
## NA's :48 NA's :48 NA's :48 NA's :48
## ShotPower Jumping Stamina Strength
## Min. : 2.00 Min. :15.00 Min. :12.00 Min. :17.00
## 1st Qu.:45.00 1st Qu.:58.00 1st Qu.:56.00 1st Qu.:58.00
## Median :59.00 Median :66.00 Median :66.00 Median :67.00
## Mean :55.49 Mean :65.12 Mean :63.22 Mean :65.33
## 3rd Qu.:68.00 3rd Qu.:73.00 3rd Qu.:74.00 3rd Qu.:74.00
## Max. :95.00 Max. :95.00 Max. :96.00 Max. :97.00
## NA's :48 NA's :48 NA's :48 NA's :48
## LongShots Aggression Interceptions Positioning Vision
## Min. : 3.00 Min. :11.00 Min. : 3.00 Min. : 2 Min. :10.00
## 1st Qu.:33.00 1st Qu.:44.00 1st Qu.:26.00 1st Qu.:39 1st Qu.:44.00
## Median :51.00 Median :59.00 Median :52.00 Median :55 Median :55.00
## Mean :47.13 Mean :55.88 Mean :46.69 Mean :50 Mean :53.45
## 3rd Qu.:62.00 3rd Qu.:69.00 3rd Qu.:64.00 3rd Qu.:64 3rd Qu.:64.00
## Max. :94.00 Max. :95.00 Max. :92.00 Max. :95 Max. :94.00
## NA's :48 NA's :48 NA's :48 NA's :48 NA's :48
## Penalties Composure Marking StandingTackle
## Min. : 5.00 Min. : 3.00 Min. : 3.00 Min. : 2.00
## 1st Qu.:39.00 1st Qu.:51.00 1st Qu.:30.00 1st Qu.:27.00
## Median :49.00 Median :60.00 Median :53.00 Median :55.00
## Mean :48.55 Mean :58.65 Mean :47.26 Mean :47.68
## 3rd Qu.:60.00 3rd Qu.:67.00 3rd Qu.:64.00 3rd Qu.:66.00
## Max. :92.00 Max. :96.00 Max. :94.00 Max. :93.00
## NA's :48 NA's :48 NA's :48 NA's :48
## SlidingTackle GKDiving GKHandling GKKicking GKPositioning
## Min. : 3.00 Min. : 1.00 Min. : 1.00 Min. : 1.0 Min. : 1.00
## 1st Qu.:24.00 1st Qu.: 8.00 1st Qu.: 8.00 1st Qu.: 8.0 1st Qu.: 8.00
## Median :52.00 Median :11.00 Median :11.00 Median :11.0 Median :11.00
## Mean :45.64 Mean :16.59 Mean :16.37 Mean :16.2 Mean :16.36
## 3rd Qu.:64.00 3rd Qu.:14.00 3rd Qu.:14.00 3rd Qu.:14.0 3rd Qu.:14.00
## Max. :91.00 Max. :90.00 Max. :92.00 Max. :91.0 Max. :90.00
## NA's :48 NA's :48 NA's :48 NA's :48 NA's :48
## GKReflexes Valor Estatura PesoKgs
## Min. : 1.00 Min. : 10000 Min. :1.550 Min. : 49.90
## 1st Qu.: 8.00 1st Qu.: 325000 1st Qu.:1.750 1st Qu.: 69.85
## Median :11.00 Median : 700000 Median :1.800 Median : 74.84
## Mean :16.68 Mean : 2444530 Mean :1.812 Mean : 75.28
## 3rd Qu.:14.00 3rd Qu.: 2100000 3rd Qu.:1.850 3rd Qu.: 79.83
## Max. :94.00 Max. :118500000 Max. :2.060 Max. :110.22
## NA's :48 NA's :48 NA's :48
head(datos)
## X Name Age Nationality Overall Potential Club
## 1 1 L. Messi 31 Argentina 94 94 FC Barcelona
## 2 2 Cristiano Ronaldo 33 Portugal 94 94 Juventus
## 3 3 Neymar Jr 26 Brazil 92 93 Paris Saint-Germain
## 4 4 De Gea 27 Spain 91 93 Manchester United
## 5 5 K. De Bruyne 27 Belgium 91 92 Manchester City
## 6 6 E. Hazard 27 Belgium 91 91 Chelsea
## Preferred.Foot International.Reputation Weak.Foot Skill.Moves Height Weight
## 1 Left 5 4 4 5'7 159lbs
## 2 Right 5 4 5 6'2 183lbs
## 3 Right 5 5 5 5'9 150lbs
## 4 Right 4 3 1 6'4 168lbs
## 5 Right 4 5 4 5'11 154lbs
## 6 Right 4 4 4 5'8 163lbs
## Crossing Finishing HeadingAccuracy ShortPassing Volleys Dribbling Curve
## 1 84 95 70 90 86 97 93
## 2 84 94 89 81 87 88 81
## 3 79 87 62 84 84 96 88
## 4 17 13 21 50 13 18 21
## 5 93 82 55 92 82 86 85
## 6 81 84 61 89 80 95 83
## FKAccuracy LongPassing BallControl Acceleration SprintSpeed Agility Reactions
## 1 94 87 96 91 86 91 95
## 2 76 77 94 89 91 87 96
## 3 87 78 95 94 90 96 94
## 4 19 51 42 57 58 60 90
## 5 83 91 91 78 76 79 91
## 6 79 83 94 94 88 95 90
## Balance ShotPower Jumping Stamina Strength LongShots Aggression Interceptions
## 1 95 85 68 72 59 94 48 22
## 2 70 95 95 88 79 93 63 29
## 3 84 80 61 81 49 82 56 36
## 4 43 31 67 43 64 12 38 30
## 5 77 91 63 90 75 91 76 61
## 6 94 82 56 83 66 80 54 41
## Positioning Vision Penalties Composure Marking StandingTackle SlidingTackle
## 1 94 94 75 96 33 28 26
## 2 95 82 85 95 28 31 23
## 3 89 87 81 94 27 24 33
## 4 12 68 40 68 15 21 13
## 5 87 94 79 88 68 58 51
## 6 87 89 86 91 34 27 22
## GKDiving GKHandling GKKicking GKPositioning GKReflexes Valor Estatura
## 1 6 11 15 14 8 110500000 1.70
## 2 7 11 15 14 11 77000000 1.88
## 3 9 9 15 15 11 118500000 1.75
## 4 90 85 87 88 94 72000000 1.93
## 5 15 13 5 10 13 102000000 1.80
## 6 11 12 6 8 8 93000000 1.73
## PesoKgs
## 1 72.12
## 2 83.01
## 3 68.04
## 4 76.20
## 5 69.85
## 6 73.94
tail(datos)
## X Name Age Nationality Overall Potential
## 17950 17950 D. Walsh 18 Republic of Ireland 47 68
## 17951 17951 J. Lundstram 19 England 47 65
## 17952 17952 N. Christoffersson 19 Sweden 47 63
## 17953 17953 B. Worman 16 England 47 67
## 17954 17954 D. Walker-Rice 17 England 47 66
## 17955 17955 G. Nugent 16 England 46 66
## Club Preferred.Foot International.Reputation Weak.Foot
## 17950 Waterford FC Left 1 3
## 17951 Crewe Alexandra Right 1 2
## 17952 Trelleborgs FF Right 1 2
## 17953 Cambridge United Right 1 3
## 17954 Tranmere Rovers Right 1 3
## 17955 Tranmere Rovers Right 1 3
## Skill.Moves Height Weight Crossing Finishing HeadingAccuracy ShortPassing
## 17950 2 6'1 168lbs 22 23 45 25
## 17951 2 5'9 134lbs 34 38 40 49
## 17952 2 6'3 170lbs 23 52 52 43
## 17953 2 5'8 148lbs 25 40 46 38
## 17954 2 5'10 154lbs 44 50 39 42
## 17955 2 5'10 176lbs 41 34 46 48
## Volleys Dribbling Curve FKAccuracy LongPassing BallControl Acceleration
## 17950 27 21 21 27 27 32 52
## 17951 25 42 30 34 45 43 54
## 17952 36 39 32 20 25 40 41
## 17953 38 45 38 27 28 44 70
## 17954 40 51 34 32 32 52 61
## 17955 30 43 40 34 44 51 57
## SprintSpeed Agility Reactions Balance ShotPower Jumping Stamina Strength
## 17950 52 39 43 48 39 74 39 52
## 17951 57 60 49 76 43 55 40 47
## 17952 39 38 40 52 41 47 43 67
## 17953 69 50 47 58 45 60 55 32
## 17954 60 52 21 71 64 42 40 48
## 17955 55 55 51 63 43 62 47 60
## LongShots Aggression Interceptions Positioning Vision Penalties Composure
## 17950 16 44 45 20 31 38 43
## 17951 38 46 46 39 52 43 45
## 17952 42 47 16 46 33 43 42
## 17953 45 32 15 48 43 55 41
## 17954 34 33 22 44 47 50 46
## 17955 32 56 42 34 49 33 43
## Marking StandingTackle SlidingTackle GKDiving GKHandling GKKicking
## 17950 44 47 53 9 10 9
## 17951 40 48 47 10 13 7
## 17952 22 15 19 10 9 9
## 17953 32 13 11 6 5 10
## 17954 20 25 27 14 6 14
## 17955 40 43 50 10 15 9
## GKPositioning GKReflexes Valor Estatura PesoKgs
## 17950 11 13 60000 1.85 76.20
## 17951 8 9 60000 1.75 60.78
## 17952 5 12 60000 1.91 77.11
## 17953 6 13 60000 1.73 67.13
## 17954 8 9 60000 1.78 69.85
## 17955 12 9 60000 1.78 79.83
Tabla de frecuencia con la función fdt_cat() de la librería fdth
tabla.frecuencia <- fdt_cat(datos$Nationality)
head(tabla.frecuencia, 10)
## Category f rf rf(%) cf cf(%)
## England 1660 0.09 9.25 1660 9.25
## Germany 1198 0.07 6.67 2858 15.92
## Spain 1072 0.06 5.97 3930 21.89
## Argentina 936 0.05 5.21 4866 27.10
## France 913 0.05 5.08 5779 32.19
## Brazil 826 0.05 4.60 6605 36.79
## Italy 702 0.04 3.91 7307 40.70
## Colombia 617 0.03 3.44 7924 44.13
## Japan 475 0.03 2.65 8399 46.78
## Netherlands 453 0.03 2.52 8852 49.30
Gráfico de barra con funciones de dplyr para filtrar datos y funciones de la librería plotly para gráfico interactivo de barra.
Se usa una variable llamada g para crear el gráfico y solo simplemente mostrarlo.
El símbolo %>% en el siguiente código significa que la instrucción continúa en la siguiente linea.
g <- plot_ly(head(tabla.frecuencia, 10)) %>%
add_trace(x = ~Category,
y = ~f,
type = 'bar',
name = 'Frecuencia',
marker = list(color = '#C9EFF9'),
hoverinfo = "text", text = ~paste(round(rf * 100, 2), "%")) %>%
layout(title = 'Frecuencia de jugadores FIFA por Nacionalidad',
xaxis = list(title = "Nacionalidades"))
g
tabla.frecuencia <- fdt_cat(datos$Preferred.Foot)
tabla.frecuencia
## Category f rf rf(%) cf cf(%)
## Right 13748 0.77 76.57 13748 76.57
## Left 4159 0.23 23.16 17907 99.73
## 48 0.00 0.27 17955 100.00
Se usa una variable llamada g para crear el gráfico y solo simplemente mostrarlo.
g <- plot_ly(tabla.frecuencia) %>%
add_trace(x = ~Category,
y = ~f,
type = 'bar',
name = 'Frecuencia',
marker = list(color = '#C9EFF9'),
hoverinfo = "text", text = ~paste(round(rf * 100, 2), "%")) %>%
layout(title = 'Frecuencia de jugadores FIFA por Pierna que usan',
xaxis = list(title = "Pie preferido"))
g
Del conjunto de datos describa las siguientes preguntas:
El conjunto de datos tiene 17955 observaciones y 50 variables.
Hay mas jugadores derechos con 13748 y zurdos 4159.
¿Cuál es la edad media de los jugadores?. La media aritmética de edad es : 25.0 años
¿Cuál es la estatura media de los jugadores?. Variable: Estatura
¿Cuál es el peso medio en kgs de los jugadores?: Variable PesoKgs
¿Cuál es el valor medio económico de los jugadores?. Varible Valor
¿Cuál es el valor medio del rating de los jugadores? . Variable International.Reputation
¿Cuál es el jugador con mayor valor de agresión o violencia. Variable Aggression. El jugador más agresivo es B. Pearson 23 de Inglaterra