library(tidyverse)
## ── Attaching packages ─────────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.1.0 ✔ purrr 0.3.0
## ✔ tibble 2.0.1 ✔ dplyr 0.8.0.1
## ✔ tidyr 0.8.2 ✔ stringr 1.4.0
## ✔ readr 1.3.1 ✔ forcats 0.4.0
## Warning: package 'tibble' was built under R version 3.5.2
## Warning: package 'purrr' was built under R version 3.5.2
## Warning: package 'dplyr' was built under R version 3.5.2
## Warning: package 'forcats' was built under R version 3.5.2
## ── Conflicts ────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
Helpers for reordering factor levels (including moving specified levels to front, ordering by first appearance, reversing, and randomly shuffling), and tools for modifying factor levels (including collapsing rare levels into other, ‘anonymising’, and manually ‘recoding’).
https://www.kaggle.com/karangadiya/fifa19/version/4
This Dataset was downloaded from Kaggle. It contains detailed attributes for every player registered in the latest edition of FIFA, the premiere soccer video game.
players<-read.csv('https://raw.githubusercontent.com/Chris-Ayre/TidyVerse/master/fifa%2019%20data.csv')
Lets explore the Nationalities of the soccer players.
fct_count(players$Nationality)
## # A tibble: 164 x 2
## f n
## <fct> <int>
## 1 Afghanistan 4
## 2 Albania 40
## 3 Algeria 60
## 4 Andorra 1
## 5 Angola 15
## 6 Antigua & Barbuda 4
## 7 Argentina 937
## 8 Armenia 10
## 9 Australia 236
## 10 Austria 298
## # … with 154 more rows
We have identified 164 different Natinalities among all 18,207 players in the game. However, we only wish to work with players from North America.We need to Anonymize the data we will not work with
table(fct_other(players$Nationality, keep = c("United States", "Canada", "Mexico")))
##
## Canada Mexico United States Other
## 64 366 353 17424
All Nationalities outside of North America are now labeled as other.