TidyVerse Assignment

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.1.0       ✔ purrr   0.3.0  
## ✔ tibble  2.0.1       ✔ dplyr   0.8.0.1
## ✔ tidyr   0.8.2       ✔ stringr 1.4.0  
## ✔ readr   1.3.1       ✔ forcats 0.4.0
## Warning: package 'tibble' was built under R version 3.5.2
## Warning: package 'purrr' was built under R version 3.5.2
## Warning: package 'dplyr' was built under R version 3.5.2
## Warning: package 'forcats' was built under R version 3.5.2
## ── Conflicts ────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

Forcats Package

Helpers for reordering factor levels (including moving specified levels to front, ordering by first appearance, reversing, and randomly shuffling), and tools for modifying factor levels (including collapsing rare levels into other, ‘anonymising’, and manually ‘recoding’).

Data Source

https://www.kaggle.com/karangadiya/fifa19/version/4

This Dataset was downloaded from Kaggle. It contains detailed attributes for every player registered in the latest edition of FIFA, the premiere soccer video game.

players<-read.csv('https://raw.githubusercontent.com/Chris-Ayre/TidyVerse/master/fifa%2019%20data.csv')

Example

Lets explore the Nationalities of the soccer players.

fct_count(players$Nationality)
## # A tibble: 164 x 2
##    f                     n
##    <fct>             <int>
##  1 Afghanistan           4
##  2 Albania              40
##  3 Algeria              60
##  4 Andorra               1
##  5 Angola               15
##  6 Antigua & Barbuda     4
##  7 Argentina           937
##  8 Armenia              10
##  9 Australia           236
## 10 Austria             298
## # … with 154 more rows

We have identified 164 different Natinalities among all 18,207 players in the game. However, we only wish to work with players from North America.We need to Anonymize the data we will not work with

table(fct_other(players$Nationality, keep = c("United States", "Canada", "Mexico"))) 
## 
##        Canada        Mexico United States         Other 
##            64           366           353         17424

All Nationalities outside of North America are now labeled as other.