Football Transfer Analysis.

Reading the dataset and loading the desired files.

library(dplyr, verbose = F)
library(ggplot2, verbose = F)

transferData <- read.csv("transfer_data.csv", stringsAsFactors = F)

Initial Transfer Summary.

cat("There have been", length(transferData$PLAYER),"transfers that took place from 2007 - 2017.")
## There have been 6237 transfers that took place from 2007 - 2017.
cat("There have been", length(unique(transferData$PLAYER)),"players transferred from 2007 - 2017.")
## There have been 4167 players transferred from 2007 - 2017.

Window Analysis.

Usually there are 2 transfer windows. 1. Summer Window. 2. Mid-Season Window.

transferData %>%
  filter(is.na(WINDOW)) %>%
  nrow()
## [1] 1

So, there is a row with missing values. Let’s remove the values.

transferData <- transferData %>%
                  filter(!is.na(WINDOW))

df <- transferData %>%
      group_by(WINDOW) %>%
      summarise(Percent = round((n()*100)/nrow(transferData)) )

df %>%   
ggplot(aes(x=WINDOW, y=Percent)) + geom_bar(stat='identity', fill='tomato') +
        ggtitle("Total number of transfers in a window(2007-2017)") +
        geom_label(label=df$Percent)

78% of the transfers happenduring the pre-season window.

transferData$SEASON[transferData$WINDOW == "Mid-Season" & transferData$SEASON == "15"] <- "14/15"
transferData$SEASON[transferData$WINDOW == "Mid-Season" & transferData$SEASON == "16"] <- "15/16"
transferData$SEASON[transferData$WINDOW == "Pre-Season" & transferData$SEASON == "15"] <- "15/16"
transferData$SEASON[transferData$WINDOW == "Pre-Season" & transferData$SEASON == "16"] <- "16/17"

transferData %>%
      group_by(WINDOW, SEASON) %>%
      summarise(Count = n()) %>%
      ggplot( aes(x=SEASON, y=Count, group=WINDOW)) +
      geom_line(aes(color=WINDOW))+
      geom_point(aes(color=WINDOW)) +
      theme_bw()

The number of transfers every season has increased. The team have tried to establish their presence by making as many transfers as possible every season.