The 2016-17 FA Cup semi-finalists are Chelsea (1st in EPL), Tottenham Hotspur (2nd), Manchester City (3rd) and Arsenal (5th). Is this the best ever line-up of semi-finalists? I used my engsoccerdata R package available on CRAN to find out. This dataset contains hundreds of thousands of professional soccer results, including all from England since 1877.
Here’s how I checked to find the best semi-finalists ever.
library(tidyverse)
library(engsoccerdata)
# get date of first semi for each season
facup %>%
filter(Season>=1888) %>%
filter(Season!=1945) %>%
filter(round=="s") %>%
group_by(Season) %>%
arrange(Date) %>%
filter(row_number()==1) %>%
select(Season,Date) -> semi_dates
head(semi_dates)
## Source: local data frame [6 x 2]
## Groups: Season [6]
##
## Season Date
## <int> <date>
## 1 1888 1889-03-16
## 2 1889 1890-03-08
## 3 1890 1891-02-28
## 4 1891 1892-02-27
## 5 1892 1893-03-04
## 6 1893 1894-03-10
# get teams in semis for each season
facup %>%
filter(Season>=1888) %>%
filter(Season!=1945) %>%
filter(round=="s") %>%
split(., .$Season) %>%
map(~ select(., home,visitor)) %>%
map(unlist) %>%
map(unique) -> semi_teams
head(semi_teams)
## $`1888`
## [1] "Preston North End" "Wolverhampton Wanderers"
## [3] "Blackburn Rovers" "West Bromwich Albion"
##
## $`1889`
## [1] "Blackburn Rovers" "Sheffield Wednesday"
## [3] "Wolverhampton Wanderers" "Bolton Wanderers"
##
## $`1890`
## [1] "Blackburn Rovers" "Notts County" "Sunderland"
## [4] "West Bromwich Albion"
##
## $`1891`
## [1] "Aston Villa" "West Bromwich Albion" "Nottingham Forest"
## [4] "Sunderland"
##
## $`1892`
## [1] "Everton" "Wolverhampton Wanderers"
## [3] "Preston North End" "Blackburn Rovers"
##
## $`1893`
## [1] "Bolton Wanderers" "Notts County" "Sheffield Wednesday"
## [4] "Blackburn Rovers"
Now I will use the maketable_eng function to get the league table on the date of the first semi-final. I have to get each Season’s dataframe into a list only including top-tier games. Then, I make the table and filter the position of each team.
# get top tier table on date of first semi.
england %>%
filter(tier==1) %>%
split(., .$Season) -> england_seasons
table_res=NULL
for(i in 1:nrow(semi_dates)){
table_res[[i]] <- maketable_eng(england_seasons[[i]] %>% filter(Date<=semi_dates$Date[[i]]),
tier=1,
Season=semi_dates$Season[[i]]) %>%
filter(team %in% semi_teams[[i]])
}
head(table_res,3)
## [[1]]
## team GP W D L gf ga gd Pts Pos
## 1 Preston North End 22 18 4 0 74 15 4.9333333 40 1
## 2 Wolverhampton Wanderers 22 12 4 6 50 37 1.3513514 28 3
## 3 Blackburn Rovers 20 9 6 5 62 42 1.4761905 24 4
## 4 West Bromwich Albion 22 10 2 10 40 46 0.8695652 22 5
##
## [[2]]
## team GP W D L gf ga gd Pts Pos
## 1 Blackburn Rovers 21 12 3 6 78 38 2.0526316 27 3
## 2 Wolverhampton Wanderers 21 9 5 7 46 37 1.2432432 23 4
## 3 Bolton Wanderers 20 9 0 11 51 58 0.8793103 18 8
##
## [[3]]
## team GP W D L gf ga gd Pts Pos
## 1 Notts County 21 10 4 7 45 34 1.323529 24 4
## 2 Blackburn Rovers 17 10 2 5 47 31 1.516129 22 5
## 3 Sunderland 20 8 5 7 43 30 1.433333 21 6
## 4 West Bromwich Albion 18 3 2 13 27 48 0.562500 8 12
As you can see, these are the semi-finalists from the top-tier in 1888-89, 1889-90, and 1890-91. In 1889-90 there was a non top-tier team in the semis (It was The Wednesday - now Sheffield Wednesday who were founder members of the Football Alliance, the rival to the Football League).
To make this easier, let’s put it into a dataframe - and arrange by the top five seasons.
# top 5 Seasons for highest placed teams
table_res_pos <- lapply(table_res, function(x) x$Pos)
data.frame(
Season = semi_dates$Season,
teams = unlist(lapply(table_res_pos, length)),
total = unlist(lapply(table_res_pos, function(x) sum(as.numeric(x))))
) %>%
filter(teams==4) %>%
arrange(total) %>%
head(5)
## Season teams total
## 1 1888 4 13
## 2 2008 4 14
## 3 1964 4 15
## 4 1896 4 16
## 5 1904 4 17
This shows that in 1888-90 there were four teams from the top tier of English football in the semi-finals of the FA Cup and their total summed league position was 13. We saw this above. It was Preston North End (1st), Wolves (3rd), Blackburn Rovers (4th) and WBA (5th).
We can look at the other teams like this:
semi_teams[which(semi_dates$Season==1888)]
## $`1888`
## [1] "Preston North End" "Wolverhampton Wanderers"
## [3] "Blackburn Rovers" "West Bromwich Albion"
semi_teams[which(semi_dates$Season==2008)]
## $`2008`
## [1] "Chelsea" "Everton" "Arsenal"
## [4] "Manchester United"
semi_teams[which(semi_dates$Season==1964)]
## $`1964`
## [1] "Leeds United" "Liverpool" "Manchester United"
## [4] "Chelsea"
So, if the teams in this season’s FA Cup semi-final remain in the same positions (1,2,3,5), that will be by my reckoning the highest ever positions of semi-finalists.
Contact me - [@jalapic](https://twitter.com/jalapic) on twitter or jc3181 AT columbia DOT edu