Qatar 2022: Predict which teams will move their to glory

Author

Andres PENA

Published

October 20, 2022

Introduction

The following document shows the results of the model that aims to predict the results of the matches to be played by the national teams participating in the Qatar 2022 World Cup.

The model starts by taking into consideration the FIFA Ranking prior to the start of the tournament, then, for each match, the probability that the team has of winning (or losing) the match is calculated, subsequently, after the match the calculation of the FIFA Ranking is updated. This process is repeated from the opening game to the grand final of the Qatar 2022 World Cup.

Nomenclature

  • id: match id
  • group: FIFA group
  • date: match date
  • rounds: group stage (Round-1 to Round-3) & knockout rounds
  • home: home team
  • away: away team
  • prob_home: win probability home team
  • prob_away: win probability away team
  • winner: winner team
  • loser: loser team
  • home_rank: home team FIFA rank after match
  • away_rank: away team FIFA rank after match

Group stage

Only the results of the 8 teams with the highest value in the FIFA ranking are shown. In the final part of this document the results of all the confrontations of the participating teams are shown.

Group stage: Round-1

The following values correspond to the FIFA ranking of August 25, 2022 and taken from the following link:

https://www.fifa.com/fifa-world-ranking/men?dateId=id13792

Code
# Data prep Round-1  
# 
# add FIFA ranking to schedule
# 
r1 <- sch %>% 
  filter(rounds == "r1") %>%
  rename(team = home) %>% 
  full_join(fr, by = "team") %>%
  rename(home = team, home_rank = rank_bQ22)

r1 <- r1 %>% 
  rename(team = away) %>% 
  full_join(fr, by = "team") %>%
  rename(away = team, away_rank = rank_bQ22) %>%
  drop_na(id)

r1 <- r1 %>% 
  select(id:date, rounds, home:away_rank)

# Final result Round-1 run update_point function 
# 
round1 <- update_point(r1, "r1", home_rank, away_rank, I, prob_home, prob_away)
Qatar 2022 - Before Round-1
home home_rank away away_rank
Senegal 1585 Netherlands 1679
England 1737 Iran 1559
France 1765 Australia 1484
Argentina 1771 Saudi Arabia 1436
Belgium 1822 Canada 1474
Spain 1717 Costa Rica 1500
Portugal 1679 Ghana 1393
Brazil 1838 Serbia 1550
Qatar 2022 - After Round-1
home away prob_home prob_away home_rank away_rank
Senegal Netherlands 41 59 1564 1700
England Iran 66 34 1754 1542
France Australia 75 25 1778 1471
Argentina Saudi Arabia 78 22 1781 1425
Belgium Canada 79 21 1832 1463
Spain Costa Rica 70 30 1732 1485
Portugal Ghana 75 25 1691 1381
Brazil Serbia 75 25 1850 1537

Group stage: Round-2

Code
# update fr with r1 results
# 
fr <- tibble(team = c(r1$home, r1$away),
              rank = c(r1$home_rank, r1$away_rank)) 
  
# add FIFA ranking to schedule
# 
r2 <- sch %>% 
  filter(rounds == "r2") %>%
  rename(team = home) %>% 
  full_join(fr, by = "team") %>%
  rename(home = team, home_rank = rank)

r2 <- r2 %>% 
  rename(team = away) %>% 
  full_join(fr, by = "team") %>%
  rename(away = team, away_rank = rank) %>%
  drop_na(id)

r2 <- r2 %>% 
  select(id:date, rounds, home:away_rank)

# Final result Round-2 run update_point function 
# 
round2 <- update_point(r2, "r2", home_rank, away_rank, I, prob_home, prob_away)
Qatar 2022 - Before Round-2
home home_rank away away_rank
Netherlands 1679 Ecuador 1464
England 1737 United States 1635
France 1765 Denmark 1665
Argentina 1771 Mexico 1650
Belgium 1822 Morocco 1558
Spain 1717 Germany 1659
Brazil 1838 Switzerland 1621
Portugal 1679 Uruguay 1641
Qatar 2022 - After Round-2
home away prob_home prob_away home_rank away_rank
Netherlands Ecuador 70 30 1695 1449
England United States 60 40 1758 1615
France Denmark 59 41 1785 1645
Argentina Mexico 61 39 1790 1630
Belgium Morocco 73 27 1835 1545
Spain Germany 56 44 1739 1637
Brazil Switzerland 70 30 1853 1606
Portugal Uruguay 54 46 1702 1618

Group stage: Round-3

Code
# update fr with r1 results
# 
fr <- tibble(team = c(r2$home, r2$away),
              rank = c(r2$home_rank, r2$away_rank)) 

# add FIFA ranking to schedule
# 
r3 <- sch %>% 
  filter(rounds == "r3") %>%
  rename(team = home) %>% 
  full_join(fr, by = "team") %>%
  rename(home = team, home_rank = rank)

r3 <- r3 %>% 
  rename(team = away) %>% 
  full_join(fr, by = "team") %>%
  rename(away = team, away_rank = rank) %>%
  drop_na(id)

r3 <- r3 %>% 
  select(id:date, rounds, home:away_rank)

# Final result Round-3 run update_point function 
# 
round3 <- update_point(r3, "r3", home_rank, away_rank, I, prob_home, prob_away)
Qatar 2022 - Before Round-3
home home_rank away away_rank
Wales 1582 England 1737
Netherlands 1679 Qatar 1442
Tunisia 1508 France 1765
Poland 1546 Argentina 1771
Croatia 1632 Belgium 1822
Japan 1555 Spain 1717
South Korea 1526 Portugal 1679
Cameroon 1485 Brazil 1838
Qatar 2022 - After Round-3
home away prob_home prob_away home_rank away_rank
Wales England 36 64 1564 1755
Netherlands Qatar 71 29 1694 1428
Tunisia France 27 73 1494 1778
Poland Argentina 30 70 1531 1786
Croatia Belgium 33 67 1616 1838
Japan Spain 35 65 1537 1734
South Korea Portugal 36 64 1508 1697
Cameroon Brazil 20 80 1475 1848
Code
# Group stage results 
# 
phase_group_results <- rbind(round1, round2, round3)

# rank after r3
# 
df <- phase_group_results %>% 
  filter(rounds == "r3") %>%
  select(group, home, away, home_rank, away_rank)

rank_after_r3 <- tibble(team = c(df$home, df$away),
                        rank = c(df$home_rank, df$away_rank))

# phase_group_winners
# 
phase_group_winners <- phase_group_results %>%
  group_by(winner) %>%
  summarise(group = unique(group),
            team = unique(winner),
            nw = n()) %>%
  filter(nw > 1) %>%
  mutate(win = ifelse(nw == 3, "1", "2"),
         winner = paste(win, group, sep = "")) %>%
  select(winner, team)

# trw: team, rank & winner
#
trw <- full_join(rank_after_r3, phase_group_winners, by = "team") %>%
  drop_na(winner)

Knockout rounds

Phase: Round of 16 - r18

In the Round of 16, groups are paired up and winners in one group face the runner-ups from the other: Group A was paired with Group B, C with D, E with F and G with H.

Code
# phase 1/8 schedule
# 
schl18 <- sch %>% filter(group == "1/8 final") %>% 
  select(id, group, date, home, away)

# join sch18 & trw to set home
# 
da1 <- schl18 %>% 
  rename(winner = home) %>% 
  full_join(trw, by = "winner") %>% 
  drop_na(id) %>%
  select(id, date, home = team, home_rank = rank)

# join sch18 & trw to set away 
#
da2 <- schl18 %>% 
  rename(winner = away) %>% 
  full_join(trw, by = "winner") %>% 
  drop_na(id) %>%
  select(id, away = team, away_rank = rank)

# join to get updated schedule 1/8
#
r18 <- full_join(da1, da2,  by = "id") %>%
  mutate(group = "r18", rounds = "r18") %>%
  select("id", "group", "date", "rounds", "home",
         "away", "home_rank", "away_rank")

# Final result r18 run update_point function 
# 
round18 <- update_point(r18, "r18", home_rank, 
                   away_rank, I, prob_home, prob_away)
Qatar 2022 - Before Round of 16
home home_rank away away_rank
Netherlands 1694 United States 1656
Argentina 1786 Denmark 1682
England 1755 Senegal 1604
France 1778 Mexico 1665
Spain 1734 Croatia 1616
Brazil 1848 Uruguay 1655
Belgium 1838 Germany 1677
Portugal 1697 Switzerland 1643
Qatar 2022 - After Round of 16
home away prob_home prob_away home_rank away_rank
Netherlands United States 54 46 1717 1633
Argentina Denmark 60 40 1806 1662
England Senegal 64 36 1773 1586
France Mexico 61 39 1798 1645
Spain Croatia 61 39 1753 1597
Brazil Uruguay 68 32 1864 1639
Belgium Germany 65 35 1856 1659
Portugal Switzerland 55 45 1719 1621

Phase: Quarterfinals - r14

Code
# importance of match
I = 60   

r14 <- round18 %>% 
  mutate(game = paste("W", id, sep = "")) %>%
  mutate(rank = ifelse(home == winner, home_rank, away_rank)) %>%
  select(team = winner, game, rank)

sch14 <- sch %>% 
  filter(group == "1/4 final") %>% 
  select(id, group, date, rounds, home, away)

db <- r14 %>% 
  rename(home = game) %>% 
  full_join(sch14, by = "home") %>% 
  drop_na(id) %>%
  select(id, group, date, rounds, home = team, home_rank = rank)

dc <- r14 %>% 
  rename(away = game) %>% 
  full_join(sch14, by = "away") %>% 
  drop_na(id) %>%
  select(away = team, away_rank = rank)

r14 <- cbind(db, dc)

# Final result r18 run update_point function 
# 
r14 <- r14 %>% mutate(group = "r14", rounds = "r14") %>%
  select("id", "group", "date", "rounds", "home",
         "away", "home_rank", "away_rank")

round14 <- update_point(r14, "r14", home_rank, 
                   away_rank, I, prob_home, prob_away)
Qatar 2022 - Before Quarterfinals
home home_rank away away_rank
Netherlands 1717 Argentina 1806
England 1773 France 1798
Spain 1753 Brazil 1864
Belgium 1856 Portugal 1719
Qatar 2022 - After Quarterfinals
home away prob_home prob_away home_rank away_rank
Netherlands Argentina 42 58 1692 1831
England France 48 52 1744 1827
Spain Brazil 40 60 1729 1888
Belgium Portugal 63 37 1878 1697

Phase: Semifinals - r12

Code
r12 <- round14 %>% 
  mutate(game = paste("W", id, sep = "")) %>%
  mutate(rank = ifelse(home == winner, home_rank, away_rank)) %>%
  select(team = winner, game, rank)

sch12 <- sch %>% 
  filter(group == "1/2 final") %>% 
  select(id, group, date, rounds, home, away)

cb <- r12 %>% 
  rename(home = game) %>% 
  full_join(sch12, by = "home") %>% 
  drop_na(id) %>%
  select(id, group, date, rounds, home = team, home_rank = rank)

cc <- r12 %>% 
  rename(away = game) %>% 
  full_join(sch12, by = "away") %>% 
  drop_na(id) %>%
  select(away = team, away_rank = rank)

r12 <- cbind(cb, cc)

# Final result r18 run update_point function 
# 
r12 <- r12 %>% mutate(group = "r12", rounds = "r12") %>%
  select("id", "group", "date", "rounds", "home",
         "away", "home_rank", "away_rank")

round12 <- update_point(r12, "r12", home_rank, 
                   away_rank, I, prob_home, prob_away)
Qatar 2022 - Before Semifinals
home home_rank away away_rank
Argentina 1831 Brazil 1888
France 1827 Belgium 1878
Qatar 2022 - After Semifinals
home away prob_home prob_away home_rank away_rank
Argentina Brazil 45 55 1804 1915
France Belgium 45 55 1800 1905

Phase: Final - rf

Code
rf <- round12 %>% 
  mutate(game = paste("W", id, sep = "")) %>%
  mutate(rank = ifelse(home == winner, home_rank, away_rank)) %>%
  select(team = winner, game, rank)

schf <- sch %>% 
  filter(group == "final") %>% 
  select(id, group, date, rounds, home, away)

db <- rf %>% 
  rename(home = game) %>% 
  full_join(schf, by = "home") %>% 
  drop_na(id) %>%
  select(id, group, date, rounds, home = team, home_rank = rank)

dc <- rf %>% 
  rename(away = game) %>% 
  full_join(schf, by = "away") %>% 
  drop_na(id) %>%
  select(away = team, away_rank = rank)

rf <- cbind(db, dc)

# Final result r18 run update_point function 
# 
rf <- rf %>% mutate(group = "rf", rounds = "rf") %>%
  select("id", "group", "date", "rounds", "home",
         "away", "home_rank", "away_rank")

roundf <- update_point(rf, "rf", home_rank, 
                   away_rank, I, prob_home, prob_away)
Qatar 2022 - Before Finals
home home_rank away away_rank
Brazil 1915 Belgium 1905
Qatar 2022 - After Finals
home away prob_home prob_away home_rank away_rank
Brazil Belgium 51 49 1944 1876

All match results

Below are the results of all the matches. You can indicate the name of a country in Search in order to visualize its performance in Qatar 2022 World Cup