World Cup Trophy
From the maven data
analystics,We will be analysing Maven world cup challenge, We will
be making some analysis on the Historical data leading to the 2022 FIFA
world cup tournament in Qatar, including all the matches from the
previous world cups, all international matches for the qualified
countries, and the groups and matches for the upcoming tournament.
In our analysis, our case study country will be Brazil, The Brazil
national team also nicknamed the Seleção Canarinha is
one of the 32 footballing nations that will participating in this years
tournament in Qatar. we will dig into data and tell a single-page story
of a country’s history with the World Cup, their road to Qatar, and
their expectations for this year’s tournament and also present our
insight by using data visualization
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.2.2
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.8 ✔ dplyr 1.0.10
## ✔ tidyr 1.2.1 ✔ stringr 1.4.1
## ✔ readr 2.1.3 ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(ggplot2)
library(lubridate)
##
## Attaching package: 'lubridate'
##
## The following objects are masked from 'package:base':
##
## date, intersect, setdiff, union
library(maps)
## Warning: package 'maps' was built under R version 4.2.2
##
## Attaching package: 'maps'
##
## The following object is masked from 'package:purrr':
##
## map
world_cup_matches <- read.csv("2022_world_cup_matches.csv")
world_cup_groups <- read.csv("2022_world_cup_groups.csv")
international_matches <- read.csv("international_matches.csv")
world_cup_games <- read.csv("world_cup_matches.csv")
world_cup <- read.csv("world_cups.csv")
Data preparation and cleaning
Brazil_international_matches <- international_matches %>% filter(international_matches$Home.Team =="Brazil" | international_matches$Away.Team =="Brazil")
world_cup_v2 <- world_cup %>%
mutate(Runners.Up = recode(Runners.Up
,"Germany FR" = "Germany"))
world_cup_v2 <- world_cup %>%
mutate(Third = recode(Third
,"Germany FR" = "Germany"))
world_cup_v2 <- world_cup %>%
mutate(Winner = recode(Winner
,"Germany FR" = "Germany"))
Brazil_international_matches <- Brazil_international_matches %>%
mutate(Tournament = recode(Tournament
,"Germany FR" = "Germany"))
Brazil_wc_game <- world_cup_games %>%
filter(world_cup_games$Home.Team == "Brazil" | world_cup_games$Away.Team == "Brazil")
Number of goals scored at home and number of matches played in every tournament
Brazil_international_matches %>%
group_by(Tournament) %>%
filter(Home.Team == "Brazil") %>%
summarise(total_home_goals = sum(Home.Goals),Num_of_match_played = n()) %>%
arrange(desc(total_home_goals))
## # A tibble: 15 × 3
## Tournament total_home_goals Num_of_match_played
## <chr> <int> <int>
## 1 Friendly 556 228
## 2 Copa America 349 137
## 3 FIFA World Cup qualification 174 63
## 4 Confederations Cup 54 18
## 5 Copa Roca 32 13
## 6 Copa Oswaldo Cruz 29 9
## 7 Pan American Championship 25 9
## 8 Copa Rio Branco 23 9
## 9 Copa Bernardo O'Higgins 11 4
## 10 Gold Cup 8 5
## 11 Atlantic Cup 7 2
## 12 Brazil Independence Cup 5 4
## 13 Mundialito 4 1
## 14 Superclasico de las Americas 4 3
## 15 USA Cup 4 2
Number of goals scored at away and number of matches played in every tournament
Brazil_international_matches %>%
group_by(Tournament) %>%
filter(Away.Team == "Brazil") %>%
summarise(total_away_goals = sum(Away.Goals),Num_of_match_played = n()) %>%
arrange(desc(total_away_goals))
## # A tibble: 18 × 3
## Tournament total_away_goals Num_of_match_played
## <chr> <int> <int>
## 1 Friendly 392 201
## 2 FIFA World Cup qualification 105 64
## 3 Copa America 81 54
## 4 Confederations Cup 24 15
## 5 Copa Oswaldo Cruz 17 7
## 6 Copa Roca 17 10
## 7 Gold Cup 14 9
## 8 Copa Rio Branco 13 9
## 9 Pan American Championship 13 7
## 10 King's Cup 7 1
## 11 Lunar New Year Cup 7 1
## 12 Copa Bernardo O'Higgins 6 6
## 13 Tournoi de France 5 3
## 14 Superclasico de las Americas 4 5
## 15 Rous Cup 3 2
## 16 Atlantic Cup 2 3
## 17 Mundialito 2 2
## 18 USA Cup 2 1
Number of goals conceded at home in all competition
Brazil_international_matches %>%
filter(!Home.Team %in% c('Brazil'))%>%
summarise(goals_conceded = sum(Home.Goals))
## goals_conceded
## 1 408
Number of goals conceded away in all competition
Brazil_international_matches %>%
filter(!Away.Team %in% c('Brazil'))%>%
summarise(goals_conceded = sum(Away.Goals))
## goals_conceded
## 1 397
world cup winning teams,match played and the number of tournament goals
world_cup %>%
group_by(Year,Winner,Matches.Played) %>%
drop_na(Goals.Scored) %>%
summarise(num_of_goals = sum(Goals.Scored)) %>%
arrange(-Matches.Played)
## `summarise()` has grouped output by 'Year', 'Winner'. You can override using
## the `.groups` argument.
## # A tibble: 21 × 4
## # Groups: Year, Winner [21]
## Year Winner Matches.Played num_of_goals
## <int> <chr> <int> <int>
## 1 1998 France 64 171
## 2 2002 Brazil 64 161
## 3 2006 Italy 64 147
## 4 2010 Spain 64 145
## 5 2014 Germany 64 171
## 6 2018 France 64 169
## 7 1982 Italy 52 146
## 8 1986 Argentina 52 132
## 9 1990 Germany FR 52 115
## 10 1994 Brazil 52 141
## # … with 11 more rows
Total number of teams to participate in the world cup
Hteam <- world_cup_games %>%
select(Home.Team)
Ateam <- world_cup_games %>%
select(Away.Team)
Ateam <- rename(
Ateam, Home.Team = Away.Team
)
total_teams <- bind_rows(Ateam,Hteam)
Total numbers of Countries to make to the world cup
n_distinct(total_teams)
## [1] 81
Number of games played in FIFA world cup
total_teams %>%
group_by(Home.Team) %>%
summarise(num_of_app = n()) %>%
arrange(desc(num_of_app))
## # A tibble: 81 × 2
## Home.Team num_of_app
## <chr> <int>
## 1 Brazil 109
## 2 Germany 109
## 3 Italy 83
## 4 Argentina 81
## 5 England 69
## 6 France 66
## 7 Spain 63
## 8 Mexico 57
## 9 Uruguay 56
## 10 Sweden 51
## # … with 71 more rows
Number of goals scored by Brazil in the world cup
Hgoal <- world_cup_games %>%
filter(Home.Team == 'Brazil') %>%
summarise(Num_of_Hgoal = sum(Home.Goals))
Agoal <- world_cup_games %>%
filter(Away.Team == 'Brazil') %>%
summarise(num_of_Agoal = sum(Away.Goals))
world_cup_games %>%
summarise(Brazil_wc_goals = sum(Hgoal+Agoal))
## Brazil_wc_goals
## 1 229
Number of goals conceded in the world by Brazil
Chgoal <- Brazil_wc_game %>%
filter(!Home.Team == 'Brazil') %>%
summarise(Num_of_Hgoal = sum(Home.Goals))
Cagoal <- Brazil_wc_game %>%
filter(!Away.Team == 'Brazil') %>%
summarise(num_of_Agoal = sum(Away.Goals))
Brazil_wc_game %>%
summarise(conceded_goals = sum(Chgoal+Cagoal))
## conceded_goals
## 1 105
Number of world cup tournament
world_cup_v2 %>%
drop_na(Goals.Scored) %>%
summarise(number_of_edition= n())
## number_of_edition
## 1 21
Total number of goals scored in FIFA world cup
world_cup %>%
drop_na(Goals.Scored) %>%
summarise(total_goal = sum(Goals.Scored))
## total_goal
## 1 2548
Average number of goals scored in FIFA world cup
world_cup %>%
drop_na(Goals.Scored) %>%
summarise(Avg_goal = mean(Goals.Scored))
## Avg_goal
## 1 121.3333
list of world cup winners
world_cup_v2 %>%
group_by(Winner) %>%
drop_na(Goals.Scored) %>%
summarise(number_of_winners = n()) %>%
arrange(desc(number_of_winners))
## # A tibble: 8 × 2
## Winner number_of_winners
## <chr> <int>
## 1 Brazil 5
## 2 Germany 4
## 3 Italy 4
## 4 Argentina 2
## 5 France 2
## 6 Uruguay 2
## 7 England 1
## 8 Spain 1
list of world cup runners up
world_cup_v2 %>%
group_by(Runners.Up) %>%
drop_na(Goals.Scored) %>%
summarise(Runners_up = n()) %>%
arrange(desc(Runners_up))
## # A tibble: 11 × 2
## Runners.Up Runners_up
## <chr> <int>
## 1 Argentina 3
## 2 Germany FR 3
## 3 Netherlands 3
## 4 Brazil 2
## 5 Czechoslovakia 2
## 6 Hungary 2
## 7 Italy 2
## 8 Croatia 1
## 9 France 1
## 10 Germany 1
## 11 Sweden 1
list of world cup third place
world_cup_v2 %>%
group_by(Third) %>%
drop_na(Goals.Scored) %>%
summarise(Third_place = n()) %>%
arrange(desc(Third_place))
## # A tibble: 15 × 2
## Third Third_place
## <chr> <int>
## 1 Germany 3
## 2 Brazil 2
## 3 France 2
## 4 Poland 2
## 5 Sweden 2
## 6 Austria 1
## 7 Belgium 1
## 8 Chile 1
## 9 Croatia 1
## 10 Germany FR 1
## 11 Italy 1
## 12 Netherlands 1
## 13 Portugal 1
## 14 Turkey 1
## 15 USA 1
match played and the number of tournament goals
world_cup %>%
group_by(Year,Winner,Matches.Played) %>%
drop_na(Goals.Scored) %>%
summarise(num_of_goals = sum(Goals.Scored)) %>%
ggplot(aes(x = Matches.Played, y = num_of_goals, fill= Year,size = num_of_goals))+
geom_point()+
labs(title = "Number of goals scored in each FIFA world cup by Number of participants", subtitle = "Total number of goals by number of participating nations")+
labs(y= "Total number of goals scored", x = "Total number of matches played")
## `summarise()` has grouped output by 'Year', 'Winner'. You can override using
## the `.groups` argument.
From our graph, as the games becomes more modernize and competitive, The number of participating teams are increased which gradually which increases the total number of goals scored in the competition over the years
Number of games played in world cup by every participating nations
total_teams %>%
group_by(Home.Team) %>%
summarise(num_of_app = n()) %>%
arrange(desc(num_of_app)) %>%
head(20) %>%
arrange(-num_of_app) %>%
ggplot(aes(x = Home.Team, y = num_of_app))+
geom_col()+
coord_flip() +
labs(y= "Participating nations", x = "Number of games played")+
labs(title = " Footballing nations and their total world cup games")+
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
list of FIFA world cup winners
world_cup_v2 %>%
group_by(Winner) %>%
drop_na(Goals.Scored) %>%
summarise(number_of_winnings = n()) %>%
ggplot(aes(x = Winner, y = number_of_winnings))+
geom_col(fill = "green")+
labs(y= "Number of champion", x = "Winning Nations")
The Brazilian football national team are the country with most FIFA
world cup champions with a record breaking five times Trophies,Germany
and Italy also record a joint number of champions with four Trophies
each, Spain and England are also proud recipient of one Trophy each
list of world cup runners up
world_cup_v2 %>%
group_by(Runners.Up) %>%
drop_na(Goals.Scored) %>%
summarise(Runners_up = n()) %>%
arrange(desc(Runners_up)) %>%
ggplot(aes(x =Runners.Up, y =Runners_up ))+
geom_col(fill = "red")+
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))+
labs(y= "Number of runners up", x = "Runners up Nations")