This is an extension of the tidytuesday assignment you have already done. Complete the questions below, using the screencast you chose for the tidytuesday assigment.
library(tidyverse)
library(ggthemes)
theme_set(theme_light())
wwc_outcomes <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-07-09/wwc_outcomes.csv")
squads <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-07-09/squads.csv")
codes <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-07-09/codes.csv")
outcomes <- wwc_outcomes %>%
left_join(codes, by = "team") %>%
group_by(year, yearly_game_id) %>%
mutate(opposing_score = rev(score)) %>%
ungroup() %>%
mutate(won_by = score - opposing_score)
wwc_outcomes
The data shown talks about the tp 10 teams in the WWC. It also shows the groups, game streaks and score.
Hint: One graph of your choice.
# Of the 3 games each country plays in the "group" round, how much did they win by on average?
avg_group_scores <- outcomes %>%
filter(round == "Group") %>%
group_by(year, team) %>%
summarize(avg_group_score = mean(score),
avg_group_won_by = mean(won_by)) %>%
ungroup()
outcomes %>%
inner_join(avg_group_scores, by = c("year", "team")) %>%
filter(round == "Final") %>%
ggplot(aes(country, avg_group_won_by, fill = win_status)) +
geom_col() +
facet_wrap(~ year, scales = "free_x") +
labs(title = "Does performance in the group round predict the winner of the finals?",
subtitle = "Yes in all years except 2011. (2015 had been tied)",
y = "Average # of goals the team had won by in the Group round",
x = "Country",
fill = "Result")
## What is the story behind the graph? The storys behind the graph that the team that tends to score more and win in the group rounds tends to win the tournament. ## Hide the messages, but display the code and its results on the webpage.