Import Data

data <- read_csv("00_data/myData.csv")
## Rows: 900 Columns: 15
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (11): country, city, stage, home_team, away_team, outcome, win_conditio...
## dbl   (3): year, home_score, away_score
## date  (1): date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Explanation on Data and Variables

The data set I chose to go with was on the FIFA World Cup Tournament from TidyTuesday. The dataset lets you take a look at every single game ever in the World Cup starting in 1930 when the first World Cup began. There are 15 variables in this set. The variables I will be using to answer my question will be stage, winning_team, and outcome.

Question

What countries/teams usually make it farther in the World Cup tournament? Also does being the home team rather than away have an advantage?

Analyzing Data

# Scatter plot of each team and how far they have made it in the tournament
ggplot(data = data, mapping = aes(x = stage, y = winning_team)) + 
  geom_point()

# Bar chart of how many wins the home and away teams have, or if they draw
ggplot(data) + 
  geom_bar(mapping = aes(x = outcome))

# Scatter/point chart on how many teams have won each stage
ggplot(data = data) +
  geom_count(mapping = aes(x = outcome, y = stage))

Conclusions on Data

When looking at the first scatter plot of each team and how far they have made it in the tournament, we can see that Argentina, Brazil, England, France, Italy, Spain, Uruguay, and Germany are the only countries/teams that have made it to the finals of the World Cup, that’s also since 1930, so these countries/teams have been known as usually the favorites to win when the tournament comes up.

Following the scatter plot is the bar chart which shows us that being the home team instead of away definitely has a slight advantage with the away team totaling 300 wins and the home team totaling around 425 wins. Moving onto the scatter/point chart we see the trend that in the group stages of the tournament you can draw with teams since they are scored on a point system. Than obviously when you get to third place, semi-finals, and the finals they’re no draws since there must be a winner and a loser.