Import data
data <- read_excel("../00_data/Data.xlsx")
## New names:
## • `` -> `...11`
## • `` -> `...12`
## • `` -> `...13`
## • `` -> `...14`
data
## # A tibble: 10,846 × 14
## team `Team City` Population team_name year total home away week
## <chr> <chr> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Arizona Phoenix 1608139 Cardinals 2000 893926 387475 506451 1
## 2 Arizona Phoenix 1608139 Cardinals 2000 893926 387475 506451 2
## 3 Arizona Phoenix 1608139 Cardinals 2000 893926 387475 506451 3
## 4 Arizona Phoenix 1608139 Cardinals 2000 893926 387475 506451 4
## 5 Arizona Phoenix 1608139 Cardinals 2000 893926 387475 506451 5
## 6 Arizona Phoenix 1608139 Cardinals 2000 893926 387475 506451 6
## 7 Arizona Phoenix 1608139 Cardinals 2000 893926 387475 506451 7
## 8 Arizona Phoenix 1608139 Cardinals 2000 893926 387475 506451 8
## 9 Arizona Phoenix 1608139 Cardinals 2000 893926 387475 506451 9
## 10 Arizona Phoenix 1608139 Cardinals 2000 893926 387475 506451 10
## # ℹ 10,836 more rows
## # ℹ 5 more variables: weekly_attendance <chr>, ...11 <lgl>, ...12 <chr>,
## # ...13 <lgl>, ...14 <dbl>
Introduction
Questions
Variation
Visualizing distributions
ggplot(data = data) +
geom_bar(mapping = aes(x = team_name)) +
coord_flip()

ggplot(data = data) +
geom_histogram(mapping = aes(x = total), binwidth = 2000)

ggplot(data = data, mapping = aes(x = total, colour = team)) +
geom_freqpoly()
## `stat_bin()` using `bins = 30`. Pick better value `binwidth`.

Typical values
Unusual values
Missing Values
Covariation
A categorical and continuous variable
Two categorical variables
Two continous variables
Patterns and models