Import data

# excel file
data <- read_excel("../00_data/Data.xlsx")
data
## # A tibble: 10,846 × 14
##    team    `Team City` Population team_name  year  total   home   away  week
##    <chr>   <chr>            <dbl> <chr>     <dbl>  <dbl>  <dbl>  <dbl> <dbl>
##  1 Arizona Phoenix        1608139 Cardinals  2000 893926 387475 506451     1
##  2 Arizona Phoenix        1608139 Cardinals  2000 893926 387475 506451     2
##  3 Arizona Phoenix        1608139 Cardinals  2000 893926 387475 506451     3
##  4 Arizona Phoenix        1608139 Cardinals  2000 893926 387475 506451     4
##  5 Arizona Phoenix        1608139 Cardinals  2000 893926 387475 506451     5
##  6 Arizona Phoenix        1608139 Cardinals  2000 893926 387475 506451     6
##  7 Arizona Phoenix        1608139 Cardinals  2000 893926 387475 506451     7
##  8 Arizona Phoenix        1608139 Cardinals  2000 893926 387475 506451     8
##  9 Arizona Phoenix        1608139 Cardinals  2000 893926 387475 506451     9
## 10 Arizona Phoenix        1608139 Cardinals  2000 893926 387475 506451    10
## # ℹ 10,836 more rows
## # ℹ 5 more variables: weekly_attendance <chr>, ...11 <lgl>, ...12 <chr>,
## #   ...13 <lgl>, ...14 <dbl>

State one question

What is the team with the highest and lowest populations and which teams are in between?

Plot data

ggplot(data = data) +
  geom_point(mapping = aes(x = team_name, y = Population)) +
    coord_flip()

Interpret

The Giants and Jets have the highest populations and the Packers have the lowest population