State your research question, a description of the variables you’ll use, and your data sources (please include website links if possible).
Our data looks at Tom Brady’s passing statistics from the beginning of the 2014 season through Super Bowl LII. We want to know whether his average yards (in a game) has any effect on points scored, based on the number of touchdowns in a given game. We plan to facet our data by regular season and playoff games, using that as our categorical explanatory variable. Our dataset has other variables (total yards, QBR, completion percentage, quarterback rating) to help us examine our findings further.
clean_names() function from the janitor package then select() only the variables you are going to use.Example:
| date | opponent | home_away | win_loss | score | playoff_reg | total_yards | avg_yards | comp_per | td | qbr | rate | sk |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2018-02-05 | PHI | N/A | L | 33-41 | Playoff | 505 | 10.52 | 0.583 | 3 | 83.8 | 115.4 | 1 |
| 2018-01-21 | JAX | Home | W | 24-20 | Playoff | 290 | 7.63 | 0.684 | 2 | 69.0 | 108.4 | 3 |
| 2018-01-13 | TEN | Home | W | 35-14 | Playoff | 337 | 6.36 | 0.660 | 3 | 82.9 | 102.5 | 0 |
| 2017-12-31 | NYJ | Home | W | 26-6 | Regular | 190 | 5.14 | 0.486 | 2 | 47.8 | 82.0 | 2 |
| 2017-12-24 | BUF | Home | W | 37-16 | Regular | 224 | 8.00 | 0.750 | 2 | 57.3 | 106.8 | 2 |
| 2017-12-17 | PIT | Away | W | 27-24 | Regular | 298 | 8.51 | 0.629 | 1 | 80.9 | 87.6 | 2 |
Create “exploratory data analysis” visualizations of your data. At this point these are preliminary and can change for the submission, but the only requirement is that your visualizations use each of the measurement variables included in your dataset to test out if they work.