Team info

  • Group name:Team CMKRML
  • Group members: Caroline, Katie, Melissa

Purpose

State your research question, a description of the variables you’ll use, and your data sources (please include website links if possible).

Our data looks at Tom Brady’s passing statistics from the beginning of the 2014 season through Super Bowl LII. We want to know whether his average yards (in a game) has any effect on points scored, based on the number of touchdowns in a given game. We plan to facet our data by regular season and playoff games, using that as our categorical explanatory variable. Our dataset has other variables (total yards, QBR, completion percentage, quarterback rating) to help us examine our findings further.

Pro Football Reference ESPN

Load packages and data

  1. Load all necessary packages
  2. Load the dataset then run the clean_names() function from the janitor package then select() only the variables you are going to use.

Example:

date opponent home_away win_loss score playoff_reg total_yards avg_yards comp_per td qbr rate sk
2018-02-05 PHI N/A L 33-41 Playoff 505 10.52 0.583 3 83.8 115.4 1
2018-01-21 JAX Home W 24-20 Playoff 290 7.63 0.684 2 69.0 108.4 3
2018-01-13 TEN Home W 35-14 Playoff 337 6.36 0.660 3 82.9 102.5 0
2017-12-31 NYJ Home W 26-6 Regular 190 5.14 0.486 2 47.8 82.0 2
2017-12-24 BUF Home W 37-16 Regular 224 8.00 0.750 2 57.3 106.8 2
2017-12-17 PIT Away W 27-24 Regular 298 8.51 0.629 1 80.9 87.6 2

Create EDA visualizations

Create “exploratory data analysis” visualizations of your data. At this point these are preliminary and can change for the submission, but the only requirement is that your visualizations use each of the measurement variables included in your dataset to test out if they work.