State your research question, a description of the variables you’ll use, and your data sources
Our purpose is to investigate if the ratings of Netflix shows have varied over time. We will investigate how IMDB ratings vary across the years 2013-2017. We are using data from Matthew Schroyer on data.world.com. data link if possible).
clean_names() function from the janitor package then select() only the variables you are going to use.Example:
| show | premiere_year | seasons_1 | imdb_rating |
|---|---|---|---|
| 1 | 2014 | 1 | 80 |
| 2 | 2013 | 1 | 89 |
| 3 | 2013 | 1 | 60 |
| 4 | 2014 | 3 | 84 |
| 5 | 2014 | 4 | 85 |
| 6 | 2015 | 3 | 79 |
Create “exploratory data analysis” visualizations of your data. At this point these are preliminary and can change for the submission, but the only requirement is that your visualizations use each of the measurement variables included in your dataset to test out if they work.