State your research question, a description of the variables you’ll use, and your data sources (please include website links if possible).
We will try to understand what affects the number of followers on Instagram for the most popular accounts. For example, does the number of posts relate to the number of followers? What kind of account draws the most followers?
Our data is from Data.world which derives it’s data from Iconosquare. This data was collected on December 26, 2016.
The variables we will be using are brand, categories_1, media_posted, and num.
clean_names() function from the janitor package then select() only the variables you are going to use.| brand | categories_1 | num | media_posted |
|---|---|---|---|
| Selena Gomez | celebrities | 105.4 | 1200 |
| Taylor Swift | celebrities | 95.2 | 958 |
| Ariana Grande | celebrities | 92.3 | 2800 |
| Beyonce | celebrities | 90.6 | 1400 |
| Kim Kardashian West | celebrities | 89.3 | 3600 |
| Cristiano Ronaldo | celebrities | 85.1 | 1600 |
Create “exploratory data analysis” visualizations of your data. At this point these are preliminary and can change for the submission, but the only requirement is that your visualizations use each of the measurement variables included in your dataset to test out if they work.