Choose one of David Robinson’s tidytuesday screencasts, watch the video, and summarise. https://www.youtube.com/channel/UCeiiqmVK07qhY-wvg3IZiZQ

Q1 What is the title of the screencast?

Analyzing squirrels in NYC

Q2 When was it published?

November 1, 2019

Q3 Describe the data

Hint: What’s the source of the data; what does the row represent; how many observations?; what are the variables; and what do they mean?

Source: The data is from the ’NYC Squirrel census - raw data at NY data portal

Number of observations: 3023

The variables and their meaning: Long (longitude), lat (latitude), unique_squirrel_id (ID tag), hectare (ID tag), shift (AM or PM), date (day and month), hectare_squirrel_number (number within chronological sequence), age (adult or juvenile), primary_fur_color (gray, cinnamon, or black), highlight_fur_color (discrete or string value of gray, cinnamon, or black), combination_of_primary_and_highlight_color (combination of the previous two), color_notes (commentary), location (ground plane or above ground), above_ground_sighter_meaurement (FALSE for squirrels on ground plane), sprecific_location (commentary on location), running, chasing, climbing, eating, foraging, other_activities, kuks, quaas, moans, tail_flags, tail_twitches, approaches, indifferent, runs_from, other_interactions, lat_long, zip_codes, community_districts, borough_boundaries, city_council_districts, police_presincts ^the ending ones are very self explanatory. For example, if the squirrel was seen running or foraging, what their tail was doing, and where they were seen (ex city council districts)

Q4-Q5 Describe how Dave approached the analysis each step.

Hint: For example, importing data, understanding the data, data exploration, etc.

He reads through all of the data and goes through what each variable means. He also talks about the differnece between which ones are character and logical.

Q6 Did you see anything in the video that you learned in class? Discuss in a short paragraph.

About 5 and a half-6 minutes into the video he creates a scaterplot using the ggplot function, which we learned to do in class. We practiced this on the quizzes with different data sets.

Q7 What is a major finding from the analysis.

Squirrels may be more likely to be grey the more North they are in the park, and more likely to be cinnamon or black otherwise. This is from a positive estimate with a very significant P value.

Another finding is that squrrels in the Northwest corner of the park are more likely to run away.

Q8 What is the most interesting thing you really liked about the analysis.

What I found really interesting is that he could make assumptions about things without having the concrete facts. One example of this wasn’t something that was so important to the data, but he made a scatterplot using ‘long’ and ‘lat’ and from the shape of this came to the conclusion that the data is likely from Central Park. What’s interesting isn’t the fact that he knew this, but that the shape of the area could be created using those two variables and a scatterplot.

When it comes to the actual analysis, I thought it was interesting that by using the differnet variables he could make an assumption on what color squirrels were more likely to be seen in certain parts of the park.

Q9 Display the title and your name correctly at the top of the webpage.

Q10 Use the correct slug.