DSLabs Homework

Author

Ryan Seabold

Import libraries

library(tidyverse)
library(dslabs)

Get data

data <- brexit_polls

Make plots comparing the poll start/end date

# Create the plot
plot1 <- data |>
  ggplot(aes(x = startdate, y = leave)) +
  geom_point() +
  geom_smooth() +
  theme_minimal() +
  labs(title = "Leave Percentage by Poll Start Date",
       x = "Poll Start Date",
       y = "Leave Percentage")

# View the plot
plot1
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'

# Make the same plot, but using end dates, to ensure they are consistent
# Create the plot
plot2 <- data |>
  ggplot(aes(x = enddate, y = leave)) +
  geom_point() +
  geom_smooth() +
  theme_minimal() +
  labs(title = "Leave Percentage by Poll End Date",
       x = "Poll End Date",
       y = "Leave Percentage")

# View the plot
plot2
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'

Make the same plots, but colorizing the dots by pollster

# Create the plot
plot3 <- data |>
  ggplot(aes(x = startdate, y = leave, color = pollster)) +
  geom_point() +
  geom_smooth(aes(color = NULL)) +
  theme_minimal() +
  labs(title = "Leave Percentage by Poll Start Date",
       x = "Poll Start Date",
       y = "Leave Percentage",
       color = "Pollster")

# View the plot
plot3
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'

# Make the same plot, but using end dates, to ensure they are consistent
# Create the plot
plot4 <- data |>
  ggplot(aes(x = enddate, y = leave, color = pollster)) +
  geom_point() +
  geom_smooth(aes(color = NULL)) +
  theme_minimal() +
  labs(title = "Leave Percentage by Poll End Date",
       x = "Poll Start Date",
       y = "Leave Percentage",
       color = "Pollster")

# View the plot
plot4
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'

dslabs is a package that includes many datasets that are useful for teaching and practice purposes. The dataset which I used contains data concerning public opinion on Brexit, based on polls conducted from January to June of 2016 by various organizations.

By plotting the relationship between the poll start/end dates and percentage of people who responded with “leave”, we can see that, while the public opinion on Brexit was initially stable from January to May, people’s views started to change in favor of Brexit — though “leave” never reached 50%.

For my third variable, I used the discrete categorical variable “pollster”. The inclusion of this variable for the dot colors allows us to find out if there are any patterns related to the pollster — for example, if one organization consistently got higher or lower results than others.