library(tidyverse)
library(dslabs)
DSLabs Homework
Import libraries
Get data
<- brexit_polls data
Make plots comparing the poll start/end date
# Create the plot
<- data |>
plot1 ggplot(aes(x = startdate, y = leave)) +
geom_point() +
geom_smooth() +
theme_minimal() +
labs(title = "Leave Percentage by Poll Start Date",
x = "Poll Start Date",
y = "Leave Percentage")
# View the plot
plot1
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
# Make the same plot, but using end dates, to ensure they are consistent
# Create the plot
<- data |>
plot2 ggplot(aes(x = enddate, y = leave)) +
geom_point() +
geom_smooth() +
theme_minimal() +
labs(title = "Leave Percentage by Poll End Date",
x = "Poll End Date",
y = "Leave Percentage")
# View the plot
plot2
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Make the same plots, but colorizing the dots by pollster
# Create the plot
<- data |>
plot3 ggplot(aes(x = startdate, y = leave, color = pollster)) +
geom_point() +
geom_smooth(aes(color = NULL)) +
theme_minimal() +
labs(title = "Leave Percentage by Poll Start Date",
x = "Poll Start Date",
y = "Leave Percentage",
color = "Pollster")
# View the plot
plot3
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
# Make the same plot, but using end dates, to ensure they are consistent
# Create the plot
<- data |>
plot4 ggplot(aes(x = enddate, y = leave, color = pollster)) +
geom_point() +
geom_smooth(aes(color = NULL)) +
theme_minimal() +
labs(title = "Leave Percentage by Poll End Date",
x = "Poll Start Date",
y = "Leave Percentage",
color = "Pollster")
# View the plot
plot4
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
dslabs is a package that includes many datasets that are useful for teaching and practice purposes. The dataset which I used contains data concerning public opinion on Brexit, based on polls conducted from January to June of 2016 by various organizations.
By plotting the relationship between the poll start/end dates and percentage of people who responded with “leave”, we can see that, while the public opinion on Brexit was initially stable from January to May, people’s views started to change in favor of Brexit — though “leave” never reached 50%.
For my third variable, I used the discrete categorical variable “pollster”. The inclusion of this variable for the dot colors allows us to find out if there are any patterns related to the pollster — for example, if one organization consistently got higher or lower results than others.