This is an extension of the tidytuesday assignment you have already done. Complete the questions below, using the screencast you chose for the tidytuesday assigment.
library(tidyverse)
theme_set(theme_light())
horror_movies_raw <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-10-22/horror_movies.csv")
horror_movies <- horror_movies_raw %>%
arrange(desc(review_rating)) %>%
extract(title, "year", "\\((\\d\\d\\d\\d)\\)$", remove = FALSE, convert = TRUE) %>%
mutate(budget = parse_number(budget)) %>%
separate(plot, c("director", "cast_sentence", "plot"), extra = "merge", sep = "\\. ", fill = "right") %>%
distinct(title, .keep_all = TRUE)
Based on the graph, this shows exactly what rating each movie got and the budget for each of those movies on the same plot. The variables are the review ratings and the budget.
Hint: One graph of your choice.
horror_movies %>%
ggplot(aes(budget, review_rating)) +
geom_point() +
scale_x_log10(labels = scales::dollar) +
geom_smooth(method = "lm")
David wants to know if higher budget movies end up being higher rated movies. He uses this graph to find out if this is true by using a review rating graph and a scatterplot for the amount of money that the movie used on the same graph. He found that budget has nothing to do with how the movie came out, low budget movies have just about the same ratings as high budget movies. Although some did not look appealing, they were just as good of a horror movie than better looking graphics.