This is an extension of the tidytuesday assignment you have already done. Complete the questions below, using the screencast you chose for the tidytuesday assigment.

Import data

library(tidyverse)
theme_set(theme_light())
horror_movies_raw <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-10-22/horror_movies.csv")
horror_movies <- horror_movies_raw %>%
  arrange(desc(review_rating)) %>%
  extract(title, "year", "\\((\\d\\d\\d\\d)\\)$", remove = FALSE, convert = TRUE) %>%
  mutate(budget = parse_number(budget)) %>%
  separate(plot, c("director", "cast_sentence", "plot"), extra = "merge", sep = "\\. ", fill = "right") %>%
  distinct(title, .keep_all = TRUE)

Description of the data and definition of variables

Based on the graph, this shows exactly what rating each movie got and the budget for each of those movies on the same plot. The variables are the review ratings and the budget.

Visualize data

Hint: One graph of your choice.

horror_movies %>%
  ggplot(aes(budget, review_rating)) +
  geom_point() +
  scale_x_log10(labels = scales::dollar) +
  geom_smooth(method = "lm")

What is the story behind the graph?

David wants to know if higher budget movies end up being higher rated movies. He uses this graph to find out if this is true by using a review rating graph and a scatterplot for the amount of money that the movie used on the same graph. He found that budget has nothing to do with how the movie came out, low budget movies have just about the same ratings as high budget movies. Although some did not look appealing, they were just as good of a horror movie than better looking graphics.

Hide the messages, but display the code and its results on the webpage.

Write your name for the author at the top.

Use the correct slug.