library(dplyr)
library(readr)
# Load the movies dataset
movies <- read_csv("https://gist.githubusercontent.com/tiangechen/b68782efa49a16edaf07dc2cdaa855ea/raw/0c794a9717f18b094eabab2cd6a6b9a226903577/movies.csv")
Rename the “Film” column to “movie_title” and “Year” to “release_year”.
Create a new dataframe with only the columns: movie_title, release_year, Genre, Profitability,
Filter the dataset to include only movies released after 2000 with a Rotten Tomatoes % higher than 80.
Add a new column called “Profitability_millions” that converts the Profitability to millions of dollars.
Sort the filtered dataset by Rotten Tomatoes % in descending order,
and then by Profitability in descending order. five <- four %>%
arrange(desc(Rotten Tomatoes %) ,
desc(Profitability_millions))
Use the pipe operator (%>%) to chain these operations together, starting with the original dataset and ending with a final dataframe that incorporates all the above transformations.
From the resulting data, are the best movies the most popular?
Create a summary dataframe that shows the average rating and Profitability_millions for movies by Genre. Hint: You’ll need to use group_by() and summarize().