A couple of exercises with dplyr and lubridate.

Two things I’d like to accomplish with today’s quick exercise

First we’re going to load this data – it’s a tibble with daily US box office returns, for each separate film, between January 1st 2000 and September 24th 2023.

library(tidyverse)
library(magrittr)
library(lubridate)

t1 <- "https://github.com/thomasjwood/code_lab/raw/main/data/box_office_jan_00_sep_23.rds" %>%   
  url %>% 
  readRDS

1. What’s happened to day of the week box office returns, by year?

Generate a new variable, wday, which indicates the day of the week for each row. Report the mean inflation adjusted box office (currently in the rev_adj variable) for each year and day of the week, and then print a summary table where the days are indicated in the columns, and the rows are separate by years.

2. What films had the highest grossing opening weekends, by year?

Estimate films’ adjusted box office for their first weekend in release. By year, report the film with the highest grossing opening weekend.

3. Which films had legs?

For each film, report the adjusted weekend gross, and the amount changed between the first and second weekend gross (a film which maintains its box office, or even grows, is said by industry parlance to have legs.) By year, report the film which most increased their second weekend gross.