dplyr
and
lubridate
.Two things I’d like to accomplish with today’s quick exercise
tidy
principles.lubridate
,
which is super useful for manipulating date
data, and can
come up in stats applications all the time.lubridate
is pedagogically advantageous: as an R programmer
you will constantly find circumstances where you need to use a
new R library and figure it out from stackoverflow, manuals, vignettes,
ChatGPT, etc. Using a new library is a chance to develop your strengths
as an independent, resourceful bench scientist.First we’re going to load this data – it’s a tibble
with
daily US box office returns, for each separate film, between January 1st
2000 and September 24th 2023.
library(tidyverse)
library(magrittr)
library(lubridate)
t1 <- "https://github.com/thomasjwood/code_lab/raw/main/data/box_office_jan_00_sep_23.rds" %>%
url %>%
readRDS
Generate a new variable, wday
, which indicates the day
of the week for each row. Report the mean inflation adjusted box office
(currently in the rev_adj
variable) for each year and day
of the week, and then print a summary table where the days are indicated
in the columns, and the rows are separate by years.
Estimate films’ adjusted box office for their first weekend in release. By year, report the film with the highest grossing opening weekend.
For each film, report the adjusted weekend gross, and the amount changed between the first and second weekend gross (a film which maintains its box office, or even grows, is said by industry parlance to have legs.) By year, report the film which most increased their second weekend gross.