Data downloading, processing and presenting using markdown

1. Introduction

I will be analyzing movie and TV show listings that are regularly updated by Netflix. The dataframe consists of twelve columns which provide information on the production, including type of production (movie or TV show), title, director, cast, countries of production, and duration (running time of the movie or TV show). The following code will remove rows with “NA” in the “country” column, mutate the “duration” column from characters into numeric values only, and create various graphs plotting movie duration for different groupings of countries. One summary table will also be provided.

2. Questions

Has the duration of movies decreased over time?
Have movies made in East Asia increased in running time?

3. Code and annotation

Load libraries:

library(tidyverse)
library(ggplot2)
library(knitr)
library(readr)

Download data

netflix_titles <- read_csv("netflix_titles.csv")

Process data and create graphs

Question 1: Has the duration of movies decreased over time?

  • Remove all TV productions, keep movies
  • Remove “min” from the “duration” category
  • Duration category: convert character to numeric format”
  • Remove rows with NAs in “country” category
Movies_minutes <- netflix_titles %>% 
  filter(type=='Movie') %>% 
  mutate(duration = str_remove_all(duration, ' min')) %>% 
  mutate(duration = as.numeric(duration)) %>% 
  drop_na(country)   
  • Create plot of movie release year vs duration
duration<- ggplot(Movies_minutes, aes(x=release_year, y= duration))+
  geom_point(color = "grey") +
  geom_smooth(method = "lm", se = F, color = "gray48") +
  scale_y_continuous() +
  xlim(1942, 2021)+
  theme_bw()+
  labs( x = "Release year",
        y = "Movie duration (min.)",
        subtitle  = "Figure 1: Length of movies, 1942 - 2021")

duration  

Question 2: Have movies made in East Asia increased in running time?

  • Create a dataset with the “country” category excluding rows without East Asian countries

    • Create vector of East Asian countries:
countries <- c('China|Hong Kong|Japan|South Korea|Singapore|Thailand|Taiwan')
  • Create dataframe with vector of countries:
Asia <- filter(Movies_minutes, str_detect(country, countries, negate = F))  # use "negate = F"
  • Create plot of year vs duration:
durationAsia<- ggplot(Asia, aes(x=release_year, y= duration))+
 geom_point() +
  theme_bw() +
  geom_smooth(method = "lm", se = F, color = "red" ) +
  scale_y_continuous() +
  labs( x = "Release year",
      y = "Movie duration (min.)",
      subtitle  = "Figure 2: Length of movies produced in East Asian countries (and other countries), 1942 - 2021")

durationAsia

  • Create dataset without Asian countries
NotAsia <- filter(Movies_minutes, str_detect(country, countries, negate = T))
  • Create plot of year vs duration
durationNotAsia<- ggplot(NotAsia, aes(x=release_year, y= duration))+
  geom_point() +
  theme_bw() +
  geom_smooth(method = "lm",  se = F, color = "darkturquoise") +
  scale_y_continuous() +
  labs(
    x = "Release year",
    y = "Movie duration",
    subtitle  = "Figure 3: Length of movies not produced in East Asia, 1942-2021")

durationNotAsia

  • Plot movie duration across all three groups
durationALL <- ggplot(Movies_minutes, aes(x=release_year, y= duration))+
  geom_smooth(method = "lm", se = F, color = "gray48",  ) +
  scale_y_continuous() +
  xlim(1942, 2021)+
  theme_bw()+
  labs( x = "Release year",
        y = "Movie duration (min.)",
        title = "Figure 4: A comparison of movie duration across three groups, 1941-2021.",
        subtitle  =  "Red = E. Asia          Grey = all films          Turquoise = outside of E. Asia") +
  geom_smooth(data = Asia, method = "lm", se = F, color = "red")+
  geom_smooth(data=NotAsia, method = "lm",  se = F, color = "darkturquoise")

durationALL

Summary table with caption
knitr::kable( head(Asia), caption = "Table 1. Movies produced in East Asia") 
Table 1. Movies produced in East Asia
show_id type title director cast country date_added release_year rating duration listed_in description
s39 Movie Birth of the Dragon George Nolfi Billy Magnussen, Ron Yuan, Qu Jingjing, Terry Chen, Vanness Wu, Jin Xing, Philip Ng, Xia Yu, Yu Xia China, Canada, United States September 16, 2021 2017 PG-13 96 Action & Adventure, Dramas A young Bruce Lee angers kung fu traditionalists by teaching outsiders, leading to a showdown with a Shaolin master in this film based on real events.
s47 Movie Safe House Daniel Espinosa Denzel Washington, Ryan Reynolds, Vera Farmiga, Brendan Gleeson, Sam Shepard, Rubén Blades, Nora Arnezeder, Robert Patrick, Liam Cunningham, Joel Kinnaman South Africa, United States, Japan September 16, 2021 2012 R 115 Action & Adventure Young CIA operative Matt Weston must get a dangerous criminal out of an agency safe house that’s come under attack and get him to a securer location.
s52 Movie InuYasha the Movie 2: The Castle Beyond the Looking Glass Toshiya Shinohara Kappei Yamaguchi, Satsuki Yukino, Mieko Harada, Koji Tsujitani, Houko Kuwashima, Kumiko Watanabe, Noriko Hidaka, Kenichi Ogata, Toshiyuki Morikawa, Izumi Ogami Japan September 15, 2021 2002 TV-14 99 Action & Adventure, Anime Features, International Movies With their biggest foe seemingly defeated, InuYasha and his friends return to everyday life. But the peace is soon shattered by an emerging new enemy.
s53 Movie InuYasha the Movie 3: Swords of an Honorable Ruler Toshiya Shinohara Kappei Yamaguchi, Satsuki Yukino, Koji Tsujitani, Houko Kuwashima, Kumiko Watanabe, Ken Narita, Akio Otsuka, Kikuko Inoue Japan September 15, 2021 2003 TV-14 99 Action & Adventure, Anime Features, International Movies The Great Dog Demon beaqueathed one of the Three Swords of the Fang to each of his two sons. Now the evil power of the third sword has been awakened.
s54 Movie InuYasha the Movie 4: Fire on the Mystic Island Toshiya Shinohara Kappei Yamaguchi, Satsuki Yukino, Koji Tsujitani, Houko Kuwashima, Kumiko Watanabe, Noriko Hidaka, Ken Narita, Cho, Mamiko Noto, Nobutoshi Canna Japan September 15, 2021 2004 TV-PG 88 Action & Adventure, Anime Features, International Movies Ai, a young half-demon who has escaped from Horai Island to try to help her people, returns with potential saviors InuYasha, Sesshomaru and Kikyo.
s55 Movie InuYasha the Movie: Affections Touching Across Time Toshiya Shinohara Kappei Yamaguchi, Satsuki Yukino, Koji Tsujitani, Houko Kuwashima, Kumiko Watanabe, Kenichi Ogata, Noriko Hidaka, Hisako Kyoda, Ken Narita, Tomokazu Seki Japan September 15, 2021 2001 TV-PG 100 Action & Adventure, Anime Features, International Movies A powerful demon has been sealed away for 200 years. But when the demon’s son is awakened, the fate of the world is in jeopardy.

4. Conclusion

  • Figure 1 shows a decrease in movie length for films that were not filtered for country of production, from ~140 to 100 minutes (1942 to 2021).
  • Figure 2 shows that movies which list East Asian countries, among others, under country of production, slightly increased in duration from ~90 to ~110 minutes (1973 to 2021). (Note that I assigned a handful of countries to the vector “country” to represent East Asia. I could not figure out how to exclude rows which listed multiple countries in conjunction with a country in East Asia.)
  • Figure 3 shows that films which were not produced in East Asian countries (as defined by the “country” vector) decreased in running time, from ~150 minutes to ~100 minutes.
  • Figure 4 plots these trends for all three groups of movies and emphasizes the sharp difference in trends in movie duration over the past six decades, when comparing movies which were produced or partially produced in East Asia (China, Hong Kong, Japan, South Korea, Singapore, Thailand, and Taiwan) to movies not produced in those countries.