I hope to create a database of JSON files from IMDb for the TV show Frasier.

omdbAPI

library("tidyverse")
library("omdbapi")

Maybe I will just use the omdbapi package and its useful wrappers. But first I need to set my API key (private)

For example, here is the information about the pilot episode.

s1e01 <- find_by_title("Frasier", type="series", season=1, episode=1)
str(s1e01)
## Classes 'omdb', 'tbl_df', 'tbl' and 'data.frame':    1 obs. of  24 variables:
##  $ Title     : chr "The Good Son"
##  $ Year      : chr "1993"
##  $ Rated     : chr "TV-PG"
##  $ Released  : Date, format: "1993-09-16"
##  $ Season    : chr "1"
##  $ Episode   : chr "1"
##  $ Runtime   : chr "24 min"
##  $ Genre     : chr "Comedy"
##  $ Director  : chr "James Burrows"
##  $ Writer    : chr "David Angell (created by), Peter Casey (created by), David Lee (created by), David Angell, Peter Casey, David L"| __truncated__
##  $ Actors    : chr "Kelsey Grammer, Jane Leeves, David Hyde Pierce, Peri Gilpin"
##  $ Plot      : chr "Having returned to his native Seattle after living for many years in Boston, psychiatrist Frasier Crane is stil"| __truncated__
##  $ Language  : chr "English"
##  $ Country   : chr "USA"
##  $ Awards    : chr "N/A"
##  $ Poster    : chr "http://ia.media-imdb.com/images/M/MV5BMjg3MTAzNjYxNl5BMl5BanBnXkFtZTgwNTE3MzM0MjE@._V1_SX300.jpg"
##  $ Ratings   :List of 1
##   ..$ :List of 2
##   .. ..$ Source: chr "Internet Movie Database"
##   .. ..$ Value : chr "8.6/10"
##  $ Metascore : chr "N/A"
##  $ imdbRating: num 8.6
##  $ imdbVotes : num 582
##  $ imdbID    : chr "tt0582531"
##  $ seriesID  : chr "tt0106004"
##  $ Type      : chr "episode"
##  $ Response  : chr "True"

Loops

Next, I hope to create a simple database of the TV episodes. Fortunately, each of the 11 seasons of Frasier had exactly 24 episodes, so the structure is easy to follow.

Wait—perhaps I just need to create one data frame instead of a matrix of data frames (I can then dplyr later).

episode_df <- data.frame()
for(season in 1:2){
  for(episode in 1:24){
    episode_df <- rbind(episode_df,
                        find_by_title("Frasier",
                                      type="series",
                                      season=season,
                                      episode=episode))
  }
}
nrow(episode_df)
## [1] 48
ncol(episode_df)
## [1] 24

One of my favorite episodes is

episode_df %>%
  filter(Season == "2") %>%
  filter(Episode == "23")
## # A tibble: 1 x 24
##            Title  Year Rated   Released Season Episode Runtime  Genre
##            <chr> <chr> <chr>     <date>  <chr>   <chr>   <chr>  <chr>
## 1 The Innkeepers  1995 TV-PG 1995-05-16      2      23  23 min Comedy
## # ... with 16 more variables: Director <chr>, Writer <chr>, Actors <chr>,
## #   Plot <chr>, Language <chr>, Country <chr>, Awards <chr>, Poster <chr>,
## #   Ratings <list>, Metascore <chr>, imdbRating <dbl>, imdbVotes <dbl>,
## #   imdbID <chr>, seriesID <chr>, Type <chr>, Response <chr>