library(tidyverse)
library(jsonlite)
library(magrittr)
library(httr)
movie_GET_url <-
function(api_key, title) {
base_url<-"http://www.omdbapi.com/?"
title <- paste("&t=", title, sep = "")
api_key <- paste("apikey=", api_key, sep = "")
movie_url <-
paste(base_url,api_key, title,sep = "")
return(movie_url)
}Open Movie Database API Tutorial
Intro
I will be demonstrating how to use the Open Movie Database API to get the movie title, year, run time, genre, director, awards, ratings, box office, etc. This API can be quite useful because it allows you to see if a movies reviews correlate to how well the movie does at the box office. You could also see if there is a correlation between the genre of a movie and the box office numbers or the reviews. There are plenty of interesting things that you could find out from this data set.
Tutorial
To start the tutorial the base url that the api uses is http://www.omdbapi.com/?. In order to use the api you are going to need an api key which you can get here https://www.omdbapi.com/apikey.aspx. Adding the api key to the link should look like this https://www.omdbapi.com/?apikey=[YOUR APIKEY]. Next we need to add some parameters. The only parameter that needs to be added is the movie title or you can use the IMDb ID while the rest of the parameters are optional. The way to format each parameter is:
movie title: “&t=movie_title” (add a plus for any spaces in the title)
year: “&y=year”
type: “&type=[movie, series, or episode]”
IMDb ID: “&i=IMDb_ID”
The url at least needs to look like https://www.omdbapi.com/?apikey=[YOUR APIKEY]&t=movie_title or https://www.omdbapi.com/?apikey=[YOUR APIKEY]&i=IMDb_ID. You can put this into google to get the data but we can do it in R to put it into a data frame.
First we need to be able to create a function to create the url
This function combines the base url, the api key and the title variable in order to create a link to access the data from the api. I used the api key and the title as parameters so that I can change them depending on my api key and on the movie title I am using. Next we need to create a function that will input the data into a data frame.
movie_df <-
function(api_key, title){
movie_url <- movie_GET_url(api_key, title)
movie_df <-
movie_url %>%
GET() %>%
content(as = "text",
encoding = "UTF-8") %>%
fromJSON() %>%
as.data.frame()
return(movie_df)
}I need to use the function that I just created in order to go get the data I am requesting and then put it in a data frame. I piped the url into GET(), the content(), and then fromJSON() in order to retrieve the data. I then piped the result into as.data.frame() to turn it into a data frame and returned the data frame. All I need to do now is use the function and input my api key and the movie title I chose which was Iron Man. I will get 3 rows per movie because we need a new row for each reviewer and their score of the movie.
Title Year Rated Released Runtime Genre Director
1 Iron Man 2008 PG-13 02 May 2008 126 min Action, Adventure, Sci-Fi Jon Favreau
2 Iron Man 2008 PG-13 02 May 2008 126 min Action, Adventure, Sci-Fi Jon Favreau
3 Iron Man 2008 PG-13 02 May 2008 126 min Action, Adventure, Sci-Fi Jon Favreau
Writer
1 Mark Fergus, Hawk Ostby, Art Marcum
2 Mark Fergus, Hawk Ostby, Art Marcum
3 Mark Fergus, Hawk Ostby, Art Marcum
Actors
1 Robert Downey Jr., Gwyneth Paltrow, Terrence Howard
2 Robert Downey Jr., Gwyneth Paltrow, Terrence Howard
3 Robert Downey Jr., Gwyneth Paltrow, Terrence Howard
Plot
1 After being held captive in an Afghan cave, billionaire engineer Tony Stark creates a unique weaponized suit of armor to fight evil.
2 After being held captive in an Afghan cave, billionaire engineer Tony Stark creates a unique weaponized suit of armor to fight evil.
3 After being held captive in an Afghan cave, billionaire engineer Tony Stark creates a unique weaponized suit of armor to fight evil.
Language
1 English, Persian, Urdu, Arabic, Kurdish, Hindi, Hungarian
2 English, Persian, Urdu, Arabic, Kurdish, Hindi, Hungarian
3 English, Persian, Urdu, Arabic, Kurdish, Hindi, Hungarian
Country Awards
1 United States, Canada Nominated for 2 Oscars. 24 wins & 73 nominations total
2 United States, Canada Nominated for 2 Oscars. 24 wins & 73 nominations total
3 United States, Canada Nominated for 2 Oscars. 24 wins & 73 nominations total
Poster
1 https://m.media-amazon.com/images/M/MV5BMTczNTI2ODUwOF5BMl5BanBnXkFtZTcwMTU0NTIzMw@@._V1_SX300.jpg
2 https://m.media-amazon.com/images/M/MV5BMTczNTI2ODUwOF5BMl5BanBnXkFtZTcwMTU0NTIzMw@@._V1_SX300.jpg
3 https://m.media-amazon.com/images/M/MV5BMTczNTI2ODUwOF5BMl5BanBnXkFtZTcwMTU0NTIzMw@@._V1_SX300.jpg
Ratings.Source Ratings.Value Metascore imdbRating imdbVotes
1 Internet Movie Database 7.9/10 79 7.9 1,213,667
2 Rotten Tomatoes 94% 79 7.9 1,213,667
3 Metacritic 79/100 79 7.9 1,213,667
imdbID Type DVD BoxOffice Production Website Response
1 tt0371746 movie N/A $319,034,126 N/A N/A True
2 tt0371746 movie N/A $319,034,126 N/A N/A True
3 tt0371746 movie N/A $319,034,126 N/A N/A True
I will get 3 rows per movie because we need a new row for each reviewer and their score of the movie. Also if I want to do through multiple movie I would need to use a for loop to add every single movie in a list of movies to the data frame.