Introduction

The data used for this exercise comes from the Open Movie Database (OMDb) API, which serves as a host server for movie information. The data is hosted at www.omdbapi.com, and requests for an API key can also be made at this site. Once an API key is granted, we can gather detailed information about movies of their choosing.

Some of the variables that are available to choose from include:

Title: The name of the movie.
Year: The release year of the movie.
Rated: The age rating of the movie.
Released: The release date of the movie.
Runtime: The duration of the movie.
Genre: The genre(s) of the movie.
Director: The director(s) of the movie.
Writer: The writer(s) of the movie.
Actors: The leading actors in the movie.
Plot: A brief summary of the movie’s storyline.
Language: The language(s) of the movie.
Country: The country(ies) where the movie was produced.
Awards: Any awards the movie has won.
Ratings: Ratings from various sources, such as IMDb and Rotten Tomatoes.

API URL

When the API call is completed, its URL looks like this:

http://www.omdbapi.com/?t=MovieTitle&apikey=myAPIKEY

Formatting Raw Data

Once the API gathers and runs your request, it creates a JSON object that can easily be converted into a usable data frame in R. We use the jsonlite package in R for this purpose.

Converting JSON to Data Frame

To convert the JSON object into a data frame that is usable, we can use the jsonlite package in R. Here is an example code that fetches data for the movie “Inception” and converts it into a data frame:

library(httr)

## Warning: package 'httr' was built under R version 4.3.3

library(jsonlite)

## Warning: package 'jsonlite' was built under R version 4.3.3

library(ggplot2)

## Warning: package 'ggplot2' was built under R version 4.3.2

api_key <- "9f41feca"
title <- "Inception"

response <- GET(paste0("http://www.omdbapi.com/?t=", title, "&apikey=", api_key))

# Converting the response to a data frame
movie_data <- fromJSON(rawToChar(response$content))

# Viewing the movie data
print(movie_data)

## $Title
## [1] "Inception"
## 
## $Year
## [1] "2010"
## 
## $Rated
## [1] "PG-13"
## 
## $Released
## [1] "16 Jul 2010"
## 
## $Runtime
## [1] "148 min"
## 
## $Genre
## [1] "Action, Adventure, Sci-Fi"
## 
## $Director
## [1] "Christopher Nolan"
## 
## $Writer
## [1] "Christopher Nolan"
## 
## $Actors
## [1] "Leonardo DiCaprio, Joseph Gordon-Levitt, Elliot Page"
## 
## $Plot
## [1] "A thief who steals corporate secrets through the use of dream-sharing technology is given the inverse task of planting an idea into the mind of a C.E.O., but his tragic past may doom the project and his team to disaster."
## 
## $Language
## [1] "English, Japanese, French"
## 
## $Country
## [1] "United States, United Kingdom"
## 
## $Awards
## [1] "Won 4 Oscars. 159 wins & 220 nominations total"
## 
## $Poster
## [1] "https://m.media-amazon.com/images/M/MV5BMjAxMzY3NjcxNF5BMl5BanBnXkFtZTcwNTI5OTM0Mw@@._V1_SX300.jpg"
## 
## $Ratings
##                    Source  Value
## 1 Internet Movie Database 8.8/10
## 2         Rotten Tomatoes    87%
## 3              Metacritic 74/100
## 
## $Metascore
## [1] "74"
## 
## $imdbRating
## [1] "8.8"
## 
## $imdbVotes
## [1] "2,530,924"
## 
## $imdbID
## [1] "tt1375666"
## 
## $Type
## [1] "movie"
## 
## $DVD
## [1] "20 Jun 2013"
## 
## $BoxOffice
## [1] "$292,587,330"
## 
## $Production
## [1] "N/A"
## 
## $Website
## [1] "N/A"
## 
## $Response
## [1] "True"

Visual Representation

While the OMDb API primarily provides textual information about movies, you can create visual representations of the data, such as comparing the IMDb ratings of several movies or analyzing the distribution of movie genres.

Comparing Christopher Nolan Movie Ratings

movies <- c("Inception", "Interstellar", "The Dark Knight", "Dunkirk", "Memento", "The Prestige")
movie_ratings <- data.frame(Title = character(), IMDb_Rating = numeric(), stringsAsFactors = FALSE)


for (movie in movies) {
  response <- GET(paste0("http://www.omdbapi.com/?t=", URLencode(movie), "&apikey=", api_key))
  movie_data <- fromJSON(rawToChar(response$content))
  

  movie_ratings <- rbind(movie_ratings, data.frame(Title = movie, IMDb_Rating = as.numeric(movie_data$imdbRating)))
}

print(movie_ratings)

##             Title IMDb_Rating
## 1       Inception         8.8
## 2    Interstellar         8.7
## 3 The Dark Knight         9.0
## 4         Dunkirk         7.8
## 5         Memento         8.4
## 6    The Prestige         8.5

ggplot(movie_ratings, aes(x = Title, y = IMDb_Rating, fill = Title)) +
  geom_bar(stat = "identity") +
  geom_text(aes(label = IMDb_Rating), vjust = -0.3) +
  labs(title = "IMDb Ratings of Christopher Nolan Movies", x = "Movie", y = "IMDb Rating") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1), legend.position = "none")

Summary

The OMDb API offers a robust dataset for movie enthusiasts and data analysts alike, providing detailed information about a vast array of films. The ease of use and the depth of data available make it an excellent tool for projects ranging from simple movie lookups to complex data analysis involving film industry trends.

Exploring Movie Data with the OMDb API

Matt DePrey

March 24, 2024