This tutorial will show you how to use the OMDb API. OMDb is an open-source database where you can access movie information based on search prompts like key words in titles, year of release and type of media. This is an excellent source for collecting movie data without having to pay ludicrous costs for access to an IMDb API or risk being blacklisted from movie websites by web-scraping.

To begin, you will need to register for your own API key. A free key allows up to 1000 calls to the database per day while a paid key has unlimited access along with other benefits.

To register, go to this website: https://www.omdbapi.com/apikey.aspx

Register with your email, name and purpose of why you’re getting an API key. Once you’ve submitted the form, check your email and activate the API key from the link in your email. You are now ready to begin searching with your new key.

We’ll now begin building our URLs. First, load in all of the necessary libraries for this script.

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.2     ✔ tibble    3.3.0
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.1.0     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(jsonlite)
## 
## Attaching package: 'jsonlite'
## 
## The following object is masked from 'package:purrr':
## 
##     flatten
library(magrittr)
## 
## Attaching package: 'magrittr'
## 
## The following object is masked from 'package:purrr':
## 
##     set_names
## 
## The following object is masked from 'package:tidyr':
## 
##     extract
library(httr) 

Next, we’ll begin putting the pieces together. All searches require the given endpoint, your API key and a search term. For this example, we’ll use year of release and type of media as well.

Let’s say we want to search up movies with “summer” in the title released in the year 2009. The url structuring would go something like this:

omdb_endpoint <- "http://www.omdbapi.com/?" # Required endpoint
api_key <- "175397df"                       # Required API Key
omdb_search <- "summer"                     # Required search term
omdb_type <- "movie"                      # Type of media (movie/series/episode)
omdb_year <- "2009"                       # Year of release

Now, we’ll put the URL together.

omdb_url <- paste(omdb_endpoint,
                  "apikey=",api_key,
                  "&s=",omdb_search,
                  "&type=",omdb_type,
                  "&y=",omdb_year,
                  sep = "")

Now, we’ll put together the GET function. The response format doesn’t require any looping, so this process is quite simple.

First, make sure to set up the database variable.

omdb_db <- data.frame()

Then, we’ll build the function

omdbsearch_df <- function(url) {
  
  omdb_json <- 
    url %>%
    GET() %>%
    content(as = "text", encoding = "UTF-8") %>%
    fromJSON()
  
  omdb_data <- 
    omdb_json$Search # Collects essential information into one table.
  
  return(omdb_data)
}

Now, we can run the function.

omdb_db <- omdbsearch_df(omdb_url)

In your database, you should now have a table of movies matching your search terms, including the title, year, type, year of release, IMDbID and link to the movie poster.

Congratulations! You are now ready to use the OMDb API key.