Overview

The New York Times web site provides a rich set of APIs, as described here: https://developer.nytimes.com/apis. I selected ‘Movie Reviews’ API for this purpose.

Load required libraries

Step 1 is to install and load required libraries to extract data from API

knitr::opts_chunk$set(eval = TRUE, results = FALSE)
library(tidyverse)
library(dplyr)
library(httr)
library(jsonlite)
library(kableExtra)
library(glue)
library(rmarkdown)

Connect to the API

Raw data looks like this:

#Interface to read the JSON data
url <- paste("https://api.nytimes.com/svc/movies/v2/reviews/search.json?query=&api-key=",apikey, sep='')

#Convert JSON to r dataframe
raw_data<- fromJSON(url, flatten = TRUE) %>% data.frame()

Read Data

#raw_data %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), fixed_thead = T)
paged_table(raw_data, options = list(rows.print = 5))
ABCDEFGHIJ0123456789
 
 
status
<chr>
copyright
<chr>
has_more
<lgl>
1OKCopyright (c) 2020 The New York Times Company. All Rights Reserved.TRUE
2OKCopyright (c) 2020 The New York Times Company. All Rights Reserved.TRUE
3OKCopyright (c) 2020 The New York Times Company. All Rights Reserved.TRUE
4OKCopyright (c) 2020 The New York Times Company. All Rights Reserved.TRUE
5OKCopyright (c) 2020 The New York Times Company. All Rights Reserved.TRUE

Let’s fetch title and review from this data and analyze it.

data_final <- raw_data %>% select(results.display_title,results.headline,results.critics_pick,results.byline)
new_names <- c('Title','Review','Critics_pick','Reviewer')
colnames(data_final) <- new_names

Analysis

We now have a list of movies that are critically acclaimed that we can add to our list

#In here, let's extract only the review from the text  
data_final <- data_final  %>% separate(Review,c("Title_again", "Review"),"Review:")
data_final <- data_final %>% select(Title,Review,Critics_pick,Reviewer)
data_final %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), fixed_thead = T)
Title Review Critics_pick Reviewer
What the Constitution Means to Me Pursuits of Happiness 1 Elisabeth Vincentelli
Synchronic Twisted, Trippy Trips Through Time 1 Glenn Kenny
Ham on Rye Coming of Age, With Existential Unease 1 Glenn Kenny
The Witches A Tale of Mice and Women, Toil and Trouble 0 Manohla Dargis
Bad Hair That Weave Is Killer 0 Teo Bugbee
Friendsgiving Dysfunction With All the Trimmings 0 Lovia Gyarkye
Radium Girls When Work Takes a Toxic Turn 0 Kristen Yoonsoo Kim
The Place of No Words Shared Illusions 0 Ben Kenigsberg
Midnight in Paris Everybody on the Dance Floor 0 Ben Kenigsberg
Over the Moon After Loss, a Lunar Adventure 0 Natalia Winkelman
Coming Home Again Confronting Mortality Through Cooking 0 Glenn Kenny
Borat Subsequent Moviefilm More Cultural Learnings 0 Devika Girish
Rebecca A Classic Tale, but There’s Only One Hitch 0 A.O. Scott
White Noise Hearing Dog Whistles Loud and Clear 0 Ben Kenigsberg
Belly of the Beast Fighting for Incarcerated Women 1 Lovia Gyarkye
White Riot When Punk’s Stars Banded Against Racism 1 Glenn Kenny
Martin Eden Reading and Writing His Way Out of the Pit 1 Manohla Dargis
The Goddess of Fortune Family Drama Under Sunny Italian Skies 0 Glenn Kenny
David Byrne’s American Utopia Opening a Wide, Wonderful World 0 Manohla Dargis
Love and Monsters Coming-of-Age After the Apocalypse 0 Lovia Gyarkye

Conclusion

Ever wondered though if there could a reviewers bias? For example, my wife likes every movie she watches whereas it’s hard to impress me unless there’s depth in the story. Could there be bias among the reviewers as well ?

Critics <- data_final %>% group_by(Reviewer) %>% summarise(All_movies = n(),Acclaimed = sum(Critics_pick))
Critics_review <- Critics %>% filter(All_movies>1) %>% group_by(Reviewer) %>% summarise(acclaim_rate = round(Acclaimed/All_movies *100,0))

Critics_review %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), fixed_thead = T)
Reviewer acclaim_rate
Ben Kenigsberg 0
Glenn Kenny 60
Lovia Gyarkye 33
Manohla Dargis 33

Seems like Ben is like me who is hard to impress whereas Glenn is more easy to impress. It’s purely an hypothesis, for detailed analysis the confounding variables like movie genre, movie cast and other variables are required. Also, definitely more than 20 sample data points!