Introduction

This project investigates whether Fandango’s movie rating system changed following a 2015 analysis by Walt Hickey, which revealed a potential bias in their displayed ratings. Hickey’s analysis suggested that Fandango was rounding ratings upwards, potentially inflating the perceived quality of movies on their platform.

Our goal is to analyze movie rating data from before and after Hickey’s analysis to determine if there is statistical evidence to suggest that Fandango addressed the alleged bias in their rating system. By comparing the distributions of movie ratings before and after the reported fix, we aim to gain insights into the integrity of movie ratings on Fandango and assess the impact of data journalism on industry practices.

library(readr)
previous <- read_csv('fandango_score_comparison.csv')
after <- read_csv('movie_ratings_16_17.csv')

head(previous)
## # A tibble: 6 × 22
##   FILM       RottenTomatoes RottenTomatoes_User Metacritic Metacritic_User  IMDB
##   <chr>               <dbl>               <dbl>      <dbl>           <dbl> <dbl>
## 1 Avengers:…             74                  86         66             7.1   7.8
## 2 Cinderell…             85                  80         67             7.5   7.1
## 3 Ant-Man (…             80                  90         64             8.1   7.8
## 4 Do You Be…             18                  84         22             4.7   5.4
## 5 Hot Tub T…             14                  28         29             3.4   5.1
## 6 The Water…             63                  62         50             6.8   7.2
## # ℹ 16 more variables: Fandango_Stars <dbl>, Fandango_Ratingvalue <dbl>,
## #   RT_norm <dbl>, RT_user_norm <dbl>, Metacritic_norm <dbl>,
## #   Metacritic_user_nom <dbl>, IMDB_norm <dbl>, RT_norm_round <dbl>,
## #   RT_user_norm_round <dbl>, Metacritic_norm_round <dbl>,
## #   Metacritic_user_norm_round <dbl>, IMDB_norm_round <dbl>,
## #   Metacritic_user_vote_count <dbl>, IMDB_user_vote_count <dbl>,
## #   Fandango_votes <dbl>, Fandango_Difference <dbl>
head(after)
## # A tibble: 6 × 15
##   movie         year metascore  imdb tmeter audience fandango n_metascore n_imdb
##   <chr>        <dbl>     <dbl> <dbl>  <dbl>    <dbl>    <dbl>       <dbl>  <dbl>
## 1 10 Cloverfi…  2016        76   7.2     90       79      3.5        3.8    3.6 
## 2 13 Hours      2016        48   7.3     50       83      4.5        2.4    3.65
## 3 A Cure for …  2016        47   6.6     40       47      3          2.35   3.3 
## 4 A Dog's Pur…  2017        43   5.2     33       76      4.5        2.15   2.6 
## 5 A Hologram …  2016        58   6.1     70       57      3          2.9    3.05
## 6 A Monster C…  2016        76   7.5     87       84      4          3.8    3.75
## # ℹ 6 more variables: n_tmeter <dbl>, n_audience <dbl>, nr_metascore <dbl>,
## #   nr_imdb <dbl>, nr_tmeter <dbl>, nr_audience <dbl>

Selecting the column that only used in analysis

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
fandango_previous <- previous %>%
  select(FILM , Fandango_Stars , Fandango_Ratingvalue , Fandango_votes , Fandango_Difference)

fandango_after <- after %>%
  select(movie, year , fandango)

head(fandango_previous)
## # A tibble: 6 × 5
##   FILM    Fandango_Stars Fandango_Ratingvalue Fandango_votes Fandango_Difference
##   <chr>            <dbl>                <dbl>          <dbl>               <dbl>
## 1 Avenge…            5                    4.5          14846                 0.5
## 2 Cinder…            5                    4.5          12640                 0.5
## 3 Ant-Ma…            5                    4.5          12055                 0.5
## 4 Do You…            5                    4.5           1793                 0.5
## 5 Hot Tu…            3.5                  3             1021                 0.5
## 6 The Wa…            4.5                  4              397                 0.5
head(fandango_after)
## # A tibble: 6 × 3
##   movie                    year fandango
##   <chr>                   <dbl>    <dbl>
## 1 10 Cloverfield Lane      2016      3.5
## 2 13 Hours                 2016      4.5
## 3 A Cure for Wellness      2016      3  
## 4 A Dog's Purpose          2017      4.5
## 5 A Hologram for the King  2016      3  
## 6 A Monster Calls          2016      4

Refining Our Goal: Popular Movies in 2015 vs. 2016 Our initial goal was to assess changes in Fandango’s rating system after Hickey’s analysis. However, the available datasets, sampled with specific criteria (minimum ratings, release year), aren’t representative of all Fandango ratings. This limits our ability to draw broad conclusions.

Therefore, we’ll adjust our focus to compare:

Fandango ratings for popular movies released in 2015.

Fandango ratings for popular movies released in 2016.

We define “popular” as having at least 30 fan ratings on Fandango. While this is a narrower scope, it still provides insight into potential shifts in Fandango’s practices.

A Caveat Regarding the 2016 Sample: The 2016 dataset lacks explicit fan rating counts. To ensure this sample truly reflects “popular” movies (>= 30 ratings), we’ll randomly select a subset (e.g., 10 movies) and manually verify their fan ratings on Fandango’s website. This helps us confirm the sample’s suitability for our revised analysis.

set.seed(1)
sample_n(fandango_after, size = 10)
## # A tibble: 10 × 3
##    movie                     year fandango
##    <chr>                    <dbl>    <dbl>
##  1 Hands of Stone            2016      4  
##  2 The Bye Bye Man           2017      3  
##  3 Our Kind of Traitor       2016      3.5
##  4 The Autopsy of Jane Doe   2016      4.5
##  5 Dirty Grandpa             2016      3.5
##  6 Arsenal                   2017      3.5
##  7 The Light Between Oceans  2016      4  
##  8 Exposed                   2016      2.5
##  9 Jason Bourne              2016      4  
## 10 Before I Fall             2017      3.5
set.seed(1)
sample <- sample_n(fandango_after, size = 10)

#creating a single column  tibble of Rotten Tomato Review count
reviews <- tibble(reviews = c(13569, 74904, 24293, 4141, 30183, 48952, 14328, 59359, 54765, 82222))
bind_cols(sample, reviews)
## # A tibble: 10 × 4
##    movie                     year fandango reviews
##    <chr>                    <dbl>    <dbl>   <dbl>
##  1 Hands of Stone            2016      4     13569
##  2 The Bye Bye Man           2017      3     74904
##  3 Our Kind of Traitor       2016      3.5   24293
##  4 The Autopsy of Jane Doe   2016      4.5    4141
##  5 Dirty Grandpa             2016      3.5   30183
##  6 Arsenal                   2017      3.5   48952
##  7 The Light Between Oceans  2016      4     14328
##  8 Exposed                   2016      2.5   59359
##  9 Jason Bourne              2016      4     54765
## 10 Before I Fall             2017      3.5   82222

All ten movies sampled have well above 30 fan ratings, but it is possible that the Rotten Tomatoes Verified Audience user base is larger than the Fandango user base. We cannot really say with confidence whether these review numbers are comparable to the Fandango fan ratings. In addition, time has passed since Hickey’s analysis, giving more fans an opportunity to submit reviews. So even if we did still have access to Fandango’s 5-star fan ratings, we would have no way to compare the number of fan ratings we see to the number that Hickey observed.

Let’s move on to the fandango_previous dataframe that does include the number of fan ratings for each movie. The documentation states clearly that there’re only movies with at least 30 fan ratings, but it should take only a couple of seconds to double-check here.

sum(fandango_previous$Fandango_votes < 30)
## [1] 0
head(fandango_previous$FILM, n = 10)
##  [1] "Avengers: Age of Ultron (2015)" "Cinderella (2015)"             
##  [3] "Ant-Man (2015)"                 "Do You Believe? (2015)"        
##  [5] "Hot Tub Time Machine 2 (2015)"  "The Water Diviner (2015)"      
##  [7] "Irrational Man (2015)"          "Top Five (2014)"               
##  [9] "Shaun the Sheep Movie (2015)"   "Love & Mercy (2015)"
unique(fandango_after$year)
## [1] 2016 2017
library(stringr)
fandango_previous <- fandango_previous %>% 
  mutate(year = str_sub(FILM , -5 , -2))

Examine the frequency distribution for the Year column and then isolate the movies released in 2015.

fandango_previous %>% 
  group_by(year) %>%
  summarise(Freq = n())
## # A tibble: 2 × 2
##   year   Freq
##   <chr> <int>
## 1 2014     17
## 2 2015    129
table(fandango_previous$year)
## 
## 2014 2015 
##   17  129
fandango_2015 <- fandango_previous %>% 
  filter(year == 2015) 
table(fandango_2015$year)
## 
## 2015 
##  129
head(fandango_after)
## # A tibble: 6 × 3
##   movie                    year fandango
##   <chr>                   <dbl>    <dbl>
## 1 10 Cloverfield Lane      2016      3.5
## 2 13 Hours                 2016      4.5
## 3 A Cure for Wellness      2016      3  
## 4 A Dog's Purpose          2017      4.5
## 5 A Hologram for the King  2016      3  
## 6 A Monster Calls          2016      4
table(fandango_after$year)
## 
## 2016 2017 
##  191   23
fandango_2016 <- fandango_after %>% 
  filter(year == 2016)
table(fandango_2016$year)
## 
## 2016 
##  191

#Comparing the shape of year 2015 and 2016

library(ggplot2)
# 2015 dataframe is specified in the ggplot call
ggplot(data = fandango_2015, 
               aes(x = Fandango_Stars)) +
  geom_density() +
  # 2016 dataframe is specified in the second geom_density() call
  geom_density(data = fandango_2016, 
               aes(x = fandango), color = "blue") +
  labs(title = "Comparing distribution shapes for Fandango's ratings\n(2015 vs 2016)",
       x = "Stars",
       y = "Density") +
  scale_x_continuous(breaks = seq(0, 5, by = 0.5), 
                     limits = c(0, 5))

The distributions reveal two key observations:

Left Skew: Both 2015 and 2016 ratings exhibit a strong left skew, indicating that Fandango movies generally receive high ratings. This pattern, combined with Fandango’s role as a ticket vendor, raises potential concerns about rating bias, though exploring this is beyond the scope of our current analysis.

Shift Leftward: The 2016 rating distribution is slightly shifted to the left compared to 2015. This suggests a potential, albeit small, decrease in average ratings for popular movies on Fandango between the two years.

fandango_2015 %>% 
  group_by(Fandango_Stars) %>% 
  summarize(Percentage = n() / nrow(fandango_2015) * 100)
## # A tibble: 5 × 2
##   Fandango_Stars Percentage
##            <dbl>      <dbl>
## 1            3         8.53
## 2            3.5      17.8 
## 3            4        28.7 
## 4            4.5      38.0 
## 5            5         6.98
fandango_2016 %>% 
  group_by(fandango) %>% 
  summarize(Percentage = n() / nrow(fandango_2016) * 100)
## # A tibble: 6 × 2
##   fandango Percentage
##      <dbl>      <dbl>
## 1      2.5      3.14 
## 2      3        7.33 
## 3      3.5     24.1  
## 4      4       40.3  
## 5      4.5     24.6  
## 6      5        0.524

In 2016, very high ratings (4.5 and 5 stars) had lower percentages compared to 2015. In 2016, under 1% of the movies had a perfect rating of 5 stars, compared to 2015 when the percentage was close to 7%. Ratings of 4.5 were also more popular in 2015 — there were approximately 13% more movies rated with a 4.5 in 2015 compared to 2016.

The minimum rating is also lower in 2016 — 2.5 instead of 3 stars, the minimum of 2015. There clearly is a difference between the two frequency distributions.

For some other ratings, the percentage went up in 2016. There was a greater percentage of movies in 2016 that received 3.5 and 4 stars, compared to 2015. 3.5 and 4.0 are high ratings and this challenges the direction of the change we saw on the kernel density plots.

Determining the Direction of the Change

Let’s take a couple of summary metrics to get a more precise picture about the direction of the change. In what follows, we’ll compute the mean, the median, and the mode for both distributions and then use a bar graph to plot the values.

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats   1.0.0     ✔ tibble    3.2.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.4     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
# Mode function  from stackoverflow
mode <- function(x) {
  ux <- unique(x)
  ux[which.max(tabulate(match(x, ux)))]
}

summary_2015 <- fandango_2015 %>% 
  summarize(year = "2015",
    mean = mean(Fandango_Stars),
    median = median(Fandango_Stars),
    mode = mode(Fandango_Stars))

summary_2016 <- fandango_2016 %>% 
  summarize(year = "2016",
            mean = mean(fandango),
            median = median(fandango),
            mode = mode(fandango))

# Combine 2015 & 2016 summary dataframes
summary_df <- bind_rows(summary_2015, summary_2016)

# Gather combined dataframe into a format ready for ggplot
summary_df <- summary_df %>% 
  gather(key = "statistic", value = "value", - year)

summary_df
## # A tibble: 6 × 3
##   year  statistic value
##   <chr> <chr>     <dbl>
## 1 2015  mean       4.09
## 2 2016  mean       3.89
## 3 2015  median     4   
## 4 2016  median     4   
## 5 2015  mode       4.5 
## 6 2016  mode       4
ggplot(data = summary_df, aes(x = statistic, y = value, fill = year)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(title = "Comparing summary statistics: 2015 vs 2016",
       x = "",
       y = "Stars")

The mean rating was lower in 2016 with approximately 0.2. This means a drop of almost 5% relative to the mean rating in 2015.

means <- summary_df %>% 
  filter(statistic == "mean")

means %>% 
  summarize(change = (value[1] - value[2]) / value[1])
## # A tibble: 1 × 1
##   change
##    <dbl>
## 1 0.0484

While the median is the same for both distributions, the mode is lower in 2016 by 0.5. Coupled with what we saw for the mean, the direction of the change we saw on the kernel density plot is confirmed: on average, popular movies released in 2016 were rated slightly lower than popular movies released in 2015.

Conclusion

Our analysis showed that there’s indeed a slight difference between Fandango’s ratings for popular movies in 2015 and Fandango’s ratings for popular movies in 2016. We also determined that, on average, popular movies released in 2016 were rated lower on Fandango than popular movies released in 2015.

We cannot be completely sure what caused the change, but the chances are very high that it was caused by Fandango fixing the biased rating system after Hickey’s analysis.