This project investigates whether Fandango’s movie rating system changed following a 2015 analysis by Walt Hickey, which revealed a potential bias in their displayed ratings. Hickey’s analysis suggested that Fandango was rounding ratings upwards, potentially inflating the perceived quality of movies on their platform.
Our goal is to analyze movie rating data from before and after Hickey’s analysis to determine if there is statistical evidence to suggest that Fandango addressed the alleged bias in their rating system. By comparing the distributions of movie ratings before and after the reported fix, we aim to gain insights into the integrity of movie ratings on Fandango and assess the impact of data journalism on industry practices.
library(readr)
previous <- read_csv('fandango_score_comparison.csv')
after <- read_csv('movie_ratings_16_17.csv')
head(previous)
## # A tibble: 6 × 22
## FILM RottenTomatoes RottenTomatoes_User Metacritic Metacritic_User IMDB
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Avengers:… 74 86 66 7.1 7.8
## 2 Cinderell… 85 80 67 7.5 7.1
## 3 Ant-Man (… 80 90 64 8.1 7.8
## 4 Do You Be… 18 84 22 4.7 5.4
## 5 Hot Tub T… 14 28 29 3.4 5.1
## 6 The Water… 63 62 50 6.8 7.2
## # ℹ 16 more variables: Fandango_Stars <dbl>, Fandango_Ratingvalue <dbl>,
## # RT_norm <dbl>, RT_user_norm <dbl>, Metacritic_norm <dbl>,
## # Metacritic_user_nom <dbl>, IMDB_norm <dbl>, RT_norm_round <dbl>,
## # RT_user_norm_round <dbl>, Metacritic_norm_round <dbl>,
## # Metacritic_user_norm_round <dbl>, IMDB_norm_round <dbl>,
## # Metacritic_user_vote_count <dbl>, IMDB_user_vote_count <dbl>,
## # Fandango_votes <dbl>, Fandango_Difference <dbl>
head(after)
## # A tibble: 6 × 15
## movie year metascore imdb tmeter audience fandango n_metascore n_imdb
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 10 Cloverfi… 2016 76 7.2 90 79 3.5 3.8 3.6
## 2 13 Hours 2016 48 7.3 50 83 4.5 2.4 3.65
## 3 A Cure for … 2016 47 6.6 40 47 3 2.35 3.3
## 4 A Dog's Pur… 2017 43 5.2 33 76 4.5 2.15 2.6
## 5 A Hologram … 2016 58 6.1 70 57 3 2.9 3.05
## 6 A Monster C… 2016 76 7.5 87 84 4 3.8 3.75
## # ℹ 6 more variables: n_tmeter <dbl>, n_audience <dbl>, nr_metascore <dbl>,
## # nr_imdb <dbl>, nr_tmeter <dbl>, nr_audience <dbl>
Selecting the column that only used in analysis
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
fandango_previous <- previous %>%
select(FILM , Fandango_Stars , Fandango_Ratingvalue , Fandango_votes , Fandango_Difference)
fandango_after <- after %>%
select(movie, year , fandango)
head(fandango_previous)
## # A tibble: 6 × 5
## FILM Fandango_Stars Fandango_Ratingvalue Fandango_votes Fandango_Difference
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Avenge… 5 4.5 14846 0.5
## 2 Cinder… 5 4.5 12640 0.5
## 3 Ant-Ma… 5 4.5 12055 0.5
## 4 Do You… 5 4.5 1793 0.5
## 5 Hot Tu… 3.5 3 1021 0.5
## 6 The Wa… 4.5 4 397 0.5
head(fandango_after)
## # A tibble: 6 × 3
## movie year fandango
## <chr> <dbl> <dbl>
## 1 10 Cloverfield Lane 2016 3.5
## 2 13 Hours 2016 4.5
## 3 A Cure for Wellness 2016 3
## 4 A Dog's Purpose 2017 4.5
## 5 A Hologram for the King 2016 3
## 6 A Monster Calls 2016 4
Refining Our Goal: Popular Movies in 2015 vs. 2016 Our initial goal was to assess changes in Fandango’s rating system after Hickey’s analysis. However, the available datasets, sampled with specific criteria (minimum ratings, release year), aren’t representative of all Fandango ratings. This limits our ability to draw broad conclusions.
Therefore, we’ll adjust our focus to compare:
Fandango ratings for popular movies released in 2015.
Fandango ratings for popular movies released in 2016.
We define “popular” as having at least 30 fan ratings on Fandango. While this is a narrower scope, it still provides insight into potential shifts in Fandango’s practices.
A Caveat Regarding the 2016 Sample: The 2016 dataset lacks explicit fan rating counts. To ensure this sample truly reflects “popular” movies (>= 30 ratings), we’ll randomly select a subset (e.g., 10 movies) and manually verify their fan ratings on Fandango’s website. This helps us confirm the sample’s suitability for our revised analysis.
set.seed(1)
sample_n(fandango_after, size = 10)
## # A tibble: 10 × 3
## movie year fandango
## <chr> <dbl> <dbl>
## 1 Hands of Stone 2016 4
## 2 The Bye Bye Man 2017 3
## 3 Our Kind of Traitor 2016 3.5
## 4 The Autopsy of Jane Doe 2016 4.5
## 5 Dirty Grandpa 2016 3.5
## 6 Arsenal 2017 3.5
## 7 The Light Between Oceans 2016 4
## 8 Exposed 2016 2.5
## 9 Jason Bourne 2016 4
## 10 Before I Fall 2017 3.5
set.seed(1)
sample <- sample_n(fandango_after, size = 10)
#creating a single column tibble of Rotten Tomato Review count
reviews <- tibble(reviews = c(13569, 74904, 24293, 4141, 30183, 48952, 14328, 59359, 54765, 82222))
bind_cols(sample, reviews)
## # A tibble: 10 × 4
## movie year fandango reviews
## <chr> <dbl> <dbl> <dbl>
## 1 Hands of Stone 2016 4 13569
## 2 The Bye Bye Man 2017 3 74904
## 3 Our Kind of Traitor 2016 3.5 24293
## 4 The Autopsy of Jane Doe 2016 4.5 4141
## 5 Dirty Grandpa 2016 3.5 30183
## 6 Arsenal 2017 3.5 48952
## 7 The Light Between Oceans 2016 4 14328
## 8 Exposed 2016 2.5 59359
## 9 Jason Bourne 2016 4 54765
## 10 Before I Fall 2017 3.5 82222
All ten movies sampled have well above 30 fan ratings, but it is possible that the Rotten Tomatoes Verified Audience user base is larger than the Fandango user base. We cannot really say with confidence whether these review numbers are comparable to the Fandango fan ratings. In addition, time has passed since Hickey’s analysis, giving more fans an opportunity to submit reviews. So even if we did still have access to Fandango’s 5-star fan ratings, we would have no way to compare the number of fan ratings we see to the number that Hickey observed.
Let’s move on to the fandango_previous dataframe that
does include the number of fan ratings for each movie. The documentation
states clearly that there’re only movies with at least 30 fan ratings,
but it should take only a couple of seconds to double-check here.
sum(fandango_previous$Fandango_votes < 30)
## [1] 0
head(fandango_previous$FILM, n = 10)
## [1] "Avengers: Age of Ultron (2015)" "Cinderella (2015)"
## [3] "Ant-Man (2015)" "Do You Believe? (2015)"
## [5] "Hot Tub Time Machine 2 (2015)" "The Water Diviner (2015)"
## [7] "Irrational Man (2015)" "Top Five (2014)"
## [9] "Shaun the Sheep Movie (2015)" "Love & Mercy (2015)"
unique(fandango_after$year)
## [1] 2016 2017
library(stringr)
fandango_previous <- fandango_previous %>%
mutate(year = str_sub(FILM , -5 , -2))
Examine the frequency distribution for the Year column and then isolate the movies released in 2015.
fandango_previous %>%
group_by(year) %>%
summarise(Freq = n())
## # A tibble: 2 × 2
## year Freq
## <chr> <int>
## 1 2014 17
## 2 2015 129
table(fandango_previous$year)
##
## 2014 2015
## 17 129
fandango_2015 <- fandango_previous %>%
filter(year == 2015)
table(fandango_2015$year)
##
## 2015
## 129
head(fandango_after)
## # A tibble: 6 × 3
## movie year fandango
## <chr> <dbl> <dbl>
## 1 10 Cloverfield Lane 2016 3.5
## 2 13 Hours 2016 4.5
## 3 A Cure for Wellness 2016 3
## 4 A Dog's Purpose 2017 4.5
## 5 A Hologram for the King 2016 3
## 6 A Monster Calls 2016 4
table(fandango_after$year)
##
## 2016 2017
## 191 23
fandango_2016 <- fandango_after %>%
filter(year == 2016)
table(fandango_2016$year)
##
## 2016
## 191
#Comparing the shape of year 2015 and 2016
library(ggplot2)
# 2015 dataframe is specified in the ggplot call
ggplot(data = fandango_2015,
aes(x = Fandango_Stars)) +
geom_density() +
# 2016 dataframe is specified in the second geom_density() call
geom_density(data = fandango_2016,
aes(x = fandango), color = "blue") +
labs(title = "Comparing distribution shapes for Fandango's ratings\n(2015 vs 2016)",
x = "Stars",
y = "Density") +
scale_x_continuous(breaks = seq(0, 5, by = 0.5),
limits = c(0, 5))
The distributions reveal two key observations:
Left Skew: Both 2015 and 2016 ratings exhibit a strong left skew, indicating that Fandango movies generally receive high ratings. This pattern, combined with Fandango’s role as a ticket vendor, raises potential concerns about rating bias, though exploring this is beyond the scope of our current analysis.
Shift Leftward: The 2016 rating distribution is slightly shifted to the left compared to 2015. This suggests a potential, albeit small, decrease in average ratings for popular movies on Fandango between the two years.
fandango_2015 %>%
group_by(Fandango_Stars) %>%
summarize(Percentage = n() / nrow(fandango_2015) * 100)
## # A tibble: 5 × 2
## Fandango_Stars Percentage
## <dbl> <dbl>
## 1 3 8.53
## 2 3.5 17.8
## 3 4 28.7
## 4 4.5 38.0
## 5 5 6.98
fandango_2016 %>%
group_by(fandango) %>%
summarize(Percentage = n() / nrow(fandango_2016) * 100)
## # A tibble: 6 × 2
## fandango Percentage
## <dbl> <dbl>
## 1 2.5 3.14
## 2 3 7.33
## 3 3.5 24.1
## 4 4 40.3
## 5 4.5 24.6
## 6 5 0.524
In 2016, very high ratings (4.5 and 5 stars) had lower percentages compared to 2015. In 2016, under 1% of the movies had a perfect rating of 5 stars, compared to 2015 when the percentage was close to 7%. Ratings of 4.5 were also more popular in 2015 — there were approximately 13% more movies rated with a 4.5 in 2015 compared to 2016.
The minimum rating is also lower in 2016 — 2.5 instead of 3 stars, the minimum of 2015. There clearly is a difference between the two frequency distributions.
For some other ratings, the percentage went up in 2016. There was a greater percentage of movies in 2016 that received 3.5 and 4 stars, compared to 2015. 3.5 and 4.0 are high ratings and this challenges the direction of the change we saw on the kernel density plots.
Determining the Direction of the Change
Let’s take a couple of summary metrics to get a more precise picture about the direction of the change. In what follows, we’ll compute the mean, the median, and the mode for both distributions and then use a bar graph to plot the values.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats 1.0.0 ✔ tibble 3.2.1
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.0.4
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
# Mode function from stackoverflow
mode <- function(x) {
ux <- unique(x)
ux[which.max(tabulate(match(x, ux)))]
}
summary_2015 <- fandango_2015 %>%
summarize(year = "2015",
mean = mean(Fandango_Stars),
median = median(Fandango_Stars),
mode = mode(Fandango_Stars))
summary_2016 <- fandango_2016 %>%
summarize(year = "2016",
mean = mean(fandango),
median = median(fandango),
mode = mode(fandango))
# Combine 2015 & 2016 summary dataframes
summary_df <- bind_rows(summary_2015, summary_2016)
# Gather combined dataframe into a format ready for ggplot
summary_df <- summary_df %>%
gather(key = "statistic", value = "value", - year)
summary_df
## # A tibble: 6 × 3
## year statistic value
## <chr> <chr> <dbl>
## 1 2015 mean 4.09
## 2 2016 mean 3.89
## 3 2015 median 4
## 4 2016 median 4
## 5 2015 mode 4.5
## 6 2016 mode 4
ggplot(data = summary_df, aes(x = statistic, y = value, fill = year)) +
geom_bar(stat = "identity", position = "dodge") +
labs(title = "Comparing summary statistics: 2015 vs 2016",
x = "",
y = "Stars")
The mean rating was lower in 2016 with approximately 0.2. This means a
drop of almost 5% relative to the mean rating in 2015.
means <- summary_df %>%
filter(statistic == "mean")
means %>%
summarize(change = (value[1] - value[2]) / value[1])
## # A tibble: 1 × 1
## change
## <dbl>
## 1 0.0484
While the median is the same for both distributions, the mode is lower in 2016 by 0.5. Coupled with what we saw for the mean, the direction of the change we saw on the kernel density plot is confirmed: on average, popular movies released in 2016 were rated slightly lower than popular movies released in 2015.
Our analysis showed that there’s indeed a slight difference between Fandango’s ratings for popular movies in 2015 and Fandango’s ratings for popular movies in 2016. We also determined that, on average, popular movies released in 2016 were rated lower on Fandango than popular movies released in 2015.
We cannot be completely sure what caused the change, but the chances are very high that it was caused by Fandango fixing the biased rating system after Hickey’s analysis.