Introduction

Fandango Media, LLC is an American ticketing company that sells movie tickets, as well as a provider of television and streaming media information. Fandango’s website also offers exclusive film clips, trailers, celebrity interviews, reviews by users, movie descriptions, and some web-based games to Fandango members.

In October 2015, a data journalist named Walt Hickey analyzed movie ratings data and found strong evidence to suggest that Fandango’s rating system was biased.Fandango displays a 5-star rating system on their website, where the minimum rating is 0 stars and the maximum is 5 stars.

There was a significant discrepancy between the number of stars displayed to users and the actual rating which was almost always rounded up to the nearest half-star.

In this project, I analyze more recent movie ratings data to determine whether there has been any change in Fandango’s rating system after Hickey’s analysis.

Understanding the data

One of the best ways to figure out whether there has been any change in Fandango’s rating system after Hickey’s analysis is to compare the system’s characteristics previous and after the analysis. Fortunately, there is ready-made data for both these periods of time:

Reading in the data and getting familiar with the structure to work with.

library(readr)
library(tibble)

previous <- read_csv('fandango_score_comparison.txt')
after    <- read_csv('movie_ratings_16_17.txt')

glimpse(previous)
## Rows: 146
## Columns: 22
## $ FILM                       <chr> "Avengers: Age of Ultron (2015)", "Cinderel~
## $ RottenTomatoes             <dbl> 74, 85, 80, 18, 14, 63, 42, 86, 99, 89, 84,~
## $ RottenTomatoes_User        <dbl> 86, 80, 90, 84, 28, 62, 53, 64, 82, 87, 77,~
## $ Metacritic                 <dbl> 66, 67, 64, 22, 29, 50, 53, 81, 81, 80, 71,~
## $ Metacritic_User            <dbl> 7.1, 7.5, 8.1, 4.7, 3.4, 6.8, 7.6, 6.8, 8.8~
## $ IMDB                       <dbl> 7.8, 7.1, 7.8, 5.4, 5.1, 7.2, 6.9, 6.5, 7.4~
## $ Fandango_Stars             <dbl> 5.0, 5.0, 5.0, 5.0, 3.5, 4.5, 4.0, 4.0, 4.5~
## $ Fandango_Ratingvalue       <dbl> 4.5, 4.5, 4.5, 4.5, 3.0, 4.0, 3.5, 3.5, 4.0~
## $ RT_norm                    <dbl> 3.70, 4.25, 4.00, 0.90, 0.70, 3.15, 2.10, 4~
## $ RT_user_norm               <dbl> 4.30, 4.00, 4.50, 4.20, 1.40, 3.10, 2.65, 3~
## $ Metacritic_norm            <dbl> 3.30, 3.35, 3.20, 1.10, 1.45, 2.50, 2.65, 4~
## $ Metacritic_user_nom        <dbl> 3.55, 3.75, 4.05, 2.35, 1.70, 3.40, 3.80, 3~
## $ IMDB_norm                  <dbl> 3.90, 3.55, 3.90, 2.70, 2.55, 3.60, 3.45, 3~
## $ RT_norm_round              <dbl> 3.5, 4.5, 4.0, 1.0, 0.5, 3.0, 2.0, 4.5, 5.0~
## $ RT_user_norm_round         <dbl> 4.5, 4.0, 4.5, 4.0, 1.5, 3.0, 2.5, 3.0, 4.0~
## $ Metacritic_norm_round      <dbl> 3.5, 3.5, 3.0, 1.0, 1.5, 2.5, 2.5, 4.0, 4.0~
## $ Metacritic_user_norm_round <dbl> 3.5, 4.0, 4.0, 2.5, 1.5, 3.5, 4.0, 3.5, 4.5~
## $ IMDB_norm_round            <dbl> 4.0, 3.5, 4.0, 2.5, 2.5, 3.5, 3.5, 3.5, 3.5~
## $ Metacritic_user_vote_count <dbl> 1330, 249, 627, 31, 88, 34, 17, 124, 62, 54~
## $ IMDB_user_vote_count       <dbl> 271107, 65709, 103660, 3136, 19560, 39373, ~
## $ Fandango_votes             <dbl> 14846, 12640, 12055, 1793, 1021, 397, 252, ~
## $ Fandango_Difference        <dbl> 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5~
glimpse(after)
## Rows: 214
## Columns: 15
## $ movie        <chr> "10 Cloverfield Lane", "13 Hours", "A Cure for Wellness",~
## $ year         <dbl> 2016, 2016, 2016, 2017, 2016, 2016, 2016, 2016, 2016, 201~
## $ metascore    <dbl> 76, 48, 47, 43, 58, 76, 54, 34, 60, 38, 59, 53, 81, 25, 3~
## $ imdb         <dbl> 7.2, 7.3, 6.6, 5.2, 6.1, 7.5, 7.4, 6.2, 7.1, 5.0, 7.2, 4.~
## $ tmeter       <dbl> 90, 50, 40, 33, 70, 87, 77, 30, 61, 0, 66, 45, 94, 4, 17,~
## $ audience     <dbl> 79, 83, 47, 76, 57, 84, 79, 50, 66, 27, 71, 16, 82, 22, 5~
## $ fandango     <dbl> 3.5, 4.5, 3.0, 4.5, 3.0, 4.0, 4.5, 4.0, 4.0, 3.5, 4.0, 3.~
## $ n_metascore  <dbl> 3.80, 2.40, 2.35, 2.15, 2.90, 3.80, 2.70, 1.70, 3.00, 1.9~
## $ n_imdb       <dbl> 3.60, 3.65, 3.30, 2.60, 3.05, 3.75, 3.70, 3.10, 3.55, 2.5~
## $ n_tmeter     <dbl> 4.50, 2.50, 2.00, 1.65, 3.50, 4.35, 3.85, 1.50, 3.05, 0.0~
## $ n_audience   <dbl> 3.95, 4.15, 2.35, 3.80, 2.85, 4.20, 3.95, 2.50, 3.30, 1.3~
## $ nr_metascore <dbl> 4.0, 2.5, 2.5, 2.0, 3.0, 4.0, 2.5, 1.5, 3.0, 2.0, 3.0, 2.~
## $ nr_imdb      <dbl> 3.5, 3.5, 3.5, 2.5, 3.0, 4.0, 3.5, 3.0, 3.5, 2.5, 3.5, 2.~
## $ nr_tmeter    <dbl> 4.5, 2.5, 2.0, 1.5, 3.5, 4.5, 4.0, 1.5, 3.0, 0.0, 3.5, 2.~
## $ nr_audience  <dbl> 4.0, 4.0, 2.5, 4.0, 3.0, 4.0, 4.0, 2.5, 3.5, 1.5, 3.5, 1.~

To focus only on the columns needed for this analysis, I will separate the relevant variables into a new dataframe.

library(dplyr)
fandango_previous <- previous %>% select('FILM', 'Fandango_Stars', 'Fandango_Ratingvalue', 'Fandango_votes', 'Fandango_Difference')
fandango_after <- after %>% select('movie', 'year', 'fandango')

glimpse(fandango_previous)
## Rows: 146
## Columns: 5
## $ FILM                 <chr> "Avengers: Age of Ultron (2015)", "Cinderella (20~
## $ Fandango_Stars       <dbl> 5.0, 5.0, 5.0, 5.0, 3.5, 4.5, 4.0, 4.0, 4.5, 4.5,~
## $ Fandango_Ratingvalue <dbl> 4.5, 4.5, 4.5, 4.5, 3.0, 4.0, 3.5, 3.5, 4.0, 4.0,~
## $ Fandango_votes       <dbl> 14846, 12640, 12055, 1793, 1021, 397, 252, 3223, ~
## $ Fandango_Difference  <dbl> 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,~
glimpse(fandango_after)
## Rows: 214
## Columns: 3
## $ movie    <chr> "10 Cloverfield Lane", "13 Hours", "A Cure for Wellness", "A ~
## $ year     <dbl> 2016, 2016, 2016, 2017, 2016, 2016, 2016, 2016, 2016, 2016, 2~
## $ fandango <dbl> 3.5, 4.5, 3.0, 4.5, 3.0, 4.0, 4.5, 4.0, 4.0, 3.5, 4.0, 3.5, 4~

The goal of the analysis is to come to a conclusion about the fairness of Fandango ratings. The population of interest is all the movies on Fandago’s website irrespective of release year. In order to come to a conclusion, the sampling has to be representative.

The sampling criteria used for Hickey’s analysis is:

The sampling criteria used in obtaining the data for 2016 and 2017 is:

The above information was obtained from the README.md section of their GitHub pages. It seems that both the data sources are not random. From the earlier data set, movies with lower than 30 reviews were not considered at all and for the second data set, it is unclear how many votes and reviews are considered.

Changing the goal of the analysis

As the data is not representative, I can either change the goal of the analysis or obtain new data. I will choose the latter and place limitations on the initial goal.

Goal: To determine if there is any difference between fandango’s rating for popular movies in 2015 vs 2016.

Isolating the Samples Needed

First, I will check if the data sets contain popular(more than 30 fan ratings on Fandango) movies.

sum(fandango_previous$Fandango_votes < 30)
## [1] 0
set.seed(1)
sample_n(fandango_after, size = 10)
## # A tibble: 10 x 3
##    movie                     year fandango
##    <chr>                    <dbl>    <dbl>
##  1 Hands of Stone            2016      4  
##  2 The Bye Bye Man           2017      3  
##  3 Our Kind of Traitor       2016      3.5
##  4 The Autopsy of Jane Doe   2016      4.5
##  5 Dirty Grandpa             2016      3.5
##  6 Arsenal                   2017      3.5
##  7 The Light Between Oceans  2016      4  
##  8 Exposed                   2016      2.5
##  9 Jason Bourne              2016      4  
## 10 Before I Fall             2017      3.5

After checking the number of fan ratings for the movies above, I discovered that as of August, 2019 Fandango no longer uses the 5-Star Fan Ratings described above. Instead, Fandango now uses the Rotten Tomatoes verified Audience Score. The number of fan ratings I found on Rotten Tomatoes are used below:

set.seed(1)
sampled <- sample_n(fandango_after, size = 10)

#Single column tibble of Rotten Tomatoes review counts
rt_reviews <- tibble(reviews = c(13569, 74904, 24293, 4141, 30183, 48952, 14328, 59359, 54765, 82222))

bind_cols(sampled, rt_reviews)
## # A tibble: 10 x 4
##    movie                     year fandango reviews
##    <chr>                    <dbl>    <dbl>   <dbl>
##  1 Hands of Stone            2016      4     13569
##  2 The Bye Bye Man           2017      3     74904
##  3 Our Kind of Traitor       2016      3.5   24293
##  4 The Autopsy of Jane Doe   2016      4.5    4141
##  5 Dirty Grandpa             2016      3.5   30183
##  6 Arsenal                   2017      3.5   48952
##  7 The Light Between Oceans  2016      4     14328
##  8 Exposed                   2016      2.5   59359
##  9 Jason Bourne              2016      4     54765
## 10 Before I Fall             2017      3.5   82222

All sampled movies have well above 30 fan ratings but it is important to take note that Rotten Tomatoes user base may be higher and more ratings may have been posted since Hickey’s analysis.

Next, I will check if the release years are in fact 2015 and 2016. In the first data set for 2015 movies, the year is not explicitly present as a separate column. It can be extracted from the FILM column.

head(fandango_previous$FILM, 10)
##  [1] "Avengers: Age of Ultron (2015)" "Cinderella (2015)"             
##  [3] "Ant-Man (2015)"                 "Do You Believe? (2015)"        
##  [5] "Hot Tub Time Machine 2 (2015)"  "The Water Diviner (2015)"      
##  [7] "Irrational Man (2015)"          "Top Five (2014)"               
##  [9] "Shaun the Sheep Movie (2015)"   "Love & Mercy (2015)"
library(stringr)
fandango_previous <- fandango_previous %>% mutate(Year = str_sub(FILM, -5, -2))
head(fandango_previous, 10)
## # A tibble: 10 x 6
##    FILM    Fandango_Stars Fandango_Rating~ Fandango_votes Fandango_Differ~ Year 
##    <chr>            <dbl>            <dbl>          <dbl>            <dbl> <chr>
##  1 Avenge~            5                4.5          14846              0.5 2015 
##  2 Cinder~            5                4.5          12640              0.5 2015 
##  3 Ant-Ma~            5                4.5          12055              0.5 2015 
##  4 Do You~            5                4.5           1793              0.5 2015 
##  5 Hot Tu~            3.5              3             1021              0.5 2015 
##  6 The Wa~            4.5              4              397              0.5 2015 
##  7 Irrati~            4                3.5            252              0.5 2015 
##  8 Top Fi~            4                3.5           3223              0.5 2014 
##  9 Shaun ~            4.5              4              896              0.5 2015 
## 10 Love &~            4.5              4              864              0.5 2015

Checking the frequency distribution of year column in both data sets to isolate only 2015 rows from first and 2016 rows from second.

fandango_previous %>% group_by(Year) %>% summarise(Freq = n())
## # A tibble: 2 x 2
##   Year   Freq
##   <chr> <int>
## 1 2014     17
## 2 2015    129
fandango_2015 <- fandango_previous %>% filter(Year == 2015)
unique(fandango_2015$Year)
## [1] "2015"
fandango_after %>% group_by(year) %>% summarise(Freq = n())
## # A tibble: 2 x 2
##    year  Freq
##   <dbl> <int>
## 1  2016   191
## 2  2017    23
fandango_2016 <- fandango_after %>% filter(year == 2016)
unique(fandango_2016$year)
## [1] 2016

Comparing Distribution Shapes for 2015 and 2016

I will use a simple kernel density plot to compare the distribution of movie ratings for the two years.

library(ggplot2)
# 2015 dataframe is specified in the ggplot call
ggplot(data = fandango_2015, 
               aes(x = Fandango_Stars)) +
  geom_density() +
  # 2016 dataframe is specified in the second geom_density() call
  geom_density(data = fandango_2016, 
               aes(x = fandango), color = "blue") +
  labs(title = "Comparing distribution shapes for Fandango's ratings\n(2015 vs 2016)",
       x = "Stars",
       y = "Density") +
  scale_x_continuous(breaks = seq(0, 5, by = 0.5), 
                     limits = c(0, 5))

Two aspects are striking on the figure above:

The left skew suggests that movies on Fandango are given mostly high and very high fan ratings.

The slight left shift of the 2016 distribution is very interesting for this analysis. It shows that ratings were slightly lower in 2016 compared to 2015. This suggests that there was a difference indeed between Fandango’s ratings for popular movies in 2015 and Fandango’s ratings for popular movies in 2016. The direction of the difference is also clear: the ratings in 2016 were slightly lower compared to 2015.

fandango_2015 %>% 
  group_by(Fandango_Stars) %>% 
  summarize(Percentage = n() / nrow(fandango_2015) * 100)
## # A tibble: 5 x 2
##   Fandango_Stars Percentage
##            <dbl>      <dbl>
## 1            3         8.53
## 2            3.5      17.8 
## 3            4        28.7 
## 4            4.5      38.0 
## 5            5         6.98
fandango_2016 %>% 
  group_by(fandango) %>% 
  summarize(Percentage = n() / nrow(fandango_2016) * 100)
## # A tibble: 6 x 2
##   fandango Percentage
##      <dbl>      <dbl>
## 1      2.5      3.14 
## 2      3        7.33 
## 3      3.5     24.1  
## 4      4       40.3  
## 5      4.5     24.6  
## 6      5        0.524

In 2016, very high ratings (4.5 and 5 stars) had lower percentages compared to 2015. In 2016, under 1% of the movies had a perfect rating of 5 stars, compared to 2015 when the percentage was close to 7%. Ratings of 4.5 were also more popular in 2015 — there were approximately 13% more movies rated with a 4.5 in 2015 compared to 2016.

The minimum rating is also lower in 2016 — 2.5 instead of 3 stars, the minimum of 2015. There clearly is a difference between the two frequency distributions.

For some other ratings, the percentage went up in 2016. There was a greater percentage of movies in 2016 that received 3.5 and 4 stars, compared to 2015. 3.5 and 4.0 are high ratings and this challenges the direction of the change seen on the kernel density plots.

Determining the Direction of the Change

I will take a couple of summary metrics to get a more precise picture about the direction of the change. In what follows, I compute the mean, the median, and the mode for both distributions and then use a bar graph to plot the values.

library(tidyr)
# Mode function from stackoverflow
mode <- function(x) {
  ux <- unique(x)
  ux[which.max(tabulate(match(x, ux)))]
}

summary_2015 <- fandango_2015 %>% 
  summarize(year = "2015",
    mean = mean(Fandango_Stars),
    median = median(Fandango_Stars),
    mode = mode(Fandango_Stars))

summary_2016 <- fandango_2016 %>% 
  summarize(year = "2016",
            mean = mean(fandango),
            median = median(fandango),
            mode = mode(fandango))

# Combine 2015 & 2016 summary dataframes
summary_df <- bind_rows(summary_2015, summary_2016)

# Gather combined dataframe into a format ready for ggplot
summary_df <- summary_df %>% 
  gather(key = "statistic", value = "value", - year)
summary_df
## # A tibble: 6 x 3
##   year  statistic value
##   <chr> <chr>     <dbl>
## 1 2015  mean       4.09
## 2 2016  mean       3.89
## 3 2015  median     4   
## 4 2016  median     4   
## 5 2015  mode       4.5 
## 6 2016  mode       4

The data is ready to be plotted for an easy comparision.

ggplot(data = summary_df, aes(x = statistic, y = value, fill = year)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(title = "Comparing summary statistics: 2015 vs 2016",
       x = "",
       y = "Stars")

The mean rating was lower in 2016 with approximately 0.2. This means a drop of almost 5% relative to the mean rating in 2015. The mode(most frequently given rating) is lower in 2016 by 0.5 stars.

Conclusion

The analysis showed that there’s indeed a slight difference between Fandango’s ratings for popular movies in 2015 and Fandango’s ratings for popular movies in 2016. It is also determined that, on average, popular movies released in 2016 were rated lower on Fandango than popular movies released in 2015.

It cannot be said with 100% surety about what caused the change, but the chances are very high that it was caused by Fandango fixing the biased rating system after Hickey’s analysis.