1 Overview

This report tidies and transforms a FiveThirtyEight dataset from Walt Hickey’s 2015 investigation of Fandango’s movie ratings. The article found evidence that displayed star ratings on Fandango were systematically higher than the underlying averages. We demonstrate reproducible data import, rigorous cleaning, summary statistics, inflation comparisons, and discuss limitations.
Article: Be Suspicious Of Online Movie Ratings, Especially Fandango’s.

2 Reproducible data access

All data are loaded directly from public GitHub URLs in FiveThirtyEight’s data repository—no local files—so the workflow is fully reproducible.

library(tidyverse)
library(janitor)
library(readr)
library(glue)
library(knitr)
library(scales)
library(httr)

This section defines the direct URLs for the raw data files on GitHub and performs a crucial check to ensure they’re accessible before proceeding. This guarantees our analysis is reproducible and not dependent on local files. If a URL is unreachable, the process will stop with an informative error message.

# Step 1: Define source URLs and check connectivity (graceful failure)
url_comparison <- "https://raw.githubusercontent.com/fivethirtyeight/data/master/fandango/fandango_score_comparison.csv"
url_scrape     <- "https://raw.githubusercontent.com/fivethirtyeight/data/master/fandango/fandango_scrape.csv"

safe_head <- function(u) {
  resp <- try(httr::HEAD(u), silent = TRUE)
  if(inherits(resp, "response") && httr::status_code(resp) == 200) return(TRUE)
  FALSE
}

ok1 <- safe_head(url_comparison)
ok2 <- safe_head(url_scrape)

resp_tbl <- tibble::tibble(
  url = c(url_comparison, url_scrape),
  status = c(ifelse(ok1, "OK", "ERROR"), ifelse(ok2, "OK", "ERROR"))
)
kable(resp_tbl, caption = "Remote data availability check (HTTP 200 = OK)")
Remote data availability check (HTTP 200 = OK)
url status
https://raw.githubusercontent.com/fivethirtyeight/data/master/fandango/fandango_score_comparison.csv OK
https://raw.githubusercontent.com/fivethirtyeight/data/master/fandango/fandango_scrape.csv OK
if(!ok1 || !ok2) {
  stop("One or more data URLs unreachable. Check internet connection or the URL paths.")
}

3 Load and inspect raw data

This section loads the raw data directly from the verified URLs and displays a basic glimpse of each dataset. This is a crucial step to understand the initial structure, data types, and column names. We can see that raw_comp contains various movie review scores from multiple sites, while raw_scrape contains a simpler, more recent snapshot of Fandango’s own ratings. This initial inspection reveals the column names that will need to be cleaned and highlights the different data points available for our analysis.

# Step 3: Read data and clean column names
raw_comp  <- readr::read_csv(url_comparison, show_col_types = FALSE) |> janitor::clean_names()
raw_scrape <- readr::read_csv(url_scrape, show_col_types = FALSE) |> janitor::clean_names()

# Step 4: Basic glimpse for teaching clarity
glimpse(raw_comp)
## Rows: 146
## Columns: 22
## $ film                       <chr> "Avengers: Age of Ultron (2015)", "Cinderel…
## $ rotten_tomatoes            <dbl> 74, 85, 80, 18, 14, 63, 42, 86, 99, 89, 84,…
## $ rotten_tomatoes_user       <dbl> 86, 80, 90, 84, 28, 62, 53, 64, 82, 87, 77,…
## $ metacritic                 <dbl> 66, 67, 64, 22, 29, 50, 53, 81, 81, 80, 71,…
## $ metacritic_user            <dbl> 7.1, 7.5, 8.1, 4.7, 3.4, 6.8, 7.6, 6.8, 8.8…
## $ imdb                       <dbl> 7.8, 7.1, 7.8, 5.4, 5.1, 7.2, 6.9, 6.5, 7.4…
## $ fandango_stars             <dbl> 5.0, 5.0, 5.0, 5.0, 3.5, 4.5, 4.0, 4.0, 4.5…
## $ fandango_ratingvalue       <dbl> 4.5, 4.5, 4.5, 4.5, 3.0, 4.0, 3.5, 3.5, 4.0…
## $ rt_norm                    <dbl> 3.70, 4.25, 4.00, 0.90, 0.70, 3.15, 2.10, 4…
## $ rt_user_norm               <dbl> 4.30, 4.00, 4.50, 4.20, 1.40, 3.10, 2.65, 3…
## $ metacritic_norm            <dbl> 3.30, 3.35, 3.20, 1.10, 1.45, 2.50, 2.65, 4…
## $ metacritic_user_nom        <dbl> 3.55, 3.75, 4.05, 2.35, 1.70, 3.40, 3.80, 3…
## $ imdb_norm                  <dbl> 3.90, 3.55, 3.90, 2.70, 2.55, 3.60, 3.45, 3…
## $ rt_norm_round              <dbl> 3.5, 4.5, 4.0, 1.0, 0.5, 3.0, 2.0, 4.5, 5.0…
## $ rt_user_norm_round         <dbl> 4.5, 4.0, 4.5, 4.0, 1.5, 3.0, 2.5, 3.0, 4.0…
## $ metacritic_norm_round      <dbl> 3.5, 3.5, 3.0, 1.0, 1.5, 2.5, 2.5, 4.0, 4.0…
## $ metacritic_user_norm_round <dbl> 3.5, 4.0, 4.0, 2.5, 1.5, 3.5, 4.0, 3.5, 4.5…
## $ imdb_norm_round            <dbl> 4.0, 3.5, 4.0, 2.5, 2.5, 3.5, 3.5, 3.5, 3.5…
## $ metacritic_user_vote_count <dbl> 1330, 249, 627, 31, 88, 34, 17, 124, 62, 54…
## $ imdb_user_vote_count       <dbl> 271107, 65709, 103660, 3136, 19560, 39373, …
## $ fandango_votes             <dbl> 14846, 12640, 12055, 1793, 1021, 397, 252, …
## $ fandango_difference        <dbl> 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5…
glimpse(raw_scrape)
## Rows: 510
## Columns: 4
## $ film   <chr> "Fifty Shades of Grey (2015)", "Jurassic World (2015)", "Americ…
## $ stars  <dbl> 4.0, 4.5, 5.0, 5.0, 4.5, 4.5, 4.5, 4.0, 5.0, 3.5, 5.0, 5.0, 4.5…
## $ rating <dbl> 3.9, 4.5, 4.8, 4.8, 4.5, 4.3, 4.2, 4.0, 4.5, 3.4, 4.5, 4.5, 4.3…
## $ votes  <dbl> 34846, 34390, 34085, 33538, 15749, 15337, 15205, 14998, 14846, …

4 Data summary and missing values

This section provides a quick overview of the dataset’s scope and integrity. It calculates the total number of movies in the sample (146), identifies the years covered by the data (2014, 2015), and, most importantly, checks for missing values. The table confirms that this particular dataset is complete, with zero missing values across all columns. This initial check is critical for ensuring the reliability of subsequent analysis and data transformations.

# Step 5: Summarize coverage and missingness
n_movies <- nrow(raw_comp)
years <- sort(unique(stringr::str_extract(raw_comp$film, "\\d{4}")))
n_years <- length(years)

missing_summary <- raw_comp |>
  summarise(across(everything(), ~sum(is.na(.)))) |>
  pivot_longer(everything(), names_to = "column", values_to = "n_missing")

n_missing_total <- sum(missing_summary$n_missing)

glue("Movies in sample: {n_movies}")
## Movies in sample: 146
glue("Years covered: {paste(years, collapse = ', ')}")
## Years covered: 2014, 2015
glue("Total missing values: {n_missing_total}")
## Total missing values: 0
kable(missing_summary, caption = "Missing values per column")
Missing values per column
column n_missing
film 0
rotten_tomatoes 0
rotten_tomatoes_user 0
metacritic 0
metacritic_user 0
imdb 0
fandango_stars 0
fandango_ratingvalue 0
rt_norm 0
rt_user_norm 0
metacritic_norm 0
metacritic_user_nom 0
imdb_norm 0
rt_norm_round 0
rt_user_norm_round 0
metacritic_norm_round 0
metacritic_user_norm_round 0
imdb_norm_round 0
metacritic_user_vote_count 0
imdb_user_vote_count 0
fandango_votes 0
fandango_difference 0

5 Tidy transformations

This section is dedicated to tidying and transforming the raw data into a clean, analysis-ready format. We’re performing several key operations here:

  • Extracting and separating: The film title and year are extracted from the film column into their own distinct title and year columns.

  • Renaming for clarity: Abbreviated column names (like fandango_ratingvalue) are renamed to be more descriptive and easier to understand (fandango_rating_value).

  • Creating a new target variable: A new column, inflated_flag, is created to represent the core finding of the article—whether a movie’s displayed Fandango star rating is inflated (0.5 stars or more higher) than its actual rating. This boolean flag serves as our target variable for any future modeling or analysis.

Finally, a subset of the most relevant columns is selected and arranged, producing the final, cleaned data frame that will be used for all subsequent analysis.

# Step 6: Parse title/year, rename columns, compute inflation flag
comp1 <- raw_comp |>
  mutate(
    title = stringr::str_remove(film, "\\s*\\(\\d{4}\\)$"),
    year  = as.integer(stringr::str_extract(film, "\\d{4}(?=\\)$)"))
  )

# Step 7: Rename abbreviations -> full names (keep normalized 0-5 metrics)
comp2 <- comp1 |>
  rename(
    rotten_tomatoes = rotten_tomatoes,
    rotten_tomatoes_user = rotten_tomatoes_user,
    metacritic = metacritic,
    metacritic_user = metacritic_user,
    imdb = imdb,
    fandango_stars = fandango_stars,
    fandango_rating_value = fandango_ratingvalue,
    rt_norm = rt_norm,
    rt_user_norm = rt_user_norm,
    metacritic_norm = metacritic_norm,
    metacritic_user_norm = metacritic_user_nom,  # spelled *_nom in source
    imdb_norm = imdb_norm,
    rt_norm_round = rt_norm_round,
    rt_user_norm_round = rt_user_norm_round,
    metacritic_norm_round = metacritic_norm_round,
    metacritic_user_norm_round = metacritic_user_norm_round,
    imdb_norm_round = imdb_norm_round,
    metacritic_user_vote_count = metacritic_user_vote_count,
    imdb_user_vote_count = imdb_user_vote_count,
    fandango_votes = fandango_votes,
    fandango_difference = fandango_difference
  )

# Step 8: Compute inflation target variable
comp3 <- comp2 |>
  mutate(
    fandango_diff_calc = fandango_stars - fandango_rating_value,
    inflated_flag = fandango_diff_calc >= 0.5
  )

# Step 9: Final, analysis-ready subset of columns with human-friendly names
movies_clean <- comp3 |>
  select(
    title, year,
    fandango_stars, fandango_rating_value,
    imdb_norm,
    rotten_tomatoes_norm = rt_norm,
    rotten_tomatoes_user_norm = rt_user_norm,
    metacritic_norm,
    metacritic_user_norm,
    fandango_difference, fandango_diff_calc, inflated_flag,
    imdb_user_vote_count, metacritic_user_vote_count, fandango_votes
  ) |>
  arrange(desc(fandango_difference), title)

movies_clean |> head() |> kable(caption = "Preview: cleaned movie ratings (subset of columns)")
Preview: cleaned movie ratings (subset of columns)
title year fandango_stars fandango_rating_value imdb_norm rotten_tomatoes_norm rotten_tomatoes_user_norm metacritic_norm metacritic_user_norm fandango_difference fandango_diff_calc inflated_flag imdb_user_vote_count metacritic_user_vote_count fandango_votes
Ant-Man 2015 5.0 4.5 3.90 4.00 4.50 3.20 4.05 0.5 0.5 TRUE 103660 627 12055
Avengers: Age of Ultron 2015 5.0 4.5 3.90 3.70 4.30 3.30 3.55 0.5 0.5 TRUE 271107 1330 14846
Black Sea 2015 4.0 3.5 3.20 4.10 3.00 3.10 3.30 0.5 0.5 TRUE 16547 37 218
Cinderella 2015 5.0 4.5 3.55 4.25 4.00 3.35 3.75 0.5 0.5 TRUE 65709 249 12640
Do You Believe? 2015 5.0 4.5 2.70 0.90 4.20 1.10 2.35 0.5 0.5 TRUE 3136 31 1793
Far From The Madding Crowd 2015 4.5 4.0 3.60 4.20 3.85 3.55 3.75 0.5 0.5 TRUE 12129 35 804

The transformed data frame movies_clean provides a clean, analysis-ready summary of movie ratings from multiple sources, with key columns for each film’s title, year, Fandango displayed stars, actual Fandango rating, normalized scores from IMDb, Rotten Tomatoes, and Metacritic, as well as vote counts. The crucial addition is inflated_flag, which is TRUE for movies where Fandango’s displayed star rating is inflated by 0.5 or more compared to its actual average—this matches the article’s central claim. For example, all previewed movies show inflated_flag = TRUE, confirming systematic inflation in Fandango’s ratings relative to the underlying score. This table is now well-structured for further analysis, visualization, or modeling.

5.1 Tidy long format (site × metric)

This code block transforms the data from a wide format (where each site’s rating is a separate column) to a tidy long format. This is a critical step for cross-site analysis and visualization, as it makes it possible to compare different rating sources (e.g., IMDb, Rotten Tomatoes, Metacritic) and types (user vs. critic) in a single, unified data frame. The process involves:

  • Pivoting: All rating columns are “gathered” into two new columns: one (site_metric) to hold the original column name and another (score_0to5) for the rating value.

  • Separating and Cleaning: The combined site_metric column is then split into site (e.g., “rotten_tomatoes”) and type (e.g., “user” or “critic”) for easier filtering and grouping. This prepares the data for powerful operations like grouping by site to calculate average scores.

# Step 10: Pivot scores into a tidy long format for cross-site analysis
ratings_long <- comp3 |>
  select(title, year, fandango_stars, fandango_rating_value,
         imdb_norm, rt_norm, rt_user_norm, metacritic_norm, metacritic_user_norm) |>
  mutate(fandango_norm = fandango_rating_value) |>
  select(-fandango_rating_value) |>
  pivot_longer(
    cols = c(fandango_norm, imdb_norm, rt_norm, rt_user_norm, metacritic_norm, metacritic_user_norm),
    names_to = "site_metric",
    values_to = "score_0to5"
  ) |>
  separate_wider_delim(site_metric, delim = "_", names = c("site", "type", "extra"), too_few = "align_start") |>
  mutate(
    site = recode(site,
      "rt" = "rotten_tomatoes",
      "imdb" = "imdb",
      "metacritic" = "metacritic",
      "fandango" = "fandango"
    ),
    type = dplyr::case_when(
      site == "fandango" ~ "user",
      type == "norm" ~ "critic",          # e.g., rt_norm, metacritic_norm
      type == "user" ~ "user",
      TRUE ~ type
    )
  ) |>
  select(title, year, site, type, score_0to5)

ratings_long |> slice(1:8) |> kable(caption = "Preview: tidy long ratings (0–5 scale)")
Preview: tidy long ratings (0–5 scale)
title year site type score_0to5
Avengers: Age of Ultron 2015 fandango user 4.50
Avengers: Age of Ultron 2015 imdb critic 3.90
Avengers: Age of Ultron 2015 rotten_tomatoes critic 3.70
Avengers: Age of Ultron 2015 rotten_tomatoes user 4.30
Avengers: Age of Ultron 2015 metacritic critic 3.30
Avengers: Age of Ultron 2015 metacritic user 3.55
Cinderella 2015 fandango user 4.50
Cinderella 2015 imdb critic 3.55

The ratings_long data frame restructures the ratings into a “tidy” long format, where each row represents a single movie, a specific rating site (e.g., Fandango, IMDb, Rotten Tomatoes, Metacritic), and the type of rater (user or critic) alongside the normalized score (0–5 scale). This format makes it easy to compare scores across sites and rating types, and enables powerful summary and visualization options—such as calculating averages by site or faceting plots. For example, the preview shows “Avengers: Age of Ultron” receiving different scores from Fandango (user: 4.50), IMDb (critic: 3.90), Rotten Tomatoes (critic: 3.70, user: 4.30), and Metacritic (critic: 3.30, user: 3.55), highlighting differences in ratings across platforms and perspectives.

6 Inflation rate comparison across sites

This code block calculates and compares the mean movie scores across all major rating sites. This is the central quantitative part of the analysis, as it allows us to verify the article’s main claim: that Fandango’s ratings are systematically higher than those from other platforms.

  • Grouped Means: The code first groups the ratings_long data frame by site and calculates the average score_0to5 for each.

  • Inflation Rate: It then computes the Fandango inflation rate by taking the mean of the inflated_flag column. Because TRUE is treated as 1 and FALSE as 0 in R, the mean of this column directly represents the percentage of movies with an inflated rating.

  • Combined Results: The two results are combined into a single table, providing a clear and concise summary that highlights how Fandango’s ratings compare to those from IMDb, Rotten Tomatoes, and Metacritic.

# Step 11: Compare mean scores across sites and inflation rate for Fandango
site_means <- ratings_long |>
  group_by(site) |>
  summarise(mean_score = mean(score_0to5, na.rm = TRUE), .groups = "drop")

inflation_rate <- mean(comp3$inflated_flag)
site_means <- bind_rows(
  site_means,
  tibble(site = "fandango_inflation_rate", mean_score = inflation_rate)
)
kable(site_means, digits = 2, caption = "Mean score per site and Fandango inflation rate")
Mean score per site and Fandango inflation rate
site mean_score
fandango 3.85
imdb 3.37
metacritic 3.10
rotten_tomatoes 3.12
fandango_inflation_rate 0.09

The output table summarizes the average movie ratings for each site and the Fandango inflation rate. Fandango’s mean score (3.85) is noticeably higher than IMDb (3.37), Metacritic (3.10), and Rotten Tomatoes (3.12), confirming the article’s claim of systematic inflation. The additional row, fandango_inflation_rate, shows that 9% of movies had their Fandango displayed star rating inflated by at least 0.5 stars compared to the underlying average. This concise comparison demonstrates both the relative inflation of Fandango ratings and the prevalence of the inflation phenomenon in the sample.

7 Exploratory graphics

This section uses visualizations to provide a visual summary of the data and reinforce the findings from the article. The first plot, Displayed stars vs. actual average on Fandango, is a scatter plot that directly compares the two Fandango rating metrics.

  • Parity Line: The dashed line represents parity, where the displayed stars are exactly equal to the actual rating value.

  • Inflation: Any point that falls above this line represents a movie where the displayed star rating was inflated, or rounded up, compared to its true average. This visualization powerfully demonstrates the systematic upward rounding that was the central finding of the original FiveThirtyEight investigation.

# Step 12: Displayed stars vs actual average on Fandango
ggplot(comp3, aes(x = fandango_rating_value, y = fandango_stars)) +
  geom_point(alpha = 0.6) +
  geom_abline(slope = 1, intercept = 0, linetype = 2) +
  labs(
    title = "Displayed stars vs. actual average on Fandango (2015 sample)",
    subtitle = "Dashed line = parity; points above line indicate inflation",
    x = "Actual average (ratingValue, 0–5)",
    y = "Displayed stars (0–5)"
  )

The scatter plot visualizes the relationship between Fandango’s displayed star ratings and their actual average values for the 2015 movie sample. Each point represents a movie. The dashed line indicates perfect parity—where the displayed stars would match the true average. Points above this line show cases where Fandango rounded up its displayed rating, visually confirming the systematic inflation described in the article. Most movies have displayed ratings that are equal to or higher than their actual average, demonstrating the prevalence of Fandango’s “rounding up” practice. ___________________________________________________________________________________________________________________________________

This section visualizes the distribution of movie scores from different websites on a common 0–5 scale. This allows for a direct comparison of how other sites’ ratings are distributed relative to each other, and implicitly, to Fandango’s.

  • Histograms: The code creates a separate histogram for each site (excluding Fandango) to show the frequency of ratings at different score levels.

  • Insights: By filtering out Fandango, we can see that the ratings from sites like Rotten Tomatoes, IMDb, and Metacritic generally have a more varied and spread-out distribution across the 0–5 scale. This contrasts with Fandango, which, as the prior plot showed, has a much higher and more compressed distribution due to its rounding-up policy. This plot provides a powerful visual argument supporting the article’s findings.

# Step 13: Cross-site distributions on common 0–5 scale
ratings_long |>
  filter(site != "fandango") |>
  mutate(site = str_to_title(str_replace_all(site, "_", " "))) |>
  ggplot(aes(x = score_0to5)) +
  geom_histogram(bins = 20) +
  facet_wrap(~ site) +
  labs(
    title = "User/critic scores across sites on the same 0–5 scale",
    x = "Score (0–5)", y = "Count"
  )

This set of histograms shows how movie scores are distributed on a 0–5 scale for IMDb, Metacritic, and Rotten Tomatoes. Each panel reveals the frequency of scores for each site. IMDb and Metacritic ratings tend to cluster toward the middle of the scale (between 2.5 and 4), while Rotten Tomatoes displays a wider, more varied spread, including a notable number of higher ratings. The differences in shape and spread highlight that other sites provide a broader range of scores, in contrast to Fandango’s compressed and consistently higher ratings. This visualization supports the article’s finding: Fandango’s scores are systematically rounded up, while other platforms show more diversity and less inflation. __________________________________________________________________________________________________________________________________ This section provides a direct summary of the most egregious examples of rating inflation found in the dataset. It creates a table of the top 10 movies where the displayed star rating was most inflated compared to the actual average rating.

  • Ranking: The code calculates the difference between fandango_stars and fandango_rating_value and then sorts the movies in descending order based on this difference.

  • Final Output: The resulting table highlights specific movies that saw the greatest “round-up” in their Fandango rating, providing concrete evidence to support the article’s claim of systematic rating inflation. This gives the audience a clear picture of which movies were most affected by Fandango’s rating policy.

# Step 14: Top-10 inflation by displayed minus actual Fandango rating (stars)
top_inflated <- comp3 |>
  arrange(desc(fandango_diff_calc)) |>
  slice(1:10) |>
  select(title, year, fandango_stars, fandango_rating_value, fandango_diff_calc)
kable(top_inflated, digits = 2, caption = "Top-10 inflation by displayed minus actual Fandango rating (stars)")
Top-10 inflation by displayed minus actual Fandango rating (stars)
title year fandango_stars fandango_rating_value fandango_diff_calc
Avengers: Age of Ultron 2015 5.0 4.5 0.5
Cinderella 2015 5.0 4.5 0.5
Ant-Man 2015 5.0 4.5 0.5
Do You Believe? 2015 5.0 4.5 0.5
Hot Tub Time Machine 2 2015 3.5 3.0 0.5
The Water Diviner 2015 4.5 4.0 0.5
Irrational Man 2015 4.0 3.5 0.5
Top Five 2014 4.0 3.5 0.5
Shaun the Sheep Movie 2015 4.5 4.0 0.5
Love & Mercy 2015 4.5 4.0 0.5

The “Top-10 inflation” table lists movies where the difference between Fandango’s displayed stars and the actual average rating is the largest (0.5 stars in every case shown). This direct comparison highlights the specific films most affected by Fandango’s rounding-up practice, with popular titles like “Avengers: Age of Ultron,” “Cinderella,” and “Ant-Man” all showing a 0.5-star inflation. These examples provide clear, concrete evidence of the systematic inflation described in the article and demonstrate how the effect was not limited to obscure titles but also impacted major releases.

8 Deliverable: final analysis-ready data frame

This section provides the final deliverable of the assignment: the cleaned and transformed data frame. It takes the movies_clean data frame, which was prepared in the previous steps, and assigns it to a new variable called final_df. This data frame contains a carefully selected subset of columns with meaningful names and includes the key target variable (inflated_flag). The table shown is a preview of the first 10 rows, demonstrating that the data is now structured and ready for any future analysis, such as modeling, visualization, or statistical inference. This step formally concludes the data transformation portion of the project.

# Step 15: Final subset for downstream analysis
final_df <- movies_clean
kable(head(final_df, 10), caption = "Final deliverable: cleaned subset + target")
Final deliverable: cleaned subset + target
title year fandango_stars fandango_rating_value imdb_norm rotten_tomatoes_norm rotten_tomatoes_user_norm metacritic_norm metacritic_user_norm fandango_difference fandango_diff_calc inflated_flag imdb_user_vote_count metacritic_user_vote_count fandango_votes
Ant-Man 2015 5.0 4.5 3.90 4.00 4.50 3.20 4.05 0.5 0.5 TRUE 103660 627 12055
Avengers: Age of Ultron 2015 5.0 4.5 3.90 3.70 4.30 3.30 3.55 0.5 0.5 TRUE 271107 1330 14846
Black Sea 2015 4.0 3.5 3.20 4.10 3.00 3.10 3.30 0.5 0.5 TRUE 16547 37 218
Cinderella 2015 5.0 4.5 3.55 4.25 4.00 3.35 3.75 0.5 0.5 TRUE 65709 249 12640
Do You Believe? 2015 5.0 4.5 2.70 0.90 4.20 1.10 2.35 0.5 0.5 TRUE 3136 31 1793
Far From The Madding Crowd 2015 4.5 4.0 3.60 4.20 3.85 3.55 3.75 0.5 0.5 TRUE 12129 35 804
Hot Tub Time Machine 2 2015 3.5 3.0 2.55 0.70 1.40 1.45 1.70 0.5 0.5 TRUE 19560 88 1021
Irrational Man 2015 4.0 3.5 3.45 2.10 2.65 2.65 3.80 0.5 0.5 TRUE 2680 17 252
Leviathan 2014 4.0 3.5 3.85 4.95 3.95 4.60 3.60 0.5 0.5 TRUE 22521 145 64
Love & Mercy 2015 4.5 4.0 3.90 4.45 4.35 4.00 4.25 0.5 0.5 TRUE 5367 54 864

The final_df data frame offers a fully cleaned and analysis-ready subset of the movie ratings data, with each row representing a film and its ratings from multiple sources. It includes clear, human-friendly column names and the key target variable inflated_flag, which identifies movies where Fandango’s displayed star rating was at least 0.5 higher than its actual average. The preview demonstrates that the dataset is now well-structured for downstream analysis, such as statistical modeling or further visualization, meeting the requirements for a reproducible and transparent data science workflow. ___________________________________________________________________________________________________________________________________ This dataset, from the 2015 FiveThirtyEight investigation, has several inherent limitations:

  • Temporal Scope:
    The data is a snapshot in time, covering movies from 2014-2015. It doesn’t reflect any changes Fandango may have made to its rating practices after the article’s publication.

  • Selection Bias:
    The sample primarily includes widely-released movies that appear on multiple major review sites, potentially excluding independent or international films. This might not be representative of the entire cinematic landscape.

  • Methodological Differences:
    The various review sites (IMDb, Rotten Tomatoes, Metacritic) use different underlying methodologies and user bases. Their scores aren’t a “ground truth” but rather a separate perspective, each with its own biases.

  • Data Integrity:
    While we addressed minor issues like the column typo (metacritic_user_nom), other subtle inconsistencies may exist and could affect more complex analyses.

9 Conclusions and Recommendations

9.1 Key Findings

Our analysis confirms the central claim of the FiveThirtyEight article: the displayed Fandango star rating is systematically inflated compared to its true numerical average. We found that the inflated_flag—which we defined as a difference of at least 0.5 stars—is a common occurrence in this dataset. By contrast, other sites like IMDb, Rotten Tomatoes, and Metacritic exhibit a more varied and less skewed distribution of scores across the full 0–5 range.

9.2 Future Work and Extensions

To build on this project, we recommend the following:

  • Temporal Analysis:
    Obtain more recent data to determine if Fandango’s rating inflation persisted or changed in the years following the original article. This would provide a valuable update to the initial findings.

  • Weighted Averages:
    Incorporate the vote counts from each site to create a more robust analysis. This would give more weight to movies with a larger number of reviews, improving the accuracy of cross-site comparisons.

  • Predictive Modeling:
    Use the tidy long format data to build a predictive model. We could use scores from other sites to predict a movie’s Fandango star rating, helping to quantify the exact degree of inflation.

  • Threshold Sensitivity:
    Explore the impact of changing the inflation threshold (e.g., from 0.5 stars to 0.4 or 0.6) to test the robustness of our inflated_flag finding.

10 References

12 Session info

sessionInfo()
## R version 4.4.1 (2024-06-14 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 11 x64 (build 26100)
## 
## Matrix products: default
## 
## 
## locale:
## [1] LC_COLLATE=English_United States.utf8 
## [2] LC_CTYPE=English_United States.utf8   
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.utf8    
## 
## time zone: America/New_York
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] httr_1.4.7      scales_1.3.0    knitr_1.49      glue_1.7.0     
##  [5] janitor_2.2.1   lubridate_1.9.3 forcats_1.0.0   stringr_1.5.1  
##  [9] dplyr_1.1.4     purrr_1.0.2     readr_2.1.5     tidyr_1.3.1    
## [13] tibble_3.2.1    ggplot2_3.5.1   tidyverse_2.0.0
## 
## loaded via a namespace (and not attached):
##  [1] sass_0.4.9        generics_0.1.3    stringi_1.8.4     hms_1.1.3        
##  [5] digest_0.6.37     magrittr_2.0.3    evaluate_1.0.3    grid_4.4.1       
##  [9] timechange_0.3.0  fastmap_1.2.0     jsonlite_1.9.0    jquerylib_0.1.4  
## [13] cli_3.6.3         rlang_1.1.4       crayon_1.5.3      bit64_4.6.0-1    
## [17] munsell_0.5.1     withr_3.0.2       cachem_1.1.0      yaml_2.3.10      
## [21] tools_4.4.1       parallel_4.4.1    tzdb_0.4.0        colorspace_2.1-1 
## [25] curl_6.2.1        vctrs_0.6.5       R6_2.6.1          lifecycle_1.0.4  
## [29] snakecase_0.11.1  bit_4.5.0.1       vroom_1.6.5       pkgconfig_2.0.3  
## [33] pillar_1.10.1     bslib_0.9.0       gtable_0.3.6      xfun_0.51        
## [37] tidyselect_1.2.1  rstudioapi_0.17.1 farver_2.1.2      htmltools_0.5.8.1
## [41] rmarkdown_2.29    labeling_0.4.3    compiler_4.4.1