This report tidies and transforms a FiveThirtyEight dataset from Walt
Hickey’s 2015 investigation of Fandango’s movie
ratings. The article found evidence that displayed star
ratings on Fandango were systematically higher than the
underlying averages. We demonstrate reproducible data import, rigorous
cleaning, summary statistics, inflation comparisons, and discuss
limitations.
Article: Be
Suspicious Of Online Movie Ratings, Especially Fandango’s.
All data are loaded directly from public GitHub URLs in FiveThirtyEight’s data repository—no local files—so the workflow is fully reproducible.
library(tidyverse)
library(janitor)
library(readr)
library(glue)
library(knitr)
library(scales)
library(httr)
This section defines the direct URLs for the raw data files on GitHub and performs a crucial check to ensure they’re accessible before proceeding. This guarantees our analysis is reproducible and not dependent on local files. If a URL is unreachable, the process will stop with an informative error message.
# Step 1: Define source URLs and check connectivity (graceful failure)
url_comparison <- "https://raw.githubusercontent.com/fivethirtyeight/data/master/fandango/fandango_score_comparison.csv"
url_scrape <- "https://raw.githubusercontent.com/fivethirtyeight/data/master/fandango/fandango_scrape.csv"
safe_head <- function(u) {
resp <- try(httr::HEAD(u), silent = TRUE)
if(inherits(resp, "response") && httr::status_code(resp) == 200) return(TRUE)
FALSE
}
ok1 <- safe_head(url_comparison)
ok2 <- safe_head(url_scrape)
resp_tbl <- tibble::tibble(
url = c(url_comparison, url_scrape),
status = c(ifelse(ok1, "OK", "ERROR"), ifelse(ok2, "OK", "ERROR"))
)
kable(resp_tbl, caption = "Remote data availability check (HTTP 200 = OK)")
url | status |
---|---|
https://raw.githubusercontent.com/fivethirtyeight/data/master/fandango/fandango_score_comparison.csv | OK |
https://raw.githubusercontent.com/fivethirtyeight/data/master/fandango/fandango_scrape.csv | OK |
if(!ok1 || !ok2) {
stop("One or more data URLs unreachable. Check internet connection or the URL paths.")
}
This section loads the raw data directly from the verified URLs and displays a basic glimpse of each dataset. This is a crucial step to understand the initial structure, data types, and column names. We can see that raw_comp contains various movie review scores from multiple sites, while raw_scrape contains a simpler, more recent snapshot of Fandango’s own ratings. This initial inspection reveals the column names that will need to be cleaned and highlights the different data points available for our analysis.
# Step 3: Read data and clean column names
raw_comp <- readr::read_csv(url_comparison, show_col_types = FALSE) |> janitor::clean_names()
raw_scrape <- readr::read_csv(url_scrape, show_col_types = FALSE) |> janitor::clean_names()
# Step 4: Basic glimpse for teaching clarity
glimpse(raw_comp)
## Rows: 146
## Columns: 22
## $ film <chr> "Avengers: Age of Ultron (2015)", "Cinderel…
## $ rotten_tomatoes <dbl> 74, 85, 80, 18, 14, 63, 42, 86, 99, 89, 84,…
## $ rotten_tomatoes_user <dbl> 86, 80, 90, 84, 28, 62, 53, 64, 82, 87, 77,…
## $ metacritic <dbl> 66, 67, 64, 22, 29, 50, 53, 81, 81, 80, 71,…
## $ metacritic_user <dbl> 7.1, 7.5, 8.1, 4.7, 3.4, 6.8, 7.6, 6.8, 8.8…
## $ imdb <dbl> 7.8, 7.1, 7.8, 5.4, 5.1, 7.2, 6.9, 6.5, 7.4…
## $ fandango_stars <dbl> 5.0, 5.0, 5.0, 5.0, 3.5, 4.5, 4.0, 4.0, 4.5…
## $ fandango_ratingvalue <dbl> 4.5, 4.5, 4.5, 4.5, 3.0, 4.0, 3.5, 3.5, 4.0…
## $ rt_norm <dbl> 3.70, 4.25, 4.00, 0.90, 0.70, 3.15, 2.10, 4…
## $ rt_user_norm <dbl> 4.30, 4.00, 4.50, 4.20, 1.40, 3.10, 2.65, 3…
## $ metacritic_norm <dbl> 3.30, 3.35, 3.20, 1.10, 1.45, 2.50, 2.65, 4…
## $ metacritic_user_nom <dbl> 3.55, 3.75, 4.05, 2.35, 1.70, 3.40, 3.80, 3…
## $ imdb_norm <dbl> 3.90, 3.55, 3.90, 2.70, 2.55, 3.60, 3.45, 3…
## $ rt_norm_round <dbl> 3.5, 4.5, 4.0, 1.0, 0.5, 3.0, 2.0, 4.5, 5.0…
## $ rt_user_norm_round <dbl> 4.5, 4.0, 4.5, 4.0, 1.5, 3.0, 2.5, 3.0, 4.0…
## $ metacritic_norm_round <dbl> 3.5, 3.5, 3.0, 1.0, 1.5, 2.5, 2.5, 4.0, 4.0…
## $ metacritic_user_norm_round <dbl> 3.5, 4.0, 4.0, 2.5, 1.5, 3.5, 4.0, 3.5, 4.5…
## $ imdb_norm_round <dbl> 4.0, 3.5, 4.0, 2.5, 2.5, 3.5, 3.5, 3.5, 3.5…
## $ metacritic_user_vote_count <dbl> 1330, 249, 627, 31, 88, 34, 17, 124, 62, 54…
## $ imdb_user_vote_count <dbl> 271107, 65709, 103660, 3136, 19560, 39373, …
## $ fandango_votes <dbl> 14846, 12640, 12055, 1793, 1021, 397, 252, …
## $ fandango_difference <dbl> 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5…
glimpse(raw_scrape)
## Rows: 510
## Columns: 4
## $ film <chr> "Fifty Shades of Grey (2015)", "Jurassic World (2015)", "Americ…
## $ stars <dbl> 4.0, 4.5, 5.0, 5.0, 4.5, 4.5, 4.5, 4.0, 5.0, 3.5, 5.0, 5.0, 4.5…
## $ rating <dbl> 3.9, 4.5, 4.8, 4.8, 4.5, 4.3, 4.2, 4.0, 4.5, 3.4, 4.5, 4.5, 4.3…
## $ votes <dbl> 34846, 34390, 34085, 33538, 15749, 15337, 15205, 14998, 14846, …
This section provides a quick overview of the dataset’s scope and integrity. It calculates the total number of movies in the sample (146), identifies the years covered by the data (2014, 2015), and, most importantly, checks for missing values. The table confirms that this particular dataset is complete, with zero missing values across all columns. This initial check is critical for ensuring the reliability of subsequent analysis and data transformations.
# Step 5: Summarize coverage and missingness
n_movies <- nrow(raw_comp)
years <- sort(unique(stringr::str_extract(raw_comp$film, "\\d{4}")))
n_years <- length(years)
missing_summary <- raw_comp |>
summarise(across(everything(), ~sum(is.na(.)))) |>
pivot_longer(everything(), names_to = "column", values_to = "n_missing")
n_missing_total <- sum(missing_summary$n_missing)
glue("Movies in sample: {n_movies}")
## Movies in sample: 146
glue("Years covered: {paste(years, collapse = ', ')}")
## Years covered: 2014, 2015
glue("Total missing values: {n_missing_total}")
## Total missing values: 0
kable(missing_summary, caption = "Missing values per column")
column | n_missing |
---|---|
film | 0 |
rotten_tomatoes | 0 |
rotten_tomatoes_user | 0 |
metacritic | 0 |
metacritic_user | 0 |
imdb | 0 |
fandango_stars | 0 |
fandango_ratingvalue | 0 |
rt_norm | 0 |
rt_user_norm | 0 |
metacritic_norm | 0 |
metacritic_user_nom | 0 |
imdb_norm | 0 |
rt_norm_round | 0 |
rt_user_norm_round | 0 |
metacritic_norm_round | 0 |
metacritic_user_norm_round | 0 |
imdb_norm_round | 0 |
metacritic_user_vote_count | 0 |
imdb_user_vote_count | 0 |
fandango_votes | 0 |
fandango_difference | 0 |
This section is dedicated to tidying and transforming the raw data into a clean, analysis-ready format. We’re performing several key operations here:
Extracting and separating: The film title and year are extracted from the film column into their own distinct title and year columns.
Renaming for clarity: Abbreviated column names (like fandango_ratingvalue) are renamed to be more descriptive and easier to understand (fandango_rating_value).
Creating a new target variable: A new column, inflated_flag, is created to represent the core finding of the article—whether a movie’s displayed Fandango star rating is inflated (0.5 stars or more higher) than its actual rating. This boolean flag serves as our target variable for any future modeling or analysis.
Finally, a subset of the most relevant columns is selected and arranged, producing the final, cleaned data frame that will be used for all subsequent analysis.
# Step 6: Parse title/year, rename columns, compute inflation flag
comp1 <- raw_comp |>
mutate(
title = stringr::str_remove(film, "\\s*\\(\\d{4}\\)$"),
year = as.integer(stringr::str_extract(film, "\\d{4}(?=\\)$)"))
)
# Step 7: Rename abbreviations -> full names (keep normalized 0-5 metrics)
comp2 <- comp1 |>
rename(
rotten_tomatoes = rotten_tomatoes,
rotten_tomatoes_user = rotten_tomatoes_user,
metacritic = metacritic,
metacritic_user = metacritic_user,
imdb = imdb,
fandango_stars = fandango_stars,
fandango_rating_value = fandango_ratingvalue,
rt_norm = rt_norm,
rt_user_norm = rt_user_norm,
metacritic_norm = metacritic_norm,
metacritic_user_norm = metacritic_user_nom, # spelled *_nom in source
imdb_norm = imdb_norm,
rt_norm_round = rt_norm_round,
rt_user_norm_round = rt_user_norm_round,
metacritic_norm_round = metacritic_norm_round,
metacritic_user_norm_round = metacritic_user_norm_round,
imdb_norm_round = imdb_norm_round,
metacritic_user_vote_count = metacritic_user_vote_count,
imdb_user_vote_count = imdb_user_vote_count,
fandango_votes = fandango_votes,
fandango_difference = fandango_difference
)
# Step 8: Compute inflation target variable
comp3 <- comp2 |>
mutate(
fandango_diff_calc = fandango_stars - fandango_rating_value,
inflated_flag = fandango_diff_calc >= 0.5
)
# Step 9: Final, analysis-ready subset of columns with human-friendly names
movies_clean <- comp3 |>
select(
title, year,
fandango_stars, fandango_rating_value,
imdb_norm,
rotten_tomatoes_norm = rt_norm,
rotten_tomatoes_user_norm = rt_user_norm,
metacritic_norm,
metacritic_user_norm,
fandango_difference, fandango_diff_calc, inflated_flag,
imdb_user_vote_count, metacritic_user_vote_count, fandango_votes
) |>
arrange(desc(fandango_difference), title)
movies_clean |> head() |> kable(caption = "Preview: cleaned movie ratings (subset of columns)")
title | year | fandango_stars | fandango_rating_value | imdb_norm | rotten_tomatoes_norm | rotten_tomatoes_user_norm | metacritic_norm | metacritic_user_norm | fandango_difference | fandango_diff_calc | inflated_flag | imdb_user_vote_count | metacritic_user_vote_count | fandango_votes |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Ant-Man | 2015 | 5.0 | 4.5 | 3.90 | 4.00 | 4.50 | 3.20 | 4.05 | 0.5 | 0.5 | TRUE | 103660 | 627 | 12055 |
Avengers: Age of Ultron | 2015 | 5.0 | 4.5 | 3.90 | 3.70 | 4.30 | 3.30 | 3.55 | 0.5 | 0.5 | TRUE | 271107 | 1330 | 14846 |
Black Sea | 2015 | 4.0 | 3.5 | 3.20 | 4.10 | 3.00 | 3.10 | 3.30 | 0.5 | 0.5 | TRUE | 16547 | 37 | 218 |
Cinderella | 2015 | 5.0 | 4.5 | 3.55 | 4.25 | 4.00 | 3.35 | 3.75 | 0.5 | 0.5 | TRUE | 65709 | 249 | 12640 |
Do You Believe? | 2015 | 5.0 | 4.5 | 2.70 | 0.90 | 4.20 | 1.10 | 2.35 | 0.5 | 0.5 | TRUE | 3136 | 31 | 1793 |
Far From The Madding Crowd | 2015 | 4.5 | 4.0 | 3.60 | 4.20 | 3.85 | 3.55 | 3.75 | 0.5 | 0.5 | TRUE | 12129 | 35 | 804 |
The transformed data frame movies_clean
provides a
clean, analysis-ready summary of movie ratings from multiple sources,
with key columns for each film’s title, year, Fandango displayed stars,
actual Fandango rating, normalized scores from IMDb, Rotten Tomatoes,
and Metacritic, as well as vote counts. The crucial addition is
inflated_flag
, which is TRUE for movies
where Fandango’s displayed star rating is inflated by 0.5 or more
compared to its actual average—this matches the article’s central claim.
For example, all previewed movies show
inflated_flag = TRUE
, confirming systematic inflation in
Fandango’s ratings relative to the underlying score. This table is now
well-structured for further analysis, visualization, or modeling.
This code block transforms the data from a wide format (where each site’s rating is a separate column) to a tidy long format. This is a critical step for cross-site analysis and visualization, as it makes it possible to compare different rating sources (e.g., IMDb, Rotten Tomatoes, Metacritic) and types (user vs. critic) in a single, unified data frame. The process involves:
Pivoting: All rating columns are “gathered” into two new columns: one (site_metric) to hold the original column name and another (score_0to5) for the rating value.
Separating and Cleaning: The combined site_metric column is then split into site (e.g., “rotten_tomatoes”) and type (e.g., “user” or “critic”) for easier filtering and grouping. This prepares the data for powerful operations like grouping by site to calculate average scores.
# Step 10: Pivot scores into a tidy long format for cross-site analysis
ratings_long <- comp3 |>
select(title, year, fandango_stars, fandango_rating_value,
imdb_norm, rt_norm, rt_user_norm, metacritic_norm, metacritic_user_norm) |>
mutate(fandango_norm = fandango_rating_value) |>
select(-fandango_rating_value) |>
pivot_longer(
cols = c(fandango_norm, imdb_norm, rt_norm, rt_user_norm, metacritic_norm, metacritic_user_norm),
names_to = "site_metric",
values_to = "score_0to5"
) |>
separate_wider_delim(site_metric, delim = "_", names = c("site", "type", "extra"), too_few = "align_start") |>
mutate(
site = recode(site,
"rt" = "rotten_tomatoes",
"imdb" = "imdb",
"metacritic" = "metacritic",
"fandango" = "fandango"
),
type = dplyr::case_when(
site == "fandango" ~ "user",
type == "norm" ~ "critic", # e.g., rt_norm, metacritic_norm
type == "user" ~ "user",
TRUE ~ type
)
) |>
select(title, year, site, type, score_0to5)
ratings_long |> slice(1:8) |> kable(caption = "Preview: tidy long ratings (0–5 scale)")
title | year | site | type | score_0to5 |
---|---|---|---|---|
Avengers: Age of Ultron | 2015 | fandango | user | 4.50 |
Avengers: Age of Ultron | 2015 | imdb | critic | 3.90 |
Avengers: Age of Ultron | 2015 | rotten_tomatoes | critic | 3.70 |
Avengers: Age of Ultron | 2015 | rotten_tomatoes | user | 4.30 |
Avengers: Age of Ultron | 2015 | metacritic | critic | 3.30 |
Avengers: Age of Ultron | 2015 | metacritic | user | 3.55 |
Cinderella | 2015 | fandango | user | 4.50 |
Cinderella | 2015 | imdb | critic | 3.55 |
The ratings_long
data frame restructures the ratings
into a “tidy” long format, where each row represents a single movie, a
specific rating site (e.g., Fandango, IMDb, Rotten Tomatoes,
Metacritic), and the type of rater (user or critic) alongside the
normalized score (0–5 scale). This format makes it easy to compare
scores across sites and rating types, and enables powerful summary and
visualization options—such as calculating averages by site or faceting
plots. For example, the preview shows “Avengers: Age of Ultron”
receiving different scores from Fandango (user: 4.50), IMDb (critic:
3.90), Rotten Tomatoes (critic: 3.70, user: 4.30), and Metacritic
(critic: 3.30, user: 3.55), highlighting differences in ratings across
platforms and perspectives.
This code block calculates and compares the mean movie scores across all major rating sites. This is the central quantitative part of the analysis, as it allows us to verify the article’s main claim: that Fandango’s ratings are systematically higher than those from other platforms.
Grouped Means: The code first groups the ratings_long data frame by site and calculates the average score_0to5 for each.
Inflation Rate: It then computes the Fandango inflation rate by taking the mean of the inflated_flag column. Because TRUE is treated as 1 and FALSE as 0 in R, the mean of this column directly represents the percentage of movies with an inflated rating.
Combined Results: The two results are combined into a single table, providing a clear and concise summary that highlights how Fandango’s ratings compare to those from IMDb, Rotten Tomatoes, and Metacritic.
# Step 11: Compare mean scores across sites and inflation rate for Fandango
site_means <- ratings_long |>
group_by(site) |>
summarise(mean_score = mean(score_0to5, na.rm = TRUE), .groups = "drop")
inflation_rate <- mean(comp3$inflated_flag)
site_means <- bind_rows(
site_means,
tibble(site = "fandango_inflation_rate", mean_score = inflation_rate)
)
kable(site_means, digits = 2, caption = "Mean score per site and Fandango inflation rate")
site | mean_score |
---|---|
fandango | 3.85 |
imdb | 3.37 |
metacritic | 3.10 |
rotten_tomatoes | 3.12 |
fandango_inflation_rate | 0.09 |
The output table summarizes the average movie ratings for each site
and the Fandango inflation rate. Fandango’s mean score (3.85) is
noticeably higher than IMDb (3.37), Metacritic (3.10), and Rotten
Tomatoes (3.12), confirming the article’s claim of systematic inflation.
The additional row, fandango_inflation_rate
, shows that 9%
of movies had their Fandango displayed star rating inflated by at least
0.5 stars compared to the underlying average. This concise comparison
demonstrates both the relative inflation of Fandango ratings and the
prevalence of the inflation phenomenon in the sample.
This section uses visualizations to provide a visual summary of the data and reinforce the findings from the article. The first plot, Displayed stars vs. actual average on Fandango, is a scatter plot that directly compares the two Fandango rating metrics.
Parity Line: The dashed line represents parity, where the displayed stars are exactly equal to the actual rating value.
Inflation: Any point that falls above this line represents a movie where the displayed star rating was inflated, or rounded up, compared to its true average. This visualization powerfully demonstrates the systematic upward rounding that was the central finding of the original FiveThirtyEight investigation.
# Step 12: Displayed stars vs actual average on Fandango
ggplot(comp3, aes(x = fandango_rating_value, y = fandango_stars)) +
geom_point(alpha = 0.6) +
geom_abline(slope = 1, intercept = 0, linetype = 2) +
labs(
title = "Displayed stars vs. actual average on Fandango (2015 sample)",
subtitle = "Dashed line = parity; points above line indicate inflation",
x = "Actual average (ratingValue, 0–5)",
y = "Displayed stars (0–5)"
)
The
scatter plot visualizes the relationship between Fandango’s displayed
star ratings and their actual average values for the 2015 movie sample.
Each point represents a movie. The dashed line indicates perfect
parity—where the displayed stars would match the true average. Points
above this line show cases where Fandango rounded up its displayed
rating, visually confirming the systematic inflation described in the
article. Most movies have displayed ratings that are equal to or higher
than their actual average, demonstrating the prevalence of Fandango’s
“rounding up” practice.
___________________________________________________________________________________________________________________________________
This section visualizes the distribution of movie scores from different websites on a common 0–5 scale. This allows for a direct comparison of how other sites’ ratings are distributed relative to each other, and implicitly, to Fandango’s.
Histograms: The code creates a separate histogram for each site (excluding Fandango) to show the frequency of ratings at different score levels.
Insights: By filtering out Fandango, we can see that the ratings from sites like Rotten Tomatoes, IMDb, and Metacritic generally have a more varied and spread-out distribution across the 0–5 scale. This contrasts with Fandango, which, as the prior plot showed, has a much higher and more compressed distribution due to its rounding-up policy. This plot provides a powerful visual argument supporting the article’s findings.
# Step 13: Cross-site distributions on common 0–5 scale
ratings_long |>
filter(site != "fandango") |>
mutate(site = str_to_title(str_replace_all(site, "_", " "))) |>
ggplot(aes(x = score_0to5)) +
geom_histogram(bins = 20) +
facet_wrap(~ site) +
labs(
title = "User/critic scores across sites on the same 0–5 scale",
x = "Score (0–5)", y = "Count"
)
This
set of histograms shows how movie scores are distributed on a 0–5 scale
for IMDb, Metacritic, and Rotten Tomatoes. Each panel reveals the
frequency of scores for each site. IMDb and Metacritic ratings tend to
cluster toward the middle of the scale (between 2.5 and 4), while Rotten
Tomatoes displays a wider, more varied spread, including a notable
number of higher ratings. The differences in shape and spread highlight
that other sites provide a broader range of scores, in contrast to
Fandango’s compressed and consistently higher ratings. This
visualization supports the article’s finding: Fandango’s scores are
systematically rounded up, while other platforms show more diversity and
less inflation.
__________________________________________________________________________________________________________________________________
This section provides a direct summary of the most egregious examples of
rating inflation found in the dataset. It creates a table of the top 10
movies where the displayed star rating was most inflated compared to the
actual average rating.
Ranking: The code calculates the difference between fandango_stars and fandango_rating_value and then sorts the movies in descending order based on this difference.
Final Output: The resulting table highlights specific movies that saw the greatest “round-up” in their Fandango rating, providing concrete evidence to support the article’s claim of systematic rating inflation. This gives the audience a clear picture of which movies were most affected by Fandango’s rating policy.
# Step 14: Top-10 inflation by displayed minus actual Fandango rating (stars)
top_inflated <- comp3 |>
arrange(desc(fandango_diff_calc)) |>
slice(1:10) |>
select(title, year, fandango_stars, fandango_rating_value, fandango_diff_calc)
kable(top_inflated, digits = 2, caption = "Top-10 inflation by displayed minus actual Fandango rating (stars)")
title | year | fandango_stars | fandango_rating_value | fandango_diff_calc |
---|---|---|---|---|
Avengers: Age of Ultron | 2015 | 5.0 | 4.5 | 0.5 |
Cinderella | 2015 | 5.0 | 4.5 | 0.5 |
Ant-Man | 2015 | 5.0 | 4.5 | 0.5 |
Do You Believe? | 2015 | 5.0 | 4.5 | 0.5 |
Hot Tub Time Machine 2 | 2015 | 3.5 | 3.0 | 0.5 |
The Water Diviner | 2015 | 4.5 | 4.0 | 0.5 |
Irrational Man | 2015 | 4.0 | 3.5 | 0.5 |
Top Five | 2014 | 4.0 | 3.5 | 0.5 |
Shaun the Sheep Movie | 2015 | 4.5 | 4.0 | 0.5 |
Love & Mercy | 2015 | 4.5 | 4.0 | 0.5 |
The “Top-10 inflation” table lists movies where the difference between Fandango’s displayed stars and the actual average rating is the largest (0.5 stars in every case shown). This direct comparison highlights the specific films most affected by Fandango’s rounding-up practice, with popular titles like “Avengers: Age of Ultron,” “Cinderella,” and “Ant-Man” all showing a 0.5-star inflation. These examples provide clear, concrete evidence of the systematic inflation described in the article and demonstrate how the effect was not limited to obscure titles but also impacted major releases.
This section provides the final deliverable of the assignment: the cleaned and transformed data frame. It takes the movies_clean data frame, which was prepared in the previous steps, and assigns it to a new variable called final_df. This data frame contains a carefully selected subset of columns with meaningful names and includes the key target variable (inflated_flag). The table shown is a preview of the first 10 rows, demonstrating that the data is now structured and ready for any future analysis, such as modeling, visualization, or statistical inference. This step formally concludes the data transformation portion of the project.
# Step 15: Final subset for downstream analysis
final_df <- movies_clean
kable(head(final_df, 10), caption = "Final deliverable: cleaned subset + target")
title | year | fandango_stars | fandango_rating_value | imdb_norm | rotten_tomatoes_norm | rotten_tomatoes_user_norm | metacritic_norm | metacritic_user_norm | fandango_difference | fandango_diff_calc | inflated_flag | imdb_user_vote_count | metacritic_user_vote_count | fandango_votes |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Ant-Man | 2015 | 5.0 | 4.5 | 3.90 | 4.00 | 4.50 | 3.20 | 4.05 | 0.5 | 0.5 | TRUE | 103660 | 627 | 12055 |
Avengers: Age of Ultron | 2015 | 5.0 | 4.5 | 3.90 | 3.70 | 4.30 | 3.30 | 3.55 | 0.5 | 0.5 | TRUE | 271107 | 1330 | 14846 |
Black Sea | 2015 | 4.0 | 3.5 | 3.20 | 4.10 | 3.00 | 3.10 | 3.30 | 0.5 | 0.5 | TRUE | 16547 | 37 | 218 |
Cinderella | 2015 | 5.0 | 4.5 | 3.55 | 4.25 | 4.00 | 3.35 | 3.75 | 0.5 | 0.5 | TRUE | 65709 | 249 | 12640 |
Do You Believe? | 2015 | 5.0 | 4.5 | 2.70 | 0.90 | 4.20 | 1.10 | 2.35 | 0.5 | 0.5 | TRUE | 3136 | 31 | 1793 |
Far From The Madding Crowd | 2015 | 4.5 | 4.0 | 3.60 | 4.20 | 3.85 | 3.55 | 3.75 | 0.5 | 0.5 | TRUE | 12129 | 35 | 804 |
Hot Tub Time Machine 2 | 2015 | 3.5 | 3.0 | 2.55 | 0.70 | 1.40 | 1.45 | 1.70 | 0.5 | 0.5 | TRUE | 19560 | 88 | 1021 |
Irrational Man | 2015 | 4.0 | 3.5 | 3.45 | 2.10 | 2.65 | 2.65 | 3.80 | 0.5 | 0.5 | TRUE | 2680 | 17 | 252 |
Leviathan | 2014 | 4.0 | 3.5 | 3.85 | 4.95 | 3.95 | 4.60 | 3.60 | 0.5 | 0.5 | TRUE | 22521 | 145 | 64 |
Love & Mercy | 2015 | 4.5 | 4.0 | 3.90 | 4.45 | 4.35 | 4.00 | 4.25 | 0.5 | 0.5 | TRUE | 5367 | 54 | 864 |
The final_df
data frame offers a fully cleaned and
analysis-ready subset of the movie ratings data, with each row
representing a film and its ratings from multiple sources. It includes
clear, human-friendly column names and the key target variable
inflated_flag
, which identifies movies where Fandango’s
displayed star rating was at least 0.5 higher than its actual average.
The preview demonstrates that the dataset is now well-structured for
downstream analysis, such as statistical modeling or further
visualization, meeting the requirements for a reproducible and
transparent data science workflow.
___________________________________________________________________________________________________________________________________
This dataset, from the 2015 FiveThirtyEight investigation, has several
inherent limitations:
Temporal Scope:
The data is a snapshot in time, covering movies from 2014-2015. It
doesn’t reflect any changes Fandango may have made to its rating
practices after the article’s publication.
Selection Bias:
The sample primarily includes widely-released movies that appear on
multiple major review sites, potentially excluding independent or
international films. This might not be representative of the entire
cinematic landscape.
Methodological Differences:
The various review sites (IMDb, Rotten Tomatoes, Metacritic) use
different underlying methodologies and user bases. Their scores aren’t a
“ground truth” but rather a separate perspective, each with its own
biases.
Data Integrity:
While we addressed minor issues like the column typo
(metacritic_user_nom
), other subtle inconsistencies may
exist and could affect more complex analyses.
Our analysis confirms the central claim of the FiveThirtyEight
article: the displayed Fandango star rating is systematically inflated
compared to its true numerical average. We found that the
inflated_flag
—which we defined as a difference of at least
0.5 stars—is a common occurrence in this dataset. By contrast, other
sites like IMDb, Rotten Tomatoes, and Metacritic exhibit a more varied
and less skewed distribution of scores across the full 0–5 range.
To build on this project, we recommend the following:
Temporal Analysis:
Obtain more recent data to determine if Fandango’s rating inflation
persisted or changed in the years following the original article. This
would provide a valuable update to the initial findings.
Weighted Averages:
Incorporate the vote counts from each site to create a more robust
analysis. This would give more weight to movies with a larger number of
reviews, improving the accuracy of cross-site comparisons.
Predictive Modeling:
Use the tidy long format data to build a predictive model. We could use
scores from other sites to predict a movie’s Fandango star rating,
helping to quantify the exact degree of inflation.
Threshold Sensitivity:
Explore the impact of changing the inflation threshold (e.g., from 0.5
stars to 0.4 or 0.6) to test the robustness of our
inflated_flag
finding.
fandango_score_comparison.csv
and
fandango_scrape.csv
pulled via raw GitHub URLs.sessionInfo()
## R version 4.4.1 (2024-06-14 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 11 x64 (build 26100)
##
## Matrix products: default
##
##
## locale:
## [1] LC_COLLATE=English_United States.utf8
## [2] LC_CTYPE=English_United States.utf8
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.utf8
##
## time zone: America/New_York
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] httr_1.4.7 scales_1.3.0 knitr_1.49 glue_1.7.0
## [5] janitor_2.2.1 lubridate_1.9.3 forcats_1.0.0 stringr_1.5.1
## [9] dplyr_1.1.4 purrr_1.0.2 readr_2.1.5 tidyr_1.3.1
## [13] tibble_3.2.1 ggplot2_3.5.1 tidyverse_2.0.0
##
## loaded via a namespace (and not attached):
## [1] sass_0.4.9 generics_0.1.3 stringi_1.8.4 hms_1.1.3
## [5] digest_0.6.37 magrittr_2.0.3 evaluate_1.0.3 grid_4.4.1
## [9] timechange_0.3.0 fastmap_1.2.0 jsonlite_1.9.0 jquerylib_0.1.4
## [13] cli_3.6.3 rlang_1.1.4 crayon_1.5.3 bit64_4.6.0-1
## [17] munsell_0.5.1 withr_3.0.2 cachem_1.1.0 yaml_2.3.10
## [21] tools_4.4.1 parallel_4.4.1 tzdb_0.4.0 colorspace_2.1-1
## [25] curl_6.2.1 vctrs_0.6.5 R6_2.6.1 lifecycle_1.0.4
## [29] snakecase_0.11.1 bit_4.5.0.1 vroom_1.6.5 pkgconfig_2.0.3
## [33] pillar_1.10.1 bslib_0.9.0 gtable_0.3.6 xfun_0.51
## [37] tidyselect_1.2.1 rstudioapi_0.17.1 farver_2.1.2 htmltools_0.5.8.1
## [41] rmarkdown_2.29 labeling_0.4.3 compiler_4.4.1