Video games. The manifestation of our childhood daydreams. Whether we wanted to be an elite street racer, a superhero in a dystopian suburbia, or anything in between and more, it seems it was all one power button, and 2 double A batteries away. My dataset looks at some of the top-performing video games and the mark they left on the community. Sina Ghafouri, an MSc (Masters of Science) student in Complex Systems Physics at Shahid Beheshti Univserty, scraped Metacritic.com, a well respected review-aggregation website. Metacritic.com distills the opinions of the most respected online writing critics and puts that data online to help consumers get an insight on the entertainment sector prior to spending money. This dataset in particular looks at video games that were playable on the Xbox Series X. I am a huge video game geek, and part of my passion for video games is trying to understand what happens behind the scenes, and look at video games not just for their performance and storytelling, but also for their influence and culture.
Our dataset has a multitude of variables, but the ones listed will be the ones that are pertinent to our investigation and analysis:
| Variable Name | Description |
|---|---|
| Name | Name of the video vame |
| Developer | Name of the company that developed the game |
| Release Date | Date that the game was released |
| Metascore | Aggregated professional-critic score |
| Userscore | Aggreagted score of the general public given to the game |
| # of User Reviews | Number of general users that reviewed the game |
| Genres | Genres attributed to the video game |
Variance and exploration: 1. Are there any hidden gems in the dataset? (Defined by having low metascore but high userscore) 2. Is there any pattern when looking at differences between acclaimed critics and the general public?
Statistical Analysis: Is there statistical significance between mean metascores and mean userscores?
library(tidyverse)
sxgames <- read_csv("xsxgame.csv")
sxgames$Userscore <- as.numeric(sxgames$Userscore)
colnames(sxgames) <- gsub(" ", "_", colnames(sxgames))
sxgames$Developer <- gsub(" ", "_", sxgames$Developer)
sxgames <- sxgames %>% select(-c(Link, Distributer, Also_On, Summary, Meta_Status, Critic_Mixed, Critic_Positive, Critic_Negative, Userscore_Status, Awards))
sxgames$Genre <- str_extract(sxgames$Genres, "([[:alnum:]]|'|-)+")
sxgames <- rename(sxgames, User_Reviews = `#_of_User_Reviews`)
Not only did we load the data, but also cleaned it by removing spaces, ensuring classes are appropriate for variables, and removing variables that won’t be of use in this investigation.
One of the biggest issues with video games occurs on the social level: The disconnect between user experience and enjoyability, and the praise given to video game developers by respected critics. Let’s check for statistical significance between metacritic scores and userscores (scores from the general public).
Looking at the missing data first:
colSums(is.na(sxgames))
## Name Developer Release_Date Metascore
## 0 1 0 0
## #_of_Critic_Reviews Userscore User_Reviews User_Positive
## 0 18 3 3
## User_Mixed User_Negative Genres #_Of_Players
## 3 3 0 36
## Rating Genre
## 16 0
We first have to make the userscores compatible
sxgames <- sxgames %>% mutate(Userscore = Userscore * 10)
boxplot(sxgames$Metascore, sxgames$Userscore,
names = c("Metacritic Scores", "User Scores"),
xlab = "Score Source", ylab = "Score Given",
col = c("orange", "purple"))
We see a notable difference visually, but how statistically significant are the differences?
Ho: μmetascore = μuserscore
Ha: μmetascore ≠ μuserscore
t.test(sxgames$Metascore, sxgames$Userscore)
##
## Welch Two Sample t-test
##
## data: sxgames$Metascore and sxgames$Userscore
## t = 7.511, df = 309.65, p-value = 6.323e-13
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 7.551141 12.911772
## sample estimates:
## mean of x mean of y
## 75.59031 65.35885
Our p-value was calculated to be 6.323e-13, which is INCREDIBLY small. We have sufficient evidence to conclude that there is statistical significance between mean scores of metacritics, and mean scores of the general public.
Negative differences in the “Differences” column indicate that userscores were less than Metascores, and the further away from 0 it is indicates how much less the general public enjoyed the game. We’ll be looking at games that had a negative score, and over 50 user reviews.
neg_games <- sxgames %>% filter(Difference < 0) %>% filter(User_Reviews > 50)
plot_ly(neg_games, x= ~Release_Date, y= ~Difference, z = ~User_Reviews, color = ~Genre) %>%
layout(scene = list(xaxis = list(title= "Release Date", font = list(size = 8)),
yaxis = list(title ="Difference (Userscore - Metascore)", font = list(size = 8)),
zaxis = list(title = "# of User Reviews", font = list(size = 8),
title = "User Reviews vs. Difference (Userscore - Metascore) vs. Release Date"))) %>%
add_markers(text = ~paste("Title: ", Name, "<br>",
"Developer: ", Developer, "<br>",
"Genre: ", Genre, "<br>",
"Release Date: ", Release_Date, "<br>",
"# of User Reviews: ", User_Reviews, "<br>",
"Difference: ", Difference, "<br>"),
hoverinfo = "text")
It is very interesting seeing the assumption of our curiosity come to fruition. Among the highest quadrant in x = Userscore, y = # of User Reviews, every single game there was created by a Triple A developer. Three out of the top 5 games with biggest differences were created by Electronic Arts. Allow me to paint just part of the bigger picture. In 2022 alone, EA stated that their revenue was just shy of $7 billion dollars, with only 7 games going into the market that year. Many of these games felt rushed, with users experiencing countless bugs and glitches, and having to go through the frustration of download gigabytes upon gigabytes of updates that did not improve the game by much. And the xbox market? It was the biggest 1-star parade of frustration and disappointment. The reviews were so low, they turned EA’s review platform in the Microsoft store into a humor-posting carnival. Yet somehow, critics are giving these video games a high score. Unfortunately, I do not have the data that is necessary to look further into this, but my passion for illustrating this truth will not die out.
Video games are going nowhere. They have been, since its inception, a crucial part of world culture. They’re a form of expression, adventure, and testament to how far technology has come. They’re also a testament that there’s more than what meets the eye. With billions of dollars circulating, one can only assume that these powerhouses have remained in their position by making the right judgements, or perhaps in this case, the “write” judgments.
Sources: