Assignment 8

Introduction

EA Sports is known for making popular sports games like Madden, Fifa, and NHL. Over the years, people are starting to turn on EA because they are getting lazy with the amount of work they have put into their games are now starting to lack detail because there is not competitor when it comes to a football or soccer game. The question I am looking to answer is, are the reviews for EA games the same for Madden and Fifa even though they have different production teams and a different consumer interest?

Loading Data

Load in the necessary packages needed

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(readr)
library(dplyr)
library(rvest)

Attaching package: 'rvest'

The following object is masked from 'package:readr':

    guess_encoding
library(jsonlite)

Attaching package: 'jsonlite'

The following object is masked from 'package:purrr':

    flatten
library(ggplot2)
library(ggwordcloud)
library(lubridate) 
library(tidytext)  
library(textdata)  
library(widyr)     
library(igraph)    

Attaching package: 'igraph'

The following objects are masked from 'package:lubridate':

    %--%, union

The following objects are masked from 'package:dplyr':

    as_data_frame, groups, union

The following objects are masked from 'package:purrr':

    compose, simplify

The following object is masked from 'package:tidyr':

    crossing

The following object is masked from 'package:tibble':

    as_data_frame

The following objects are masked from 'package:stats':

    decompose, spectrum

The following object is masked from 'package:base':

    union
library(ggraph)   

I scraped a variety of different reviews from positive and negative. There is 100 reviews for each game.

EAFC_Reviews_Clean<-read_csv("https://myxavier-my.sharepoint.com/:x:/g/personal/shannonr1_xavier_edu/IQDkvjrmUQyVTYGlfmqc4i1-AUS_OFT9IQMDNbICwKPqm-c?download=1")
New names:
Rows: 100 Columns: 3
── Column specification
──────────────────────────────────────────────────────── Delimiter: "," chr
(1): reviews.review dbl (1): ...1 lgl (1): reviews.steam_purchase
ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
Specify the column types or set `show_col_types = FALSE` to quiet this message.
• `` -> `...1`
Madden_Reviews_Clean<-read_csv("https://myxavier-my.sharepoint.com/:x:/g/personal/shannonr1_xavier_edu/IQAkZBzKlLPuSI2gQq-SODnaATe4A4b8GP5mKpfCuyHcNsY?download=1")
New names:
Rows: 100 Columns: 3
── Column specification
──────────────────────────────────────────────────────── Delimiter: "," chr
(1): reviews.review dbl (1): ...1 lgl (1): reviews.steam_purchase
ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
Specify the column types or set `show_col_types = FALSE` to quiet this message.
• `` -> `...1`

Tidying the Data

Next I want to Tidy the data so that I could figure out key words that people are using so that you can figure out if they are saying positive or negative things about the games.

bing <- 
  get_sentiments("bing")
tidy_EAFC<-
  EAFC_Reviews_Clean %>% 
  unnest_tokens(word, reviews.review) %>% 
  anti_join(stop_words) 
Joining with `by = join_by(word)`
tidy_EAFC<-
  EAFC_Reviews_Clean %>% 
  unnest_tokens(word, reviews.review) %>% 
  anti_join(stop_words)
Joining with `by = join_by(word)`
tidy_EAFC %>% 
  group_by(word) %>% 
  summarize(n = n()) %>% 
  arrange(-n)
# A tibble: 676 × 2
   word         n
   <chr>    <int>
 1 game        65
 2 football    15
 3 gameplay    15
 4 play        14
 5 shit        14
 6 fifa        12
 7 time        10
 8 ea           9
 9 feels        8
10 fun          8
# ℹ 666 more rows
tidy_EAFC %>% 
  pairwise_count(item = word,        
                 feature = reviews.steam_purchase, 
                 upper = FALSE) %>%  
  arrange(-n)
# A tibble: 228,150 × 3
   item1 item2     n
   <chr> <chr> <dbl>
 1 fc    25        2
 2 game  rush      1
 3 game  5         1
 4 rush  5         1
 5 game  mode      1
 6 rush  mode      1
 7 5     mode      1
 8 game  fun       1
 9 rush  fun       1
10 5     fun       1
# ℹ 228,140 more rows
EAFC_counts <- 
  tidy_EAFC %>% 
  group_by(word) %>% 
  summarize(n = n()) %>% 
  inner_join(bing)
Joining with `by = join_by(word)`
tidy_Madden<-
  Madden_Reviews_Clean %>% 
  unnest_tokens(word, reviews.review) %>% 
  anti_join(stop_words)
Joining with `by = join_by(word)`
tidy_Madden %>% 
  group_by(word) %>% 
  summarize(n = n()) %>% 
  arrange(-n)
# A tibble: 736 × 2
   word         n
   <chr>    <int>
 1 game        95
 2 madden      36
 3 play        34
 4 ea          18
 5 pc          14
 6 bad         13
 7 buy         13
 8 football    13
 9 time        13
10 fun         12
# ℹ 726 more rows
tidy_Madden %>% 
  pairwise_count(item = word,        
                 feature = reviews.steam_purchase, 
                 upper = FALSE) %>%  
  arrange(-n)
# A tibble: 263,390 × 3
   item1  item2       n
   <chr>  <chr>   <dbl>
 1 game   madden      2
 2 game   win         2
 3 madden win         2
 4 game   plays       2
 5 madden plays       2
 6 win    plays       2
 7 game   offense     2
 8 madden offense     2
 9 win    offense     2
10 plays  offense     2
# ℹ 263,380 more rows
Madden_counts <- 
  tidy_Madden %>% 
  group_by(word) %>% 
  summarize(n = n()) %>% 
  inner_join(bing)
Joining with `by = join_by(word)`

Madden 26 Visualizations

For this first visualization, I wanted to show the positive and negative words combined and see if there is a trend of positive or negative words at the top. For Madden, There seems to be a lot more negative but a lot of positive near the middle.

Madden_counts %>% 
  group_by(sentiment) %>% 
  slice_max(n, n = 10) %>%    
  ungroup() %>% 
  ggplot(aes(x = reorder(word, n), y = n, fill = sentiment)) +
  geom_col() +
  coord_flip() +
  labs(
    title = "Most Common Positive and Negative Words in Madden Reviews",
    x = "Word",
    y = "Count"
  ) +
  scale_fill_manual(values = c("positive" = "steelblue", "negative" = "firebrick")) +
  theme_minimal()

This next visualization, I wanted the words to pop out at you and these are the words that could sway someone to buying the game or not. When first looking at this, I see a lot of red and a lot of words that you don’t want to see when you are figuring out if you want to buy a game or not.

ggplot(Madden_counts, aes(
  label = word,
  size  = n,
  color = sentiment
)) +
  geom_text_wordcloud_area(eccentricity = 1) +
  scale_size_area(max_size = 20) +
  scale_color_manual(values = c("positive" = "steelblue", "negative" = "firebrick")) +
  theme_minimal() +
  labs(
    title = "Word Bubble of Madden Reviews",
    size = "Frequency"
  )

This last visualization I wanted to show was to show the most frequent words in reviews. What this really shows is that there are more frequent negative words than positive which again is something that you don’t want to see when buying a game.

Madden_counts %>%
  group_by(sentiment) %>% 
  slice_max(n, n = 10) %>% 
  ungroup() %>% 
  ggplot(aes(x = reorder(word, n), y = n, fill = sentiment)) +
  geom_col() +
  facet_wrap(~ sentiment, scales = "free_y") +
  coord_flip() +
  labs(
    title = "Top Words Contributing to Positive and Negative Sentiment",
    x = "Word",
    y = "Frequency"
  ) +
  theme_minimal()

EAFC Visualizations

I decided to make the same 3 visualizations but for EAFC 26. These reviews were a lot worse and it looks like the EAFC fan base is a lot more passionate about their game than Madden. You can tell their fans want change bad.

EAFC_counts %>% 
  group_by(sentiment) %>% 
  slice_max(n, n = 10) %>%    
  ungroup() %>% 
  ggplot(aes(x = reorder(word, n), y = n, fill = sentiment)) +
  geom_col() +
  coord_flip() +
  labs(
    title = "Most Common Positive and Negative Words in Madden Reviews",
    x = "Word",
    y = "Count"
  ) +
  scale_fill_manual(values = c("positive" = "steelblue", "negative" = "firebrick")) +
  theme_minimal()

ggplot(EAFC_counts, aes(
  label = word,
  size  = n,
  color = sentiment
)) +
  geom_text_wordcloud_area(eccentricity = 1) +
  scale_size_area(max_size = 20) +
  scale_color_manual(values = c("positive" = "steelblue", "negative" = "firebrick")) +
  theme_minimal() +
  labs(
    title = "Word Bubble of Madden Reviews",
    size = "Frequency"
  )

EAFC_counts %>%
  group_by(sentiment) %>% 
  slice_max(n, n = 10) %>% 
  ungroup() %>% 
  ggplot(aes(x = reorder(word, n), y = n, fill = sentiment)) +
  geom_col() +
  facet_wrap(~ sentiment, scales = "free_y") +
  coord_flip() +
  labs(
    title = "Top Words Contributing to Positive and Negative Sentiment",
    x = "Word",
    y = "Frequency"
  ) +
  theme_minimal()

Conclusion

In conclusion, EA is doing a terrible job a producing games and putting out a product that is scamming their fan base. These reviews show that there needs to be a competitor in the market that would allow them to focus on their game and put more details into it.