Advanced data analytics 1 assignment 2

library(tidyverse) 
library(tidytext) 
library(wordcloud)
library(topicmodels)
library(topicdoc)
library(ggplot2)
library(reshape2)
library(stopwords)
library(textdata)


gamestop <- read_csv ("gamestop_product_reviews.csv")
mcdonalds <- read_csv ("mcdonalds_reviews.csv")

Question 1 Text and Sentiment Analysis

(A)

(B)

(i)

In general, negative words are more common in reviews than positive words.Customers are more likely to comment negatively instead of positively. In McDonalds case, the three most common negative words all have a count of over 150, but only the most common positive word had a count of over 150.

(C)

(D)

(E)

(F)

(i)

Typically, when customers are referring to the word “waiting”, its usually in the context of poor customer service, and long delays. This is both in-store and at the drive through.Customers seem to think this is a result of poor attention from the staff, or inificiency. The longer wait usually is due to waiting for their food, assistance, or payment. Some customers believe this results in inncorrect orders or cold food.

(ii)

These reviews are quite mixed. Some customers express excitement towards the shake or nostalgia due to its limited availability. Some reviews express some dissapointed comments due to some locations not being as consistent as others or just finding the drink too sweet.

(iii)

Typically when the ice cream machine is mentioned, its generally quite negative. These comments usually outline frustations due to power outages, or the machine being off late at night. This reocurring theme can lead to customers being completely unsatisfied.

(G)

Question 2 Topic Modelling Analysis

(A)

<<DocumentTermMatrix (documents: 77821, terms: 9604)>>
Non-/sparse entries: 77821/747315063
Sparsity           : 100%
Maximal term length: 27
Weighting          : term frequency (tf)

(B)

# A tibble: 101 × 3
   topic term         beta
   <int> <chr>       <dbl>
 1     1 pokemon    0.0808
 2     1 graphics   0.0323
 3     1 energizer  0.0199
 4     1 fun        0.0190
 5     1 xbox       0.0176
 6     1 lot        0.0168
 7     1 zelda      0.0151
 8     1 black      0.0148
 9     1 2          0.0133
10     1 absolutely 0.0106
# ℹ 91 more rows

C

Topic 1: This could be comments reffering to the graphics on pokemon or zelda being good.

Topic 2: Potentially surrounding batteries for appliences like games or moniters being a good option instead of a plug in option.

Topic 3: Gaming on a tv is easier and worthwhile, instead of other alternatives such as moniters or computers.

Topic 4: There is a longer reccomened play time for this series but its worthwhile.

Topic 5: A really good game ith good features.

Topic 6: The quality of the batteries are really good, they potentially last longer than others.

Topic 7:

Topic 8:

Topic 9:

Topic 10:

   topic_num topic_size mean_token_length dist_from_corpus tf_df_dist
1          1   947.9462               5.5        0.5925200          0
2          2   926.5020               5.7        0.5600071          0
3          3   971.1871               5.7        0.5984456          0
4          4   958.5754               5.3        0.6059046          0
5          5   946.9539               5.2        0.5668348          0
6          6   965.7248               6.5        0.6023261          0
7          7   974.3125               4.8        0.5640667          0
8          8   974.5684               5.3        0.6078920          0
9          9   965.4095               4.7        0.5811292          0
10        10   972.8203               4.6        0.5736287          0
   doc_prominence topic_coherence topic_exclusivity
1               0       -421.3962          9.773387
2               0       -441.3778          9.624160
3               0       -407.8515          9.901533
4               0       -416.8882          9.777006
5               0       -434.9903          9.650003
6               0       -407.8311          9.846372
7               0       -431.9216          9.608833
8               0       -403.3100          9.747891
9               0       -412.6744          9.739809
10              0       -417.0966          9.628106