MID-TERM BRIGHT EBEDOT

This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.

Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Cmd+Shift+Enter.

Install necessary packages. Comment after installation

#install.packages('tm')
#install.packages('RColorBrewer')
#install.packages('wordcloud')
#install.packages("slam", type = "binary")

Include the packages.

library('tm')
## Loading required package: NLP
library('RColorBrewer')
library('wordcloud')
library('slam')

Process data

ZyngaData <- readRDS("Zynga.RDS")
tweets <- ZyngaData$text

# swap out all non-alphanumeric characters
# Note that the definition of what constitutes a letter or a number or a punctuatution mark varies slightly depending upon your locale, so you may need to experiment a little to get exactly what you want.
# str_replace_all(tweets, "[^[:alnum:]]", " ")
# iconv(tweets, from = 'UTF-8', to = 'ASCII//TRANSLIT')
# Encoding(tweets)  <- "UTF-8"

# Function to clean tweets
clean.text = function(x)
{
  # remove rt
  x = gsub("rt", "", x)
  # remove at
  x = gsub("@\\w+", "", x)
  # remove punctuation
  x = gsub("[[:punct:]]", "", x)
  # remove numbers
  x = gsub("[[:digit:]]", "", x)
  # remove links http
  x = gsub("http\\w+", "", x)
  # remove tabs
  x = gsub("[ |\t]{2,}", "", x)
  # remove blank spaces at the beginning
  x = gsub("^ ", "", x)
  # remove blank spaces at the end
  x = gsub(" $", "", x)
  # tolower
  # x = tolower(x)
  return(x)
}

# clean tweets
tweets = clean.text(tweets)

Create word cloud of tweets

corpus = Corpus(VectorSource(tweets))

# create term-document matrix
tdm = TermDocumentMatrix(
  corpus,
  control = list(
    wordLengths=c(3,50),
    removePunctuation = TRUE,
    stopwords = c("the", "a", stopwords("english")),
    removeNumbers = TRUE, 
  # tolower may cause trouble on Window because UTF-8 encoding, changed to FALSE  
    tolower = FALSE) )

# convert as matrix. It may consume near 1g of your RAM
tdm = as.matrix(tdm)

# get word counts in decreasing order
word_freqs = sort(rowSums(tdm), decreasing=TRUE) 

#check top 50 most mentioned words
head(word_freqs, 50)
##                  Zynga                  needs                  Poker 
##                    895                    591                    545 
##                   just                   card                   sent 
##                    500                    469                    469 
##                 raised                Deborah                  Scott 
##                    401                    311                    264 
##                looking                 Prized                    now 
##                    257                    219                    213 
##                    can                  Petra                  adult 
##                    198                    187                    186 
##                 Jeneva                  found                rewards 
##                    174                    166                    159 
##                  trees                   Play FarmVilleCountryEscape 
##                    158                    157                    155 
##                    How                    car                    bit 
##                    143                    141                    140 
##                  video            sponsorship                RTHeres 
##                    139                    139                    138 
##                needing                  shook                  gotas 
##                    138                    138                    137 
##                 Points                  Betty                  Fruit 
##                    129                    125                    113 
##                Kathryn         FarmVilleOnWeb                    The 
##                    112                    111                    108 
##                 Career                 Spring                    get 
##                    105                    104                    102 
##                 Corner                       ️                   Game 
##                    102                     98                     98 
##                    You                 County                    use 
##                     94                     94                     92 
##                    Hat                  Black                   game 
##                     92                     92                     88 
##                   King                    win 
##                     88                     87
#remove the top words which don’t generate insights such as "the", "a", "and", etc.
word_freqs = word_freqs[-(1:9)]  #Here “1:9” is 1st-5th words in the list we want to remove 

# create a data frame with words and their frequencies
dm = data.frame(word=names(word_freqs), freq=word_freqs)

#Plot corpus in a clored graph; need RColorBrewer package

wordcloud(head(dm$word, 200), head(dm$freq, 200), random.order=FALSE, colors=brewer.pal(8, "Dark2"))

#check top 50 most mentioned words
head(word_freqs, 50)
##                looking                 Prized                    now 
##                    257                    219                    213 
##                    can                  Petra                  adult 
##                    198                    187                    186 
##                 Jeneva                  found                rewards 
##                    174                    166                    159 
##                  trees                   Play FarmVilleCountryEscape 
##                    158                    157                    155 
##                    How                    car                    bit 
##                    143                    141                    140 
##                  video            sponsorship                RTHeres 
##                    139                    139                    138 
##                needing                  shook                  gotas 
##                    138                    138                    137 
##                 Points                  Betty                  Fruit 
##                    129                    125                    113 
##                Kathryn         FarmVilleOnWeb                    The 
##                    112                    111                    108 
##                 Career                 Spring                    get 
##                    105                    104                    102 
##                 Corner                       ️                   Game 
##                    102                     98                     98 
##                    You                 County                    use 
##                     94                     94                     92 
##                    Hat                  Black                   game 
##                     92                     92                     88 
##                   King                    win                  Check 
##                     88                     87                     86 
##              FarmVille                Nesting                  Horse 
##                     83                     83                     82 
##                  Dolls                 Mobile                 Ribbon 
##                     82                     82                     81 
##                  today                    yet 
##                     80                     80
# I see some words I don't know or understand, so I retrieve the tweets that have the words
# I retrieve all the tweets that have "nigeria" in it

index = grep("zynga", tweets)
tweets[index]
##  [1] "sino may mga zynga poker dyan\U0001f602 SEND NYO YUNG INVITER CODE NYO SAKINNNN NEED MONEYYY HAHAHA"                       
##  [2] "Peut zynga rebondir avec le jeu en lignetableau déchange readwrite bitcoin"                                                
##  [3] "zynga"                                                                                                                     
##  [4] "anyone wna play zynga poker with mi"                                                                                       
##  [5] "\ni just create this profile to give advise if you can please to give few option to zynga poker \nsomething li"            
##  [6] "how have I lost k onBall Pool amp M on zynga poker"                                                                        
##  [7] "So good History of fbs api is a core history of the web From friend feed zynga Twitter denying Sna"                        
##  [8] "بضايق فشخ لما اخسر الفلوس اللي معايا في zynga poker وبقعد مش لاقي حاجه اعملها"                                             
##  [9] "hello i wonder if you can talk with zynga poker texas holdem developer to fix the app some things is access"               
## [10] "Thx for the dinner Best Western \U0001f64f\U0001f3fcBukan zynga poker \U0001f605aqwestobatBest Western Premier"            
## [11] "zynga poker na lang"                                                                                                       
## [12] "Svp mon joue ne veut pas se débloquer aidez moi svp zynga"                                                                 
## [13] "fearless leader visiting zynga banglore"                                                                                   
## [14] "you purposely put your phone on charge while playing zynga poker so the battery doesnt run out and you end up losing chips"
## [15] "zynga bagay diyan"                                                                                                         
## [16] "le pire que jai trouvé cest zynga qui te demande presque dargumenter\n\net paypal ptn paypal\n\nJe me"                     
## [17] "I thanked zynga for helping you last week"                                                                                 
## [18] "RTStaed withon the stock market bought some Fitbit shares and and a zynga share Im atnow I dont know how you guy"          
## [19] "Staed withon the stock market bought some Fitbit shares and and a zynga share Im atnow I dont know h"                      
## [20] "naay jesa nga maadik sa zynga"                                                                                             
## [21] "You work for zynga pat Nice piece right"                                                                                   
## [22] "Heres my cellphoneI gave up likeyears ago Pretty sure zynga poker has pics of my pebus on their servers"                   
## [23] "B masyukkk zynga poker \U0001f600"                                                                                         
## [24] "RTMafia wars zynga poker dayyummmmmmmmmmmm"                                                                                
## [25] "lmao zyngas gdc pay WOULD be street racing themed"                                                                         
## [26] "Potamil zynga poker"                                                                                                       
## [27] "I only play poker on zynga now"                                                                                            
## [28] "Ive finally reachedon zynga poker If only it was real money"

During the study of Zynga, there is needs to increase the game worldwide. By increasing the communication between the players, this will help to adequately increasing revenue. In my findings on word cloud I will suggest personally for Zynga to add to their bonus point to their game this will help in boosting the players mindset. One finding I saw again is that users are very active on weekend than weekdays, so this is an opportunity for Zynga to facilitate the players on that time.