MID-TERM BRIGHT EBEDOT
This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.
Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Cmd+Shift+Enter.
Install necessary packages. Comment after installation
#install.packages('tm')
#install.packages('RColorBrewer')
#install.packages('wordcloud')
#install.packages("slam", type = "binary")
Include the packages.
library('tm')
## Loading required package: NLP
library('RColorBrewer')
library('wordcloud')
library('slam')
Process data
ZyngaData <- readRDS("Zynga.RDS")
tweets <- ZyngaData$text
# swap out all non-alphanumeric characters
# Note that the definition of what constitutes a letter or a number or a punctuatution mark varies slightly depending upon your locale, so you may need to experiment a little to get exactly what you want.
# str_replace_all(tweets, "[^[:alnum:]]", " ")
# iconv(tweets, from = 'UTF-8', to = 'ASCII//TRANSLIT')
# Encoding(tweets) <- "UTF-8"
# Function to clean tweets
clean.text = function(x)
{
# remove rt
x = gsub("rt", "", x)
# remove at
x = gsub("@\\w+", "", x)
# remove punctuation
x = gsub("[[:punct:]]", "", x)
# remove numbers
x = gsub("[[:digit:]]", "", x)
# remove links http
x = gsub("http\\w+", "", x)
# remove tabs
x = gsub("[ |\t]{2,}", "", x)
# remove blank spaces at the beginning
x = gsub("^ ", "", x)
# remove blank spaces at the end
x = gsub(" $", "", x)
# tolower
# x = tolower(x)
return(x)
}
# clean tweets
tweets = clean.text(tweets)
Create word cloud of tweets
corpus = Corpus(VectorSource(tweets))
# create term-document matrix
tdm = TermDocumentMatrix(
corpus,
control = list(
wordLengths=c(3,50),
removePunctuation = TRUE,
stopwords = c("the", "a", stopwords("english")),
removeNumbers = TRUE,
# tolower may cause trouble on Window because UTF-8 encoding, changed to FALSE
tolower = FALSE) )
# convert as matrix. It may consume near 1g of your RAM
tdm = as.matrix(tdm)
# get word counts in decreasing order
word_freqs = sort(rowSums(tdm), decreasing=TRUE)
#check top 50 most mentioned words
head(word_freqs, 50)
## Zynga needs Poker
## 895 591 545
## just card sent
## 500 469 469
## raised Deborah Scott
## 401 311 264
## looking Prized now
## 257 219 213
## can Petra adult
## 198 187 186
## Jeneva found rewards
## 174 166 159
## trees Play FarmVilleCountryEscape
## 158 157 155
## How car bit
## 143 141 140
## video sponsorship RTHeres
## 139 139 138
## needing shook gotas
## 138 138 137
## Points Betty Fruit
## 129 125 113
## Kathryn FarmVilleOnWeb The
## 112 111 108
## Career Spring get
## 105 104 102
## Corner ️ Game
## 102 98 98
## You County use
## 94 94 92
## Hat Black game
## 92 92 88
## King win
## 88 87
#remove the top words which don’t generate insights such as "the", "a", "and", etc.
word_freqs = word_freqs[-(1:9)] #Here “1:9” is 1st-5th words in the list we want to remove
# create a data frame with words and their frequencies
dm = data.frame(word=names(word_freqs), freq=word_freqs)
#Plot corpus in a clored graph; need RColorBrewer package
wordcloud(head(dm$word, 200), head(dm$freq, 200), random.order=FALSE, colors=brewer.pal(8, "Dark2"))
#check top 50 most mentioned words
head(word_freqs, 50)
## looking Prized now
## 257 219 213
## can Petra adult
## 198 187 186
## Jeneva found rewards
## 174 166 159
## trees Play FarmVilleCountryEscape
## 158 157 155
## How car bit
## 143 141 140
## video sponsorship RTHeres
## 139 139 138
## needing shook gotas
## 138 138 137
## Points Betty Fruit
## 129 125 113
## Kathryn FarmVilleOnWeb The
## 112 111 108
## Career Spring get
## 105 104 102
## Corner ️ Game
## 102 98 98
## You County use
## 94 94 92
## Hat Black game
## 92 92 88
## King win Check
## 88 87 86
## FarmVille Nesting Horse
## 83 83 82
## Dolls Mobile Ribbon
## 82 82 81
## today yet
## 80 80
# I see some words I don't know or understand, so I retrieve the tweets that have the words
# I retrieve all the tweets that have "nigeria" in it
index = grep("zynga", tweets)
tweets[index]
## [1] "sino may mga zynga poker dyan\U0001f602 SEND NYO YUNG INVITER CODE NYO SAKINNNN NEED MONEYYY HAHAHA"
## [2] "Peut zynga rebondir avec le jeu en lignetableau déchange readwrite bitcoin"
## [3] "zynga"
## [4] "anyone wna play zynga poker with mi"
## [5] "\ni just create this profile to give advise if you can please to give few option to zynga poker \nsomething li"
## [6] "how have I lost k onBall Pool amp M on zynga poker"
## [7] "So good History of fbs api is a core history of the web From friend feed zynga Twitter denying Sna"
## [8] "بضايق فشخ لما اخسر الفلوس اللي معايا في zynga poker وبقعد مش لاقي حاجه اعملها"
## [9] "hello i wonder if you can talk with zynga poker texas holdem developer to fix the app some things is access"
## [10] "Thx for the dinner Best Western \U0001f64f\U0001f3fcBukan zynga poker \U0001f605aqwestobatBest Western Premier"
## [11] "zynga poker na lang"
## [12] "Svp mon joue ne veut pas se débloquer aidez moi svp zynga"
## [13] "fearless leader visiting zynga banglore"
## [14] "you purposely put your phone on charge while playing zynga poker so the battery doesnt run out and you end up losing chips"
## [15] "zynga bagay diyan"
## [16] "le pire que jai trouvé cest zynga qui te demande presque dargumenter\n\net paypal ptn paypal\n\nJe me"
## [17] "I thanked zynga for helping you last week"
## [18] "RTStaed withon the stock market bought some Fitbit shares and and a zynga share Im atnow I dont know how you guy"
## [19] "Staed withon the stock market bought some Fitbit shares and and a zynga share Im atnow I dont know h"
## [20] "naay jesa nga maadik sa zynga"
## [21] "You work for zynga pat Nice piece right"
## [22] "Heres my cellphoneI gave up likeyears ago Pretty sure zynga poker has pics of my pebus on their servers"
## [23] "B masyukkk zynga poker \U0001f600"
## [24] "RTMafia wars zynga poker dayyummmmmmmmmmmm"
## [25] "lmao zyngas gdc pay WOULD be street racing themed"
## [26] "Potamil zynga poker"
## [27] "I only play poker on zynga now"
## [28] "Ive finally reachedon zynga poker If only it was real money"
During the study of Zynga, there is needs to increase the game worldwide. By increasing the communication between the players, this will help to adequately increasing revenue. In my findings on word cloud I will suggest personally for Zynga to add to their bonus point to their game this will help in boosting the players mindset. One finding I saw again is that users are very active on weekend than weekdays, so this is an opportunity for Zynga to facilitate the players on that time.