Objective

Brexit is the term for the potential or hypothetical departure of United Kingdom from European Union. On June 23 2016, people of Britan voted for a Britan exit in a historic referendum. We analyzed tweets on #brexit from Apr 15 2016 - Jul 29 2016 to understand the sentiments of the people.

Dataset Information:

Extract tweets on brexit from London city, dated April 15th 2016 (when the campain started officially)- until July 29th 2017 # News Source : http://www.businessinsider.com/brexit-campaign-starts-april-15-2016-4 # Data Source : #brexit using Twitter API

Analysis Details…….

Part 1 :

Extract tweets on brexit from London city, dated April 15th 2016 (when the campain started officially)- until July 29th 2017.
Look at the tweet status source and plotted the source platform.
Clean the tweets and removed stop words.
Find the frequency of words and visulaize the word cloud.

Part 2 : Data Exploration for Sentimental Analysis

Find data associations for term - remain, leave
Score the sentiment against positive and negative words
Visualize the sentiments of the tweets

In particular, the first phase is represented by the establishment of the R connection with the Twitter API.
In the second phase, instead, we created a filter function for the future reduction of the text of our data, and we prepared the lexicon to be used for assessing the polarity.
Then, in the third phase we performed our classification of the sentiment using a simple score sentiment function.
Finally, we analyzed the results and commented upon them.

Analysis Details…….

Part 1 :

Extract tweets on brexit from London city, dated April 15th 2016 (when the campain started officially)- until July 29th 2017.
Look at the tweet status source and plotted the source platform.
Clean the tweets and removed stop words.
Find the frequency of words and visulaize the word cloud.

Part 2 : Data Exploration for Sentimental Analysis

Find data associations for term - remain, leave
Score the sentiment against positive and negative words
Visualize the sentiments of the tweets

. ### Environment Setup

#Cred <- setup_twitter_oauth(api_key,api_secret,access_token,access_token_secret)
Cred <- OAuthFactory$new(consumerKey=consumerKey,
                         consumerSecret=consumerSecret,
                         requestURL=reqURL,
                         accessURL=accessURL, 
                         authURL=authURL)

save(Cred, file='twitter authentication.Rdata')
load('twitter authentication.Rdata') #Once you launched the code first time, you can start from this line in the future (libraries should be connected)
setup_twitter_oauth(consumerKey, consumerSecret, access_token=NULL, access_secret=NULL)

## [1] "Using browser based authentication"

Extract, Import and Read

# Extract tweets on brexit from London city, dated April 15th 2016 (when the campain started officially)- until July 29th 2017 # Source : http://www.businessinsider.com/brexit-campaign-starts-april-15-2016-4 

brexit_tweets <-  searchTwitteR("brexit", n=3000, lang='en',since='2015-04-15', until = '2017-07-29') #geocode= "51.507351,-0.127758,60km") # eg geocode  for London city

Convert the list of tweets into a data frame

brexit_tweetDF<-twListToDF(brexit_tweets)
head(brexit_tweetDF)

##                                                                                                                                               text
## 1                 @1PabloAngel As I say all the time, the only way out of Brexit is if the electorate demand it stops. Which labour could exploit.
## 2 RT @ReutersUK: Daily Briefing\n- British consumer morale sinks\n- Brexit talks run behind schedule \n- France boasts strong GDP growth\nhttps:/<U+0085>
## 3     RT @MikeCarlton01: i read Boris Johnson's Sydney speech.  Amusing, elegant. Truly, one ot the great minds ot the 18th century. https://t.co<U+0085>
## 4     RT @LibDemPress: Good to see Sadiq Khan agreeing with our call for final vote on Brexit deal with choice to stay in EU. Now over to Corbyn.<U+0085>
## 5  RT @TheSussexSquare: As Brexit negotiators try 2agree status of EU nationals,NHS looks 2recruit 2,000 foreign GPs targeting EU countries &amp;<U+0085>
## 6     RT @laylamoran: As a British EU negotiator, I can tell you that Brexit is going to be far worse than anyone could have guessed https://t.co<U+0085>
##   favorited favoriteCount   replyToSN             created truncated
## 1     FALSE             1 1PabloAngel 2017-07-28 23:59:59     FALSE
## 2     FALSE             0        <NA> 2017-07-28 23:59:59     FALSE
## 3     FALSE             0        <NA> 2017-07-28 23:59:58     FALSE
## 4     FALSE             0        <NA> 2017-07-28 23:59:56     FALSE
## 5     FALSE             0        <NA> 2017-07-28 23:59:55     FALSE
## 6     FALSE             0        <NA> 2017-07-28 23:59:54     FALSE
##           replyToSID                 id replyToUID
## 1 891084037217017856 891085860267012096  600397034
## 2               <NA> 891085859356848128       <NA>
## 3               <NA> 891085855682469888       <NA>
## 4               <NA> 891085847549927424       <NA>
## 5               <NA> 891085844571979776       <NA>
## 6               <NA> 891085840675459074       <NA>
##                                                                           statusSource
## 1   <a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>
## 2                   <a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>
## 3 <a href="http://twitter.com/download/android" rel="nofollow">Twitter for Android</a>
## 4 <a href="http://twitter.com/download/android" rel="nofollow">Twitter for Android</a>
## 5    <a href="http://twitter.com/#!/download/ipad" rel="nofollow">Twitter for iPad</a>
## 6   <a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>
##       screenName retweetCount isRetweet retweeted longitude latitude
## 1    mikey_rains            0     FALSE     FALSE        NA       NA
## 2     albertsdav           17      TRUE     FALSE        NA       NA
## 3    markproffit          141      TRUE     FALSE        NA       NA
## 4  ianstotesbury          577      TRUE     FALSE        NA       NA
## 5 HeslingLaolcom            2      TRUE     FALSE        NA       NA
## 6    sam_iam1992          522      TRUE     FALSE        NA       NA

brexit_tweetDF<-as.tbl(brexit_tweetDF)
head(brexit_tweetDF)

## # A tibble: 6 x 16
##                                                                          text
##                                                                         <chr>
## 1 @1PabloAngel As I say all the time, the only way out of Brexit is if the el
## 2 "RT @ReutersUK: Daily Briefing\n- British consumer morale sinks\n- Brexit t
## 3 RT @MikeCarlton01: i read Boris Johnson's Sydney speech.  Amusing, elegant.
## 4 RT @LibDemPress: Good to see Sadiq Khan agreeing with our call for final vo
## 5 RT @TheSussexSquare: As Brexit negotiators try 2agree status of EU national
## 6 RT @laylamoran: As a British EU negotiator, I can tell you that Brexit is g
## # ... with 15 more variables: favorited <lgl>, favoriteCount <dbl>,
## #   replyToSN <chr>, created <dttm>, truncated <lgl>, replyToSID <chr>,
## #   id <chr>, replyToUID <chr>, statusSource <chr>, screenName <chr>,
## #   retweetCount <dbl>, isRetweet <lgl>, retweeted <lgl>, longitude <lgl>,
## #   latitude <lgl>

setwd("C:/DSLA/Twitter/Brexit_analysis")
write.csv(brexit_tweetDF, file = paste("C:/DSLA/Twitter/Brexit_analysis/brexit_tweetDF.csv"), row.names = TRUE)
head(brexit_tweetDF)

## # A tibble: 6 x 16
##                                                                          text
##                                                                         <chr>
## 1 @1PabloAngel As I say all the time, the only way out of Brexit is if the el
## 2 "RT @ReutersUK: Daily Briefing\n- British consumer morale sinks\n- Brexit t
## 3 RT @MikeCarlton01: i read Boris Johnson's Sydney speech.  Amusing, elegant.
## 4 RT @LibDemPress: Good to see Sadiq Khan agreeing with our call for final vo
## 5 RT @TheSussexSquare: As Brexit negotiators try 2agree status of EU national
## 6 RT @laylamoran: As a British EU negotiator, I can tell you that Brexit is g
## # ... with 15 more variables: favorited <lgl>, favoriteCount <dbl>,
## #   replyToSN <chr>, created <dttm>, truncated <lgl>, replyToSID <chr>,
## #   id <chr>, replyToUID <chr>, statusSource <chr>, screenName <chr>,
## #   retweetCount <dbl>, isRetweet <lgl>, retweeted <lgl>, longitude <lgl>,
## #   latitude <lgl>

Take a look at the tweet status source

# Get source of tweets from statusSource
library(tidyr)
head(brexit_tweetDF$statusSource,20)

##  [1] "<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>"  
##  [2] "<a href=\"http://twitter.com\" rel=\"nofollow\">Twitter Web Client</a>"                  
##  [3] "<a href=\"http://twitter.com/download/android\" rel=\"nofollow\">Twitter for Android</a>"
##  [4] "<a href=\"http://twitter.com/download/android\" rel=\"nofollow\">Twitter for Android</a>"
##  [5] "<a href=\"http://twitter.com/#!/download/ipad\" rel=\"nofollow\">Twitter for iPad</a>"   
##  [6] "<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>"  
##  [7] "<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>"  
##  [8] "<a href=\"http://twitter.com/download/android\" rel=\"nofollow\">Twitter for Android</a>"
##  [9] "<a href=\"http://twitter.com/download/android\" rel=\"nofollow\">Twitter for Android</a>"
## [10] "<a href=\"http://twitter.com/download/android\" rel=\"nofollow\">Twitter for Android</a>"
## [11] "<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>"  
## [12] "<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>"  
## [13] "<a href=\"http://twitter.com\" rel=\"nofollow\">Twitter Web Client</a>"                  
## [14] "<a href=\"http://twitter.com\" rel=\"nofollow\">Twitter Web Client</a>"                  
## [15] "<a href=\"http://twitter.com/#!/download/ipad\" rel=\"nofollow\">Twitter for iPad</a>"   
## [16] "<a href=\"http://twitter.com/#!/download/ipad\" rel=\"nofollow\">Twitter for iPad</a>"   
## [17] "<a href=\"http://twitter.com/download/android\" rel=\"nofollow\">Twitter for Android</a>"
## [18] "<a href=\"https://mobile.twitter.com\" rel=\"nofollow\">Twitter Lite</a>"                
## [19] "<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>"  
## [20] "<a href=\"http://twitter.com/download/android\" rel=\"nofollow\">Twitter for Android</a>"

# Extract iPhone and Android updates
brexit_tweets_tbl <- brexit_tweetDF %>%
  select(id, statusSource, text, created) %>%
  tidyr::extract(statusSource, "source", "Twitter for (.*?)<")%>%
  filter(source %in% c("iPhone", "Android","iPad")) 
head(brexit_tweets_tbl)

## # A tibble: 6 x 4
##                   id  source
##                <chr>   <chr>
## 1 891085860267012096  iPhone
## 2 891085855682469888 Android
## 3 891085847549927424 Android
## 4 891085844571979776    iPad
## 5 891085840675459074  iPhone
## 6 891085828918837248  iPhone
## # ... with 2 more variables: text <chr>, created <dttm>

# Plot the source platforms
qplot(brexit_tweets_tbl$source, xlab = "Source of Tweets",geom = "bar" ,
      fill=I("lightblue"), 
      col=I("black"))

# We can see most number of tweets are from "iphone"

Data Munging

# Transform the text of tweets into Document Term Matrix
# First clean the text
library(tm)
# Create a collection of documents with each tweet text is a row
# Remove some special characters like smilies to help cleaning
brexit_tweetDF$text<-sapply(brexit_tweetDF$text,function(x) iconv(x ,to="UTF-8",sub = "" ))
brexit_tweetCorpus <- Corpus(VectorSource(brexit_tweetDF$text))

Text Filtering

# Usual transformations for cleaning the test
brexit_tweetCorpus <- tm_map(brexit_tweetCorpus, tolower)
brexit_tweetCorpus <- tm_map(brexit_tweetCorpus, removePunctuation)
brexit_tweetCorpus <- tm_map(brexit_tweetCorpus, removeNumbers)
removeURL <- function(x) gsub("http[[:alnum:]]*", "", x) # remove URLs
brexit_tweetCorpus <- tm_map(brexit_tweetCorpus, removeURL)
twtrStopWords <- c(stopwords("english"),'with','the','an')
brexit_tweetCorpus <- tm_map(brexit_tweetCorpus, removeWords, twtrStopWords) # remove stop words

# Remove the search words, Which is obvious to be very frequent.
brexit_tweetCorpus <- tm_map(brexit_tweetCorpus, removeWords, c("brexit"))
# inspect(brexit_tweetCorpus)
brexit_tweetCorpus

## <<SimpleCorpus>>
## Metadata:  corpus specific: 1, document level (indexed): 0
## Content:  documents: 3000

head(brexit_tweets,20)

## [[1]]
## [1] "mikey_rains: @1PabloAngel As I say all the time, the only way out of Brexit is if the electorate demand it stops. Which labour could exploit."
## 
## [[2]]
## [1] "albertsdav: RT @ReutersUK: Daily Briefing\n- British consumer morale sinks\n- Brexit talks run behind schedule \n- France boasts strong GDP growth\nhttps:/<U+0085>"
## 
## [[3]]
## [1] "markproffit: RT @MikeCarlton01: i read Boris Johnson's Sydney speech.  Amusing, elegant. Truly, one ot the great minds ot the 18th century. https://t.co<U+0085>"
## 
## [[4]]
## [1] "ianstotesbury: RT @LibDemPress: Good to see Sadiq Khan agreeing with our call for final vote on Brexit deal with choice to stay in EU. Now over to Corbyn.<U+0085>"
## 
## [[5]]
## [1] "HeslingLaolcom: RT @TheSussexSquare: As Brexit negotiators try 2agree status of EU nationals,NHS looks 2recruit 2,000 foreign GPs targeting EU countries &amp;<U+0085>"
## 
## [[6]]
## [1] "sam_iam1992: RT @laylamoran: As a British EU negotiator, I can tell you that Brexit is going to be far worse than anyone could have guessed https://t.co<U+0085>"
## 
## [[7]]
## [1] "goldie_dyke: RT @TomLondon6: Brexit is a national disaster \nBlame lies with\n1 Cameron\n2 Murdoch, Dacre etc\n3 Farage, Johnson etc\n4 BBC for appalling cov<U+0085>"
## 
## [[8]]
## [1] "PoliticalJenga: RT @TheEconomist: Thanks to the financial crisis and Brexit, Britain has lost all of its global functions in one great rush https://t.co/jP<U+0085>"
## 
## [[9]]
## [1] "alienobserver1: RT @Independent: Councils need 'billions' in aid to replace EU funding https://t.co/f4fW3jBTS7"
## 
## [[10]]
## [1] "ianstotesbury: RT @vincecable: Why would #business be satisfied with two year #Brexit transition? Transition to what? Kicking can down the road isnt a str<U+0085>"
## 
## [[11]]
## [1] "CathyMcRorie: RT @JamesMelville: Brexiters say, \"respect the democratic will.\"\nIn the case of Brexit, I won't. I have never respected decisions being mad<U+0085>"
## 
## [[12]]
## [1] "JohnPCBiggs: RT @AMDWaters: Brussels threatens to block US trade deal with UK. How can anyone not see the nature of this monster?  Its a tyrant. https:/<U+0085>"
## 
## [[13]]
## [1] "toadflaxmillion: Well done Sadiq - standing up for Londoners interests on Brexit https://t.co/t7cFGjgGIi"
## 
## [[14]]
## [1] "miltoncontact: US political insults sink to new low. Brexit planes and NI. Norwich outing. https://t.co/hhcag1X6JV #BrexiTrumpDiary"
## 
## [[15]]
## [1] "LadyTyke49: RT @agapanthus49: Chlorine washed chicken scare stories just anti-Brexit smears, we've been eating chlorine washed salad for years &amp; drink<U+0085>"
## 
## [[16]]
## [1] "Mariebe10098426: RT @SimonFRCox: Labour can stop Brexit <U+0096> but only with fresh vote, says Sadiq Khan https://t.co/EcoaKOctrY"
## 
## [[17]]
## [1] "RoyMotteram: RT @JeanneBartram: Theresa May 'interfered with crucial immigration report to play up benefits tourism' against all evidence https://t.co/8<U+0085>"
## 
## [[18]]
## [1] "MagDMongoose: RT @AMDWaters: Brussels threatens to block US trade deal with UK. How can anyone not see the nature of this monster?  Its a tyrant. https:/<U+0085>"
## 
## [[19]]
## [1] "OKLumberjack: RT @morninggloria: \"And England. That's a mess. They tried to do Brexit, and they made a big mess of their schools. Dead girls in the alive<U+0085>"
## 
## [[20]]
## [1] "juliemiles66: RT @trishgreenhalgh: \"As a British EU negotiator, I can tell you that Brexit is going to be far worse than anyone could have guessed\".\n htt<U+0085>"

Create a Document Term Matrix

brexit_tweetDTM<-DocumentTermMatrix(brexit_tweetCorpus,list(termFreq=1))
#inspect(brexit_tweetDTM)

# set save defaults using option:
path <- setwd("C:/DSLA/Twitter/Brexit_analysis")
saveRDS(brexit_tweetDTM,'./path', ascii = TRUE)
brexit_tweetDTM

## <<DocumentTermMatrix (documents: 3000, terms: 5248)>>
## Non-/sparse entries: 29820/15714180
## Sparsity           : 100%
## Maximal term length: 31
## Weighting          : term frequency (tf)

Finding the frequency terms

# Find frequent terms
brexit_freqTerms<-findFreqTerms(brexit_tweetDTM,lowfreq = 10)
head(brexit_freqTerms)

## [1] "electorate" "labour"     "say"        "time"       "way"       
## [6] "british"

# Find their frequencies
brexit_term.freq<-colSums(as.matrix(brexit_tweetDTM))
brexit_term.freq.df<-data.frame(term = names(brexit_term.freq),freq=brexit_term.freq)
head(brexit_term.freq.df)

##                  term freq
## demand         demand    4
## electorate electorate   11
## exploit       exploit    1
## labour         labour  220
## pabloangel pabloangel    2
## say               say   51

Visualizing the Wordcloud

# Add random order = FALSE to get the bigger words in the center.
# Set the scale(Max_wordsize,min_wordsize) to make the fonts readable
wordcloud(words = brexit_freqTerms,
          freq = brexit_term.freq.df[brexit_term.freq.df$term %in% brexit_freqTerms,2],
          max.words = 100,
          color = rainbow(50),random.color = T, random.order = FALSE,scale = c(3.7,1.0))

# names(brexit_term.freq)

Data Exploration for Sentimental Analysis

#Check for any NA values
any(is.na(brexit_freqTerms))

## [1] FALSE

# we see no missing values in the dataset.


# Find word associations for the term "remain"
term.association<-findAssocs(brexit_tweetDTM,terms = "remain",corlimit = 0.2)
# Plot it
term.assoc.freq <- rowSums(as.matrix(term.association$remain))
remainDF <- data.frame(term=names(term.association$remain),freq=term.association$remain)
g_remain <-ggplot(remainDF,aes(x=term,y=freq)) +
  geom_bar(stat = "identity", fill = "green") +
  xlab("Terms")+
  ylab("Associations to the teem 'remain")+
  coord_flip()
# g_remain

# term association for term "leave"
term.association2 <- findAssocs(brexit_tweetDTM,terms = "leave", corlimit = 0.2)
# Plot it
term.assoc.freq2 <- rowSums(as.matrix(term.association2$leave))
leaveDF <- data.frame(term=names(term.association2$leave),freq=term.association2$leave)
g_leave <-ggplot(leaveDF,aes(x=term,y=freq)) +
  geom_bar(stat = "identity",fill = "#FF6666")+
  xlab("Terms") + 
  ylab("Associations to the term 'leave'")+
  coord_flip()
#g_leave

library(gridExtra)
par(mfrow = c(1,2))
grid.arrange(g_remain,g_leave, nrow=1, ncol=2)

#library(graph)
#library(Rgraphviz)
#plot(tdm, term = freq.terms, corThreshold = 0.1, weighting = T)

Sentiments Functions

library(plyr)
library(stringr)

score.sentiment = function(sentences, pos.words, neg.words, .progress = 'none')
{
  require(plyr)
  require(stringr)
  scores <- laply(sentences, function(sentence, pos.words, neg.words){
    sentence <- gsub('[[:punct:]]', "", sentence)
    sentence <- gsub('[[:cntrl:]]', "", sentence)
    sentence <- gsub('\\d+', "", sentence)
    sentence <- tolower(sentence)
    word.list <- str_split(sentence, '\\s+')
    words <- unlist(word.list)
    pos.matches <- match(words, pos.words)
    neg.matches <- match(words, neg.words)
    pos.matches <- !is.na(pos.matches)
    neg.matches <- !is.na(neg.matches)
    score <- sum(pos.matches) - sum(neg.matches)
    return(score)
  },
  pos.words, neg.words, .progress=.progress)
  scores.df <- data.frame(score=scores, text=sentences)
  return(scores.df)
}

Scoring Tweets & Adding a column

#Load sentiment word lists
list.pos = scan('C:/DSLA/Twitter/Brexit_analysis/positive-words.txt', what='character', comment.char=';')
list.neg = scan('C:/DSLA/Twitter/Brexit_analysis/negative-words.txt', what='character', comment.char=';')

#Add words to list
pos.words = c(list.pos, 'upgrade')
neg.words = c(list.neg, 'wtf', 'wait','waiting', 'epicfail', 'mechanical')

#Import 3 csv
DatasetBrexit <- read.csv("C:/DSLA/Twitter/Brexit_analysis/brexit.df.csv")
head(DatasetBrexit)

##                                                                                                                                               text
## 1     RT @vincecable: Warmly welcome .@SadiqKhan statement supporting #Referendum on #Brexit deal. Serious political heavyweight on board. .@jere<U+0085>
## 2     @GuitarMoog @JohnRentoul @BarbaraSpiegelh And forever. That's just who they are. Always people like that in society<U+0085> https://t.co/jC8jkNovks
## 3   RT @RCorbettMEP: #BBCNewsnight \nI warned some time ago that #Brexit will cause massive probs for #airline s  and air manufacturing: \nhttps:<U+0085>
## 4     RT @TheStephenRalph: The needs of the many outweigh the needs of the few or the @Nigel_Farage  #Brexit #stopbrexit @UKLabour @LibDems #Labo<U+0085>
## 5 RT @UKDemocracyNow2: @mybaldswede2 @Arron_banks UK already had freedom &amp; sovereignty. You've been sold a pup. #Brexit https://t.co/wwWHFjJ9<U+0085>
## 6     RT @vincecable: Warmly welcome .@SadiqKhan statement supporting #Referendum on #Brexit deal. Serious political heavyweight on board. .@jere<U+0085>
##   favorited favoriteCount  replyToSN             created truncated
## 1     FALSE             0       <NA> 2017-07-29 00:42:41     FALSE
## 2     FALSE             0 GuitarMoog 2017-07-29 00:42:29      TRUE
## 3     FALSE             0       <NA> 2017-07-29 00:41:38     FALSE
## 4     FALSE             0       <NA> 2017-07-29 00:41:26     FALSE
## 5     FALSE             0       <NA> 2017-07-29 00:41:20     FALSE
## 6     FALSE             0       <NA> 2017-07-29 00:41:20     FALSE
##     replyToSID           id replyToUID
## 1           NA 8.910966e+17         NA
## 2 8.910957e+17 8.910966e+17  720975284
## 3           NA 8.910963e+17         NA
## 4           NA 8.910963e+17         NA
## 5           NA 8.910963e+17         NA
## 6           NA 8.910963e+17         NA
##                                                                           statusSource
## 1    <a href="http://twitter.com/#!/download/ipad" rel="nofollow">Twitter for iPad</a>
## 2 <a href="http://twitter.com/download/android" rel="nofollow">Twitter for Android</a>
## 3 <a href="http://twitter.com/download/android" rel="nofollow">Twitter for Android</a>
## 4                 <a href="http://www.gooseford.com" rel="nofollow">BrexitDebateUK</a>
## 5                 <a href="http://www.gooseford.com" rel="nofollow">BrexitDebateUK</a>
## 6 <a href="http://twitter.com/download/android" rel="nofollow">Twitter for Android</a>
##        screenName retweetCount isRetweet retweeted longitude latitude
## 1         NtNigel          292      TRUE     FALSE        NA       NA
## 2 TheStephenRalph            1     FALSE     FALSE        NA       NA
## 3          pw_pwd           38      TRUE     FALSE        NA       NA
## 4  BrexitDebateUK            1      TRUE     FALSE        NA       NA
## 5  BrexitDebateUK            1      TRUE     FALSE        NA       NA
## 6          pw_pwd          292      TRUE     FALSE        NA       NA

DatasetBrexit$text<-as.factor(DatasetBrexit$text)

Score the sentiments of all tweets

brexit_tweet.scores <- score.sentiment(DatasetBrexit$text,pos.words,neg.words,.progress = "text")

## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |                                                                 |   1%
  |                                                                       
  |=                                                                |   1%
  |                                                                       
  |=                                                                |   2%
  |                                                                       
  |==                                                               |   2%
  |                                                                       
  |==                                                               |   3%
  |                                                                       
  |==                                                               |   4%
  |                                                                       
  |===                                                              |   4%
  |                                                                       
  |===                                                              |   5%
  |                                                                       
  |====                                                             |   5%
  |                                                                       
  |====                                                             |   6%
  |                                                                       
  |====                                                             |   7%
  |                                                                       
  |=====                                                            |   7%
  |                                                                       
  |=====                                                            |   8%
  |                                                                       
  |======                                                           |   8%
  |                                                                       
  |======                                                           |   9%
  |                                                                       
  |======                                                           |  10%
  |                                                                       
  |=======                                                          |  10%
  |                                                                       
  |=======                                                          |  11%
  |                                                                       
  |=======                                                          |  12%
  |                                                                       
  |========                                                         |  12%
  |                                                                       
  |========                                                         |  13%
  |                                                                       
  |=========                                                        |  13%
  |                                                                       
  |=========                                                        |  14%
  |                                                                       
  |=========                                                        |  15%
  |                                                                       
  |==========                                                       |  15%
  |                                                                       
  |==========                                                       |  16%
  |                                                                       
  |===========                                                      |  16%
  |                                                                       
  |===========                                                      |  17%
  |                                                                       
  |===========                                                      |  18%
  |                                                                       
  |============                                                     |  18%
  |                                                                       
  |============                                                     |  19%
  |                                                                       
  |=============                                                    |  19%
  |                                                                       
  |=============                                                    |  20%
  |                                                                       
  |=============                                                    |  21%
  |                                                                       
  |==============                                                   |  21%
  |                                                                       
  |==============                                                   |  22%
  |                                                                       
  |===============                                                  |  22%
  |                                                                       
  |===============                                                  |  23%
  |                                                                       
  |===============                                                  |  24%
  |                                                                       
  |================                                                 |  24%
  |                                                                       
  |================                                                 |  25%
  |                                                                       
  |=================                                                |  25%
  |                                                                       
  |=================                                                |  26%
  |                                                                       
  |=================                                                |  27%
  |                                                                       
  |==================                                               |  27%
  |                                                                       
  |==================                                               |  28%
  |                                                                       
  |===================                                              |  28%
  |                                                                       
  |===================                                              |  29%
  |                                                                       
  |===================                                              |  30%
  |                                                                       
  |====================                                             |  30%
  |                                                                       
  |====================                                             |  31%
  |                                                                       
  |====================                                             |  32%
  |                                                                       
  |=====================                                            |  32%
  |                                                                       
  |=====================                                            |  33%
  |                                                                       
  |======================                                           |  33%
  |                                                                       
  |======================                                           |  34%
  |                                                                       
  |======================                                           |  35%
  |                                                                       
  |=======================                                          |  35%
  |                                                                       
  |=======================                                          |  36%
  |                                                                       
  |========================                                         |  36%
  |                                                                       
  |========================                                         |  37%
  |                                                                       
  |========================                                         |  38%
  |                                                                       
  |=========================                                        |  38%
  |                                                                       
  |=========================                                        |  39%
  |                                                                       
  |==========================                                       |  39%
  |                                                                       
  |==========================                                       |  40%
  |                                                                       
  |==========================                                       |  41%
  |                                                                       
  |===========================                                      |  41%
  |                                                                       
  |===========================                                      |  42%
  |                                                                       
  |============================                                     |  42%
  |                                                                       
  |============================                                     |  43%
  |                                                                       
  |============================                                     |  44%
  |                                                                       
  |=============================                                    |  44%
  |                                                                       
  |=============================                                    |  45%
  |                                                                       
  |==============================                                   |  45%
  |                                                                       
  |==============================                                   |  46%
  |                                                                       
  |==============================                                   |  47%
  |                                                                       
  |===============================                                  |  47%
  |                                                                       
  |===============================                                  |  48%
  |                                                                       
  |================================                                 |  48%
  |                                                                       
  |================================                                 |  49%
  |                                                                       
  |================================                                 |  50%
  |                                                                       
  |=================================                                |  50%
  |                                                                       
  |=================================                                |  51%
  |                                                                       
  |=================================                                |  52%
  |                                                                       
  |==================================                               |  52%
  |                                                                       
  |==================================                               |  53%
  |                                                                       
  |===================================                              |  53%
  |                                                                       
  |===================================                              |  54%
  |                                                                       
  |===================================                              |  55%
  |                                                                       
  |====================================                             |  55%
  |                                                                       
  |====================================                             |  56%
  |                                                                       
  |=====================================                            |  56%
  |                                                                       
  |=====================================                            |  57%
  |                                                                       
  |=====================================                            |  58%
  |                                                                       
  |======================================                           |  58%
  |                                                                       
  |======================================                           |  59%
  |                                                                       
  |=======================================                          |  59%
  |                                                                       
  |=======================================                          |  60%
  |                                                                       
  |=======================================                          |  61%
  |                                                                       
  |========================================                         |  61%
  |                                                                       
  |========================================                         |  62%
  |                                                                       
  |=========================================                        |  62%
  |                                                                       
  |=========================================                        |  63%
  |                                                                       
  |=========================================                        |  64%
  |                                                                       
  |==========================================                       |  64%
  |                                                                       
  |==========================================                       |  65%
  |                                                                       
  |===========================================                      |  65%
  |                                                                       
  |===========================================                      |  66%
  |                                                                       
  |===========================================                      |  67%
  |                                                                       
  |============================================                     |  67%
  |                                                                       
  |============================================                     |  68%
  |                                                                       
  |=============================================                    |  68%
  |                                                                       
  |=============================================                    |  69%
  |                                                                       
  |=============================================                    |  70%
  |                                                                       
  |==============================================                   |  70%
  |                                                                       
  |==============================================                   |  71%
  |                                                                       
  |==============================================                   |  72%
  |                                                                       
  |===============================================                  |  72%
  |                                                                       
  |===============================================                  |  73%
  |                                                                       
  |================================================                 |  73%
  |                                                                       
  |================================================                 |  74%
  |                                                                       
  |================================================                 |  75%
  |                                                                       
  |=================================================                |  75%
  |                                                                       
  |=================================================                |  76%
  |                                                                       
  |==================================================               |  76%
  |                                                                       
  |==================================================               |  77%
  |                                                                       
  |==================================================               |  78%
  |                                                                       
  |===================================================              |  78%
  |                                                                       
  |===================================================              |  79%
  |                                                                       
  |====================================================             |  79%
  |                                                                       
  |====================================================             |  80%
  |                                                                       
  |====================================================             |  81%
  |                                                                       
  |=====================================================            |  81%
  |                                                                       
  |=====================================================            |  82%
  |                                                                       
  |======================================================           |  82%
  |                                                                       
  |======================================================           |  83%
  |                                                                       
  |======================================================           |  84%
  |                                                                       
  |=======================================================          |  84%
  |                                                                       
  |=======================================================          |  85%
  |                                                                       
  |========================================================         |  85%
  |                                                                       
  |========================================================         |  86%
  |                                                                       
  |========================================================         |  87%
  |                                                                       
  |=========================================================        |  87%
  |                                                                       
  |=========================================================        |  88%
  |                                                                       
  |==========================================================       |  88%
  |                                                                       
  |==========================================================       |  89%
  |                                                                       
  |==========================================================       |  90%
  |                                                                       
  |===========================================================      |  90%
  |                                                                       
  |===========================================================      |  91%
  |                                                                       
  |===========================================================      |  92%
  |                                                                       
  |============================================================     |  92%
  |                                                                       
  |============================================================     |  93%
  |                                                                       
  |=============================================================    |  93%
  |                                                                       
  |=============================================================    |  94%
  |                                                                       
  |=============================================================    |  95%
  |                                                                       
  |==============================================================   |  95%
  |                                                                       
  |==============================================================   |  96%
  |                                                                       
  |===============================================================  |  96%
  |                                                                       
  |===============================================================  |  97%
  |                                                                       
  |===============================================================  |  98%
  |                                                                       
  |================================================================ |  98%
  |                                                                       
  |================================================================ |  99%
  |                                                                       
  |=================================================================|  99%
  |                                                                       
  |=================================================================| 100%

setwd("C:/DSLA/Twitter/Brexit_analysis")
write.csv(brexit_tweet.scores, file = paste("C:/DSLA/Twitter/Brexit_analysis/brexitScores.csv",sep = " "), row.names = TRUE)
View(brexit_tweet.scores)

Visualize the sentiments of the tweets

hist(brexit_tweet.scores$score, main = "Scores of each tweet", xlab = "Scores", border="blue", col = "green")

# The above histogram shows the frequency of tweets w.r to the scores calculated for each tweets
#  The x-axis shows the score od each tweet as a negative or positive integer or zero
#  A positive score represents positive or good sentiments associated with that particulat tweet.
#  A score of zero indicates aneutral sentiment.
#  The more positive the score, the more positive the sentiments of the person tweeting.

qplot(brexit_tweet.scores$score,bins = 30,main = "Scores of each tweet", xlab = "Scores", border="blue", col = "green")

# Out of 3000 tweets that we fetched with "#brexit" 
# A majority of them more then 1000 are neutral.
# Around 800 have positive sentiments
# 500-600 tweets show negative sentiments

Brexit data analysis

Nazima Khan

July 29, 2017

Objective

Dataset Information:

Analysis Details…….

Analysis Details…….

Extract, Import and Read

Convert the list of tweets into a data frame

Take a look at the tweet status source

Data Munging

Text Filtering

Create a Document Term Matrix

Finding the frequency terms

Visualizing the Wordcloud

Data Exploration for Sentimental Analysis

Sentiments Functions

Scoring Tweets & Adding a column

Score the sentiments of all tweets

Visualize the sentiments of the tweets