Text Mining

Text Analysis in Political Speeches

The objective of this analysis is show NLP analysis capabilities beyond counting words but really analize context, content and use of language in a very useful way. Applications range from Academia, Research, Intelligence, Literary Analysis, Competitive Analisys in open sources, Copyright, etc

The documents used are The national security strategy for the 2002 and 2017 years to make a comparative analysis. I used the ‘rvest’ package to identify the text and bring it into R. The only other package needed is ‘qdap’. Note that I used the Chrome Extension ‘SelectorGadget’ to scrape the relevant text.

Extracting the Text from the web

library(rJava)
library(rvest)

## Loading required package: xml2

library(qdap)

## Loading required package: qdapDictionaries

## Loading required package: qdapRegex

## Loading required package: qdapTools

## Loading required package: RColorBrewer

## 
## Attaching package: 'qdap'

## The following object is masked from 'package:rvest':
## 
##     %>%

## The following object is masked from 'package:base':
## 
##     Filter

library(SnowballC)
library(tm)

## Loading required package: NLP

## 
## Attaching package: 'NLP'

## The following object is masked from 'package:qdap':
## 
##     ngrams

## 
## Attaching package: 'tm'

## The following objects are masked from 'package:qdap':
## 
##     as.DocumentTermMatrix, as.TermDocumentMatrix

Gather both files text and generating a corpus for each of them:

library(pdftools)

files1 <- "NSS2017.pdf"
files2 <- "NSS2002.pdf"
Rpdf <- readPDF(control = list(text = "-layout"))
pdf17 <- Corpus(URISource(files1), readerControl = list(reader = Rpdf))
pdf02 <- Corpus(URISource(files2), readerControl = list(reader = Rpdf))

Tyding the text

You can explore the text as you wish using html_text(). We will need to put the text into a dataframe, but there are some cleaning tasks that need to be done first.

Text17 <- pdf17
Text02 <- pdf02

If you end up with strange characters in your text then change the character encoding using iconv() function and end of lines need to be added.

Text17 <- iconv(Text17, "latin1", "ASCII", "")
Text02 <- iconv(Text02, "latin1", "ASCII", "")
#adding end of lines
Text17 <- paste(Text17, collapse = c(" ", "\n"))
Text02 <- paste(Text02, collapse = c(" ", "\n"))

This is where the first ‘qdap’ function comes into play, qprep(). This function is a wrapper for a number of other cleaning functions and using it will speed pre-processing. The functions it passes through are as follows: 1. bracketX() - apply bracket removal 2. replace_abbreviation() - changes abbreviations 3. replace_number() - numbers to words e.g. 100 becomes one hundred 4. replace_symbol() - symbols become words e.g. @ becomes ‘at’

This chunk of code does the above and also replaces contractions, removes the top 100 stopwords and strips the text of unwanted characters. Note that we will keep the period and the question marks to assist in sentence creation for the deep speech analysis.

Prep17 <- qprep(Text17)
Prep02 <- qprep(Text02)

Prep17 <- replace_contraction(Prep17)
Prep02 <- replace_contraction(Prep02)

Rm17 <- rm_stopwords(Prep17, Top100Words, separate = F)
Rm02 <- rm_stopwords(Prep02, Top100Words, separate = F)

Strip17 <- strip(Rm17, char.keep = c("?", "."))
Strip02 <- strip(Rm02, char.keep = c("?", "."))

One of the things I’ll do is fill spaces between words, which will keep them together for the analysis such as a person’s name. The ‘keep’ list below provide an example of this and it will be used in the space_fill() function. You could include several others.

It is also now time to put both speeches into one dataframe, consisting of the text for each respective candidate.

#keep <- c("United States", "Hillary Clinton", "Donald Trump", "middle class", "Supreme Court")
Fill17 <- data.frame(Strip17)
Fill02 <- data.frame(Strip02)
Fill17$candidate <- "Trump"
colnames(Fill17)[1] <- "text"
colnames(Fill02)[1] <- "text"
#hillFill <- data.frame(space_fill(hillStrip, keep))
Fill02$candidate <- "Bush"
#colnames(hillFill)[1] <- "text"
df1 <- rbind(Fill17, Fill02)

Critical to any analysis with the ‘qdap’ package is to put the text into sentences with the sentSplit() function. It also creates the ‘tot’ variable or ‘turn of talk’ index, which is something that would be important for analyzing the debates. Analysis of dialogues is very easy with this package

df2 <- sentSplit(df1, "text")

## Warning in sentSplit(df1, "text"): The following problems were detected:
## non character, missing ending punctuation, indicating incomplete
## 
## *Consider running `check_text`

df2<-str(df1)

## 'data.frame':    2 obs. of  2 variables:
##  $ text     : Factor w/ 2 levels "list terrorists battlefields syria iraq continue pursuing until destroyed. americas allies contributing our com"| __truncated__,..: 1 2
##  $ candidate: chr  "Trump" "Bush"

We’ve come to the point I think where stemming would be implemented. That is, to reduce a word to its root e.g. stems, stemming, stemmer all become stem.‘Qdap’ has some flexibility in comparing stemmed text versus non-stemmed text as we shall soon see.

Preliminary analysis

Word Count Plots

I’ll start out with the standard word frequency analysis. Using the bag o words() and word_count() functions. Here I create a df of the 25 most frequent terms by candidate and compare that data in a plot.

freq <- freq_terms(df1$text)
plot(freq)

donFreq <- df1[df1$candidate == "Trump", ]
donFreq <- freq_terms(donFreq$text)
hillFreq <- df1[df1$candidate == "Bush", ]
hillFreq <- freq_terms(hillFreq$text)
# par(mfrow=c(1,2))
plot(donFreq)

plot(hillFreq)

No surprise that Trump hits “trade”, “violence”, “immigration” and “law”. Hillary likes to talk about “us” and “me” and surprisingly uses “Wall” more than Trump in the acceptance speech. Nothing about children or families?

Word Frequency Matrix

Creating a word frequency matrix, which provides the counts for each word by speaker

wordMat <- wfm(df1$text, df1$candidate)
wordMat[c(1:5, 350:354), ]

##           Bush Trump
## ability      7    30
## able        15    16
## abroad       2     8
## abuse        1     4
## abused       2     0
## combining    1     0
## comes        2     0
## comfort      1     0
## coming       1     2
## command      4     3

Word Cloud Stemmed Words

Now generating a word cloud, will use stemmed words

trans_cloud(df1$text, df1$candidate, stem = T, min.freq = 20, title=TRUE)

## Warning in wordcloud(df2[, 1], df2[, 2], colors = COL, rot.per = rot.per, :
## societi could not be fit on page. It will not be plotted.

## Warning in wordcloud(df2[, 1], df2[, 2], colors = COL, rot.per = rot.per, :
## modern could not be fit on page. It will not be plotted.

## Warning in wordcloud(df2[, 1], df2[, 2], colors = COL, rot.per = rot.per, :
## twenti could not be fit on page. It will not be plotted.

## Warning in wordcloud(df2[, 1], df2[, 2], colors = COL, rot.per = rot.per, :
## citizen could not be fit on page. It will not be plotted.

## Warning in wordcloud(df2[, 1], df2[, 2], colors = COL, rot.per = rot.per, :
## defeat could not be fit on page. It will not be plotted.

## Warning in wordcloud(df2[, 1], df2[, 2], colors = COL, rot.per = rot.per, :
## exploit could not be fit on page. It will not be plotted.

## Warning in wordcloud(df2[, 1], df2[, 2], colors = COL, rot.per = rot.per, :
## freedom could not be fit on page. It will not be plotted.

## Warning in wordcloud(df2[, 1], df2[, 2], colors = COL, rot.per = rot.per, :
## network could not be fit on page. It will not be plotted.

## Warning in wordcloud(df2[, 1], df2[, 2], colors = COL, rot.per = rot.per, :
## homeland could not be fit on page. It will not be plotted.

## Warning in wordcloud(df2[, 1], df2[, 2], colors = COL, rot.per = rot.per, :
## individu could not be fit on page. It will not be plotted.

## Warning in wordcloud(df2[, 1], df2[, 2], colors = COL, rot.per = rot.per, :
## missil could not be fit on page. It will not be plotted.

## Warning in wordcloud(df2[, 1], df2[, 2], colors = COL, rot.per = rot.per, :
## prevent could not be fit on page. It will not be plotted.

## Warning in wordcloud(df2[, 1], df2[, 2], colors = COL, rot.per = rot.per, :
## resili could not be fit on page. It will not be plotted.

There you have it, children and families now appear on Clinton’s narrative making double down in topics highlighting “work”, “believe”, family. In the case of Trump the number of topics are less and focus on “country” , “america”, “nation” draw his topic line

Word Association

A great function is ‘word_associate()’ and building word clouds based on that association. Let’s give “terror”, “wall” and “Deal” a try. Selection of term is a little Biased to make the example clear

#word_associate(df1$text, df1$candidate, match.string = "china", wordcloud = T)
#word_associate(df1$text, df1$candidate, match.string = "terror", wordcloud = T)

No commentary needed very self explanatory on where the candidates focus or change connotation of the selected target word“.

Word Stats

A complete explanation of the stats is available under ?word_stats

Here a quick Reference

n.tot - number of turns of talk
n.sent - number of sentences
n.words - number of words
n.char - number of characters
n.syl - number of syllables
n.poly - number of polysyllables
sptot - syllables per turn of talk
wptot - words per turn of talk
wps - words per sentence
cps - characters per sentence
sps - syllables per sentence
psps - poly-syllables per sentence
cpw - characters per word
spw - syllables per word
n.state - number of statements
n.quest - number of questions
n.exclm - number of exclamations
n.incom - number of incomplete statements
p.state - proportion of statements
p.quest - proportion of questions
p.exclm - proportion of exclamations
p.incom - proportion of incomplete statements
n.hapax - number of hapax legomenon
n.dis - number of dis legomenon
grow.rate - proportion of hapax legomenon to words
prop.dis - proportion of dis legomenon to words

ws <- word_stats(df1$text, df1$candidate, rm.incomplete = T)

## Warning in word_stats(df1$text, df1$candidate, rm.incomplete = T): 
##   Some rows contain double punctuation.  Suggested use of sentSplit function.

## Warning in word_stats(df1$text, df1$candidate, rm.incomplete = T): Some sentences do not have standard qdap punctuation endmarks.
##   Use $mpun for a list of observations with missing endmarks.

plot(ws, label = T, lab.digits = 2)

## Warning: attributes are not identical across measure variables; they will
## be dropped

## Warning: Ignoring unknown aesthetics: fill

Interesting the breakdown in the count of sentences and words. Hillary used a hundred more sentences, but only two hundred more words. I’m curious as to what questions they asked and how they incorporated them. Without analysis we should had think that Trump will use less words or polysylable words which is not true (still questioning if those words are relevant or not)

Question Extraction

The next analysis allows you to extract alll the questions made during the speech (yes mostly rethorical), and let you know the type of question

#x1 <- question_type(df1$text, grouping.var = df1$candidate)
#x1
#truncdf(x1$raw)

OK, we’ve learned that rows 473 and 474 should be thrown out according to the output of the type of question. Also looks like we have the classic use of an anaphora by Trump, which is the technique of repeating the first word or words of several consecutive sentences. I think Churchill used it quite a bit e.g. “We shall not flag or fail. We shall go on to the end. We shall fight in France, we shall.”"

df2[c(161:163), 3]

## NULL

df2[c(473:474), 3]

## NULL

df2 <- df2[c(-473,-474), ]

Advanced analysis

NLP Wrapper

The pos functions are wrappers from openNLP. Adding a dictionary for reference and interpretation

##    Tag  Description                             
## 1  CC   Coordinating conjunction                
## 2  CD   Cardinal number                         
## 3  DT   Determiner                              
## 4  EX   Existential there                       
## 5  FW   Foreign word                            
## 6  IN   Preposition or subordinating conjunction
## 7  JJ   Adjective                               
## 8  JJR  Adjective, comparative                  
## 9  JJS  Adjective, superlative                  
## 10 LS   List item marker                        
## 11 MD   Modal                                   
## 12 NN   Noun, singular or mass                  
## 13 NNS  Noun, plural                            
## 14 NNP  Proper noun, singular                   
## 15 NNPS Proper noun, plural                     
## 16 PDT  Predeterminer                           
## 17 POS  Possessive ending                       
## 18 PRP  Personal pronoun                        
## 19 PRP$ Possessive pronoun                      
## 20 RB   Adverb                                  
## 21 RBR  Adverb, comparative                     
## 22 RBS  Adverb, superlative                     
## 23 RP   Particle                                
## 24 SYM  Symbol                                  
## 25 TO   to                                      
## 26 UH   Interjection                            
## 27 VB   Verb, base form                         
## 28 VBD  Verb, past tense                        
## 29 VBG  Verb, gerund or present participle      
## 30 VBN  Verb, past participle                   
## 31 VBP  Verb, non-3rd person singular present   
## 32 VBZ  Verb, 3rd person singular present       
## 33 WDT  Wh-determiner                           
## 34 WP   Wh-pronoun                              
## 35 WP$  Possessive wh-pronoun                   
## 36 WRB  Wh-adverbvvv

Be advised that this takes some time, which you can track with a progress bar. Notice Clinton’s use and Trump’s lack of use of interjections.

posbydf <- pos_by(df1$text, grouping.var = df1$candidate)
names(posbydf)

##  [1] "text"         "POStagged"    "POSprop"      "POSfreq"     
##  [5] "POSrnp"       "percent"      "zero.replace" "pos.by.freq" 
##  [9] "pos.by.prop"  "pos.by.rnp"

plot(posbydf, values = T, digits = 2)

## Warning: Ignoring unknown aesthetics: fill

Readability Score

Readability scores were originally designed to measure the difficulty of text. Scores are generally based on, number of words, syllables, polly-syllables and word length. While these scores are not specifically designed for, or tested on, speech, they can be useful indicators of speech complexity.

automated_readability_index(df1$text, df1$candidate)

## Warning in automated_readability_index(df1$text, df1$candidate): 
##   Some rows contain double punctuation.  Suggested use of sentSplit function.

##   candidate word.count sentence.count character.count Automated_Readability_Index
## 1      Bush       6467              1           44736                    3244.652
## 2     Trump      16658              1          104097                    8337.003

Linguistical Diversity Stats

Diversity stats are a measure of language “richness” or rather, how expansive is a speakers vocabulary. The results indicate similar use of vocabulary, certainly not unusual given the assistance of professional speech writers.

diversity(df1$text, df1$candidate)

##   candidate    wc simpson shannon collision berger_parker brillouin
## 1      Bush  6467   0.998   7.004     6.045         0.026     6.576
## 2     Trump 16658   0.997   7.245     5.967         0.024     6.928

Formal or Contextual?

Formality contextualizes the text by comparing formal parts of speech (noun, adjective, preposition and article) versus contextual parts of speech (pronoun, verb, adverb, interjection). A plot for analysis is available. Scores closer to 100 are more formal and those closer to 1 are more contextual.

form <- formality(df1$text, df1$candidate)
form

##   candidate word.count formality
## 1      Bush       6467     71.67
## 2     Trump      16658     71.62

plot(form)

Polarity Measures AKA Sentiment Analysis

Polarity measures sentence sentiment. A plot is available. What we see is that, on average, Trump was slightly more negative.

pol <- polarity(df1$text, df1$candidate)

## Warning in polarity(df1$text, df1$candidate): 
##   Some rows contain double punctuation.  Suggested use of `sentSplit` function.

plot(pol)

## Warning: `show_guide` has been deprecated. Please use `show.legend`
## instead.

## Warning: Ignoring unknown aesthetics: x

## Warning: `show_guide` has been deprecated. Please use `show.legend`
## instead.

## Warning: Removed 2 rows containing missing values (geom_errorbarh).

Lexical dispersion

The lexical dispersion plot allows one to see how a word occurs throughout the text. It is interesting to view to see how topics change over time. Note that you can also include freq_terms should you so choose.

dispersion_plot(df1$text, c("terror", "defense", "threat", "china", "russia"),     df2$candidate)

Stemmed Vs UnStemmed

Finally, an example of a gradient wordcloud, which produces one wordcloud colored by a binary grouping variable. Let’s do one with words not stemmed and one with stemming included.

gradient_cloud(df1$text, df1$candidate, min.freq = 20, stem = F)

gradient_cloud(df1$text, df1$candidate, min.freq = 20, stem = T)

## Warning in stemmer(text.var): The following row(s) do have standard qdap punctuation endmarks:
##   rows: 1, 2

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : includ could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : through could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : freedom could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : ment could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : opportun could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : competit could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : strateg could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : actor could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : reduc could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : strength could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : encourag could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : provid could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : achiev could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : defend could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : remain could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : respons could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : adversari could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : infrastructur could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : requir could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : threaten could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : access could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : reform could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : resourc could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : prevent could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : privat could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : space could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : thousand could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : expand could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : grow could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : sector could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : societi could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : where could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : advantag could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : assist could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : emerg could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : principl could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : strong could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : futur could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : while could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : innov could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : maintain could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : relationship could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : allow could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : human could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : jihadist could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : right could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : border could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : democrat could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : home could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : homeland could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : integr could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : missil could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : modern could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : sustain could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : better could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : cyber could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : defeat could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : today could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : africa could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : citizen could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : fight could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : group could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : preserv could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : target could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : benefit could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : common could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : con could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : foreign could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : non could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : pursu could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : should could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : thirti could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : under could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : address could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : communiti could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : europ could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : individu could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : network could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : sourc could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : stabil could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : allianc could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : area could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : becom could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : conflict could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : enhanc could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : exploit could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : health could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : miss could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : north could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : upon could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : year could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : central could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : corrupt could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : diplomaci could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : european could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : necessari could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : potenti could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : togeth could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : bank could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : democraci could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : enemi could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : financi could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : inter could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : product could not be fit on page. It will not be plotted.

## Warning in wordcloud(WF[, 1], WF[, "total"], min.freq = min.freq, colors =
## WF[, : public could not be fit on page. It will not be plotted.

Conclusion

After the analysis you may have surprising insights or how close the language used, structure and emphasis on messaging is so close to each other (tha candidates) hence how this speeches where note determinant for help the elctorate to make a choice since clarity of difference in the proposals wasn’t there at least at the very beginning of the campaign.