Overview:

Politicians has this perception of being repetitive and always leading their conversations to false hope. This data set that I found examines the repetition of words politicians from both Democratic and Republican parties, where one-word contains every one-word phrase that was mentioned in at least 10 speeches and every two- or three-word phrase that was mentioned in at least five speeches. I would like to see which party repeat their words.

Speech <- read.csv("https://raw.githubusercontent.com/Wilchau/StatusofState/main/words.csv")

We will take a look at the dataset and see some of the names are unclear or not as associated to the political names.

head(Speech)
##            phrase              category d_speeches r_speeches total
## 1    minimum wage economy/fiscal issues          9          0     9
## 2    clean energy    energy/environment         11          1    12
## 3  climate change    energy/environment         13          2    15
## 4    gun violence         crime/justice          8          0     8
## 5 affordable care                               10          1    11
## 6   international                                0         10    10
##   percent_of_d_speeches percent_of_r_speeches      chi2        pval
## 1                 39.13                  0.00 10.565217 0.001152355
## 2                 47.83                  3.70 10.074611 0.001503264
## 3                 56.52                  7.41  9.986581 0.001576851
## 4                 34.78                  0.00  9.391304 0.002180170
## 5                 43.48                  3.70  8.931196 0.002803407
## 6                  0.00                 37.04  8.518519 0.003515506

I focus on changing phrase to Popular phrase, d_speeches to Democrats, r_speeches to Republican, percent_of_d_speeches to % of D, and percent_of_r_speeches to % of R to shorten the names and give specific names to the party’s column.

You can also embed plots, for example:

colnames(Speech) <- c("Popular Phrase", "Category", "Democrats", "Republican", "Total per category", "% of D", "% of R","chi2val", "pval")

I want to check the total sum of counts for repetition of words for Democrats and Republican usage and discover Republicans has more count. Maybe some further exploration of Republican word count may be a good future studies?

sum(Speech$Democrats)
## [1] 16413
sum(Speech$Republican)
## [1] 19228

Conclusion

Based on this simple observation, Republicans seem to have more repeating words in their speech. This can show that they may be more focus on their speeches, Democrats on the other hand can be more focus on discussing more other topics or have different word phrases for specific categories, an example: Instead of repeated “Healthcare” Health specific needs, etc. Further studies should be down to see how these parties used repeated words and see if their outcome has any success.