library(tidytext)        # package for text analysis
library(readxl)          # reads excel files, the format I used for the data

In this notebook, I will be analyzing the content from manifestos by six mass killers. In particular, I want to look at their vocabularies and see which words stand out as the most distinct. To accomplish this, I am using the tidytext and readxl packages in R.

  1. First, I will tell R to load the text and then “unnest” the words, which just means breaking each manifesto down into its constituent words, minus any spaces and punctuation.
manifestos <- read_excel("manifestos.xlsx")

manifestos_words <- manifestos %>%
  unnest_tokens(word, text)