Speeches of Hitler, Mussolini, and Trump

Author

Sami Engel

For my project, I want to analyze all of the speeches of Hitler, Mussolini, and Trump, three men who have embraced their dictatorial and fascist leadership styles. I predict that in order to have rallied the crowds and cults all three men were successful in earning, the words used in their speech would be filled with mostly negative sentiment and it would include short and common words so that the maximum amount of people can understand it. I received the text for Hitler’s speeches from https://archive.org/stream/TheSpeechesOfAdolfHitler19211941/hitler-speeches-collection_djvu.txt. I received the text for Mussolini’s speeches from https://www.gutenberg.org/files/62754/62754-0.txt. Lastly, I received the text for Trump’s speeches from https://github.com/ryanmcdermott/trump-speeches/blob/master/speeches.txt.

First we need to download and install the following files and packages:

Code
library(tidyverse)
library(tidytext)
library(textdata)
library(readr)
library(wordcloud2)
library(gutenbergr)
library(ggthemes)
library(plotly)
Code
library(readr)
pg62754 <- read_csv("~/Desktop/BadMenSpeeches/pg62754.txt")

library(readr)
Hitler_txt <- read_csv("~/Desktop/BadMenSpeeches/Hitler.txt.rtf")

library(readr)
Trump <- read_csv("~/Desktop/BadMenSpeeches/Trump.txt")

Now, uploading Mussolini’s, Hitler’s, and Trump’s speeches, removing any conjunctions.

Code
pg62754 <- read_csv("~/Desktop/BadMenSpeeches/pg62754.txt",  col_names = FALSE)

pg62754 |>
  unnest_tokens(word, X1) |>
  anti_join(stop_words) |>
  count(word, sort = TRUE) |>
  mutate(Speaker = "Mussolini") -> Mussolini



Hitler_txt <- read_csv("~/Desktop/BadMenSpeeches/Hitler.txt.rtf", col_names = FALSE)


Hitler_txt |>
  unnest_tokens(word, X1) |>
  anti_join(stop_words) |>
  count(word, sort = TRUE) |>
  mutate(Speaker = "Hitler") -> Hitler



Trump <- read_csv("~/Desktop/BadMenSpeeches/Trump.txt", col_names = FALSE)



Trump |>
  unnest_tokens(word, X1) |>
  anti_join(stop_words) |>
  count(word, sort = TRUE) |>
  mutate(Speaker = "Trump") -> Trump

Now that all three files have been uploaded, we can merge all three together to look at similarities.

Code
Trump |>
  full_join(Mussolini) |>
  full_join(Hitler) -> BadMen

BadMen |>
  group_by(Speaker) |>
  count(word, sort = TRUE)
# A tibble: 20,804 × 3
# Groups:   Speaker [3]
   Speaker word           n
   <chr>   <chr>      <int>
 1 Hitler  1              1
 2 Hitler  1,000          1
 3 Hitler  1,042          1
 4 Hitler  10             1
 5 Hitler  10,000         1
 6 Hitler  10,000,000     1
 7 Hitler  10,572         1
 8 Hitler  100            1
 9 Hitler  104            1
10 Hitler  10th           1
# ℹ 20,794 more rows

We can look at the most common negative and positive words used by each man and visualize them through a word cloud to see each word’s popularity.

Code
BadMen |>
  inner_join(get_sentiments('afinn')) -> BadMen_sentiment

First, the 30 most popular negative words used by Trump in his speeches.

Code
BadMen_sentiment |>
  filter(Speaker %in% "Trump") |>
  arrange(-desc(value)) |>
  head(30) |>
  wordcloud2()

It seems that the most common negative words Trump includes in his speeches are “bad” and “horrible.”

Now, the 30 most popular positive words used by Trump in his speeches

Code
BadMen_sentiment |>
  filter(Speaker %in% "Trump") |>
  arrange(desc(value)) |>
  head(30) |>
  wordcloud2()

It looks like Trump says “win” and “love” the most among the positive words he includes in his speeches.

Next, let’s look at the 30 most popular negative words used by Mussolini in his speeches.

Code
BadMen_sentiment |>
  filter(Speaker %in% "Mussolini") |>
  arrange(-desc(value)) |>
  head(30) |>
  wordcloud2()

Based on this word cloud, Mussolini said “crisis” the most among the words he said that had negative sentiment, along with “dead” and “lost”.

Now, the 30 most popular positive words used by Mussolini in his speeches

Code
BadMen_sentiment |>
  filter(Speaker %in% "Mussolini") |>
  arrange(desc(value)) |>
  head(30) |>
  wordcloud2()

Mussolini said “love” the most, along with “perfectly” and “win.”

Lastly, let’s examine the 30 most popular negative words used by Hitler in his speeches

Code
BadMen_sentiment |>
  filter(Speaker %in% "Hitler") |>
  arrange(-desc(value)) |>
  head(30) |>
  wordcloud2()

From analyzing this word cloud, Hitler said “lost” and “crisis” the most, among the negative words he used, throughout his speeches.

And now, let’s visualize the 30 most popular positive words used by Hitler in his speeches.

Code
BadMen_sentiment |>
  filter(Speaker %in% "Hitler") |>
  arrange(desc(value)) |>
  head(30) |>
  wordcloud2()

Among these words, Hitler used “won”, “win”, and “love” most often throughout his speeches.

From looking at the word clouds, there are very interesting similarities between the most popular negative and positive words used among the three men. Trump used the word “bad” a lot as well as synonyms for the word, such as “terrible” and “horrible.” Mussolini focuses more on death and suffering, saying “dead”, “die”, and “destruction” most often. Hitler said “crisis” and “lost” a lot throughout his speeches, perhaps to utilize fear-mongering among his listeners about the state of Germany. From looking at the most popular positive words among the three men, there are also many similarities. All three men said “love” and “win” most often. I am inferring that this is because each leader wanted to earn trust and connection with their followers as well as make them believe that they are going to lead their countries and people to victory against their enemies.

I assume that, on average, each leader’s sentiment throughout their speeches would be negative, so here, I can check to see if I am correct.

Code
BadMen_sentiment |>
  filter(Speaker %in% "Trump") -> Trump_sentiment

BadMen_sentiment |>
  filter(Speaker %in% "Mussolini") -> Mussolini_sentiment

BadMen_sentiment |>
  filter(Speaker %in% "Hitler") -> Hitler_sentiment
Code
mean(Trump_sentiment$value)
[1] -0.3678977
Code
mean(Mussolini_sentiment$value)
[1] -0.2750846
Code
mean(Hitler_sentiment$value)
[1] -0.3725

Trump’s mean sentiment is -0.3678977, Mussolini’s average sentiment is -0.2750846, and Hitler’s average sentiment is -0.3725. On average, each leader’s speech contained a negative sentiment, and their scores are relatively close. Trump has the most negative sentiment, followed by Mussolini, and then Hitler.

In order to find the most popular words, in general, for each man, I need to remove apostrophes and filter out words that do not contain any sentiment, such as “im” and “its” to see what each man liked to emphasize to their followers.

Code
BadMen |>
  mutate(word = gsub(pattern = '[[:punct:]]', replacement = '', word)) -> BadMen2
Code
BadMen2 |>
  filter(Speaker %in% "Trump") |>
  filter(!word %in% c("its", "im", "were", "theyre", "thats", "hes", "ive", "youre", "cant", "didnt", "dont", "lot") ) |>
  arrange(desc(n)) |>
  left_join(get_sentiments("afinn")) |>
  head(10) |>
  ggplot(aes(x=n, y=reorder(word, n), fill = value)) +
  geom_col() +
  xlab("Count") + ylab("Word") + ggtitle("Trump Top Ten Words")

Trump overwhelmingly said “people” most throughout his speeches. From observing him over the last eight years, he enjoys using his speeches to talk about other people, criticize particular groups of people, as well as speak directly to his audiences. Further, as a presidential-candidate and president, it makes sense that he would say “country” a lot as well to discuss The United States as well as America’s relationships with other nations. Trump also enjoys talking about money, whether it is about the nation’s economic state or his own finances, mostly the latter.

Code
BadMen2 |>
  filter(Speaker %in% "Mussolini") |>
  filter(!word %in% c("its", "im", "were", "theyre", "thats", "hes", "ive", "youre", "cant", "didnt", "dont", "lot") ) |>
  arrange(desc(n)) |>
  left_join(get_sentiments("afinn")) |>
  head(10) |>
  ggplot(aes(x=n, y=reorder(word, n), fill = value)) +
  geom_col() +
  xlab("Count") + ylab("Word") + ggtitle("Mussolini Top Ten Words")

Mussolini’s most popular words include “Italy,” “Italian,” and “War”, which all make sense as he was the leader of the state of Italy, spoke to Italian people in most of his speeches, and he was the leader during World War 2, so the subject matter in his speeches would typically be about the state of the war.

Code
BadMen2 |>
  filter(Speaker %in% "Hitler") |>
  filter(!word %in% c("its", "im", "were", "theyre", "thats", "hes", "ive", "youre", "cant", "didnt", "dont", "lot") ) |>
  arrange(desc(n)) |>
  left_join(get_sentiments("afinn")) |>
  head(10) |>
  ggplot(aes(x=n, y=reorder(word, n), fill = value)) +
  geom_col() +
  xlab("Count") + ylab("Word") + ggtitle("Hitler Top Ten Words")

Hitler’s most popular words included “people,”German,” and “Germany.” Similarly to Trump, Hitler used people to talk about and blame enemy groups as well as round up the people of Germany to believe in him. Similarly to Mussolini, Hitler said “German” and “Germany” the most because he was the leader of Germany and was mostly speaking to or about the German people.

For the last portion of my study, I want to compare the average word lengths that are included in the speeches of each of these three leaders because I hypothesize that the length would be relatively-shorter to appeal to a larger and less-educated crowd.

Code
BadMen2 |>
   mutate(Length = str_length(BadMen2$word)) -> BadMen2_length

BadMen2_length |>
  group_by(Speaker) |>
  summarize(average = mean(Length)) |>
  ggplot(aes(Speaker, average, fill = Speaker)) +
  geom_col() + 
labs(x = "Speaker", y= "Average Word Length by Letter", title = "Average word length in Speakers' Speeches")

Hitler’s average word length was the longest among the three leaders, but each three leader’s average word length are relatively the same, with Trump’s average word length clearly being the lowest. Trump says simple words like “good,” and “bad” to appeal to his target audiences.

I found that this report supports my hypothesis because the text within the speeches of Trump, Mussolini, and Hitler contained generally negative sentiment, and included short and common words.