HEAT MAPS OF DEBATE SPEECH in R

SUMMARY

This continuation of BIAS AND CONTEXT IN PRESIDENTIAL DEBATE TEXTS, which focused on a “Bag of Words” approach to analyzing the text of Presidential Debates.

This analysis shows a “Heat Map” of frequent words. It is not really a new analysys, but just a better way of visualizing the data. I also

DATA SOURCES AND METHODS

The text of the presidential debates are downloaded from the UCSB Presidency Project. Transcripts were pasted into Apple Pages and stored as unformatted .txt files.

    ##FILTER TEXT
    word.filter <- "terror"

## Start and Stop Word Frequency Rank
    n.s <- 1    ## Start
    n.w <- 20   ## Number of words

CANDIDATE WORD FREQUENCIES

We can check word frequency directly by tokenizing and counting single words. (Note: this is a partial duplication of the work done in the first analysis. But as the word vector analysis below leverages some of the output of this, it’s reproduced here in a slightly different format as a control of quality)

There are a total of 925 words in the combined vocabulary of the candidates.

word	trump	sanders	clinton	rubio	cruz	all
isis	0	4	6	7	14	31
terrorism	1	8	7	1	12	29
terrorists	0	2	5	6	12	25
radical	1	0	1	3	18	23
islamic	1	0	0	0	18	19
need	0	2	7	1	8	18
will	0	0	4	1	13	18
think	0	4	10	0	2	16
people	2	1	6	5	0	14
now	0	1	1	5	6	13
going	0	5	3	2	0	10
international	0	7	1	0	0	8
got	0	5	1	0	1	7
issue	0	6	0	1	0	7
records	0	0	0	6	0	6
say	2	1	2	0	1	6
things	2	0	0	0	1	3
opened	2	0	0	0	0	2
SUM	46	310	402	821	330	1909

- Hillary Clinton spoke 402 total words, with a vocabulary of 289 words.
- Bernie Sanders spoke 310 total words, with a vocabulary of 211 words.
- Donald Trump spoke 46 words with a vocabulary of 42 words.
- Ted Cruz spoke 821 words with a vocabulary of 469 words.
- Marco Rubio spoke 330 words with a vocabulary of 218 words.

A “heat map” of frequent words shows several interesting patterns. For instance, all candidates but one use the word “people” with high frequency. Conversely, only one candidate mentions the word “tax” frequently.

CONCLUSIONS

Candidate word choices vary from candidate to candidate. Filtering for specific text choices and word counts reveals interesting and potentially explitable patterns.

HEAT MAPS OF DEBATE SPEECH in R

WW44SS

April 5, 2016

SUMMARY

DATA SOURCES AND METHODS

CANDIDATE WORD FREQUENCIES

CONCLUSIONS