Hillary Clinton was Secretary of State from 2009 to 2013. She had her own email server and her own Blackberry for both personal and government emails. She was investigated twice and found innocent of wrongdoing. The last investigation by the FBI was held with timing so as to interfere with a potential successful run for the U.S. Presidency. One of the criticisms that was leveled at Secretary Clinton, whether actual criminal wrongdoing was found, was that one of the missions of the Department of State is in safeguarding the safety and security of the United States and its citizens, both at home and abroad. And in having sensitive Federal communications sent to a personal server and blackberry, she was going against the advice of her digital policy/cyber team.
A Word Cloud is essentially however it is a depiction - after you clean up your text - removing stop words and words that do not contribute to analysis, the most frequently occurring terms in a graphical depiction.
This exercise will essentially be a data cleaning exercise with a visualization at the end.
library(tidyverse)
library(dplyr)
library(tidytext)
library(stringr)
library(tidyr)
library(scales)
library(DT)
library(knitr)
library(tm)
library(tibble)
library(magrittr)
library(purrr)
library(readr)
library(devtools)
library(wordcloud)
library(wordcloud2)# metapackage of all tidyverse packages
df <- read.csv("/cloud/project/HC.csv", header=TRUE, stringsAsFactors=FALSE)
head(df, 2)
## docID subject documentClass
## 1 C06245106 (NO SUBJECT) Litigation_F-2016-07895_48
## 2 C06245079 Delivery Status Notification (Failure) Litigation_F-2016-07895_48
## pdfLink
## 1 DOCUMENTS/Litigation_F-2016-07895_48/F-2016-07895/DOC_0C06245106/C06245106.pdf
## 2 DOCUMENTS/Litigation_F-2016-07895_48/F-2016-07895/DOC_0C06245079/C06245079.pdf
## originalLink docDate postedDate from to
## 1 NA 2018-10-04 Tom Buffenbarger Hillary Clinton
## 2 NA 2018-10-04 Mail Delivery System Hillary Clinton
## messageNumber caseNumber
## 1 F-2016-07895
## 2 F-2016-07895
## docText
## 1 UNCLASSIFIED U.S. Department of State Case No. F-2016-07895 Doc No. C06245106 Date: 09/26/2018From: Buffenbarger TomTo: Hillary Clinton (hdr22@clinfonemail.com) hdr22@clintonemail.comGood afternoon, Madam Secretary,RELEASE IN PART B6Today's New York Times confirms what we knew lay just over the horizon. "Rapid Declines in Manufactur ing Spread GlobalAnxiety" details what one analyst calls a "classic adverse feedback loop" Unless that loop is reversed, the contraction inmanufacturing could increase instability in many countries and it could undermine America's own recovery.Unfortunately, industrial output indicators lag behind events by 60 to 90 days. The magnitude of the manufacturing crisis isonly now becoming evident. Next month's G-20 meeting may come too soon to focus on this newest crisis, but I am confidentyou will find a way to start the consultations required for a coordinated and effective global recovery.Good luck in London and just know I and your union remain exceedingly proud of the work you do for all Americans.Tom BuffenbargerNotice: This message is intended for the addressee only and may contain privileged and/or confidentia I information. Use ordissemination by anyone other than the intended recipient is prohibited.UNCLASSIFIED U.S. Department of State Case No. F-2016-07895 Doc No. C06245106 Date: 09126/2018\f
## 2 UNCLASSIFIED U.S. Department of State Case No. F-2016-07895 Doc No. C06245079 Date: 09120/2018Delivery Status Notification (Failure)From:To:Subject:Mail Delivery System MAILER-DAEMON@smip07.bis.na.bl ackberry.comSRSO.xqtKNI-I.AC.att.blackberry.ne 1.1r15@srs.bis.na.b lackberry.comDelivery Status Notification (Failure)The following message to was undeliverable.The reason for the problem:5.1.0 - Unknown address error 550-1#5.1.0 Address rejectedEmbedded MessageFinal-Recipient:Action: failedStatus: 5.0.0 (permanent failure)Remote-MTA: dns, [207.105.247.120]Diagnostic-Code: smtp, 5.1.0 - Unknown address error 550-'#5.1.0 Address rejected0) End of Embedded Message Embedded Message From: "H"To: "Jim Steinberg"Cc: "Thomas E. Donilon"Date: Sat, 4 Apr 2009 13:14:57Subject: IraqRELEASE IN PARTB5,B6B6(delivery attempts:UNCLASSIFIED U.S. Department of State Case No. F-2016-07895 Doc No. C06245079 Date: 09120/2018\f\nUNCLASSIFIED U.S. Department of State Case No. F-2016-07895 Doc No. C06245079 Date: 09120/2018The President wants I told him you would lead it B5for State starting next week and that Chris Hill should be involved. He agreed and I talked to Donilon who said there had beensome work already but he'd get the broader policy review organized. B5The President was particularly focussed onI had already asked Jeff Feltmann to give me info about what NEA was doing in and about Iraq so there is material for us toreview.Let's discuss Monday. Thx.End of Embedded MessageB5135UNCLASSIFIED U.S. Department of State Case No. F-2016-07895 Doc No. C06245079 Date: 0912012018\f
df <- df[ c(8,9,12) ]
df <- df[ c(1,3)]
head(df,2)
## from
## 1 Tom Buffenbarger
## 2 Mail Delivery System
## docText
## 1 UNCLASSIFIED U.S. Department of State Case No. F-2016-07895 Doc No. C06245106 Date: 09/26/2018From: Buffenbarger TomTo: Hillary Clinton (hdr22@clinfonemail.com) hdr22@clintonemail.comGood afternoon, Madam Secretary,RELEASE IN PART B6Today's New York Times confirms what we knew lay just over the horizon. "Rapid Declines in Manufactur ing Spread GlobalAnxiety" details what one analyst calls a "classic adverse feedback loop" Unless that loop is reversed, the contraction inmanufacturing could increase instability in many countries and it could undermine America's own recovery.Unfortunately, industrial output indicators lag behind events by 60 to 90 days. The magnitude of the manufacturing crisis isonly now becoming evident. Next month's G-20 meeting may come too soon to focus on this newest crisis, but I am confidentyou will find a way to start the consultations required for a coordinated and effective global recovery.Good luck in London and just know I and your union remain exceedingly proud of the work you do for all Americans.Tom BuffenbargerNotice: This message is intended for the addressee only and may contain privileged and/or confidentia I information. Use ordissemination by anyone other than the intended recipient is prohibited.UNCLASSIFIED U.S. Department of State Case No. F-2016-07895 Doc No. C06245106 Date: 09126/2018\f
## 2 UNCLASSIFIED U.S. Department of State Case No. F-2016-07895 Doc No. C06245079 Date: 09120/2018Delivery Status Notification (Failure)From:To:Subject:Mail Delivery System MAILER-DAEMON@smip07.bis.na.bl ackberry.comSRSO.xqtKNI-I.AC.att.blackberry.ne 1.1r15@srs.bis.na.b lackberry.comDelivery Status Notification (Failure)The following message to was undeliverable.The reason for the problem:5.1.0 - Unknown address error 550-1#5.1.0 Address rejectedEmbedded MessageFinal-Recipient:Action: failedStatus: 5.0.0 (permanent failure)Remote-MTA: dns, [207.105.247.120]Diagnostic-Code: smtp, 5.1.0 - Unknown address error 550-'#5.1.0 Address rejected0) End of Embedded Message Embedded Message From: "H"To: "Jim Steinberg"Cc: "Thomas E. Donilon"Date: Sat, 4 Apr 2009 13:14:57Subject: IraqRELEASE IN PARTB5,B6B6(delivery attempts:UNCLASSIFIED U.S. Department of State Case No. F-2016-07895 Doc No. C06245079 Date: 09120/2018\f\nUNCLASSIFIED U.S. Department of State Case No. F-2016-07895 Doc No. C06245079 Date: 09120/2018The President wants I told him you would lead it B5for State starting next week and that Chris Hill should be involved. He agreed and I talked to Donilon who said there had beensome work already but he'd get the broader policy review organized. B5The President was particularly focussed onI had already asked Jeff Feltmann to give me info about what NEA was doing in and about Iraq so there is material for us toreview.Let's discuss Monday. Thx.End of Embedded MessageB5135UNCLASSIFIED U.S. Department of State Case No. F-2016-07895 Doc No. C06245079 Date: 0912012018\f
df <-rename(df,From = from, Email = docText)
head(df, 2)
## From
## 1 Tom Buffenbarger
## 2 Mail Delivery System
## Email
## 1 UNCLASSIFIED U.S. Department of State Case No. F-2016-07895 Doc No. C06245106 Date: 09/26/2018From: Buffenbarger TomTo: Hillary Clinton (hdr22@clinfonemail.com) hdr22@clintonemail.comGood afternoon, Madam Secretary,RELEASE IN PART B6Today's New York Times confirms what we knew lay just over the horizon. "Rapid Declines in Manufactur ing Spread GlobalAnxiety" details what one analyst calls a "classic adverse feedback loop" Unless that loop is reversed, the contraction inmanufacturing could increase instability in many countries and it could undermine America's own recovery.Unfortunately, industrial output indicators lag behind events by 60 to 90 days. The magnitude of the manufacturing crisis isonly now becoming evident. Next month's G-20 meeting may come too soon to focus on this newest crisis, but I am confidentyou will find a way to start the consultations required for a coordinated and effective global recovery.Good luck in London and just know I and your union remain exceedingly proud of the work you do for all Americans.Tom BuffenbargerNotice: This message is intended for the addressee only and may contain privileged and/or confidentia I information. Use ordissemination by anyone other than the intended recipient is prohibited.UNCLASSIFIED U.S. Department of State Case No. F-2016-07895 Doc No. C06245106 Date: 09126/2018\f
## 2 UNCLASSIFIED U.S. Department of State Case No. F-2016-07895 Doc No. C06245079 Date: 09120/2018Delivery Status Notification (Failure)From:To:Subject:Mail Delivery System MAILER-DAEMON@smip07.bis.na.bl ackberry.comSRSO.xqtKNI-I.AC.att.blackberry.ne 1.1r15@srs.bis.na.b lackberry.comDelivery Status Notification (Failure)The following message to was undeliverable.The reason for the problem:5.1.0 - Unknown address error 550-1#5.1.0 Address rejectedEmbedded MessageFinal-Recipient:Action: failedStatus: 5.0.0 (permanent failure)Remote-MTA: dns, [207.105.247.120]Diagnostic-Code: smtp, 5.1.0 - Unknown address error 550-'#5.1.0 Address rejected0) End of Embedded Message Embedded Message From: "H"To: "Jim Steinberg"Cc: "Thomas E. Donilon"Date: Sat, 4 Apr 2009 13:14:57Subject: IraqRELEASE IN PARTB5,B6B6(delivery attempts:UNCLASSIFIED U.S. Department of State Case No. F-2016-07895 Doc No. C06245079 Date: 09120/2018\f\nUNCLASSIFIED U.S. Department of State Case No. F-2016-07895 Doc No. C06245079 Date: 09120/2018The President wants I told him you would lead it B5for State starting next week and that Chris Hill should be involved. He agreed and I talked to Donilon who said there had beensome work already but he'd get the broader policy review organized. B5The President was particularly focussed onI had already asked Jeff Feltmann to give me info about what NEA was doing in and about Iraq so there is material for us toreview.Let's discuss Monday. Thx.End of Embedded MessageB5135UNCLASSIFIED U.S. Department of State Case No. F-2016-07895 Doc No. C06245079 Date: 0912012018\f
tidy_HC <- df %>%
unnest_tokens(word, Email)
data(stop_words)
tidy_HC2 <- tidy_HC %>%
anti_join(stop_words)
uni_sw <- data.frame(word = c("U.S. Department of State ","27","state.gov","fyi","45","2018","from","october", "u.s", "thursday", "tuesday", "day","april", "september","clintonemail.com","clintonemail","04", "16", "23", "march", "world", "hrod17","government","monday", "friday","saturday","people","2009","al", "21","25", "email","staff","country", "http","told","14","2010", "2011", "2012","2014", "2015","2016","30", "b6", "2", "b5", "4", "6", "20", "00", "26", "19", "05","january","UNCLASSIFIED", "02", "01", "07895", "time", "call", "07", "3", "29", "1", "09", "08", "02", "12", "31", "message", "10", "11", "gov", "hillary", "june", "july", "28", "24","august", "03", "17", "18", "0", "22", "wednesday", "sunday", "13", "5", "7", "15", "1.4", "8", "06", "office", "mailto", "01", "call", "07", "3", "29", "1", "fw", "09", "cc", "08", "12", "original", "message", "10", "11", "31", "gov", "pm", "subject", "20439", "doc", "date","department", "9","unclassified","Case No"))
tidy_HC3 <- tidy_HC2 %>%
anti_join(uni_sw, by = "word")
HC_count <- tidy_HC3 %>%
count(word, sort = TRUE)
head(HC_count, 20)
## word n
## 1 cid 190355
## 2 release 34002
## 3 cheryl 29366
## 4 huma 25779
## 5 mills 25448
## 6 sullivan 25132
## 7 jacob 24223
## 8 abedin 20719
## 9 president 17879
## 10 secretary 17072
## 11 hdr22 13613
## 12 security 13458
## 13 clinton 13080
## 14 meeting 13038
## 15 foreign 12283
## 16 house 11641
## 17 united 11220
## 18 news 11070
## 19 millscd 10601
## 20 obama 10547
wordcloud2(HC_count, size = 2.3, minRotation = -pi/6, maxRotation = -pi/6, rotateRatio = 1)