Sentiment Analysis

Row

Who Has Tracktion On Twitter?

How Are They Feeling?

The Redline in the density plot represents the median sentiment of the tweets. A score of 0 is normal langage. A negative score indicates the use of negative words. Positive sentiments indicates the use of positive words.

Row

Total Retweets Trump

958025

Total likes Trump

1761127

Total Retweets JOE

1610608

Total likes JOE

7707869

What Words Do They Use Most Frequently?

Column

JOE’s Words

geom_bar: width = NULL, na.rm = FALSE
stat_count: width = NULL, na.rm = FALSE
position_stack 

Column

Don’s Words

geom_bar: width = NULL, na.rm = FALSE
stat_count: width = NULL, na.rm = FALSE
position_stack 

Word Cloud

Sadly at the moment it isn’t possible to show both word clouds at the same time. It appears to be an issue with flexdashboard . They are working on it.

So since Don is far more active on Twitter than Joe is (afterall he is a slightly younger man) I included only wordcloud of his words for now.

Wordcloud Don

---
title: "Trump & Joe The latest 100 Tweets"
output: 
  flexdashboard::flex_dashboard:
    orientation: columns
    vertical_layout: fill
    source_code: embed
    social: ["menu"]
    theme: paper
---
```{r include=FALSE}
library(flexdashboard)
library(dplyr)
library(rtweet)
library(SentimentAnalysis) #sentiment analysis
library(ggplot2)


# To connect to Twitter: 

api_key <- "SpGXGfb7kRZGdWul8T7L4H0u8"
api_secret_key <- "7aPRstWMTQvQYLScNGvxSwItf9U6bR5LzHrZTyOqJElIbj0qnl"
access_token <- "1263464116510998529-jzwUhaFHLRnKbLkRNgNFDHWP0KgCyl"
access_token_secret <- "YevdjVCwvvFlYPiSTmqWUcbsB0xMo8QhlOSPSYyP850bU"

## authenticate via web browser
token <- create_token(
  app = "JOE-DON-in-100-tweets",
  consumer_key = api_key,
  consumer_secret = api_secret_key,
  access_token = access_token,
  access_secret = access_token_secret)
```

```{r data gathering, include=FALSE}
Don <- get_timelines("realdonaldtrump", n = 100)
JOE <- get_timelines("JoeBiden", n = 100)
```

```{r popularity}
popularity <- bind_rows(
  tibble(ID = "TRUMP", likes = Don$favorite_count, Retweets = Don$retweet_count, time = Don$created_at),
  tibble(ID = "BIDEN", likes = JOE$favorite_count, Retweets = JOE$retweet_count, time = JOE$created_at)
)

```

Sentiment Analysis
======================================================================

Row
-----------------------------------------------------------------
### Who Has Tracktion On Twitter?

```{r likes and Rts}
popularity %>% arrange(desc(time)) %>% ggplot(aes(x= likes, y = time , size = Retweets, color = ID)) + facet_grid(~ID)+
    geom_point(alpha=0.7) +
    scale_size(range = c(.1, 9)) +
    theme_classic()+
    theme(axis.text.x = element_text(angle = 90))+
  labs( title = "Who Has Tracktion On Twitter?", subtitle = "Last 100 Tweets", caption = "\nSource: Data collected from Twitter's REST API via rtweet")

```


### How Are They Feeling? 

The Redline in the density plot represents the median sentiment of the tweets. 
A score of 0 is normal langage. A negative score indicates the use of negative words. Positive sentiments indicates the use of positive words. 

```{r include=FALSE}
library(textdata)
library(tidytext)
library(purrr) # to use the map function
```

```{r funtion to get the sentiments out, include=FALSE}
sentiment.1 <- function (twt){
  
 twt_tbl = tibble(text = twt) %>%
   mutate(stripped_text = gsub("http\\s+","",text)) %>%
   unnest_tokens(word, stripped_text) %>%
   anti_join(stop_words) %>%
   inner_join(get_sentiments("bing")) %>%
   count(word,sentiment, sort = TRUE) %>%
   ungroup() %>% 
  mutate(score = case_when(
      sentiment == 'negative'~ n*(-1),
      sentiment == 'positive' ~ n*1)
  )
  
  sent.score = case_when(
    nrow(twt_tbl)== 0~0,
    nrow(twt_tbl)> 0 ~ sum(twt_tbl$score)
  )
  list(score = sent.score , twt_tbl = twt_tbl)
}
```

```{r tesing and gathering the data, include=FALSE}
Don_tested <- lapply(Don$text, function(x){sentiment.1(x)})
JOE_tested <- lapply(JOE$text, function(x){sentiment.1(x)})

sentiments <- bind_rows(
  tibble(ID = "TRUMP", score = unlist(map(Don_tested,'score'))),
  tibble(ID = "BIDEN", score = unlist(map(JOE_tested,'score')))
)

```

```{r plot}
ggplot(sentiments, aes(x = score)) +facet_grid(~ID) + geom_density(bw=.5)+ geom_vline(xintercept = median(sentiments$score), color = "red", linetype = "dashed") + 
  labs( title = "Sentimet Analysis for Biden & Trump  ", subtitle = "Based On The Last 100 Tweets", caption = "\nSource: Data collected from Twitter's REST API via rtweet")+  ylab(label = "Density") + theme_classic()
```

Row
-----------------------------------------------------------------

### Total Retweets Trump

```{r}
Retweets.D <- sum(Don$retweet_count)
valueBox(Retweets.D, icon = "fa-comments")
```

### Total likes Trump

```{r}
Likes.D <- sum(Don$favorite_count)
valueBox(Likes.D, icon = "fa-comments", color = ifelse(sum(Don$favorite_count) > sum(JOE$favorite_count), "success","primary"))
```

### Total Retweets JOE

```{r}
Retweets.J <- sum(JOE$retweet_count)
valueBox(Retweets.J, icon = "fa-comments")
```

### Total likes JOE

```{r}
likes.J <- sum(JOE$favorite_count)
valueBox(likes.J, icon = "fa-comments",color = ifelse(sum(JOE$favorite_count) > sum(Don$favorite_count), "success","primary"))
```



What Words Do They Use Most Frequently? 
============================================================

```{r include=FALSE}
library(wordcloud2)
```


```{r data filtering again, include=FALSE}
data("stop_words")
J.TBL <- JOE %>% unnest_tokens(word, text)

J.TBL<- J.TBL %>% anti_join(stop_words)

J.TBL <- J.TBL %>% count(word, sort = TRUE)

J.TBL <- J.TBL %>% filter(!word %in% c('t.co', 'https', 'president', 'i’ll','it’s'), n > 3)
```

```{r data filtering, include=FALSE}
D.TBL <- Don %>% unnest_tokens(word, text)

D.TBL<- D.TBL %>% anti_join(stop_words)

D.TBL <- D.TBL %>% count(word, sort = TRUE)

D.TBL <- D.TBL %>% filter(n>=3 , !word %in% c('t.co', 'https', 'president', 'it`s',"realdonaldtrump","amp"))

```
Column
-------------------------------------
### JOE's Words
```{r message=FALSE, warning=FALSE}

J.TBL %>%
 filter(n >= 4) %>%
 ggplot() +
 aes(x = word, weight = n) +
 geom_bar(fill = "#0c4c8a") +
 labs(x = "WORD", y = "COUNT", title = "Joe's Most Used Words", subtitle = "Based on His Last 100 Tweets") +
 coord_flip() +
 theme_classic()
geom_bar()
```
Column
---------------------------------------------------------------------------------------------------
### Don's Words
```{r message=FALSE, warning=FALSE}
D.TBL %>%
 filter(n >= 5 ) %>%
 ggplot() +
 aes(x = word, weight = n) +
 geom_bar(fill = "#cb181d") +
 labs(x = "WORD", y = "COUNT", title = "Don's Most Used Words", subtitle = "Based on His Last 100 Tweets") +
 coord_flip() +
 theme_classic()
geom_bar()
```

Word Cloud {data-width=350}
===================================================================================================

Sadly at the moment it isn't possible to show both word clouds at the same time. It appears to be an issue with [flexdashboard](https://github.com/Lchiffon/wordcloud2/issues/60#issuecomment-608360195) . 
They are working on it. 

So since Don is far more active on Twitter than Joe is (afterall he is a slightly younger man) I included only wordcloud of his words for now. 

```{r include=FALSE}
library(wordcloud2)

```

### Wordcloud Don 

```{r don words}

wordcloud2(D.TBL, size= 0.7)

```

```{r joe words, eval=FALSE, include=FALSE}
### wordcloud Joe {.no-padding}

wordcloud2(J.TBL, size= 0.7 )
```