Prepare

In my independent analysis, I will be exploring how disability services are discussed on Twitter with a specific focus on sentiment. My research question is: What is the overall ratio of negative to positive sentiments around disability services on Twitter?

Install and Load Packages

First, I installed and loaded the packages I intended to use for my analysis.

install.packages("dplyr")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
install.packages("tidyverse")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
install.packages("ggplot2")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
install.packages("readr")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
install.packages("rtweet")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
install.packages("readxl")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
install.packages("scales")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
install.packages("textdata")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(readr)
library(tidyr)
library(rtweet)
library(writexl)
library(readxl)
library(tidytext)
library(textdata)
library(ggplot2)
library(textdata)
library(scales)
## 
## Attaching package: 'scales'
## The following object is masked from 'package:readr':
## 
##     col_factor

Import Data

Next, I imported my data from Twitter.

app_name <- "GWiedrich"
api_key <- "XZcZIGvZCh8R9VTdser7SbJIa"
api_secret_key <- "E5ZLKuiAp6FzJ7X1BZzvUjNp3lhZrp3z6WYsAo8IOGSaswXTn2"
access_token <- "1577294143713611776-VjiSkvfGsn73XClahUhMSFfCyO2B5v"
access_token_secret <- "sxP4wmGmkwqvHCeuhKC2v43622oDOUGomRajLsKaxHfUi"

token <- create_token(
  app = app_name,
  consumer_key = api_key,
  consumer_secret = api_secret_key,
  access_token = access_token,
  access_secret = access_token_secret)
## Warning: `create_token()` was deprecated in rtweet 1.0.0.
## ℹ See vignette('auth') for details
## Saving auth to '/cloud/home/r1591092/.config/R/rtweet/create_token.rds'
get_token()
## Warning: `get_token()` was deprecated in rtweet 1.0.0.
## ℹ Please use `auth_get()` instead.
## <Token>
## <oauth_endpoint>
##  request:   https://api.twitter.com/oauth/request_token
##  authorize: https://api.twitter.com/oauth/authenticate
##  access:    https://api.twitter.com/oauth/access_token
## <oauth_app> rtweet
##   key:    XZcZIGvZCh8R9VTdser7SbJIa
##   secret: <hidden>
## <credentials> oauth_token, oauth_token_secret
## ---

I created a dictionary using the follow hashtags and phrases:

  • #disabilityservices

  • #DRO

  • #DSS

  • disability services

  • disability resources

This dictionary was used to find the data from Twitter for my analysis.

dis_dictionary <- c("#disabilityservices OR #DRO",
                     '"#DSS"',
                     '"#disabilityresources"',
                     '"disability services"',
                     '"disability resources"')

dis_tweets <- search_tweets2(dis_dictionary,
                              n=5000,
                              include_rts = FALSE)

write_xlsx(dis_tweets, "dis_tweets.xlsx")

Wrangle

After I had my tweets, I started to get them into a format where I could create a meaningful model for my research question.

dis_text <-
  dis_tweets %>%
  filter(lang == "en") %>%
  select(id, created_at, full_text)
tweet_tokens <- 
  dis_tweets %>%
  unnest_tokens(output = word, 
                input = full_text, 
                token = "words")
tidy_tweets <-
  tweet_tokens %>%
  anti_join(stop_words, by = "word")

count(tidy_tweets, word, sort = T)
## # A tibble: 6,450 × 2
##    word           n
##    <chr>      <int>
##  1 https        685
##  2 t.co         685
##  3 disability   418
##  4 dss          393
##  5 services     358
##  6 ji           143
##  7 amp          114
##  8 people        95
##  9 の            83
## 10 saint         74
## # … with 6,440 more rows
## ℹ Users data at users_data()
tidy_tweets <-
  tweet_tokens %>%
  anti_join(stop_words) %>%
  filter(!word == c("https", "t.co")) %>%
  filter(lang == "en") %>%
  select(id, text, word)
## Joining with `by = join_by(word)`
tidy_tweets
## # A tibble: 13,402 × 3
##         id text                                                            word 
##      <dbl> <chr>                                                           <chr>
##  1 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberry Pi Pi… pico…
##  2 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberry Pi Pi… bt   
##  3 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberry Pi Pi… blue…
##  4 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberry Pi Pi… clas…
##  5 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberry Pi Pi… spp  
##  6 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberry Pi Pi… rasp…
##  7 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberry Pi Pi… pi   
##  8 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberry Pi Pi… pico 
##  9 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberry Pi Pi… subs…
## 10 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberry Pi Pi… https
## # … with 13,392 more rows
## ℹ Users data at users_data()

I decided to use the four sentiment dictionaries we used in the walkthrough to get a variety of perspectives to use together.

afinn <- get_sentiments("afinn")
bing <- get_sentiments("bing")
nrc <- get_sentiments("nrc")
loughran <- get_sentiments("loughran")

Explore

Then I began manipulating my data and seeing what sense I could make of it.

sentiment_afinn <- inner_join(tidy_tweets, afinn, by = "word")

sentiment_afinn
## # A tibble: 1,042 × 4
##         id text                                                      word  value
##      <dbl> <chr>                                                     <chr> <dbl>
##  1 1.63e18 "are falling on Earth one by one,  disappearing deep in … fall…    -1
##  2 1.63e18 "Wow, listen to this Meidas Touch Network video re the G… wow       4
##  3 1.63e18 "Wow, listen to this Meidas Touch Network video re the G… sign…     1
##  4 1.63e18 "$ALC $DRO ASX shares: Invest in these 2 stocks for a le… shar…     1
##  5 1.63e18 "$ALC $DRO ASX shares: Invest in these 2 stocks for a le… chan…     2
##  6 1.63e18 "$ALC $DRO ASX shares: Invest in these 2 stocks for a le… fool     -2
##  7 1.63e18 "Smart Home Devices Got a Smart Plug? Here Are 10 Cool T… smart     1
##  8 1.63e18 "Smart Home Devices Got a Smart Plug? Here Are 10 Cool T… smart     1
##  9 1.63e18 "Smart Home Devices Got a Smart Plug? Here Are 10 Cool T… cool      1
## 10 1.63e18 "@janeway888 It is #GameOver for the wicked who want to … wick…    -2
## # … with 1,032 more rows
## ℹ Users data at users_data()
sentiment_bing <- inner_join(tidy_tweets, bing, by = "word")

sentiment_bing
## # A tibble: 1,173 × 4
##         id text                                                    word  senti…¹
##      <dbl> <chr>                                                   <chr> <chr>  
##  1 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspber… clas… positi…
##  2 1.63e18 "are falling on Earth one by one,  disappearing deep i… fall… negati…
##  3 1.63e18 "are falling on Earth one by one,  disappearing deep i… dark  negati…
##  4 1.63e18 "Wow, listen to this Meidas Touch Network video re the… wow   positi…
##  5 1.63e18 "Wow, listen to this Meidas Touch Network video re the… sign… positi…
##  6 1.63e18 "$ALC $DRO ASX shares: Invest in these 2 stocks for a … motl… negati…
##  7 1.63e18 "$ALC $DRO ASX shares: Invest in these 2 stocks for a … fool  negati…
##  8 1.63e18 "Smart Home Devices Got a Smart Plug? Here Are 10 Cool… smart positi…
##  9 1.63e18 "Smart Home Devices Got a Smart Plug? Here Are 10 Cool… smart positi…
## 10 1.63e18 "Smart Home Devices Got a Smart Plug? Here Are 10 Cool… cool  positi…
## # … with 1,163 more rows, and abbreviated variable name ¹​sentiment
## ℹ Users data at users_data()
sentiment_nrc <- inner_join(tidy_tweets, nrc, by = "word")
## Warning in inner_join(tidy_tweets, nrc, by = "word"): Each row in `x` is expected to match at most 1 row in `y`.
## ℹ Row 27 of `x` matches multiple rows.
## ℹ If multiple matches are expected, set `multiple = "all"` to silence this
##   warning.
sentiment_nrc
## # A tibble: 6,108 × 4
##         id text                                                    word  senti…¹
##      <dbl> <chr>                                                   <chr> <chr>  
##  1 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspber… clas… positi…
##  2 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspber… subs… antici…
##  3 1.63e18 "I saw how much of an impact Recreational Therapy can … recr… antici…
##  4 1.63e18 "I saw how much of an impact Recreational Therapy can … recr… joy    
##  5 1.63e18 "I saw how much of an impact Recreational Therapy can … recr… positi…
##  6 1.63e18 "are falling on Earth one by one,  disappearing deep i… fall… negati…
##  7 1.63e18 "are falling on Earth one by one,  disappearing deep i… fall… sadness
##  8 1.63e18 "are falling on Earth one by one,  disappearing deep i… endl… anger  
##  9 1.63e18 "are falling on Earth one by one,  disappearing deep i… endl… fear   
## 10 1.63e18 "are falling on Earth one by one,  disappearing deep i… endl… joy    
## # … with 6,098 more rows, and abbreviated variable name ¹​sentiment
## ℹ Users data at users_data()
ts_plot(dis_tweets, by = "days")

summary_bing <- sentiment_bing %>% 
  count(sentiment, sort = TRUE) %>% 
  spread(sentiment, n) %>%
  mutate(sentiment = positive - negative) %>%
  mutate(lexicon = "bing") %>%
  relocate(lexicon)

summary_bing
## # A tibble: 1 × 4
##   lexicon negative positive sentiment
##   <chr>      <int>    <int>     <int>
## 1 bing         499      674       175
summary_afinn <- sentiment_afinn %>% 
  summarise(sentiment = sum(value)) %>% 
  mutate(lexicon = "AFINN") %>%
  relocate(lexicon)

summary_afinn
## # A tibble: 1 × 2
##   lexicon sentiment
##   <chr>       <dbl>
## 1 AFINN         443
summary_nrc <- sentiment_nrc %>% 
  filter(sentiment == c("positive", "negative")) %>%
  count(sentiment, sort = TRUE) %>% 
  spread(sentiment, n) %>%
  mutate(sentiment = positive - negative) %>%
  mutate(lexicon = "nrc") %>%
  relocate(lexicon)

summary_nrc
## # A tibble: 1 × 4
##   lexicon negative positive sentiment
##   <chr>      <int>    <int>     <int>
## 1 nrc          526      617        91
sentiment_loughran <- inner_join(tidy_tweets, loughran, by = "word")
## Warning in inner_join(tidy_tweets, loughran, by = "word"): Each row in `x` is expected to match at most 1 row in `y`.
## ℹ Row 1314 of `x` matches multiple rows.
## ℹ If multiple matches are expected, set `multiple = "all"` to silence this
##   warning.
summary_loughran <- sentiment_loughran %>% 
  filter(sentiment == c("positive", "negative")) %>%
  count(sentiment, sort = TRUE) %>% 
  spread(sentiment, n) %>%
  mutate(sentiment = positive - negative) %>%
  mutate(lexicon = "loughran") %>%
  relocate(lexicon)
## Warning: There was 1 warning in `filter()`.
## ℹ In argument: `sentiment == c("positive", "negative")`.
## Caused by warning in `sentiment == c("positive", "negative")`:
## ! longer object length is not a multiple of shorter object length
summary_loughran
## # A tibble: 1 × 4
##   lexicon  negative positive sentiment
##   <chr>       <int>    <int>     <int>
## 1 loughran      170       77       -93
sentiment_joined <- full_join(summary_loughran, summary_nrc)
## Joining with `by = join_by(lexicon, negative, positive, sentiment)`
sentiment_joined <- full_join(sentiment_joined, summary_afinn)
## Joining with `by = join_by(lexicon, sentiment)`
sentiment_joined <- full_join(sentiment_joined, summary_bing)
## Joining with `by = join_by(lexicon, negative, positive, sentiment)`
head(sentiment_bing)
## # A tibble: 6 × 4
##        id text                                                     word  senti…¹
##     <dbl> <chr>                                                    <chr> <chr>  
## 1 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberr… clas… positi…
## 2 1.63e18 "are falling on Earth one by one,  disappearing deep in… fall… negati…
## 3 1.63e18 "are falling on Earth one by one,  disappearing deep in… dark  negati…
## 4 1.63e18 "Wow, listen to this Meidas Touch Network video re the … wow   positi…
## 5 1.63e18 "Wow, listen to this Meidas Touch Network video re the … sign… positi…
## 6 1.63e18 "$ALC $DRO ASX shares: Invest in these 2 stocks for a l… motl… negati…
## # … with abbreviated variable name ¹​sentiment
## ℹ Users data at users_data()
head(sentiment_afinn)
## # A tibble: 6 × 4
##        id text                                                       word  value
##     <dbl> <chr>                                                      <chr> <dbl>
## 1 1.63e18 "are falling on Earth one by one,  disappearing deep in t… fall…    -1
## 2 1.63e18 "Wow, listen to this Meidas Touch Network video re the GO… wow       4
## 3 1.63e18 "Wow, listen to this Meidas Touch Network video re the GO… sign…     1
## 4 1.63e18 "$ALC $DRO ASX shares: Invest in these 2 stocks for a leg… shar…     1
## 5 1.63e18 "$ALC $DRO ASX shares: Invest in these 2 stocks for a leg… chan…     2
## 6 1.63e18 "$ALC $DRO ASX shares: Invest in these 2 stocks for a leg… fool     -2
## ℹ Users data at users_data()
head(sentiment_loughran)
## # A tibble: 6 × 4
##        id text                                                     word  senti…¹
##     <dbl> <chr>                                                    <chr> <chr>  
## 1 1.63e18 "are falling on Earth one by one,  disappearing deep in… disa… negati…
## 2 1.63e18 "at the Nairobi (Kenya) Declaration.Since 1998, the Exe… hidd… uncert…
## 3 1.63e18 "Questions about some filing phrases? It is important t… ques… negati…
## 4 1.63e18 "Smart Home Devices Unwanted Connection: Who Has Contro… unwa… negati…
## 5 1.63e18 "NDIS has transformed daily life and given us the suppo… succ… positi…
## 6 1.63e18 "With NDIS, every day is an opportunity to thrive and s… oppo… positi…
## # … with abbreviated variable name ¹​sentiment
## ℹ Users data at users_data()
head(sentiment_nrc)
## # A tibble: 6 × 4
##        id text                                                     word  senti…¹
##     <dbl> <chr>                                                    <chr> <chr>  
## 1 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberr… clas… positi…
## 2 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberr… subs… antici…
## 3 1.63e18 "I saw how much of an impact Recreational Therapy can h… recr… antici…
## 4 1.63e18 "I saw how much of an impact Recreational Therapy can h… recr… joy    
## 5 1.63e18 "I saw how much of an impact Recreational Therapy can h… recr… positi…
## 6 1.63e18 "are falling on Earth one by one,  disappearing deep in… fall… negati…
## # … with abbreviated variable name ¹​sentiment
## ℹ Users data at users_data()

I ended up wanting to look at a wordcloud, so I installed and loaded the appropriate package and proceeded to make a word cloud of the words in my tweets.

install.packages("wordcloud2")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
library(wordcloud2)
tidy_tweets_count <- count(tidy_tweets, word, sort = TRUE)

wordcloud2(tidy_tweets_count)

Model

Looking at the sentiment values for my data across the four different sentiment dictionaries, we can see that there is generally a more positive sentiment around this topic on Twitter recently.

sentiment_joined
## # A tibble: 4 × 4
##   lexicon  negative positive sentiment
##   <chr>       <int>    <int>     <dbl>
## 1 loughran      170       77       -93
## 2 nrc           526      617        91
## 3 AFINN          NA       NA       443
## 4 bing          499      674       175

Communicate

  1. Findings. What did you ultimately find? How do your “data products” help to illustrate these findings? What conclusions can you draw from your analysis?

    • I found that there was a generally positive sentiment around disability services on Twitter recently.
  2. Discussion. What were some of the strengths and weaknesses of your analysis? How might your audience use this information? How might you revisit or improve upon this analysis in the future?

    • This was hard for me to wrap my head around what I should be showing or demonstrating. This is pretty typical when I’m learning new things, so I’m hoping it will click as we continue this semester.

    • I don’t think my data was the best to work with. I realized a bit too far into my analysis that I had forgotten to remove some useless stopwords (see the word cloud) and I wasn’t sure when I reached the point where I had done what I set out to do.

    • I feel like I can figure out what to do technically, but I’m still having a hard time with coming up with something original and executing it confidently.