In my independent analysis, I will be exploring how disability services are discussed on Twitter with a specific focus on sentiment. My research question is: What is the overall ratio of negative to positive sentiments around disability services on Twitter?
First, I installed and loaded the packages I intended to use for my analysis.
install.packages("dplyr")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
install.packages("tidyverse")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
install.packages("ggplot2")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
install.packages("readr")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
install.packages("rtweet")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
install.packages("readxl")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
install.packages("scales")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
install.packages("textdata")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(readr)
library(tidyr)
library(rtweet)
library(writexl)
library(readxl)
library(tidytext)
library(textdata)
library(ggplot2)
library(textdata)
library(scales)
##
## Attaching package: 'scales'
## The following object is masked from 'package:readr':
##
## col_factor
Next, I imported my data from Twitter.
app_name <- "GWiedrich"
api_key <- "XZcZIGvZCh8R9VTdser7SbJIa"
api_secret_key <- "E5ZLKuiAp6FzJ7X1BZzvUjNp3lhZrp3z6WYsAo8IOGSaswXTn2"
access_token <- "1577294143713611776-VjiSkvfGsn73XClahUhMSFfCyO2B5v"
access_token_secret <- "sxP4wmGmkwqvHCeuhKC2v43622oDOUGomRajLsKaxHfUi"
token <- create_token(
app = app_name,
consumer_key = api_key,
consumer_secret = api_secret_key,
access_token = access_token,
access_secret = access_token_secret)
## Warning: `create_token()` was deprecated in rtweet 1.0.0.
## ℹ See vignette('auth') for details
## Saving auth to '/cloud/home/r1591092/.config/R/rtweet/create_token.rds'
get_token()
## Warning: `get_token()` was deprecated in rtweet 1.0.0.
## ℹ Please use `auth_get()` instead.
## <Token>
## <oauth_endpoint>
## request: https://api.twitter.com/oauth/request_token
## authorize: https://api.twitter.com/oauth/authenticate
## access: https://api.twitter.com/oauth/access_token
## <oauth_app> rtweet
## key: XZcZIGvZCh8R9VTdser7SbJIa
## secret: <hidden>
## <credentials> oauth_token, oauth_token_secret
## ---
I created a dictionary using the follow hashtags and phrases:
#disabilityservices
#DRO
#DSS
disability services
disability resources
This dictionary was used to find the data from Twitter for my analysis.
dis_dictionary <- c("#disabilityservices OR #DRO",
'"#DSS"',
'"#disabilityresources"',
'"disability services"',
'"disability resources"')
dis_tweets <- search_tweets2(dis_dictionary,
n=5000,
include_rts = FALSE)
write_xlsx(dis_tweets, "dis_tweets.xlsx")
After I had my tweets, I started to get them into a format where I could create a meaningful model for my research question.
dis_text <-
dis_tweets %>%
filter(lang == "en") %>%
select(id, created_at, full_text)
tweet_tokens <-
dis_tweets %>%
unnest_tokens(output = word,
input = full_text,
token = "words")
tidy_tweets <-
tweet_tokens %>%
anti_join(stop_words, by = "word")
count(tidy_tweets, word, sort = T)
## # A tibble: 6,450 × 2
## word n
## <chr> <int>
## 1 https 685
## 2 t.co 685
## 3 disability 418
## 4 dss 393
## 5 services 358
## 6 ji 143
## 7 amp 114
## 8 people 95
## 9 の 83
## 10 saint 74
## # … with 6,440 more rows
## ℹ Users data at users_data()
tidy_tweets <-
tweet_tokens %>%
anti_join(stop_words) %>%
filter(!word == c("https", "t.co")) %>%
filter(lang == "en") %>%
select(id, text, word)
## Joining with `by = join_by(word)`
tidy_tweets
## # A tibble: 13,402 × 3
## id text word
## <dbl> <chr> <chr>
## 1 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberry Pi Pi… pico…
## 2 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberry Pi Pi… bt
## 3 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberry Pi Pi… blue…
## 4 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberry Pi Pi… clas…
## 5 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberry Pi Pi… spp
## 6 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberry Pi Pi… rasp…
## 7 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberry Pi Pi… pi
## 8 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberry Pi Pi… pico
## 9 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberry Pi Pi… subs…
## 10 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberry Pi Pi… https
## # … with 13,392 more rows
## ℹ Users data at users_data()
I decided to use the four sentiment dictionaries we used in the walkthrough to get a variety of perspectives to use together.
afinn <- get_sentiments("afinn")
bing <- get_sentiments("bing")
nrc <- get_sentiments("nrc")
loughran <- get_sentiments("loughran")
Then I began manipulating my data and seeing what sense I could make of it.
sentiment_afinn <- inner_join(tidy_tweets, afinn, by = "word")
sentiment_afinn
## # A tibble: 1,042 × 4
## id text word value
## <dbl> <chr> <chr> <dbl>
## 1 1.63e18 "are falling on Earth one by one, disappearing deep in … fall… -1
## 2 1.63e18 "Wow, listen to this Meidas Touch Network video re the G… wow 4
## 3 1.63e18 "Wow, listen to this Meidas Touch Network video re the G… sign… 1
## 4 1.63e18 "$ALC $DRO ASX shares: Invest in these 2 stocks for a le… shar… 1
## 5 1.63e18 "$ALC $DRO ASX shares: Invest in these 2 stocks for a le… chan… 2
## 6 1.63e18 "$ALC $DRO ASX shares: Invest in these 2 stocks for a le… fool -2
## 7 1.63e18 "Smart Home Devices Got a Smart Plug? Here Are 10 Cool T… smart 1
## 8 1.63e18 "Smart Home Devices Got a Smart Plug? Here Are 10 Cool T… smart 1
## 9 1.63e18 "Smart Home Devices Got a Smart Plug? Here Are 10 Cool T… cool 1
## 10 1.63e18 "@janeway888 It is #GameOver for the wicked who want to … wick… -2
## # … with 1,032 more rows
## ℹ Users data at users_data()
sentiment_bing <- inner_join(tidy_tweets, bing, by = "word")
sentiment_bing
## # A tibble: 1,173 × 4
## id text word senti…¹
## <dbl> <chr> <chr> <chr>
## 1 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspber… clas… positi…
## 2 1.63e18 "are falling on Earth one by one, disappearing deep i… fall… negati…
## 3 1.63e18 "are falling on Earth one by one, disappearing deep i… dark negati…
## 4 1.63e18 "Wow, listen to this Meidas Touch Network video re the… wow positi…
## 5 1.63e18 "Wow, listen to this Meidas Touch Network video re the… sign… positi…
## 6 1.63e18 "$ALC $DRO ASX shares: Invest in these 2 stocks for a … motl… negati…
## 7 1.63e18 "$ALC $DRO ASX shares: Invest in these 2 stocks for a … fool negati…
## 8 1.63e18 "Smart Home Devices Got a Smart Plug? Here Are 10 Cool… smart positi…
## 9 1.63e18 "Smart Home Devices Got a Smart Plug? Here Are 10 Cool… smart positi…
## 10 1.63e18 "Smart Home Devices Got a Smart Plug? Here Are 10 Cool… cool positi…
## # … with 1,163 more rows, and abbreviated variable name ¹sentiment
## ℹ Users data at users_data()
sentiment_nrc <- inner_join(tidy_tweets, nrc, by = "word")
## Warning in inner_join(tidy_tweets, nrc, by = "word"): Each row in `x` is expected to match at most 1 row in `y`.
## ℹ Row 27 of `x` matches multiple rows.
## ℹ If multiple matches are expected, set `multiple = "all"` to silence this
## warning.
sentiment_nrc
## # A tibble: 6,108 × 4
## id text word senti…¹
## <dbl> <chr> <chr> <chr>
## 1 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspber… clas… positi…
## 2 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspber… subs… antici…
## 3 1.63e18 "I saw how much of an impact Recreational Therapy can … recr… antici…
## 4 1.63e18 "I saw how much of an impact Recreational Therapy can … recr… joy
## 5 1.63e18 "I saw how much of an impact Recreational Therapy can … recr… positi…
## 6 1.63e18 "are falling on Earth one by one, disappearing deep i… fall… negati…
## 7 1.63e18 "are falling on Earth one by one, disappearing deep i… fall… sadness
## 8 1.63e18 "are falling on Earth one by one, disappearing deep i… endl… anger
## 9 1.63e18 "are falling on Earth one by one, disappearing deep i… endl… fear
## 10 1.63e18 "are falling on Earth one by one, disappearing deep i… endl… joy
## # … with 6,098 more rows, and abbreviated variable name ¹sentiment
## ℹ Users data at users_data()
ts_plot(dis_tweets, by = "days")
summary_bing <- sentiment_bing %>%
count(sentiment, sort = TRUE) %>%
spread(sentiment, n) %>%
mutate(sentiment = positive - negative) %>%
mutate(lexicon = "bing") %>%
relocate(lexicon)
summary_bing
## # A tibble: 1 × 4
## lexicon negative positive sentiment
## <chr> <int> <int> <int>
## 1 bing 499 674 175
summary_afinn <- sentiment_afinn %>%
summarise(sentiment = sum(value)) %>%
mutate(lexicon = "AFINN") %>%
relocate(lexicon)
summary_afinn
## # A tibble: 1 × 2
## lexicon sentiment
## <chr> <dbl>
## 1 AFINN 443
summary_nrc <- sentiment_nrc %>%
filter(sentiment == c("positive", "negative")) %>%
count(sentiment, sort = TRUE) %>%
spread(sentiment, n) %>%
mutate(sentiment = positive - negative) %>%
mutate(lexicon = "nrc") %>%
relocate(lexicon)
summary_nrc
## # A tibble: 1 × 4
## lexicon negative positive sentiment
## <chr> <int> <int> <int>
## 1 nrc 526 617 91
sentiment_loughran <- inner_join(tidy_tweets, loughran, by = "word")
## Warning in inner_join(tidy_tweets, loughran, by = "word"): Each row in `x` is expected to match at most 1 row in `y`.
## ℹ Row 1314 of `x` matches multiple rows.
## ℹ If multiple matches are expected, set `multiple = "all"` to silence this
## warning.
summary_loughran <- sentiment_loughran %>%
filter(sentiment == c("positive", "negative")) %>%
count(sentiment, sort = TRUE) %>%
spread(sentiment, n) %>%
mutate(sentiment = positive - negative) %>%
mutate(lexicon = "loughran") %>%
relocate(lexicon)
## Warning: There was 1 warning in `filter()`.
## ℹ In argument: `sentiment == c("positive", "negative")`.
## Caused by warning in `sentiment == c("positive", "negative")`:
## ! longer object length is not a multiple of shorter object length
summary_loughran
## # A tibble: 1 × 4
## lexicon negative positive sentiment
## <chr> <int> <int> <int>
## 1 loughran 170 77 -93
sentiment_joined <- full_join(summary_loughran, summary_nrc)
## Joining with `by = join_by(lexicon, negative, positive, sentiment)`
sentiment_joined <- full_join(sentiment_joined, summary_afinn)
## Joining with `by = join_by(lexicon, sentiment)`
sentiment_joined <- full_join(sentiment_joined, summary_bing)
## Joining with `by = join_by(lexicon, negative, positive, sentiment)`
head(sentiment_bing)
## # A tibble: 6 × 4
## id text word senti…¹
## <dbl> <chr> <chr> <chr>
## 1 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberr… clas… positi…
## 2 1.63e18 "are falling on Earth one by one, disappearing deep in… fall… negati…
## 3 1.63e18 "are falling on Earth one by one, disappearing deep in… dark negati…
## 4 1.63e18 "Wow, listen to this Meidas Touch Network video re the … wow positi…
## 5 1.63e18 "Wow, listen to this Meidas Touch Network video re the … sign… positi…
## 6 1.63e18 "$ALC $DRO ASX shares: Invest in these 2 stocks for a l… motl… negati…
## # … with abbreviated variable name ¹sentiment
## ℹ Users data at users_data()
head(sentiment_afinn)
## # A tibble: 6 × 4
## id text word value
## <dbl> <chr> <chr> <dbl>
## 1 1.63e18 "are falling on Earth one by one, disappearing deep in t… fall… -1
## 2 1.63e18 "Wow, listen to this Meidas Touch Network video re the GO… wow 4
## 3 1.63e18 "Wow, listen to this Meidas Touch Network video re the GO… sign… 1
## 4 1.63e18 "$ALC $DRO ASX shares: Invest in these 2 stocks for a leg… shar… 1
## 5 1.63e18 "$ALC $DRO ASX shares: Invest in these 2 stocks for a leg… chan… 2
## 6 1.63e18 "$ALC $DRO ASX shares: Invest in these 2 stocks for a leg… fool -2
## ℹ Users data at users_data()
head(sentiment_loughran)
## # A tibble: 6 × 4
## id text word senti…¹
## <dbl> <chr> <chr> <chr>
## 1 1.63e18 "are falling on Earth one by one, disappearing deep in… disa… negati…
## 2 1.63e18 "at the Nairobi (Kenya) Declaration.Since 1998, the Exe… hidd… uncert…
## 3 1.63e18 "Questions about some filing phrases? It is important t… ques… negati…
## 4 1.63e18 "Smart Home Devices Unwanted Connection: Who Has Contro… unwa… negati…
## 5 1.63e18 "NDIS has transformed daily life and given us the suppo… succ… positi…
## 6 1.63e18 "With NDIS, every day is an opportunity to thrive and s… oppo… positi…
## # … with abbreviated variable name ¹sentiment
## ℹ Users data at users_data()
head(sentiment_nrc)
## # A tibble: 6 × 4
## id text word senti…¹
## <dbl> <chr> <chr> <chr>
## 1 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberr… clas… positi…
## 2 1.63e18 "PicoDro BT via Bluetooth Classic SPP with the Raspberr… subs… antici…
## 3 1.63e18 "I saw how much of an impact Recreational Therapy can h… recr… antici…
## 4 1.63e18 "I saw how much of an impact Recreational Therapy can h… recr… joy
## 5 1.63e18 "I saw how much of an impact Recreational Therapy can h… recr… positi…
## 6 1.63e18 "are falling on Earth one by one, disappearing deep in… fall… negati…
## # … with abbreviated variable name ¹sentiment
## ℹ Users data at users_data()
I ended up wanting to look at a wordcloud, so I installed and loaded the appropriate package and proceeded to make a word cloud of the words in my tweets.
install.packages("wordcloud2")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
library(wordcloud2)
tidy_tweets_count <- count(tidy_tweets, word, sort = TRUE)
wordcloud2(tidy_tweets_count)
Looking at the sentiment values for my data across the four different sentiment dictionaries, we can see that there is generally a more positive sentiment around this topic on Twitter recently.
sentiment_joined
## # A tibble: 4 × 4
## lexicon negative positive sentiment
## <chr> <int> <int> <dbl>
## 1 loughran 170 77 -93
## 2 nrc 526 617 91
## 3 AFINN NA NA 443
## 4 bing 499 674 175
Findings. What did you ultimately find? How do your “data products” help to illustrate these findings? What conclusions can you draw from your analysis?
Discussion. What were some of the strengths and weaknesses of your analysis? How might your audience use this information? How might you revisit or improve upon this analysis in the future?
This was hard for me to wrap my head around what I should be showing or demonstrating. This is pretty typical when I’m learning new things, so I’m hoping it will click as we continue this semester.
I don’t think my data was the best to work with. I realized a bit too far into my analysis that I had forgotten to remove some useless stopwords (see the word cloud) and I wasn’t sure when I reached the point where I had done what I set out to do.
I feel like I can figure out what to do technically, but I’m still having a hard time with coming up with something original and executing it confidently.