TidyVerse Recipe

How to get Twitter data with rtweet in R

Install and load packages (and vignettes for further documentation)

#devtools::install_github("mkearney/rtweet") # Latest working version of rtweet, this is preferred version to use
#packageVersion("rtweet")

#install.packages("tidyverse")
library(rtweet)
library(tidyverse)

## -- Attaching packages ---------------------------------------------------------------------------------------------------------- tidyverse 1.2.1 --

## v ggplot2 2.2.1     v purrr   0.2.5
## v tibble  1.4.2     v dplyr   0.7.5
## v tidyr   0.8.1     v stringr 1.3.1
## v readr   1.1.1     v forcats 0.3.0

## -- Conflicts ------------------------------------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter()  masks stats::filter()
## x purrr::flatten() masks rtweet::flatten()
## x dplyr::lag()     masks stats::lag()

library(knitr)

## quick overview of rtweet functions
## vignette("intro", package = "rtweet")
## working with the stream
## vignette("stream", package = "rtweet")
## troubleshooting
## vignette("FAQ", package = "rtweet")

Twitter API

Main Steps:

1) Apply for a Twitter API at: https://developer.twitter.com/en/apply-for-access (A Twitter account is required to go through the process)

Fill out the application form by answering the following questions:

a) The core use case, intent, or business purpose for your use of the Twitter APIs

d) If you’ll display Twitter content off of Twitter, explain how and where Tweets and Twitter content will be displayed to users of your product or service, including whether Tweets and Twitter content will be displayed at row level or aggregated

2) Create a Twitter App and obtain access tokens

See the auth vignette (https://rtweet.info/articles/auth.html) for instructions on obtaining access to Twitter’s APIs

3) Create Token in R - Commands commented and keys masked as this step depends on the individual Twitter app created

## appname <- "twitter_analysis"
## key <- "12345678901234567890"
##secret <- "12345678901234567890abcdefghijk"
# create token named "twitter_token"
##twitter_token <- rtweet::create_token(app = appname,
##                                    consumer_key = key,
##                                    consumer_secret = secret)

Creating the token in R will take you to an authentication step via the browser (interactive). Click on “Authorize the app” button to finalize the process

Save twitter_token in your home directory - Commands commented as the token is created interactively and cannot be repro’ed within and RMD file

# path of home directory 
##home_directory <- "C:/DATA/R Working Dir"
# combine with name for token
##file_name <- file.path(home_directory,
##                       "twitter_token.rds")
# save token to home directory
##saveRDS(twitter_token, file = file_name)

# assuming you followed the procodures to create "file_name"
# from the previous code chunk, then the code below should
# create and save your environment variable.
##cat(paste0("TWITTER_PAT=", file_name),
##    file = file.path(home_directory, ".Renviron"),
##    append = TRUE)

Start collecting and analyzing some Twitter data

## search for 3000 tweets using the rstats hashtag
rt <- rtweet::search_tweets("#rstats", n = 3000, include_rts = FALSE)
## preview tweets data
rt %>% dplyr::glimpse(10)

## Observations: 2,865
## Variables: 88
## $ user_id                 <chr> ...
## $ status_id               <chr> ...
## $ created_at              <dttm> ...
## $ screen_name             <chr> ...
## $ text                    <chr> ...
## $ source                  <chr> ...
## $ display_text_width      <dbl> ...
## $ reply_to_status_id      <chr> ...
## $ reply_to_user_id        <chr> ...
## $ reply_to_screen_name    <chr> ...
## $ is_quote                <lgl> ...
## $ is_retweet              <lgl> ...
## $ favorite_count          <int> ...
## $ retweet_count           <int> ...
## $ hashtags                <list> ...
## $ symbols                 <list> ...
## $ urls_url                <list> ...
## $ urls_t.co               <list> ...
## $ urls_expanded_url       <list> ...
## $ media_url               <list> ...
## $ media_t.co              <list> ...
## $ media_expanded_url      <list> ...
## $ media_type              <list> ...
## $ ext_media_url           <list> ...
## $ ext_media_t.co          <list> ...
## $ ext_media_expanded_url  <list> ...
## $ ext_media_type          <chr> ...
## $ mentions_user_id        <list> ...
## $ mentions_screen_name    <list> ...
## $ lang                    <chr> ...
## $ quoted_status_id        <chr> ...
## $ quoted_text             <chr> ...
## $ quoted_created_at       <dttm> ...
## $ quoted_source           <chr> ...
## $ quoted_favorite_count   <int> ...
## $ quoted_retweet_count    <int> ...
## $ quoted_user_id          <chr> ...
## $ quoted_screen_name      <chr> ...
## $ quoted_name             <chr> ...
## $ quoted_followers_count  <int> ...
## $ quoted_friends_count    <int> ...
## $ quoted_statuses_count   <int> ...
## $ quoted_location         <chr> ...
## $ quoted_description      <chr> ...
## $ quoted_verified         <lgl> ...
## $ retweet_status_id       <chr> ...
## $ retweet_text            <chr> ...
## $ retweet_created_at      <dttm> ...
## $ retweet_source          <chr> ...
## $ retweet_favorite_count  <int> ...
## $ retweet_retweet_count   <int> ...
## $ retweet_user_id         <chr> ...
## $ retweet_screen_name     <chr> ...
## $ retweet_name            <chr> ...
## $ retweet_followers_count <int> ...
## $ retweet_friends_count   <int> ...
## $ retweet_statuses_count  <int> ...
## $ retweet_location        <chr> ...
## $ retweet_description     <chr> ...
## $ retweet_verified        <lgl> ...
## $ place_url               <chr> ...
## $ place_name              <chr> ...
## $ place_full_name         <chr> ...
## $ place_type              <chr> ...
## $ country                 <chr> ...
## $ country_code            <chr> ...
## $ geo_coords              <list> ...
## $ coords_coords           <list> ...
## $ bbox_coords             <list> ...
## $ status_url              <chr> ...
## $ name                    <chr> ...
## $ location                <chr> ...
## $ description             <chr> ...
## $ url                     <chr> ...
## $ protected               <lgl> ...
## $ followers_count         <int> ...
## $ friends_count           <int> ...
## $ listed_count            <int> ...
## $ statuses_count          <int> ...
## $ favourites_count        <int> ...
## $ account_created_at      <dttm> ...
## $ verified                <lgl> ...
## $ profile_url             <chr> ...
## $ profile_expanded_url    <chr> ...
## $ account_lang            <chr> ...
## $ profile_banner_url      <chr> ...
## $ profile_background_url  <chr> ...
## $ profile_image_url       <chr> ...

## plot time series
ts_plot(rt) +
  ggplot2::theme_minimal() +
  ggplot2::theme(plot.title = ggplot2::element_text(face = "bold")) +
  ggplot2::labs(
    x = NULL, y = NULL,
    title = "Frequency of #rstats Twitter statuses from past 9 days",
    caption = "\nSource: Data collected from Twitter's REST API via rtweet"
  )

Maps

## search for 1000 tweets sent from the US
rt <- search_tweets(
  "lang:en", geocode = lookup_coords("usa"), n = 1000
)

## create lat/lng variables using all available tweet and profile geo-location data
rt <- lat_lng(rt)

## plot state boundaries
par(mar = c(0, 0, 0, 0))
maps::map("state", lwd = .25)

## plot lat and lng points onto state map
with(rt, points(lng, lat, pch = 20, cex = .75, col = rgb(0, .3, .7, .75)))

Streaming tweets

## stream tweets mentioning Reinforcement Learning for a week (60 secs)
stream_tweets(
  "Machine Learning",
  timeout = 60,
  file_name = "tweetsaboutml.json",
  parse = FALSE
)

## Streaming tweets for 60 seconds...

## Finished streaming tweets!

## streaming data saved as tweetsaboutml.json

## read in the data as a tidy tbl data frame
mlt <- parse_stream("tweetsaboutml.json")

## opening file input connection.

## 
 Found 5 records...
 Imported 5 records. Simplifying...

## closing file input connection.

select(mlt, location, text) %>% kable()

location	text
Golden, CO	Amazon debuts a scale model autonomous car to teach developers machine learning.
https://t.co/99y2DGyQVI
Dortmund, Deutschland	Microsoft and @Nielsen aim to change the #CPG industry by bringing data and machine learning together. Read more on @Forbes:
https://t.co/1WFK103jf1 https	://t.co/pEe6lxsy9A
HollyWood, CA	Amazon debuts a scale model autonomous car to teach developers machine learning - https://t.co/SBTAyOwNcB https://t.co/0kLqPyMR4W
<U+0	93F>, “Pattern Recognition and Machine Learning” by Dr. @ChrisBishopMSFT is now available for free. Download today for your introduction to the fields of pattern recognition and machine learning: https://t.co/w0GwuuGPlo
San Francisco Bay Area, CA	The GANfather @goodfellow_ian talking about Adversarial Machine Learning @Perimeter.

The ML and Theoretical Physics worlds collide! https://t.co/8zn6jUiJSi

Timelines

## get the most recent 3000 tweets from cnn, BBCWorld, and foxnews
tmls <- get_timelines(c("cnn", "BBCWorld", "foxnews"), n = 3000)

## plot the frequency of tweets for each user over time
tmls %>%
  dplyr::filter(created_at > "2018-10-01") %>%
  dplyr::group_by(screen_name) %>%
  ts_plot("days", trim = 1L) +
  ggplot2::geom_point() +
  ggplot2::theme_minimal() +
  ggplot2::theme(
    legend.title = ggplot2::element_blank(),
    legend.position = "bottom",
    plot.title = ggplot2::element_text(face = "bold")) +
  ggplot2::labs(
    x = NULL, y = NULL,
    title = "Frequency of Tweets posted by news organization",
    subtitle = "Tweet counts aggregated by day from October 2018",
    caption = "\nSource: Data collected from Twitter's REST API via rtweet"
  )

Trends and Favorite Topics

## Get the 30 most recently favorited tweets by CBS
cbs <- get_favorites("cbs", n = 30)
select(cbs, location, text) %>% head() %>% kable()

location	text
	muppet arms Finally!!! Thank you thank you @CBS!!! Cant wait to see you grace my TV again @PhilKeoghan!!! https://t.co/RmxoSdNbLm
United States	This is my favorite part of #Rudolph does that make me weird? https://t.co/fXPiBccVol
Los Angeles	Clarice had maaaaaaad game. The way she fluttered those eyelashes and whispered in Rudolphs ear….
Those are the mo	ves of a pro.
Im over here ta	king !!
#Rudolph #shethi	nksimcuuuute https://t.co/HQEGHsVVOS
	Can’t get enough of ’Rudolph the Red-nosed Reindeer…@cbs https://t.co/iZ6NGClUlc
Portland, OR	Time to start playing the ‘Rudolph’ music https://t.co/vASqKMHzil
Oakland, CA	Pound 4 Pound still THE best Xmas Special year after year. #Rudolph now 50+ yrs old! https://t.co/3HInygJbSb

## Discover what's currently trending in NYC.
ny <- get_trends("New York")
select(ny, trend, url) %>% head() %>% kable()

trend	url
#ALegendaryChristmas	http://twitter.com/search?q=%23ALegendaryChristmas
#Survivor	http://twitter.com/search?q=%23Survivor
#RockCenterXMAS	http://twitter.com/search?q=%23RockCenterXMAS
#STAR	http://twitter.com/search?q=%23STAR
#Riverdale	http://twitter.com/search?q=%23Riverdale
Roy Williams	http://twitter.com/search?q=%22Roy+Williams%22

TidyVerse Recipe - rtweet

humbertohp

November 27, 2018

How to get Twitter data with rtweet in R

Install and load packages (and vignettes for further documentation)

Twitter API

Main Steps:

1) Apply for a Twitter API at: https://developer.twitter.com/en/apply-for-access (A Twitter account is required to go through the process)

Fill out the application form by answering the following questions:

a) The core use case, intent, or business purpose for your use of the Twitter APIs

d) If you’ll display Twitter content off of Twitter, explain how and where Tweets and Twitter content will be displayed to users of your product or service, including whether Tweets and Twitter content will be displayed at row level or aggregated

2) Create a Twitter App and obtain access tokens

See the auth vignette (https://rtweet.info/articles/auth.html) for instructions on obtaining access to Twitter’s APIs

3) Create Token in R - Commands commented and keys masked as this step depends on the individual Twitter app created

Creating the token in R will take you to an authentication step via the browser (interactive). Click on “Authorize the app” button to finalize the process

Save twitter_token in your home directory - Commands commented as the token is created interactively and cannot be repro’ed within and RMD file

Start collecting and analyzing some Twitter data

Maps

Streaming tweets

Timelines

Trends and Favorite Topics

References:

http://www.storybench.org/get-twitter-data-rtweet-r/ ; https://rtweet.info/

TidyVerse Recipe - rtweet

humbertohp

November 27, 2018

How to get Twitter data with rtweet in R

Install and load packages (and vignettes for further documentation)

Twitter API

Main Steps:

1) Apply for a Twitter API at: https://developer.twitter.com/en/apply-for-access (A Twitter account is required to go through the process)

Fill out the application form by answering the following questions:

a) The core use case, intent, or business purpose for your use of the Twitter APIs

b) If you intend to analyze Tweets, Twitter users, or their content, share details about the analyses you plan to conduct and the methods or techniques

c) If your use involves Tweeting, Retweeting, or liking content, share how you will interact with Twitter users or their content

d) If you’ll display Twitter content off of Twitter, explain how and where Tweets and Twitter content will be displayed to users of your product or service, including whether Tweets and Twitter content will be displayed at row level or aggregated

2) Create a Twitter App and obtain access tokens

See the auth vignette (https://rtweet.info/articles/auth.html) for instructions on obtaining access to Twitter’s APIs

3) Create Token in R - Commands commented and keys masked as this step depends on the individual Twitter app created

Creating the token in R will take you to an authentication step via the browser (interactive). Click on “Authorize the app” button to finalize the process

Save twitter_token in your home directory - Commands commented as the token is created interactively and cannot be repro’ed within and RMD file

Start collecting and analyzing some Twitter data

Maps

Streaming tweets

Timelines

Trends and Favorite Topics

References:

http://www.storybench.org/get-twitter-data-rtweet-r/ ; https://rtweet.info/