I chose to do my search on Jimmy Fallon. This first chunk retrieves all of the data on tweets that others are making about Jimmy fallon.
fallon <- search_tweets("Jimmy Fallon", n = 1000, include_rts = F)
Downloading [=======>---------------------------------] 20%
Downloading [===========>-----------------------------] 30%
Downloading [===============>-------------------------] 40%
Downloading [===================>---------------------] 50%
Downloading [========================>----------------] 60%
Downloading [============================>------------] 70%
Downloading [================================>--------] 80%
Downloading [====================================>----] 90%
Downloading [=========================================] 100%
Using the above data, this next chunk creates a table of the top most popular tweets with “Jimmy Fallon”.
fallon %>%
select(text, retweet_count) %>%
top_n(25) %>% # get the top 25 most popular tweets
arrange(-retweet_count) %>% # sort in descending order of popularity
datatable()
Selecting by retweet_count
This next chunk then retrieves all the tweets that jimmy fallon himself has made on his own twitter page.
fallon_tweets <- get_timeline("jimmyfallon", n = 5000)
Using the information retrieved from above, this chunk finds the hashtags Jimmy Fallon uses the most and creates a table.
fallon_tweets %>%
select(hashtags) %>% # Focus on the hashtags
unnest() %>% # Separate multiple hashtags
mutate(hashtags = tolower(hashtags)) %>% # make all hashtags lowercase
count(hashtags, sort=TRUE) %>% # count how often they appear
datatable() # create an interactive table
`cols` is now required when using unnest().
Please use `cols = c(hashtags)`
This next chunk finds how many tweets per day and organizes it into a table.
fallon_tweets %>%
group_by(day = date(created_at)) %>% # extract the date, group by it
summarize(tweets_per_day = n()) # count the number of tweets each day
This chunk now finds the average number of tweets Jimmy Fallon makes per day.
fallon_tweets %>%
group_by(day = date(created_at)) %>% # extract the date, group by it
summarize(tweets_per_day = n()) %>% # count the number of tweets each day
summarize(mean(tweets_per_day))
Now, using plotly this next chunk shows the number of tweets per day from Jimmy Fallon in a histogram rather than a table.
fallon_tweets %>%
mutate(day = date(created_at)) %>%
plot_ly(x = ~day) %>%
add_histogram() %>%
layout(title = "Number of tweets from @jimmyfallon")
This next chunk then uses the data to find how many tweets per hour of the day Jimmy Fallon tweets. It also takes into account his time zone which I believe is Los Angeles.
fallon_tweets %>%
mutate(time = with_tz(created_at, "America/Los_Angeles")) %>%
mutate(time = hour(time)) %>%
count(time) %>%
datatable(options = (list(pageLength = 24)), rownames = F)
This next chunk then changes the table into a plotly histogram of how many tweets Jimmy fallon makes at each hour.
fallon_tweets %>%
mutate(time = with_tz(created_at, "America/Los_Angeles")) %>% # convert to Los Angeles time zone
mutate(time = hour(time)) %>% # extract the hour
plot_ly(x = ~time) %>% # create plotly graph
add_histogram() %>% # make histogram
layout(title = "When Does @jimmyfallon Tweet?",
xaxis = list(title = "Time of Day (0 = midnight)"),
yaxis = list(title = "Number of Tweets"))
NA
This next chunk then organizes the tweets by day, showing on average how many tweets Jimmy Fallon makes on each day of the week.
fallon_tweets %>%
mutate(Day = wday(created_at, # find the weekday that the tweet was created
label = T)) %>% # use labels (Sun, Mon, etc) rather than numbers
count(Day) %>% # count the number of tweets each day
datatable(rownames = F)
The next chunk then converts the table into a histogram showing the same data of tweets each weekday.
fallon_tweets %>%
mutate(Day = wday(created_at, # find the weekday that the tweet was created
label = T)) %>% # use labels (Sun, Mon, etc) rather than numbers
plot_ly(x = ~Day) %>% # create plotly graph
add_histogram() %>% # make histogram
layout(title = "When Does @jimmyfallon Tweet?",
xaxis = list(title = "Days of the week"),
yaxis = list(title = "Number of Tweets"))
This last chunk combines the information from the histograms and creates a heat map showing which hour of each day Jimmy fallon tweets the most.
fallon_tweets %>%
mutate(day = wday(created_at, label = T)) %>%
mutate(hour = hour(with_tz(created_at, "America/Los_Angeles"))) %>%
plot_ly(x = ~day, y = ~hour) %>%
add_histogram2d(nbinsx = 7, nbinsy = 24) %>%
layout(title = "When Does @jimmyfallon Tweet?",
xaxis = list(title = "Days of the week"),
yaxis = list(title = "hour"))
NA