Description:

Attempt to do an analysis of Twitter data. Being a Marvel fan I thought it would be fun to an exploration of tweets regarding Marvels new movie that came out this week Avengers: Endgame

Intentionally Not Displaying my_token to access Twitter but Validating its Authenticity

identical(my_token, get_token())
## [1] FALSE

Visualizating The Frequency of Tweets Using #AvengersEndgame

Per Thirty-second Interval Per Three-Hour Interval

library(ggplot2)
ts_plot(avenger_tweet, by = "30secs", tz = "UTC") + ggplot2::theme_minimal() + ggplot2::theme(plot.title = ggplot2::element_text(face = "bold")) + ggplot2::labs(x = NULL, y = NULL, title = "Frequency of #AvengersEndgame Twitter status", subtitle = "Tweet counts using thirty-second intervals", caption = "Collected From Twitter's API")

library(ggplot2)
ts_plot(avenger_tweet, by = "3hours", tz = "UTC") + ggplot2::theme_minimal() + ggplot2::theme(plot.title = ggplot2::element_text(face = "bold")) + ggplot2::labs(x = NULL, y = NULL, title = "Frequency of #AvengersEndgame Twitter status", subtitle = "Tweet counts using three-hour intervals", caption = "Collected From Twitter's API")

April 24-26 is the technical release date of the new movie Avengers: Endgame. In the United States the movie is released on April 26th. The frequency by hour puts in persepctive that every hour that passes the more that people are tweeting about it globally. The movie created by Marvel Studios and Disney is expected to shatter box office records.

Count of Tweets by Platform

aveng_platform <- avenger_tweet %>% group_by(source) %>% summarize(n = n()) %>% mutate(percent_of_tweets = n/sum(n)) %>% arrange(desc(n))
aveng_platform %>% slice(1:5)
## # A tibble: 5 x 3
##   source                  n percent_of_tweets
##   <chr>               <int>             <dbl>
## 1 Twitter for Android  3590            0.451 
## 2 Twitter for iPhone   2500            0.314 
## 3 Twitter Web Client    728            0.0915
## 4 Twitter Web App       461            0.0580
## 5 Instagram             178            0.0224

Most Recent 7000 tweets from The Avengers and Marvel Studios

library(rtweet)
library(dplyr)
library(tidyverse)
library(ggplot2)
account_timeline <- get_timelines(c("Avengers", "MarvelStudios"), n = 7000)
##plot
account_timeline %>% dplyr::filter(created_at > "2019-4-19") %>% dplyr::group_by(screen_name) %>% ts_plot("days") + ggplot2::geom_point() + ggplot2::theme_minimal() + ggplot2::theme(legend.title = ggplot2::element_blank(), legend.position = "bottom", plot.title = ggplot2::element_text(face = "bold")) + ggplot2::labs(x = NULL, y = NULL, title = "Frequency of Tweets posted by Marvel Platforms", subtitle = "By Day", caption = "Collected from Twiter's API")

* With April 22 being the launch party for Avengers: Endgame it is obvious why the tweet frequency would explode. It is also interesting to visualize that when one of the accounts is not as active tweeting, it seems the other account picks up the slack. One can see this when examining the days April 22-24.

Summary:

While I do not use Twitter as a social media platform for myself it was amusing to work with Twitter and analyize its tweets. Being this assignment allowed us to try and get more creative I wanted to implement visualizations that we have not used yet. The rtweet package allowed that. I am excited to learn more about the rtweet package and working with Twitter data.